JOURNAL OF SEMANTICS A 'INTER ATIONAL JOURNAL FOR TI-lE INTERDISCIPLINARY STUDY OF THE SEMANTICS OF NATURAL LANGUAGE
VOLUME 6,
1988
SWETS & ZEITLINGER B.V. LISSE- THE NETHERLANDS- 1991
JOURNAL OF SEMANTICS AN INTERNATIONAL JOURNAL FOR TilE INTERDISCIPLINARY STUDY OF TilE SEMANTICS OF NATURAL LANGUAGE
VOLUME 6,
1988
Reprittl�d wiJh �rmissiott of Foris PublicatiottS, Dordruht by
SWETS & ZEITLINGER B.V. LISSE- THE NETHERLANDS- 1991
JOURNAL CONTENTS
OF
S E M A N TI C S
VOLUME
6 (1 988)
Articles MARK ARONSZAJN,
271
Thought and Circumstance
NICHOLAS ASHER and HAJIME WADA, account of syntactic,
A computational
Semantic and discourse principles
for anaphora resolution MANFRED BIERWISCH,
Tools and Explanations of
comparison
57 101
Part I Part II OSTEN DAHL, SIMON C.
The role of deduction rules in Semantics
GARROD and ANTHONY J.
SANFORD,
Discourse
models as interfaces between language and the spatial word KEES HENGEVELD,
Illocution,
mood and modality 1n a
functional grammar of spanish JACK HOEKSEMA, ROLF MAYER,
309
The Semantics of non-boolean "and"
JOHAN ROORYCK,
Conditions for mutuality
19 369
Restrictions on dative Cliticization
41
in french causatives PIETER SEUREN,
227 345
Motion imperatives
JOSEF PERNER and ALAN GARNHAM,
147
Presupposition and negation
175
Bookreviews BART GEURTS,
Hiyan Alshawi,
Memory and context for
language interpretation MANFRED KRIFKA,
Gerhard Heyer,
PIETER A.M. SEUREN, dictionary
Generische Kenzeichungen
Collings cobuild english language
95 161 169
JoUT11al of Stmantics 6: I
-
18
THE ROLE O F DEDUCTION RULES IN SEMANTICS
OSTEN DAHL
ABSTRACT
claimed that there is a parallel between the construction of a proof based on a set of premises and e.g. the production of a natural-language text which is based on information in some kind
of data-base. The main part of the paper is devoted to a discussion of the relations between the deduction rules traditionally associated with the existential quantifier and notions pertaining to the theory of reference such as specificity and referentiality I attributivity. Two types of spe cificity are distinguished, which can be connected with 'Existential Elimination' and 'Existen tial Introduction', respectively. A distinction is further made between trivial and non-trivial 'Existential Introduction', where only the latter l::ind involves erasure of 'coreference links.' It is argued that an analogous treatment of the referential-attributive distinction is a way of making sense of Donnellan's suggestion that the latter may depend on the description's role in an argument. Finally, the notions of 'external anchoring' and 'stability of individual con cepts' are related to the distinctions made earlier in the paper.
DEDUCTION RULES
The idea of using 'partial' rather than 'total' interpretations or models in logical semantics, which has been around for a rather long time (see e.g. Hintikka 1969), has become quite popular recently, in the guise of 'situa tion semantics' (e.g. Barwise and Perry 1 983), situations, i.e. partial mod els, being seen as serious alternatives to 'possible worlds' , i.e. total models. In a discussion of the relation between logic and computerized data-base sys tems, Reiter (1978) introduces a distinction between 'open world' and 'closed world' evaluation, which is basically equivalent to that between 'par tial' and 'total models' . Sowa ( 1 983) applies Reiter's distinction to knowl edge representation. A connection hinted at by Sowa, which I want to de velop here, is the close relation between the partial-total distinction and the traditional distinction in logic between proof theory and model theory. Traditionally, proof theory is seen as the study of the ways in which the orems may be derived from a set of axioms, or, on a more liberal view, as the general study of what statements can be concluded to be true, given a set of assumptions or premises. The relation to partial models is easily seen
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The distinction between 'partial' and 'total' interpretations (models) is discussed and related to the distinction between proof-theoretical and model-theoretical treatments of logic. It is
2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
if we contemplate the fact that a set of assumptions is nothing but a partial assignment of truth-values to the set of sentences that can be asserted about some universe of discourse. Saying that something follows from a set of premises is thus completely analogous to saying that something holds in a partial model. A proof in mathematics or logic is a sequence of formulas, or sentences, which obey certain well-defined constraints, relating to the concept of log ical consequence. From the linguistic point of view, a proof is a special kind of text. Actually, one might claim that proof theory is the only successful formal theory of texts, since only in proof theory has it been possible to de fine 'texthood' rigorously for a text type. The possible links between proof theory and a theory of texts or discourses were pointed out almost two de cades ago by the logician John Corcoran 1 , but have not to my knowledge been taken up in a serious way by any linguist, probably because it has seemed that the properties of proofs that interest logicians are too far re moved from the issues that are central to the linguistic theory of discourse. It would appear that the relation of logical consequence, which is central to the theory of proofs, has little relevance for everyday discourses. I want to claim that this view is mistaken. Deductive inferences are by definition 'truth-preserving'. This means that in one sense of the word 'information' , a statement derived by deduction does not contain any information that is not contained in the premises. In a proof, no sentence may appear that does not follow logically from the as sumptions made at that point in the proof or is itself introduced explicitly as a new, temporary assumption. In other words, a proof contains no informa tion (in the 'logical' sense) which is not contained in the axioms or assump tions, and can, if we like, be seen as a partial 'rephrasing' of the assump tions. In this way, a proof is like a summary or an abstract of a book or an article, which may be seen as an alternative rendering of the information found in the latter. In fact, an abstract writer can be said to obey the same constraints as a writer of a proof: if he adds a thought of his own, that is, something that cannot be concluded from the original text, he breaks the rules of the game. Of course, this is not to deny that there are differences. The proof constructor's aim is to prove a theorem from a set of axioms, which can be assumed to be known to his readers, whereas the abstract writer's is to communicate the gist of the contents of some work to people who have not read it. In this sense, the abstract conveys new information in a way that the proof does not. However, this should not detract us from the important similarities between the two kinds of activities. In fact, we can generalize the analogy further. An abstract differs from most other kind of texts by its essential relations to another text - the book or article that is being abstracted. Although it is true that the 'data-bases' people use when talking about the world are often expressed in the same language as
3
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the one they are speaking, we would not like to restrict a theory of text generation to those cases. We need not do so, either, if we extend our con cept of a 'partial model' from denoting a truth-value assignment to a set of sentences to denoting a similar assignment to any set of representations of reality, whatever form they are expressed in. If someone is given the assign ment to write a description of how to get from Stockholm to Copenhagen, it matters little if he bases his description on a map or on the text in Miche lin's Guide. Both may be seen as partial representations of reality from which one 'deduces' the information needed. In discourse theory, it has been popular to view the processing of a text in terms of the construction of 'mental models' or 'discourse representa tions' : when someone listens to or reads a description of some part of real ity, he builds a representation of that part of reality in a stepwise fashion as the discourse goes along. The proof theory analogy suggests a different perspective where one looks at. things rather from the point of view of the speaker or writer: how can a text be constructed on the basis of a represen tation of some part of reality? This is of course nothing but a rephrasing of the problem of what is called 'computer-based text generation' , i.e. the construction of programs for automatic compilation of reports written in natural language on the basis of computer data-bases. Notice that since such reports normally provide the answers to a specific set of questions, text generators may be seen as special cases of question-answer systems, and there is an obvious link between such systems and proof theory in that what a question-answer system does when it gets a question is to try to test, i.e. prove, for each possible answer to the question, whether it follows from the information in the data-base or not. Since extensive work has been done in the area of text generation, I can not be said to be entering virgin ground, but I want to pursue a line of thought that I have not seen discussed anywhere, viz. to explore to what ex tent the rules of deduction formulated by logicians in proof theory can serve as a basis for a formal semantics for natural language discourse, and to show that it is indeed possible to throw some light on some classical ques tions of natural language semantics in this way. Deduction rules as formulated in modern treatments of proof theory are often based on principles due to Gent zen 1 934- 5 , where the rules were given in pairs, in such a way that for every logical constant, there was one 'introduction rule' and one 'elimination rule' . For example, we may postu late a 'Conjunction Introduction Rule' saying that whenever the assump tions contain the propositions a and {3, we are allowed to assert a 1\ {3, and the converse 'Conjunction Elimination Rule', to the effect that given the proposition a 1\ {3 we may assert any of the conjuncts, i.e. either a or {3. The most complex and also most interesting from our point of view among the deduction rules usually postulated are those pertaining to the
4 quantifiers of predicate logic. The ones that we shall discuss here are th1 'Existential Introduction Rule' and the 'Existential Elimination Rule' , tha is, the rules that govern the use of the existential quantifier. The ' 3 Introduction Rule' is the rule that allows us to prefix an existentia quantifier to a formula, simultaneously replacing all occurrences of a speci fied free variable with a variable bound by the quantifier. This is the sim plest rule of the two, and the most interesting one from our point of view but to obtain some background, we shall first look at the second one, the '3 Elimination Rule'.
'3 Elimination' is the process by which you substitute an unbound individu al term for the occurrences of a bound variable in an existentially quanti fied expression, simultaneously deleting the quantifier. As this process i� usually described, it cannot be performed just like that: the choice of an ar· bitrary individual term would not guarantee the truth of the resulting for· mula. Therefore, '3 Elimination' is only allowed to be used as a temporar) step in the proof of a formula which does not contain the individual term Furthermore, the individual term used must not have occurred earlier in tht proof. These conditions are intended to ensure that the choice of the indi· vidual term does not influence the validity of the argument. It seems tha1 the conditions could be relaxed in a system where the extensions of individu· al terms are not determined in advance: in such a system, there could be ar '3 Elimination Rule' allowing one to create new individual terms anc substitute them for existentially bound variables. The domain of such a rult would be exactly those variables that correspond to 'specific' indefinit< noun phrases in the sense of 'specific' which corresponds to Geach's con· cept of 'namely-riders' , i.e. those noun phrases which can be amplified b) a tag of the form ' . . . namely X'. Consider e.g. the classical specificity am· biguity in (1), the two readings of which can be formalized as (2a-b). (I)
Mary wants to marry someone.
(2)
(a) (3x)(Want (Mary, Marry(Mary, x))) (b) Want(Mary, (3x)(Marry(Mary, x)))
The (a) reading is a 'namely' -rider: we can continue ' . . . namely John' . In the (b) reading, this addition does not make sense. Correspondingly, •: Elimination ' is applicable only to (2a), since it can only be applied to ex· istentially quantified formulas where the quantifier has scope over the whole remaining formula.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
EXISTENTIAL ELIMINATION AND (T-)SPECIFICITY
5 What I have said here about the relations between the deduction rule ' 3 Elimination' and specificity in natural language i s hardly original o r con troversial. It is of course trivial that two truth-conditionally different read ings of a sentence differ in their logical consequences and therefore in the roles they play in a logical argument. What I shall claim now is that there are ambiguities which are not truth-conditional and which proof theory may throw light on.
EXISTENTIAL INTRODUCTION AND (P-)SPECIFICITY
(3)
Some child has not received his ice-cream cone.
(4)
(a) At a party, the host is distributing ice-cream cones to a group of children. He thinks he has given them to everyone, but his wife sees one unhappy face among the children and utters (3). (b) At another party, the host knows that he had bought exactly one cone for each child. He thinks he has given everyone a cone but sees that there is still one left on the plate, and so utters (3).
The difference between the two 'readings' of (3) is usually said to depend on 'whether the speaker has a specific individual in mind' or 'whether the speaker is referring to a specific individual'. It thus is very similar to Don nellan's (1 966) distinction between referential and attributive readings of definite noun phrases, and could perhaps be claimed to be simply its coun terpart for indefinite NPs. In any case, much of what has been said about the referential-attributive distinction carries over directly to the specificity distinction in (3). Like the referential-attributive distinction and unlike the ambiguity in (1), the 'specificity' exemplified in (3) does not obviously cor respond to a difference in truth-conditions, although arguments to that ef fect have been made, e.g. by Kasher and Gabbay (1976), who - unconvinc ingly - claim that the (a) reading is true only if the speaker can correctly identify the referent of someone. 2 Donnellan says in his classical paper that the referential-attributive distinction reflects a pragmatic ambiguity. Accepting this for the time being, let us refer to the two kinds of specificity as 'T-specificity' (for truth-conditional) and 'P-specificity' (for 'pragmat ic'), respectively.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The term 'specificity' is often used not only to refer to the property that dis tinguishes the (a) reading from the (b) reading of ( I ) but also for distinguish ing different kinds of uses of indefinite noun phrases in cases where there is no apparent scope ambiguity involved. Thus, (3) could be uttered in the context (4a) and (4b), where (4a) would be said to illustrate a 'specific' use.
6
An attempt to formalize P-specificity in a pragmatic framework is made in Groenendijk and Stokhof ( 1 98 1 ) . G&S build their proposal on a notion of an 'epistemic model', in which one defines, for each predicate and each language user, the set of possible denotations of that predicate relative to that language user's beliefs of the world. This involves statements of the fol lowing form (where 'has the information' should be read as 'is of the opin ion ' rather than 'knows that'): '
x has the information that the predicate o is true of the individuals a, b and c'
(5)
You know that you bought seven cones, and there are seven chil dren . But there is still one cone on the table. So some child did not receive his cone.
I think that it is reasonable to say that this is a non-P-specific use of an in definite noun phrase, by a person who knows the identity of its referent. It we admit that this is possible, we must look for a treatment of P-specificity that is at least in principle independent of the speaker's beliefs.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3), according to the view of G&K, would be used P-specifically if there is exactly one individual in the speaker's epistemic model which is both a member of the set of possible denotations of child and of that of has not received an ice-cream cone. G&K say that their notion of 'specific refer ence' is 'objective in this sense that if two language users have the same in formation about the denotation of the expressions involved and use the same sentence, it can never happen that one of them refers P-specifically to an object without the other also referring specifically to the same object' , whereas the traditional notion of 'having something i n mind ' i s 'purely sub jective, and therefore completely uninteresting from the point of view of conversational analysis' . There are a number of rather difficult problems with this kind of treatment, among others those connected with the individ uation of objects of belief. The problem I want to discuss at this point is the question of whether we really want the notion of P-specificity to be 'ob jective' in G&K's sense. Notice to begin with that the objectivity involved is a bit dubious, in practice, since it is entirely dependent on the speaker's beliefs, which are accessible only to himself. One obvious consequence of G&K's view is that e.g. the speaker in (4a) just cannot use (3) non-P specifically, even if she wants to, since she knows who she is referring to. H owever, I think that there are fairly clear cases when a speaker may use an indefinite noun phrase non-P-specifically even if he knows the identity of its referent. Suppose e.g. that the husband in (4a) does not accept his wife's statement, claiming that the child she is talking of has already eaten his cone. She might then utter (5).
7 What I want to claim now is that there is a direct relation between P specificity and the deductive rule of ' 3 Introduction' . Let us ask the follow ing question: On the basis of what kind of information do the speakers in the two di fferent situations (a) and (b) make the statement (3)? Possible an swers are found in (6a - b). (6)
(a) My husband is distributing cones to the children. I see that Bill looks unhappy and has no cone in his hand . I conclude that Bill has not received his ice-cream cone. (b) I bought seven cones. There are seven children. There is still one cone left. Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Notice that (6a) - but not (6b) - contains a statement which is identical to (3) except for having a proper name instead of an indefinite noun phrase in the subject position. That means that one may go from (6a) to (3) by an operation which could be said to be the natural language counterpart to '3 Introduction ' . In (6b), on the other hand, no such step is possible. The general idea could then be formulated as follows : a person using an in definite noun phrase P-specifically is making a statement on the basis of a data-base from which that statement is derivable using a logical operation equivalent to '3 Introduction ' . Before elaborating the details o f this idea, let u s first see how i t differs from the proposal of G&K, and in particular, how it can be used to explain the fact that a speaker can use a noun phrase non-P-specifically even if he knows the identity of the referent. What is crucial is the interpretation of the expression ' to make a state ment on the basis of a data-base'. The point is that the data-base involved here need not be identical with the speaker's beliefs. The character of the data-base depends on the nature of the ' language-game' that the speaker en gages in. Two main types of language-game, and correspondingly, two main types of discourse and text, can be labelled 'descriptive' and 'argumen tative'. In a (purely) descriptive discourse, the speaker describes an object or a situation which he has access to information about . In a (purely) argu mentative discourse, the speaker gives arguments to support or prove some thesis. The language-games in which such discourses occur typically obey di fferent constraints as to what can be asserted . In the most extreme form of an argumentative discourse, a mathematical proof, all assumptions are supposed to be known to and accepted by all participants in the language game. Nothing can be asserted that does not follow directly from these as sumptions. In such case, the data-base on the basis of which assertions are made is clearly not what the speaker believes: it is the assumptions com monly agreed upon . Even if everyday arguments are governed by less rigid rules, it is still the case that one expects claims to be made on the basis of
8 information that is accepted by everyone. Therefore, it makes sense also in (5) to say that the statement that someone has not received his ice-cream cone is not made on the basis of the speaker' s beliefs but on the facts that are commonly known to the s peaker and the hearer. In other words, cases of indefinite noun phrases being used non-P-specifically by speakers who know the reference of the noun phrase are to be expected when the noun phrases occur as parts of statements that are presented as conclusions in an argument.
P-SPECIFICITY AND COREFERENCE L INKS
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
What we have seen so far is how the two concepts of specificity - T-speci f icity and P-speci ficity - may each be linked up with one of the two deduc tion rules associated with the existential quantifier. There is a problem , however : It turns out that there is a trivial way of fulfilling the condition on P-specificity, under which it will collapse with T-specificity. To any T specific individual term, we may apply '3 Elimination' or Skolemization, which will give us a sentence containing an individual term . To this sen tence, we can then apply '3 Introduction ' , taking us back to the original sen tence. It thus appears that any T-specific individual term will fulfill the con ditions for P-specificity trivially. Is there any way of saving the proposal? It appears that what should be done is to formulate some further condition on the individual terms to which '3 Introduction' can apply. Sowa (1983) introduces a system for knowledge representation called 'conceptual graphs' , which builds on the ' existential graphs' used by C.S Peirce for logical representations. I n Sowa's system , individuals are repre sented by nodes, which may be connected by ' coreference links ' indicating which nodes stand for the same individual . Existential quantification is im plicit: any individual node is by default assumed to be existentially quan tified. There is a set of inference rules, by means of which graphs may be derived from each other, and two of which are of special interest here: the ones that involve erasure and insertion of coreference links . About these, Sowa says ( 1 983: 1 5 5): 'Peirce's rules for drawing and erasing co reference links replace the standard rules of universal instantiation and existential generalization . ' He does not give any concrete demonstration of the equiva lence of the Peircean rules and the standard rules of in ference, however, and we shall soon see that there is indeed a crucial di fference between them , with direct bearing on our topic. To make the discussion more concrete, we shall introduce 'data-base' of a simple kind, which will be represented by sets of sentences, containing ba sically proper names and verbs (sometimes with prepositions added) . Two
9 examples of such data-bases are given in (7) and (8). For obvious reasons, we shall refer to them as ' first order data-bases' . The assumption is that we do not have any independent information about the individuals mentioned in the texts. (7)
John lives in England. John loves Mary. Mary lives in Scotland.
(8)
John loves Mary. Mary lives in Scotland .
(7 1 )
John lives in England. Someone loves Mary. Mary lives in Scotland.
(8 I )
Someone loves Mary. Mary lives in Scotland .
Using NN as a 'dummy' proper name, we can then use get to (7") and (8"). (7 ")
John lives in England NN loves Mary. Mary lives in Scotland.
(8 )
NN loves Mary. Mary lives in Scotland.
"
'3
Elimination' to
We can now see that although we have seemingly performed identical opera tions on (7) and (8), there is an important di fference in the effects. (8) and (8 ") can be said to be isomorphic - in a way, they describe the same situa tion or, equivalently, contain the same information, given the assumption that the names are just arbitrary labels. Going from (7) to (7 "), on the other hand, there is a clear loss of information: we no longer know that the person who loves Mary lives in England . We may accordingly distinguish two kinds of applications of (destructive) '3 Introduction ' : trivial 3 Intro duction - as in (8), which does not entail loss of information, and non-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let us now define 'destructive 3 Introduction' as an operation which gener ates a copy of a first-order data-base with the only d ifference that the in definite pronoun someone has been substituted for one occurrence of a proper noun in the original data-base. Letting destructive 3 Introduction apply to John in John loves Mary in (7) and (8) respectively, we obtain (7 1) and (81).
10
(9)
he 1 lives in England. he2 loves her3 . she4 lives in Scotland. he 1 he2• she4• she3 =
=
(7)
he 1 loves her2 . she3 lives in Scotland . she2 she3 . =
He can now see that in order to represent (7 ") in this format, we have to he2 ' from (9), yielding (9 ' ) : delete the statement 'he 1 =
(9 ' )
he1 lives in England. he2 loves her3 . she4 lives in Scotland. she4 . she3 =
Both (8) and (8" ), on the other hand , will be represented as (7) - this is then what we called the trivial case of 3 Introduction.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
trivial 3 Introduction, as in (7), which does, and sharpen the condition on P-specificity by saying that it must involve non-trivial 3 I ntroduction. Exactly what information is contained in (7) that makes 3 I ntroduction destructive there but not in (8)? We may say that it is the information that the referents of the subjects of the two first sentences in (7) are identical. This information is not expressed as a separate proposition but rather fol lows from the fact that the same proper name (John) is used twice. In a sys tem like that of Sowa, this kind of information could be expressed by a 'coreference link' , which would then get lost in going from (7) to (7 ") quite in agreement with Sowa's statement above about the correspondence between 3 Introduction and erasure of coreference links. But notice that in (8) there is no coreference link to erase - in other words , 3 Introduction is equivalent to coreference link erasure only in the non-trivial cases. In order to illustrate this in a more concrete way we shall introduce a notation which is equivalent to the graph representations Sowa uses, but which retains a clausal form (in fact, quite similar to the 'Discourse Representation Struc tures' of Kamp 1 98 1 ). As before, we shall use data-bases which are essen tially sets of sentences : the main difference being that instead of proper names we use subscripted pronouns . Co-reference relations between the pronouns are not indicated primarily by using identical subscripts but by special statements of identity, corresponding to Sowa's coreference links. (7) and (8) will thus correspond to (20) and (2 1 ), respectively:
II THE REFERENTIAL-ATTRIBUTIVE DISTINCTION
We shall now turn to a discussion of the distinction between referential and attributive uses of definite descriptions as introduced by Donnellan (1966):
'A speaker who uses a definite description attributively in an assertion
Donnellan rejects the possibility that the distinction is a function of the speaker's beliefs, using an example that is somewhat similar to the one we used above: 'To use the Smith murder case again, suppose that Jones is on trial for the murder and that I and everyone else believe him guilty. Suppose that I comment that the murder of Smith is insane, but instead of backing this up, as in the example previously used , by citing Jones' behavior in the dock, I go on to outline reasons for thinking that anyone who murdered poor Smith in that particularly horrible way must be insane. If now it turns out that Jones was not the murderer after all, but someone else was , I think I can claim to have been right if the true murderer is after all in sane. Here, I think , I would be using the definite description attributive ly, even though I believe that a particular person fits the description . ' Thus, according to Donnellan ' s analysis, the distinction - at least i n argu mentative texts - does not so much depend on what is in the speaker's mind as on the role of the description in the argument. 4 We shall illustrate this by spelling out the 'referential' and the 'attributive' arguments explicitly. (Of course, these arguments are not logically valid but should rather be seen as fairly typical examples of everyday reasoning, in which , however, rules of deduction may play an important role . ) (I I )
The 'referential' argument: (a) Someone is sitting in the dock (b) Someone murdered Smith
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
states something about whoever or whatever is the so-and-so . A speaker who uses a definite description referentially in an assertion, on the other hand , uses the description to enable the audience to pick out whom or what he is talking about and states something about that person or thing. In the first case the definite description might be said to occur essentially, for the speaker wishes to assert something about whatever or whoever fits that description; but in the referential use the definite description is merely one tool for doing a certain job - calling attention to a person or thing - and in general any other device for doing the same job, another description or a name, would do as well . ' ( 1 966:285)
12 (c) (d) (e) (f) ( 1 2)
The man in the dock shows symptoms of insanity The man in the dock is insane The man in the dock is Smith's murderer Smith's murderer is insane
The 'attributive' argument: (a) Someone murdered Smith (b) Smith was murdered in a cruel way
[(c) The man in the dock is Smith 's murderer] (d) Smith's murderer is insane
( 1 1 ')
( 1 2')
(a) (b) (c) (d) (e) (f) (g) (h)
he1 he2 he 3 he4 he4 he5 he5 he2
is sitting in the dock murdered him 3 : Smith shows symptoms of insanity : he 1 is insane : he1 he1
(a) (b) (c) (d)
he1 he2 he 3 he 3
murdered he2 : Smith was murdered in a cruel way : he 2
=
We can see that according to the definition of P-specificity given above, someone in the statement Someone murdered Smith would be P-specific relative to ( II) but not to ( 1 2), since the application of 3 Introduction would be non-trivial only in ( 1 1 ), due to the presence of the co reference link ( 1 1 ' h). We might therefore suggest the following condition on the referen tial use of definite descriptions:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The difference here seems to be that the referential argument crucially in volves a statement which provides an independent way of identifying Smith's murderer. If this statement should turn out to be false, the argu ment will collapse. In the attributive case, on the other hand, the identifi cation statement can well be deleted without affecting the validity of the ar gument in any way, as I have indicated by putting ( 1 2c) in square brackets. How is this related to what we have said above about the relation be tween specificity and the use of deduction rules? There t urns out to be a very direct relation, but to see it clearly we have to rewrite the essential premises in the above arguments in a way that does not involve definite de scriptions.
13 ( 1 3)
A definite description is used referentially if and only i f the existen tial statement it presupposes is derivable by non-trivial 3 I ntro duction .
EXTERNAL ANCHORING AND STABILITY OF INDIVIDUAL CONCEPTS
( 1 4)
An old man visited a small girl yesterday. He brought her a present .
The information i n ( 1 4) could b e represented as ( 1 4 ' ) .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To what has been said in this paper so far the following objection may be made: It is all very well to say that a person who uses an i ndefinite N P P specifically bases that use on a data-base from which the statement he makes is derivable by something like '3 I ntroduction ' . But how are we to apply this criterion in practice? H ow do we know whether the knowledge of the speaker is such that we can apply a logical operation 'equivalent to 3 I ntro duction' ? More specifically : it is one thing if we have a set of sentences which contain some individual constant - it is then uncontroversial that we can make these sentences into existential claims by '3 Introduction ' . But i sn't the problem really what the conditions are for using individual con stants or proper names? Without really claiming to be able to answer those questions, I would like to discuss the relation of the notion of P-specificity to some other concepts viz. that of external referential anchoring and stability in individual con cepts. Consider a maximally pure case of a fictional text, say the fairy-tale about Cinderella. Such a text clearly defines a 'data-base' or 'relational structure' involving a set of individuals with certain properties and stand ing in certain relations to each other. But the elements in this structure have no relation - as far as we know - to anything in the real world, or to any other structure outside of it. In the terminology of Chastain ( 1 975), the text is 'referentially segregated ' , or in the terms of other people, it has n o ' referential anchoring' . Still, of course, there are internal coreference links in the story, but none that go outside of it. It sometimes happens that we read a story, believing it to be fictional , and later on find out that it was in fact about real people and real events . What this illustrates is that having referential anchoring is something that is in principle outside the data-base as such - it is something that may or may not be added to it. We shall illustrate how this might work, using our recently introduced notation for data-bases. Suppose that someone tells me the 'story' in ( 1 4) .
14 (14')
he1 i s a n old man. Sh� is a small girl. he3 visited her4 yesterday. he1 he3 she4 she2 he5 brought her6 a present. he5 he1 he4 he2 =
=
=
( 1 5)
he6 lives in Stockholm . he7 i s a teacher. he6 h� =
Suppose now that I find out that the story I heard really concerns the teacher in Stockholm . Notice that this knowledge is separate, in principle, from both ( 1 4 ' ) and ( 1 5) : my total knowledge would have to be represented as in ( 1 6). ( 1 6)
he1 is an old man. She2 is a small girl. he 3 visited her4 yesterday. he3 he1 she4 she2 he5 brought her6 a present. he5 he1 he4 = he2 =
=
he6 lives in Stockholm. he7 is a teacher. he6 he7
In this structure, the story and my previous knowledge show up as sub-structures. The point is that the external anchoring of the dis course referents in the story shows up as an ordinary internal coref erence link in the total structure. This fact makes it possible to link up the notion of external refertntial anchoring with the notion of P specificity. Let us again consider (9) with a slight modi fication: we
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Now, I happen to know about a person who lives in Stockholm and is a teacher. This knowledge could be represented as ( 1 5) .
15 have separated out the part o f it that underlies the sentence Someone loves Mary: (9")
he1 lives in England.
I he2 loves she3• she4 lives in Scotland. he1 = he2 . she4• she3 =
( 1 7)
Suppose a man, let us call him John Smith, meets a woman , let us call her Susan Brown , and that John Smith falls in love with Susan Brown . . .
( 1 7), of course, might grow into quite a long story. What distinguishes this
use of proper names from other, more ordinary uses? One thing is immedia tely clear: regarding ( 1 7) and its continuation as a data-base, it will be ' ref erentially segregated' : there is no link from John Smith and Susan Brown to anything outside it. (The same will be true, of course, of proper names used a fictional characters in novels.) This contrasts with the ' normal' use of proper names: a person mentioned in a news item, for instance, will be identifiable also outside that news item. Again, we see that the identifiability of individuals outside the data-base is crucial: this condition singles out P specific indefinite noun phrases, but it also distinguishes arbitrary and non arbitrary uses of proper names. Let us now look at proper names from a slightly di fferent angle, and re call the old discussion about the question whether a proper name has a sense in addition to a reference . Some people have claimed that proper names are abbreviated descriptions. The well-known objection to this theory is that in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We can see that if we look at the enclosed part of (9") as a data-base of its own, its referents have ' external anchoring' due to the coreference state ments at the bottom of the list. The presence of these coreference statements was also precisely the basis for saying that someone in the sentence Someone loves Mary was P-specific when based on this data-base. In other words, 'having external referential anchoring' and 'being P-specific' are closely re lated notions. Let us now return to the question of the conditions for using constants and I or proper names. The rule of 3 Elimination, by which a ' new' constant is introduced to refer to an individual - resembles the way in which 'arbitra ry' proper names may sometimes be introduced in everyday language, e.g. in discourses like the following:
16
CONCLUSION
In this paper, I have argued for the relevance of deduction rules in formal semantics of natural language. More specifically, recall that many seman ticists have equated the meaning of a sentence with the set of propositions that are logical consequences of that p roposition . Even if one does not make that identification, it is clear that whenever two sentences have the same truth-conditions, they have the same logical consequences, which also means that if we apply the same logical deduction rules to two sentences with the same truth-conditions, we should get the same results. The upshot of this is that the study of the truth-conditions of linguistic expressions is equivalent to the study of their role as premises in logical arguments . In this paper, I have tried to show the relevance of proof theory for semantics by showing that there are distinctions commonly made in semantics that can
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
most cases, we know so much about an individual that there is a very large set of possible descriptions that each would serve as well as an identification of the individual in question, and thus the choice of one of them as the 'meaning' of a proper name that is used of that individual appears ar bitrary. In his discussion of this problem , Kripke ( 1 972) mentions that there are cases when the reference of a name is determined by a description, e . g . when the name Neptune was introduced t o refer t o a hypothesized planet ' which caused such and such discrepancies in the orbits of certain other planets ' . I want to look at somewhat similar cases, when a name is given to a (sometimes hypothetical) historical person for whom there is basically on ly one possible identifying description. A good example is the I ndian lin guist P
17
ACKNOWLEDGEMENT
I am indebted to Kari Fraurud for valuable comments on a draft of this paper.
NOTES
I. Class lectures, summer
1971.
2. A similar claim is in fact made by Donnellan in his 1 978 paper. Talking about an (invented) quotation from the Watergate journalists Woodward and Bernstein where the first sentence goes
We now had a telephon� call from a man high in the inner circle.
Donnellan says that if
the man Woodward and Bernstein 'had in mind' did not call them but wrote them a letter, the sentence would express a falsehood 'and would not be saved if SOME man high in the inner circles did in fact call them at the time in question'. I do not find this convincing.
3. Such a generalized operation of
3
Elimination could in fact be regarded as a special case of
'Skolemization', the process by which existential quantifiers are eliminated by replacing them with function symbols (not earlier used in the system). In the cases where the existential quan tifier is not within the scope of any universal quantifiers, the function symbol will be zero place, and thus equivalent to an individual constant.
4. Klein ( 1 9 8 1 )
proposes an alternative analysis of the referential I attributive distinction, where
the referential use is supposed to be the default case and the attributive interpretation arises only in those contexts where the implicature that the speaker has evidence for the presumption that the description is instantiated (generated by the maxim 'Do not say that for which you
have no adequate evidence') is cancelled (or the maxim is blatantly violated). On the basis of this definition, Klein suggests that the description Smith's murderer in the case discussed in the main text is in fact used referentially rather than attributively, 'since there is no reason for any one in that context to doubt that the speaker had adequate evidence for believing that the mur derer of Smith exists'. To me, it seems clear that Klein's distinction is rather different from Donnellan's, although Donnellan's formulations are a bit vague in places and it might be argued that Klein's interpretation of the referential-attributive distinction is compatible at
least in spirit with some of them.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
be better understood i f we look at sentences as the conclusions of logical ar guments, i . e . as being derived via deductive rules from sets of propositions (data-bases) . Thus, I have argued that the concept of specificity for in definite noun phrases may be understood in two ways, one truth-theoretical and the other proof-theoretical, the referential I attributive distinction be ing the counterpart to the latter for definite noun phrases. The continua tion of this argument would be that deduction could be seen as a general model for the relation between a discourse and the knowledge base of its producer. There are also other aspects of proof theory, more specifically concerning the hierarchical structure of proofs , that I think are relevant to discourse theory, but I have neither the time nor the space to go into these questions here.
18 REFERENCES Barwise, J. and J. Perry, 1 983: Situations and Allitud�. MIT Press. Cambridge, Mass. Chastain, C., 1 975: Reference and Context. In Gunderson (ed.) pp. 1 94-269. Cole, P. (ed.), 1 978: Pragmatics. ( =Syntax and Semantics, Vol. 9.) Academic Press. New York. Davidson, D., and G. Harman (ed.s.), 1 972. Semantics of Natural Language. Reidel. Dordrecht.
Donnellan, K., 1 966: Reference and definite descriptions. Philosophical Review 75:281 - 304 . Donnellan, K., 1 978: Speaker References, Descriptions and Anaphora. I n Cole (ed.) pp. 4768. Gallaire, H. and J. Minker (eds.), 1 978. Logic and Data-Bas�. Plenum Press. New York. Geach, P., 1 962 . Reference and Generality. Cornell Univ. Press. Ithaca.
schrift 39: 1 76-210, 405-431 . Groenendijk, J. and M. Stokhof, 1 98 1 : A Pragmatic Analysis of Specificity. In Heny (ed.) pp. 1 53-1 90 .
Gunderson, K., (ed.) 1 97 5 : Language, Mind, and Knowledge. Univ. o f Minnesota Press. Minneapolis. Heny, F. (ed.), 1 98 1 : Ambiguiti� in Intensional Contexts. Reidel. Dordrecht. Hintikka, J., 1 969: Models for Modalitie.>. Reidel. Dordrecht. Karnp, H . , 1 980: A theory of truth and semantic representation. In Groenendijk, Janssen, and Stokhof (ed s.) pp. 277- 322. Kasher, A. and D. Gabbay, 1 976: On The Semantics and Pragmatics of Specific and NonSpecific Indefinite Expressions. Theoretical Linguistics 3: 1 45- 1 90 . Klein, E., 1 98 1 : Defensible descriptions. I n Heny (ed.) pp. 83- 1 02. Kripke, S., 1 972: Naming and necessity. In Davidson and Harman (eds.) pp. 252- 355 . Reiter, R., 1 978: O n closed world data-bases. In: Gallaire and Minker (eds.) pp. 5 5 -76. Sowa, J., 1 983: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley. Reading, Mass.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Gentzen, G., 1 934- 35 . Untersuchungen iiber das logische Schliessen. Mathematische Zeit
Jolll7Ull of s�mantics 6: 19- 40
THE SEMANTICS OF NON-BOOLEAN "AND"1
JACK HOEKSEMA
ABSTRACT
tation in other cases of conjunction, such as sentential and predicate conjunction. More pre cisely, this is the case when the noun phrases conjoined are referrring terms. A regular Boolean interpretation is still possible whenever two or more quantificational NPs are conjoined. Dis junction is always a Boolean operation. A semantics based on the notion of set formation is provided to deal with conjunctions of referring terms and compared to other proposals in this area, such as Link's lattice-theoretical approach. The present proposal has certain advantages, including the fact that it does not require conjunction to be an associative operation.
1. In this paper I present an approach to noun phrase conjunction which I view as a natural development of the account given in an earlier paper of mine, called " Plurality and Conjunction" (Hoeksema 1 983). Some aspects of that earlier paper were unsatisfactory, and are revised here; furthermore, some general conclusions are drawn concerning the ontology of natural language . Although this paper, like its predecessor, concentrates on the semantics o f conjoined noun phrases, it has clear implications for the general theory o f plurality which has been emerging i n the works of Godehard Link, Remko Scha, David Dowty, Jan Lenning, Fred Landman, Craige Roberts, and a good many others . I will point out some of these implications at appropriate points . The analysis proposed here is indebted to earlier work by Partee and Rooth and van Benthem on type raising.
2. One o f the interesting featu res o f noun phrase conjunction i s that i t does not always behave in a Boolean manner. The algebraic approach to the semantics of the coordinative connectives of English originated by George Boote in his 1 854 book A n Investigation of the Laws of Thought and carried out more recently in great depth in Keenan and Faltz' ( 1 985) opus Boolean
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The meaning of "and" in noun phrase conjunctions differs from its ordinary Boolean interpre
20
(1) (2)
Henry and Lynne drank all m y liquor. Henry drank all my liquor.
Yet normally we can replace a conjunction by one of the conjuncts without changing the truth-value, as in (3):
(3)
Henry ate and drank . Henry drank.
This observation is commonly related to the fact that the conjunction of two singular terms usually counts as a plural term in English, cf. (4): (4)
A man and a woman I were/*was I arrested.
Again, this is a non-Boolean property, since in general the Boolean opera tions do not change the category of their arguments . However, as was noted in Hoeksema ( 1 98 3 ) , certain singular quanti fiers are exceptions to this pat tern. Sentences such as: (5)
a. b. c. d.
Every day and every night was spent in bed . Every man but no woman was upset. N o peasant and no pauper was ever President. Many a day and many a night has passed by.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Semanticsfor Natural Language has been surprisingly fruitful . The logic of the sentential connectives and, or, not has been generalized by Keenan and Faltz, Gazdar { 1 980), and others to nonsentential categories in the obvious way, by associating these categories semantically with boolean algebras. For example, verb phrases can be interpreted as members of some boolean al gebra (let us say, of " properties") since all the laws which hold for senten tial connectives appear to be valid here as well. Among other things, we have commutativity: walk and talk is equivalent with talk and walk, associativity: (eat and drink) and be merry is equivalent with eat and (drink and be merry) and idempotency: sleep and sleep is equivalent to sleep, if we ignore the more emphatic nature of the former expression . The laws of Boolean al gebra can also be seen at work in the case of coordination and negation o f adjectival, adverbial and other modifiers, relative clauses, sentential com plements, and so on. Nevertheless, some descri ptive problems remain for the Boolean account of coordination. In particular, it is well known that conjunctions of proper names, definite descriptions, and existential quantifiers do not behave ac cording to general Boolean principles. For example, ( 1 ) does not entail (2):
21 are perfectly acceptable, even though their verbs are singular. 2 Perhaps not entirely surprisingly, in these cases the conjunction obeys the Boolean ' ' laws 3 of thought" . So, for instance, (7) entails (8): (7) (8)
Every man and every woman solved the crossword puzzle. Every man solved the crossword puzzle.
3. I n Hoeksema ( 1 983), I made use o f the theory of generalized quantifiers t o provide a formal characterization o f the class of expressions which behave essentially like proper names with regard to conjunction . A generalized quantifier can be construed, among other things, as a second-order predi cate, denoting a collection of sets. For instance, the denotation of no angel is taken to be the collection of all sets of individuals which do not contain any angels, and likewise the denotation of every Mormon is the collection of all sets of individuals to which every Mormon belongs, etc. Many lin guistically important classes o f noun phrases can be characterized in terms of the formal properties of these collections of sets, such as closure proper ties, minimal members, and the like. The noun phrases which pattern with proper names in conjunction always denote collections of sets whose minimal members are singleton sets. I have called such noun phrases atomic. Consider now the following definitions in the manner of Barwise and Cooper ( 1 98 1 ): (9)
DJohnB Dthe PopeO lla doctorD Uno doctod levery angell
(X (X (X (X (X
< < < < <
E: E: E: E: E:
j EX) !popeU � X & Card(lpopeU) OdoctorU n X � 0) ndoctod n X = 0) UangelU � X I
l)
Proper names are atomic, since their denotation has a single minimal ele ment in every model, namely the singleton set of the individual we are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
When we replace every man by a proper name, say Tim, and every woman by Grace, say, no such implication is valid. Given these facts, the question arises which noun phrases behave like the proper names4 and which behave like every man or no woman, and why they show this behavior. I believe there is a systematic account for these facts, based on the semantic proper ties of the noun phrases involved . Most if not all other accounts of conjunc tion that I am aware of either fail to note these facts or would have to treat them as cases of arbitrary variation .
22 talking about. For singular definite descriptions, we have the same result, albeit that here we may have different singletons in different situations. In the case of Da doctor II , there can be more than one minimal element, but every minimal element must be the singleton of a doctor. In the jargon of Boolean algebra, the first two cases illustrate ultrafilters, and the third one a union of ultrafilters. An atomic quantifier, then, denotes a union of ·ultrafilters at every model. In the case of U no doctor D , there is a single minimal element, the empty set. Since the empty set is not a singleton set, no doctor is not atomic . Finally, Uevel1: angel l has as its minimal the set of angels. This could be a singleton set, in which case l every angel O = O the angel D . However, the minimal member is not a singleton in every model,
4. A more interesting account would also give some independent reason why natural language conjunction is sensitive to the distinction between atomic and nonatomic quantifiers. Within the Montagovian tradition, in which NPs are always viewed as second-order predicates , such an independent rea son is not forthcoming, since the domain of interpretation for NPs has a uniform Boolean structure and it seems mysterious why that structure is not used for all cases of conjunction. Another problem with my previous ac count is that it seems odd that conjunction should be sensitive to an essen tially intensional property like atomicity. The Boolean connectives have always appeared to be among the most clearcut cases of purely extensional operators . Now we must abandon that view, it appears. Yet intuitively, and is not at all similar to modal operators such as necessarily or possibly. A deeper explanation is arrived at when we return to an older view, ac cording to which the singular terms we have called atomic quantifiers denote not quantifiers, but individuals (that is, elements of type e, in Mon tague's terminology), rather than second-order properties (which have type ( ( e,t ) ,t ) ) . As pointed out in Partee and Rooth ( 1 983), there is no reason to assume that the domain of individuals is a full-fledged Boolean algebra, :md so there is no reason to expect that the Boolean connectives have their regular Boolean interpretation in this domain. It is intuitively natural to assume that individuals do not have complements and that there are no disjunctive individuals. This does not mean, of course, that the domain of individuals has no structure whatsoever. (On the contrary, we shall
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
and so every angel does not qualify as an atomic quantifier. Having made these semantic observations, one can proceed to give an interpretation for and which distinguishes the two classes of quantifiers. This is what I did in my ( 1 983).
23
5. Now that we have established that there is a basic distinction between referring terms on the one hand and quantified noun phrases on the other hand with regard to their semantic behaviour in conj unctions, let us con sider exactly how this difference works out in a formal system . The basic model I assume is a structure ( E , 1 } , I . U ) , where E is a set of entities closed under the operation of group formation (indicated here by I I ) and S . 0 is the interpretation function. Groups are defined as sets with two or more members . We can think of E as being constructed from a set of basic individuals I by letting E be the closure of I under group formation. It is clear that even if I is finite, E will be infinite, since we can iterate group for mation, using the same basic elements: if a,b in I , then I a,b I in E, and I I a,b I ,a I i n E and I l l a, b I ,a I ,a I in E , etc . In Hoeksema ( 1 983) I proposed to allow only groups which can be derived from I by a finite
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
assume, with Link, Landman, and others, that the domain of individuals is endowed with a part-of relation , but more about that later.) Interpreting proper names and definite descriptions as individuals should not seem problematic or perverse. The case of singular indefinites like a nurse or some doctor, however, is not quite as clear-cut, since the Fregean tradition treats them as existential quantifiers. While it might appear that it would fly in the face of a whole logical tradition to treat singular indefinites as referring ex pressions, the situation is actually not that hopeless, give the recent emer gence of a new approach to indefinites in work by Hans Kamp and Irene Heim (cf. e . g . Kamp ( 1 98 1 ), Heim ( 1 982)). This approach, referred to by Kamp as Discourse Representation Theory and by Heim as File-Change Semantics, makes crucial use of the assumption that indefinites are not quantifiers, but variables. The main advantage of this new theory is that it o ffers a more explanatory account of donkey-sentences and discourse anaphora. According to this point of view, indefinites do not refer to a specified member of the universe, but rather to a so-called "discourse referent' ' or place-holder. Since such place-holders may correspond to more than one real object, we get the quantificational flavor of indefinites back, but in an indirect way . For my purposes, it is not necessary to dwell on the details or the motivation of this theory. Its most important aspect, from the point of view of this paper, is that it treats the noun phrases Ronald Reagan, the President, and some White House cowboy on a par as referring expres sions. The manner in which they refer, to be sure, is different in each case, but none of them has quantificational force at the level of discourse representation. In this respect, they differ crucially from such noun phrases as every monkey or many a bookworm . 5
24
( 1 0)
a. b. c.
I t,d,h I l l t,d ) ,h l I t , I d,h I I
If NP-conjunction were a Boolean operation, these three interpretations would be equivalent . In the present framework, all three groups are distinct. For more discussion of this, cf. Section 7 below. I note also that conjunc tions of plural referring terms are interpreted in exactly the same way as con junctions of singular terms. For instance, the NP the Democrats and the Republicans denotes a group of two groups, j ust as the President and the Secretary-of-State denotes a group consisting of two individuals. It is im portant to distinguish such cases of reference to a group from cases of quan tification over the members of that group, as in every Democrat and every Republican .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
number of applications of group formation, since presumably we don't need anything more . Before I say more about groups, let me briefly dismiss what I consider to be a red herring . Link ( 1 983) has argued against using sets as denotations for plural objects because sets are abstract objects, yet the denotation of a plural NP like my parents is not abstract at all, but very concrete. I nstead, he proposes to use sums, which are supposed to be as concrete as their parts. With Landman ( 1 987) and Cresswell (1 985), I fail to see why sets are more abstract than individuals or sums. If I kick or kiss my parents, have I, in doing so, touched a set or a sum? It seems rather arbitrary to make any ex clusive choice here. While certain sets undoubtedly have to be viewed as purely abstract objects (i . e . many mathematically interesting sets such as the set of all real numbers or the power set of the i rrational numbers), there is no compelling reason, it seems to me, to restrict the notion " set" to abstract objects and to use a different term for collections of concrete objects. Note also that Link's sums have the same problem as sets, in that they are some times concrete ( " Ronald and Nancy") and sometimes abstract ( "the square root of 2 and the cardinality of Q"). Returning to the main topic, it is straightforward to interpret a conjunc tion of referring terms as denoting group formation of the entities referred to . I f " Adam" refers to individual a and " Eve" to individual b, then " Adam and Eve" refers to the group I a,b 1 . This kind of conjunction is non-Boolean, since group formation is not a Boolean operation. Depending on its proper parsing, a noun phrase such as Tom and Dick and Harry could correspond to any of the three groups in ( 1 0) :
25 6.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Interesting mixed cases o f reference to groups and quantification over their parts are partitive constructions like every one of the Republicans. The par titive construction directly exploits the part-of relation between groups and their members. In fact, they use more than we have considered so far , be cause they can also be used with mass-determiners, in which case they also make use of the part-of relation between individuals and their proper parts, which I have not mentioned so far . It seems to me that the object of partitive of must be a referring term, and cannot be a quantified noun phrase. So we have partitives such as most ofthe cake and some of the boys, but not * three of no boys or •seven of every student, where partitive of is followed by a 6 quantificational noun phrase. Count determiners only cooccur with group-level entities, whereas mass determiners can also cooccur with individual-level entities. So we have several of the cakes and three of the women but not •several of the cake or * three of the woman , for obvious reasons. There is an interesting distinction here between mass determiners and count determiners with respect to quantifiers. As I said, partitive of ap pears to select referring terms, not quantifiers. However, when the deter miner is a mass-determiner, this claim appears to be falsified, since we find acceptable cases like try to eat some of every dish or of no dish did she eat very much. Such cases, I maintain, involve wide scope of the quantifier . The of really operates on a variable, and so we expect to find the mass determiners, but not the count determiners, in these cases. This is because variables refer to individuals, not groups, and only mass determiners apply to expressions which concern parts of individuals. This account of partitives is somewhat different from that of Barwise and Cooper (198 1 ) and Ladusaw ( 1 982). I believe that the general framework of this paper offers some minor conceptual advantages over these earlier accounts. According to Barwise and Cooper as well as Ladusaw, partitive of can only apply to definite noun phrases. We have already seen that there are counterexamples to this claim. Ladusaw himself notes cases like one of three students. 7 Moreover, and more importantly, it seems odd that the partitive construction should make reference to the notion definiteness, since definiteness, just like atomicity, is not an extensional property. According to Barwise and Cooper's defini tion, a generalized quantifier Q is definite just in case it denotes a proper principal filter in every model on which it is defined. To see whether a noun phrase is definite, it is not enough to check its denotation in a particular model . This makes the partitive construction an essentially intensional con struction, which is j ust as odd as the supposition of my (1983) paper that NP-conjunction is intensional . My present account, which claims that definite NPs correspond to individuals and groups , avoids this problem.
26 7.
Ao = A � + I = pow(An) - ( 0)
Aw = pow( U m < w A m) - ( 0) This definition gives us all the sets that can be constructed out of A, except for those that contain the empty set . Note that A contains no individuals - the smallest elements are singletons of individuals. This means that we can use set union instead o f group formation as the basic operation in the interpretation of conjunction . Again, there is a drawback to this: set union is associative. Union (or rather: the Boolean join-operation) is also em ployed by Lenning ( 1 986). Apart from the empirical problem of nonas sociativity, the choice of this operation to model conjunction is also rather startling given that we would expect or, and not and, to express union . Another unattractive feature of the way in which Landman has set up his domain is the fact that it distinguishes all of the following sets: (a) ( (a) I
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The model presented here compares favorably, I believe, with other models proposed in the literature on plurals, such as Link's ( 1 983) lattice model and Landman's ( 1 987) domain of graded types, as well as Umning's ( 1 986) Boolean model. For instance, Link 's system features an associative summa tion operation . This is an u ndesirable property. For the purposes of NP conjunction it does not seem appropriate to postulate associativity. This is most obvious in the case of highly connotative conjunctions such as [Bob Dylan and [Simon and Garjunkei]J. While it seems true that Bob Dylan and Simon and Garfunkel wrote many hits in the 1 960s, it does not follow that Dylan and Simon or Garfunkel wrote many hits. Likewise, there seems to be a true ambiguity in sentences like Karttunen and Peters and Ritchie never published in the same journal. Either this sentence claims that the journals in which Karttunen and Peters published are not the same as those in which Ritchie published , or else it claims that Karttunen did not publish in journals where Peters and Ritchie published. 8 This can be captured naturally by postulating a structural ambiguity between [Karttunen and [ Peters and Ritchie]] and [[Karttunen and Peters] and Ritchie] . Associativity o f con junction has the unpleasant consequence of making this structural ambigui ty irrelevant to semantic interpretation. (For some further criticisms of Link' s theory, see Landman ( 1 987). Landman's model is an ordered pair ( A"" 0 U ), where A is based on a set A of basic individuals through the following inductive definition:
27 ( ( (a) ) ) l l l lal l l l etc.
(1 1)
a. b.
The cards below 7 and the cards from 7 up are red. The cards below 7 and the cards from 7 u p are shuffled.
In both cases, the predicate can be distributed over the conjuncts, so that we can infer ( l 2a) and ( l 2b) from ( l 1a) and ( l 1b) respectively. (12)
a. b.
The cards below 7 are red and the cards from 7 up are red . The cards below 7 are shuffled and the cards from 7 up are shu ffled .
However, in the case of (11a), we can infer a lot more, namely that every card is red . In the case of ( l 1b), of course, no inferences about individual cards can be made; indeed, it makes little sense to say of an individual card that it is shuffled. This di fference is represented in LP by assigning the fol lowing translations to ( l 1a) and (11b): (11')
a. b.
*Red ((ax : x < 7) + (ax: x � 7)) •sH ( t (ax: x < 7) + t ((ax: x � 7))
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This is a spurious structure that does not seem to be needed anywhere i n the semantics o f plurals and indeed I suspect that i t has n o basis in cogni tion either. While we seem to be perfectly capable of conceiving of collec tions , collections of collections and so on in everyday life, we do not seem to distinguish, except in our more mathematical moments, elements from their singleton sets and the singletons of their singletons. By requiring that groups have at least two members we avoid this problem . (As an historical aside, I note here that some versions of set theory, such as the one in Quine's New Foundations, have equated elements with their unit sets, and hence do not distinguish between the singletons listed above either . ) I f w e are not going t o make a distinction between singletons and their members, as I propose, then certain other features of Landman's account must be rejected as well, such as his operators t and l , which send entities to their unit sets and vice versa. Landman uses these operators to distinguish group-level readings from distributive readings. Link and Landman make use of a formal metalanguage LP in which predicates are marked by a star for distributivity. More precisely, if I P I is some set of individuals, then I *P I is the closure o f that set under summation. This means that whenever *P(x + y) is the case, we have *P(x) and *P(y). In other words, starred predi cates distribute over the parts of the sums they apply to. Now there are various degrees to which a predicate can be said to be distributive. Consider for example the difference between the following two sentences:
28 (In these formulas, the symbol derive (1 1 " ) b .
a
stands for ' sum'. ) From ( 1 1 ' b) we can
• s H ( f (ax : x < 7)) and *SH ( t (ax: x
2::
7)
(ax: boy(x)) + (ay: girl(y)) (ax ; boy(x)) + t (ay: girl(y)) t (ax: boy(x)) + (ay: girl(y)) t (ax: boy(x)) + t (ay : girl(y)) t (ax: boy(x)) + (ay: girl (y)) t ( t (ax: boy(x)) + (ay: girl(y)) t ((ax : boy(x)) + t (ay : girl(y)) t ( l (ax: boy(x)) + t (ay: girl(y))
I nstead, I treat such conjunctions as having a single logical form. The several readings that arise in various contexts are not produced by the trans lations of the conjoined noun phrases, but arise from lexical entailments or pragmatic implicatures associated with the predicates. If you will, these can be spelled out in the form of meaning postulates, as in Scha ( 1 98 1 ), Hoekse ma ( 1 983), or Dowty ( 1 986). This approach may not be fully satisfactory, for reasons pointed out in Roberts ( 1 987), but it has some rather attractive features .9 For instance, distinctions between predicates that distribute all the way down to the individuals, such as be red in ( 1 1 a) and predicates which distribute down to the immediate-constituents of the groups they apply to, like be shuffled in ( 1 1 b), are not hard to handle. We just state: if a group I X, Y, . . . , Z I c I be shuffled D , where X , Y, . . , Z are groups, then so are X, Y, , Z. I note that this automatically accounts for Landman's observation that the statement that the cards below 7 and the ones from 7 up are shuffled does not entail that the cards below 1 0 and the ones from lO up are shuffled , even though both collections make up the same deck of cards. The observation is accounted for because a group I X, Y I need not have W and Z as proper parts, even though X U Y = W U Z. In the case of ( 1 1 a), .
.
.
.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
because the predicate *SH applies to a sum . However, since expressions of the form t X denote singleton sets, which cannot be taken to be sums of other sets (given that the empty set is excluded from the domain), the predi cate cannot be distributed over its parts . This result is the main motivation for having t he t operator in LP. This way of characterizing the complex patterns of collective and distibu tive readings is rather counterintuitive in my opinion. A simple conjunction of noun phrases like the boys and the girls becomes at least 8-way ambi guous (or more if we add spurious repetitions of the t operator) on this account since we can have each of the following representations in L P :
29
Bill, Pete, Hank and Dan lifted a piano. It was very heavy. Roberts notes that on the relevant reading, where each of the four men lifts a (possibly di fferent) piano, the noun phrase a piano is not a good antece dent for the pronoun in the second sentence. In a theory such as Kamp's Dis course Representation Theory, this calls for some kind of overt marking o f distributivity, s o as t o block such anaphoric links. However, the facts here are not straightforward. For me, many cases of distributive readings are not at all incompatible with subsequent anaphora. For instance, the following pieces of discourse strike me as being rather natural: The men were told to keep a diary. It would help them remember their present plight. They all have a car. Unfortunately, they don't let anyone else drive it.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
X and Y are the groups of cards below 7 and from 7 up, which consist o f individual cards. The predicate Dbe shuffledO does not apply t o individuals, and so the predicate will not distribute all the way down . In the case of Ube red I, we drop the requirement that X and Y be groups, and so this predicate will apply all the way down to the individual cards, as required. An attrac tive aspect of this point of view, which is not present in the Link/Landman theories, was pointed out in Dowty ( 1 986) , namely, that it needs no special machinery to handle mixed conjunctions of distributive and collective predi cates, as in We met in a bar and had a good time. It is not a knock-down argument, since Landman shows how to treat such cases by using type shifting rules, but it is rather appealing to have a theory which does not need any extra provisions for such cases. I further note here that collective/dis tributive ambiguities can even be found when the subject does not refer to a group but to an individual . For example, from the fact that Marie weighs less than 1 1 0 lbs , we may safely conclude that her arms or her legs weigh less than 1 1 0 lbs, but from the fact that Dick weighs more than 1 50 lbs, we can conclude nothing about the weights of his body parts. So ' weighs less than X' is a distributive predicate, whereas 'weighs more than X' is not. However, for some purposes, an explicit marking o f distributivity such as an operator makes available may be useful, as in the case of discourse pronomina in a Kamp-style account . This point was made in Roberts ( 1 987) and is well-taken . I f a singular indefinite is used as part of a distributive predicate, it becomes unavailable as an antecedent for a pronoun in the fol lowing sentence (unless a wide-scope reading is represented). To see this, consider the following discourse:
30 Both keep a diary and have a car have obvious distributive readings, yet in each case the pronoun can be used to pick up the indefinite. Hence the status of this argument is not clear . I have nothing against the use o f a distributivity operator a s a device t o mark the readings of a predicate. What I do want to dispute, however, is that distributive/collective ambiguities of conjoined noun phrases should be ascribed to di fferent translations or logical forms of these conjunctions. Whenever there is such an ambiguity, the source can be found in the predi cation involved .
Returning to conjunction, I want t o point out an interesting consequence of the present proposal: Coreferential NPs cannot be conjoined. The reason is that the group consisting of some individual a, a and a is not defined (un like for instance the j oin of a set with itself). To me, this seems a reasonable result, given that most such conjunctions are in fact unacceptable. For in stance, while we can express John's similarity to Mary by saying either ( 1 3a) or ( 1 3b), only ( 1 4a) is a possible way of expressing John's similarity to himself: ( 1 3).
a. b.
John is similar to Mary. John and Mary are similar.
( 1 4)
a. b.
John is similar to himself. • John and John are similar .
The situation is more complex than that, however, because other kinds of conjunctions of coreferential elements are actually fine, as the following examples illustrate: ( 1 5) ( 1 6)
The Morning Star and the Evening Star are the same planet. Cicero and Tully are one and the same person.
Cases like these do not count as counterexamples to the present theory. Rather, they indicate that the notion of 'individual' involved is more sophisticated than one might have supposed. Indeed, in order to make sense of such examples, it seems necessary to appeal to the intentional objects in voked by philosophers like Husser! and semanticists like Landman (see his 1 986, 1 987) . It seems most attractive to introduce two distinct intentional objects, the Morning Star and the Evening Star, or Cicero and Tully, which may correspond to only one real-world entity. Because there are two inten-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
8.
31
( 1 7)
a. b.
His aged servant and the subsequent editor of his collected papers was with him at his deathbed. H i s aged servant and the subsequent editor of his collected papers were with him at his deathbed.
As a matter of fact, as things stand right now, the first type of conjunction, called appositional conjunction by Quirk et al. ( 1 972), cannot be handled by our semantic approach at all. I will come back to these examples later on. Van Eijck deals with these cases by introducing a individual-level dis course marker for the first example, of which the properties named by the two NPs and the VP are predicated, while the second case gets a group-level marker, the members of which are the marker for the aged servant and the one for the editor of the papers. This brings me to an issue not yet addressed here. The group structure imposed here on the domain of individuals must be extended to the domain of discourse referents : just as we have individuals and groups, we need individual and group discourse referents. The latter have been introduced into the literature on discourse representations in van Eijck ( 1 983) in an attempt to deal with plural pronouns. A more recent ap-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tional objects, w e get the plural agreement o n the predicate. A t this point, no doubt some would like to invoke Occam's Razor, which forbids us to multiply entities beyond necessity. Why this need for intentional objects on top of the real-world objects which any truth-conditional theory of seman tics must postulate? One might point out that it is quite natural to invoke intentional objects in the case of fiction. Consider for instance the strange story of Dr. Jekyll and Mr. Hyde. First we are introduced to two individuals, one outstanding and respectable, the other dangerous and despicable. After a while, we learn that Dr. Jekyll is in fact Mr. Hyde, and so the two entities collapse into one. In the meantime, we are aware this is fiction, and we know that Dr. Jekyll and Mr. Hyde have no counterpart or counterparts in the real world. So we have two individuals who are really one who are really none . An austere referential semantics would not do j ustice to this and simi lar cases . It is not my purpose here to formulate a theory of intentional ob jects . All I want to suggest here is that we need such a theory to understand what is going on in examples such as ( 1 5) and ( 1 6) . One possibility that one might pursue here is that intentional objects are discourse referents in the sense of the Kamp/Heim theory of discourse anaphora. Landman ( 1 986) ar gues against this construal, but the matter is not settled in my opinion. The conjunction facts which ought to be accounted for are actually quite com plex. For instance, it has been noted in the literature that a conjunction o f two singular terms can combine with a plural and a singular predicate, de pending on whether the terms are taken to be coreferential or not. Some examples taken from van Eijck ( 1 983 :99) illustrate this phenomenon :
32 plication can be found in Hoeksema ( 1 986), where plural referents are em ployed in the analysis of relative clauses with split antecedents, such as Perlmutter and Ross' ( 1 970) example ( 1 8)
A man entered the room and a woman went out who were quite similar.
( 1 9)
A man and a woman entered. They embraced. He was short, she was tall.
we get the following representation: (20)
u,v, [u,v] man(u) woman(v) enter( [u,v] ) embrace( [u, v] ) short(u) tall(v)
The embedding function f, which assigns values in the actual model to the discourse referents must have some obvious properties: if f(u) a and f(v) = b, then f( [a,b] ) = ( a,b ) . More generally, f( [x,y, . . . ,z] ) = ( f(x) , f(y), . . . , f(z) ) . =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Such sentences have always posed a formidable problem for compositional semantics, but can be handled quite straightforwardly in Kamp's discourse representation theory. The two singular indefinite NPs each introduce a new discourse referent to the representation. The representation can then be op tionally extended by adding group-level referents consisting of elements that are already present. For instance, suppose we have the referents u and w in the representation , then we can add [u, w] to this representation. (The op tionality is added only to avoid overloading the representations with too many discourse referents. ) This new group marker can then function as the referent of a plural pronoun or, in the case of split relatives, be available to predicate something of, such as the property of being similar in the case o f ( 1 8) . For the case of conjunctions of indefinites, I propose a similar scheme. The conjoined elements themselves give rise to discourse referents and the conjunction adds a group-level referent. This time, however, the ad dition is obligatory, because it is this plural object that the verb phrase is predicated o f. For a little piece of discourse such as
33 8.
TYPE LIITING
f: ( e ) - ( ( e,t ) ,t ) : f(a)
=
A.P[P(a)]
Type-lifting makes it possible to conjoin referring terms with quantifiers and to interpret disjunctions of referring terms. For example, it is now
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
At this point, it is appropriate to reflect on what has been proposed so far . T o account for the fact that conjunctions of N P s do not generally behave like intersections of sets, it was suggested that the older theory, which holds that some NPs refer to individuals, and others behave more like higher order objects, like quantifiers, might well be correct. However, it is not enough to j ust fall back on this older theory; after all, when Montague first pro posed his unified account of NP interpretation, this was seen as a major step forward, because it offered ( 1 ) , a single semantic type for what syntactically appears to behave as a single category, (2) a solution to the problem that referring terms can be coordinated with quantificational terms, as in the President and some senators, 1965 and every preceding year, Hannah or any of her sisters, and so on, which seems to indicate that the semantic type of quantifiers and referring terms is really the same. It is rather interesting to see that conjunction can be used to argue for as well as against the distinc tion between referring and quantificational terms. In order to take care of these problems, it seems best to follow a growing number of semanticists who enjoy the benefits of higher-order interpretations without entirely giving up on the simpler interpretations which first-order theories provide, by making use of type-raising. Type-raising is a common device in much current work in categorial grammar, such as Dowty's (to appear) study of non-constituent coordination and Steedman's ( 1 98 5 , to appear) work on ex traction and coordination. For our purposes, it will not be necessary to consider a general theory of type-raising . All we need is a rule which relates elements of type ( e ) to second-order predicates of type ( ( e,t ) ,t ) . The basic motivation behind such a rule is an observation made by various logicians, including Ramsey and Geach, long before Montague, namely that a subject-predicate combination such as Socrates is mortal can be viewed in two different, but equivalent ways : either the predicate is taken to give a property of the subject, in this case by ascribing mortality to Socrates, or else the subject is taken to provide a property of the predicate, in this case by ascribing to mortality the property that is applied to Socrates. I n the latter case the subject is interpreted as a property of a predicate, which makes it a second-order predicate. The use of referring terms as second order predicates is modelled simply by the following rule which maps in dividuals into the ultrafilters they generate:
34
possible to interpret the conjunction the Pope and every other Catholic as follows: the Pope - g(i)x : Pope(x) (first-order analysis) - A.P [P(g(i)x:Pope(x))] (second-order analysis) every other Catholic - A.P[vx: Catholic(x)
&
-
Pope(x) - Px]
the Pope and every other Catholic - A.P[P(g(i)x: Pope(x)) & vx: Catholic(x) & Pope(x) - Px] A.P [vx: Catholic(x) - Px] =
(i) (ii) (iii) (iv) (v)
a => a a ( a , b ) => b a A => b , then A => ( a,b ) A => b, then B A C => B b C A => B, and B => C, then A => C
where a,b are any types, and A,B,C any sequences of types, and the order of functors and their arguments is irrelevant The earlier type-lifting ( e ) => ( (e,t ),t) now turns out to be a special case which can be shown to follow from the Lambek calculus (I) e ( e, t ) => t (case of ii) (II) e => ( ( e,t ) ,t ) (by I and iii) Non-boolean conjunction is an operator of type ( e, ( e,e) ) : it takes an argu ment of type e, and then another argument of type e, to yield a value of type e. It can be raised to type (T, (T,T) ) , where T = ( ( e,t ) ,t ) in the Lambek calculus, cf. : (a) (b) (c) (d) (e)
( e ) ( e, ( e,e ) ) => ( e,e ) by (ii) ( e,e) ( e ) => ( e ) (permutation-variant of ii) ( e ) ( e, ( e,e) ) ( e ) => ( e ) (by iv and v from (a) and (b)) ( e ) ( e, ( e,e ) ) ( e ) ( e,t ) => ( t ) (by iv , v from (c) and (II)) ( e, ( e,e) > (e) ( e,t) => (e, t ) (by iii from (d))
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
As pointed out by a reviewer, one might also type-raise the non-Boolean conjunction operator itself, if a more general perspective on type-lifting, such as that of the Lambek-calculus (Lambek 1 958, van Benthem 1 986), is adopted. In that case one may get non-Boolean conjunctions of higher order expressions, e.g. noun phrases of type ( ( e,t ) ,t ) , as well as mixed conjunctions of elements of type ( e ) with elements of type ( ( e,t ) ,t ) . In van Benthem ' s version of the Lambek calculus, the rules o f type change are:
35 (f) (g) (h) (i) U) (k)
( e,t ) ( T ) ( e , ( e,e ) ) ( e, ( e,e ) ) ( e, ( e,e ) ) ( e, ( e,e ) ) ( e, ( e,e ) )
� t (by ii) ( e ) ( e,t ) ( T ) � (by iv and v from (e) and (f)) ( e,t) ( T ) � ( e, t ) (by iii from (g)) ( e,t) ( T ) ( T ) � t (by iv, v from (h) and (f)) ( T ) ( T ) � ( ( e,t ) ,t ) = T (by iii from (i) � ( T , ( T,T ) ) (by two applications of ii from U ))
As the reader pointed out, the semantics for type-change in van Benthem ( 1 986) will associate with the type-lifted non-Boolean conjunction the fol A.PA.ci>I.. ll . ci>(A.x .nO..y .P( ( x ,y ) ))), where ci> lowing interpretation: U and O and n range over denotations of type T and P is a variable over type ( e,t ) . To see this, note that every application o f rule (ii) corresponds to application of a variable and every application of rule (iii) to the abstraction over a variable. So steps (a) and (b) correspond to applying basic (not type-lifted) non-Boolean " and" to x and y, resulting in the doubleton ( x,y } . Steps (e), (h) and U) correspond to the three abstractions in the translation of "and" . What is crucial here is that the denotation of "and" after type-lifting is not stipulated ad hoc, but follows from van Benthem's semantics for type shifting. For a conjunction of two quantificational noun phrases, such as every soldier and every officer, type-lifted non-Boolean conjunction will produce the set of all properties o f all pairs of a soldier and an officer. This gives the right interpretation for sentences such as every soldier and every officer met, no soldier and no officer have danced together etc . (cf. also Footnote 1 ) . For cases of mixed conjunctions (i.e. where a quantificational noun phrase is conjoined with a referring term), we now have the option to lift the type of the referring term to that of a generalized quantifier and then apply Boolean conjunction, or else to lift the type of the conjunction opera tor for one of its arguments. In the latter case we get the set of properties of all pairs consisting of the Pope and a Catholic as the denotation of the conjunction the Pope and every other Catholic. A maj or unsolved problem with the type-lifting approach is that we must block the use of type lifting in situations where it is not needed, in particular in the cases of non-Boolean conjunction discussed in the beginning of this paper. Partee and Rooth (1 983) suggest a processing strategy, according to which one uses the lowest types possible. However, the facts they had in mind (scopal readings o f disjunctions) are of a different status than the ones I am concerned with here. In particular, it seems simply wrong to say that Bill and Harry was watching TV is a possible English sentence, and that its oddness stems from the fact that the sentence Bill and Harry were watching TV is the preferred variant because of some processing strategy. In fact, there are cases were a Boolean interpretation is enforced, such as in the ' X as well a s Y ' construction, cf. John as well as Harry works o n this problem . Here there seems to be no processing problem at all . So we must invoke a =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
36 rule to use minimal types only, but this rule appears to be not a processing strategy but rather a principle of grammar.
9.
(2 1 )
a. b. c. d. e.
My great opponent and the hero of my youth has passed away. A great man and a good father h as passed away. A great man and the best magician in New Jersey has passed away. •or. Jekyll and Mr. Hyde has passed away. •charles Dodgson and Lewis Carroll h as passed away.
Even though two names can refer to the same entity, appositional conjunc tion is disallowed. The only way to account for this that comes to my mind, is by stipulating that different proper names always correspond to different discourse markers in the discourse representation structure, whereas de scriptions may be represented by a single marker. Hence there will always be the possibility of first-order conjunction for proper names, even in cases of coreference, and the availability of such a conjunction will block the higher-order analysis needed for appositional conjunction. As we have seen, some such blocking principle is needed anyway to rule out Boolean conjunc tions of non-coreferential singular terms. However, making this distinction seems an ad hoc proposal at the moment, and moreover, it still fails to cover the facts entirely, since it would predict that appositive conjunction of a proper name with a definite description would be possible. As far as I am aware, this is not the case. (22)
a. b. c.
• John and my best friend is sick . • M y hero and Houdini h as passed away. •Amy and a long-time lover lies buried here.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It might be supposed that type-lifting will also give us a handle on apposi tional conj unction. Conjunction of the noun phrases his aged servant and the subsequent editor of his collected papers is blocked at the first-order level, since group formation of a and b is not defined in case a = b. Lifting the type and applying regular Boolean conjunction gives us appositional conjunction for free. However, things are not this simple. It should be noted that appositional conjunction is restricted to definites and indefinites; proper names do not seem to be conjoinable in this way. So we have the following pattern:
37 1 0.
Department of Linguistics Univusity of Ptnnsylvania 619 Williams Hall Philadelphia, PA 19104-6305 USA
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To conclude, let me sum up the main claims of this paper. It was found that quantified noun phrases and referring terms behave differently in conjunc tions. This finding makes sense if we take quantified noun phrases to denote generalized quantifiers, while holding on to the view that referring expres sions denote individuals. The domain of individuals was sketched with its group structure and compared with alternative proposals. For indefinites, the Kamp/Heim theory of discourse representations was adopted, which has the desirable property of treating indefinites as referring terms instead of existential quantifiers . Since this theory was originally motivated for an entirely different set of phenomena, this paper provides some additional support for it. Finally, it was shown how type raising can be used to relate referring terms to generalized quantifiers so that it is possible to interpret disjunctions of referring terms and mixed conjunctions of referring and quantified expressions. This approach seems preferable over the more uni fied Montagovian approach, which treats all NPs as having the same logical type. This brings me to a matter discussed by Ed Keenan ( 1 982) in a paper called "Eliminating the Universe (A Study in Ontological Perfection)" . Ac cording to Keenan, a semantics for L is ontologically perfect j ust in case the elements of its ontology are possible denotations for expressions in L. H e argues that i t is desirable t o have an ontologically perfect semantics, for ' ' [o ]therwise the denotations of some expressions would be defined in terms of semantic things which we cannot refer to in the language and so in some sense cannot know " . While I think we don't have to be able to refer to an object in order to know it (in fact, it is not necessary to speak any language at all to know some objects, as prelinguistic babies seem to show), ontologi cal perfection seems to be a desirable feature. It is closely related to Occam's Razor and similar requirements of parsimony in scientific methodology. The semantics for a fragment of English in M ontague's PTQ (Montague 1 974) is not ontologically perfect, since it introduces objects without having any expressions denote t hem . I nstead, the objects are needed to build up sets of objects, as denotations for the predicates, and sets of sets of objects, for the NPs and so on. Keenan proposes to eliminate objects and use properties instead as the primitive elements of his Boolean semantics. This paper sug gests yet another road to ontological bliss, not by eliminating the universe, but by re-introducing referring terms.
38 NOTES
I.
Earlier versions of this paper have been presented at the University of Washington and
Stanford University. I am indebted to the audiences at these presentations as well as D. Dowty, C. Roberts and
2.
J. Lenning and an anonymous reviewer for comments and criticisms.
In many of these
cases,
both plural and singular agreements are possible. Exactly what
causes this variation is not clear to me, but it would seem that the singular agreement is caused by the Boolean nature of the conj unction in these cases (hence semantically motivated) and the plural agreement is due to the formal analogy of these conjunctions with the much more com mon non-Boolean variety (hence syntactically-driven). In the area of agreement, such variation is not uncommon, and usually hard to accoun t for in a rigorous manner. To be sure, the exis tence of this variation is often taken to be evidence for a syntactic account of number agree ment, since there appear to be no semantic differences. However, the position that number
sarily weak. My position is that most facts about number agreement
can
seems
unneces
only be explained (as
opposed to described) semantically, but that there remains some arbitrariness which must be ascribed to syntactic encoding. This general position is also taken in Sadock (I 983). 3.
David Dowty has drawn my attention to the existence of
cases
where the conjunction of
two quantifiers does not behave in a Boolean manner, but rather in the manner of branching quantifiers: (i)
No farmer and no student were ever alike.
Unlike the rather similar examples discussed i n Barwise (I 979), these cannot be explained away
quite as easily by invoking the logical properties of reciprocal predicates, as in the appendix
of Hoeksema (1 983). For an analysis of these cases in terms of type-lifting of the non-Boolean conjunction operator,
4.
see
Section 8 .
In the case o f proper names, i t should be noted that there i s a rather common use o f con
joined proper names as singular expressions, namely when the conjunction as a whole is used as a single proper name. Examples of this special use of conjunctions are brandnames (e.g.
Strawbridge and Clothier is havmg a salt; Bolt, Btranek and Ntwman has hirtd a linguist; Johnson and Johnson �/Is baby products), reference to publications by the names of the authors (as in Dowty, Wall and Pettrs is out of print), etc. Semantically, the internal of these names is irrelevant. Each name refers to a si ngle individual-level entity, and not to a group. Of course, this entity may have a certain historical relationship to a group, such as the relation ship between a firm and its founders, or that between a paper and its authors, but that relation ship does not take part in the interpretation of the complex names under consideration. In this respect, such names are on a par with other complex names, such as booktitles, quotes, or placenames like Bird-in-Hand, Pa and White Plains, N. Y. . 5 . Lenning (1 986) and Roberts ( 1 987) have also appealed to the distinction between quan
tificational and referring terms to explain certain d ifferences in distributivity of conjoined ex pressions.
6.
An unsolved problem for almost all accounts of partitives is the non-occurrence of con
joined NPs after partitive of, as illustrated by the ungrammaticality of examples like ont of
you and mt, two of Tom, Dick and Harry, nont of Janict and mt, etc. Only the account given m
Keenan and Stavi ( 1 986) has no di fficulty with such cases, because these authors analyze the
partitive construction as involving a complex determiner of the form "dtt of dtt" combined with a common-noun phrase. While two of tht boys or stvtral of my frttnds allows such a parsing, the cases involving conjunctions do not. However, that account seems flawed for other reasons (cf. Hoeksema 1 984 for some discussion).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
agreement is a purely syntactic phenomenon, a position commonly taken in GPSG-studies of agreement and conjunction, such as Sag, Gazdar, Wasow, and Weisler ( 1 985),
39 7.
Even more problematic cases like on� of ev�ry th� students actually seem to involve an
idiomatic interpretation. Clearly, such expressions do not quantify over all triples of students, but rather over all triples in a given partition of the domain . Note that we could have used one in evuy three students as well. Otherwise, it is not possible to replace partitive of by in. 8.
Of course there is a third reading, according to which none of the three men ever published
in the same journal, which does not concern us here. 9.
A problem which arises is that either meaning postulates must be allowed to be optional,
which is inconsistent, or else rather pervasive lexical ambiguity must be assumed, given the large amount of verbs which give rise to both distributive and collective readings.
REFERENCES
Philosophy 4 : 1 59-219.
Biiuerle, R., Schwarze, C. & Stech ow, A. von (eds.), ( 1 983): Meaning, u� and Interpretation
of Languag�. De Gruyter, Berlin.
Benthem, 1. van 1 986: Essays in Logical �man tics, D . Reidel Co., Dordrecht.
Boole, G. 1 854: An Investigation of the Laws of Thought on Which are Founded the Math�
matica/ Th�nes of Logic and Probabilities [Reprint. Dover, New York 1 95 1 ) . Cresswell, M . 1 985: Review o f Landman and Veltman (eds .), 1 984, Varieties ofFormal &man
tics. I n : Linguistics 23. Dowty, D . 1986: A Note on collective predicates, distributive predicates, and A ll. I n : Marshall, Miller and Zhang, (eds.), Proceedings of th� Third Eastern States Conference on Linguis tics. Department of Linguistics, Ohio State University, Columbus. Pp. 97- 1 1 5 . Dowty, D . t o appear, type-raising, functional composition and non-constituent coordination. ln: R.T. Oehrle, E. Bach and D. W heeler, (eds.), Categorial Grammars and Natural Lan
guage Structures, D. Reidel, Dordrecht. Eijck, J. van 1 98 3 : discourse representation theory and plurality. In: A. ter Meulen, (ed .),
Studies m Modeltheor�tic &mantics. Foris Publications. Dordrecht. Pp. 8 5 - 1 06 . Flickinger, D . P . , Macken, M . a n d Wiegand, N . (eds.) 1 982: Procefflings of the First West
Coast Cofl/erence on Formal Linguistics. Linguistics Department, Stanford University, Stanford. Gazdar, G. 1 980: A cross-categorial semantics for coordination. Linguistics and Philosophy 3 :407-409. Groenendijk, J . , Janssen, T. & Stokhof M . , (eds.) 1 98 1 : Formal Methods in the Study of
Language. Mathematisch Centrum, Amsterdam. Heim, I . , 1 982: The Semantics of Definite and Indefinite Noun Phrases. Doctoral dissertation. University of Massachusetts, Amherst. Hoeksema, 1. 1983: Plurality and conjunction. In: A. ter Meulen, (ed.), Studies in Mod�l
th�retic &mantics. Foris. Dordrecht.
Hoeksema, 1 . 1 984: Partitives. MS, Rijksuniversiteit Groningen.
Hoeksema, 1 . 1 986: An account of relative Clauses With Split Antecedents. In: M. Dalrymple,
1 . Goldberg, K. Hanson, M. Inman, C. Pinon and S. Wechsler, (eds .), Pro�ings of the
West Coast Conference on Formal Linguistics, vol. 5 :68-86.
Kamp, H. 1 98 1 : A theory of truth and semantic representation. In: Groenendijlc et al . Keenan, E.L. 1 982: eliminating the universe (a study in ontological perfection). In Flickinger et al . , pp. 7 1 - 8 1 . Keenan, E . L . & Faltz, L . 1 985: Boolean &mantics for Natural Language. D . Reidel, Dordrecht. Keenan, E.L. & Stavi, Y. 1 986: A Semantic Characterization of Natural Language Deter miners. Linguistics and Philosophy 9-3 :253- 326.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Barwise, 1. & Cooper, R. 1 98 1 : generalized quantifiers and natural language. Linguistics and
40 Ladusaw, W . A . I 982: Semantic constraints on the english partitive construction. In: Flickinger et al., pp. 23 1 -242. Lambek, J . , 1 958: The mathematics .of sentence structure. A merican Mathematical Monthly 6 5 : I 54- I 70. Landman, F. 1986: Towards a Thtory of Information. The Status of Partial Objects in &man tics. Foris Publications. Dordrecht.
Landman, F . , 1 987: Groups. MS. University of Massach usetts, Amherst. Link, G. 1 983: The logical analysis of plurals and mass terms: a lattice-theoretical approach. I n : R. Bl:iuerle et al. L0nning, J . T. 1 986: Collective readings of definite and indefinite noun phrases. In: P. Gaer denfors (ed.), Gtntralized Quantifius: Linguistic and Logical Approacht!S. D. Reidel. Dordrecht. Montague, R. 1 974: Formal Philosophy, edited by R . H . Thomason. Yale U niversity Press.
1 : 36 1 - 383.
Roberts, C. 1 987: Modal Subordination, A naphora and Distributivity. Doctoral dissertation. University of Massachusetts, Amherst. Sadock, J . M . I 983: The necessary overlapping of grammatical components. In: J .F. Richard
son, M. Marks, A. Chukerman, (eds.), Pa{Jt!rs from tht ParastsSion on the Interplay of Phonology, Morphology and Syntax. Chicago Linguistic Society, Chicago, pp. 1 98-22 1 . Sag, I . , Gazdar , G . , Wasow, T . & Weisler, S . 1 985: Coordination and how to distinguish categories, Natural Languagt! and Linguistic Theory 3 : 1 1 7 - 1 7 1 .
Scha, R . 1 98 1 : Distributive, collective and cumulative quantification. In: Groenendijk et al. Pp. 483-5 1 2 . Steedman, M. 1 98 5 : Dependency and coordination in the grammar o f dutch and english". Languagt 6 1 : 523-568. Steedman, M., to appear: Combinatory grammars and parasitic gaps. Natural Languagt! and
L inguistic Thtory.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
New Haven. Partee, B. & Rooth, M. 1 983: Generalized conjunction and type ambiguity. In: Bl:iuerle et al . Perlmutter, D.M. & Ross, 1 . R . 1 970: Relative clauses with split antecedents. Linguistic Inquiry
Jounral of Semantics 6: 4 1
-
55
RESTRICTIONS ON DATIVE CLITICIZATION IN FRENCH CAUSATIVES*
JOHAN ROORYCK
ABSTRACf
Goodall ( 1 987) have related this restriction to the ergative-inergative distinction. However, the inability to formally define ergative verbs in French, as well as funher restrictions on the clitici zation of datives in causative constructions show that this hypothesis fails to account for the data observed . A thematic condition on dative cliticization in causatives adequately describes the restrictions noted.
I. I NTRODUCTION
1 Recent work on causative ' restructuring' constructions in French (Fau connier 1 983; Tasmowski 1 984; Burzio 1 986) draws the attention to the fact that syntactically similar verbs di ffer with respect to the cliticization of their animate indirect object or dative complement when inserted into the causa tive construction. This difference appears most strikingly when verbs cor responding to the NPJ VP a NP2 .... NPJ /ui2 VP format are constructed with a causative. (I)
a. b.
(2)
J ' ai fait parvenir/arriver cette lettre a son amie . ' I made that letter arrive to her friend . ' J e lui ai fait parvenir/arriver cette lettre. 'I (to her) made arrive that letter.
a.
J 'ai fait nuire/obeir/ressembler Oscar a ce general . ' I made Oscar harm/obey/resemble that general . ' b . • J e lui ai fait nuire/obeir/ressembler Oscar. 'I made him harm/obey/resemble that general . '
These restrictions also apply to certain verbs selecting both a direct and in direct object (telephoner, repondre) or two indirect objects (parter) when only the dative is expressed .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Causative constructions in French display restrictions as to the cliticization of lexical datives onto the causative. In altogether different frameworks, Fauconnier ( 1 983), Burzio ( 1 986) and
42 (3)
J 'ai fait/vu telephoner/parler/n!pondre Oscar a son frere. ' I made/saw Oscar call/talk to his brother. ' b . • Je lui ai fait/vu telephoner/parler/repondre Oscar. 'I made/ saw him call/talk to his brother. '
(4)
J'ai fait/vu donner/conseiller/interdire ce livre a Luc par Max. ' I made/saw give/recommend/refuse that book to Luc by Max . ' b . J e lui a i fait/vu donner/conseiller/interdire ce livre par Max . ' I made/saw him give/recommend/refuse that book by Max . '
a.
a.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In order to explain this observation, both relational grammar (Fauconnier 1 983) and Chomskyan generative grammar (Burzio 1 986) distinguish two classes among the verbs selecting both a subject and an indirect object. They claim that the superficial subject of 'inaccusative verbs' (RG) or 'ergative' verbs like parvenir, arriver (GG) actually is a direct object at the right side of the verb in deep structure. As such, these verbs cannot constitute an S, but necessarily form a VP. Hence, the anaphor of the dative selected by er gative verbs escapes the Opacity condition when attached to the causative. Both Burzio ( 1 986) and Fauconnier ( 1 983) choose to solve this problem by a double subcategorization of the causative for S and VP respectively. The possibility of sentences like (4b) is explained by the passive interpretation of the embedded infinitive, but this problem will not concern us here. 2 Goodall ( 1 987: 1 28 - 1 29) does not accept this double subcategorization scheme for causatives. His analysis mainly rests on a combination of the ergative hypothesis and Case theory. Goodall ( 1 987) assumes that the causa tive cannot assign accusative case to Oscar in (2b) and (3b) because of the intervening trace of lui. Since Oscar is not adjacent to the complex verb con stituted by the causative and the infinitive, Case cannot be assigned and (3b) is ruled out by the Case filter. Goodall (1987) then predicts that whenever the embedded subject does not need Case, the PP complement of the infini tive can freely cliticize on the causative. For Goodall ( 1 987), this is the case in (4b) where the embedded verb need not assign Case to the subject posi tion, since the verb is interpreted as a passive. This situation also occurs in (I b), since inaccusative/ergative verbs do not assign a thematic role and hence no Case to their subject position . In the remainder of this article, I will critically examine both the analysis based on Case theory and the approaches that only makes use of the ergative-inergative distinction . Moreover, I will try to show that a thematic condition on the cliticization of datives onto the causative construction is sufficient to account for the restrictions concerning both 'ergative' verbs ( l ) (2) and ditransitive verbs (3) (4).
43 2. PROBLEMS FOR CASE THEORY
Goodall's (1 987) account of the restrictions on Dative cliticization on the causative does not seem adequate for both theoretical and empirical rea sons . A first problem involves the explanation of (3a). The acceptability of this sentence is explained as a result of the extraposition of the dative com plement in the following sentence. • J'ai fait/vu telephoner/ecrire/repondre a son frere Oscar. ' I made call/write/ answer to his brother Oscar. '
(5)
(6)
L'infirmiere a fait telephoner a leurs parents taus les enfants qui avaient pleure pendant Ia nuit. 'The nurse made call to their parents all the children who cried during the night. '
Apparently, sentences like (5) are perfectly acceptable when the subject is heavier. This pragmatic restriction of 'NP Heaviness' is a well-known for the stylistic postposition of embedded subjects in French (see Bailard 198 1 for discussion). (7)
a.
II dit qu'ont ete acceptes taus les candidats qui s'etaient pn!sentes ce matin. ' He says that have been accepted all the candidates who came this morning. ' b . * I I dit qu'a ete acceptee Violaine. 'He says that has been accepted Vio1aine.
Consequently, it seems much more adequate to analyze (5) and (6) along the lines of (7) as sentences where the infinitival subject has been postposed. In this way, the need for a theoretically awkward Dative extraposition rule dis appears.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Goodall ( 1 987: 1 8 1 n9) assumes that this rule does not involve movement. Consequently, no trace is present to block Case assignment. However, if no movement is involved, extraposition must be viewed as a stylistic rule oper ating in Phonological Form. This rule should then be ordered before the ap plication of the Case filter, otherwise the rule would have nothing to operate upon: (5) would be excluded by the Case filter because of the intervening PP complement. This ordering of filters and stylistic rules is certainly not a desirable result. A further problem we want to point out in Goodall's (1 987) analysis con cerns the exclusion of (5) by virtue of the Case filter. How are acceptable sentences like the following to be explained?
44 The analysis under discussion also makes for some empirically inadequate predictions. Since the trace of the dative blocks Case assignment in (2b) and (3b), sentences where the dative is subject to Wh-movement should be equally unacceptable. However, this is not the case. (8)
a.
Voila l'homme a qui j'ai fait/vu telephoner/repondre les enfants. 'This is the man to whom I made/saw call/answer the children. ' b . Voila I a femme a laquelle Ie sculpteur a fait ressembler sa statue. 'This is the woman to whom the sculptor made resemble his statue. '
(9)
Cunegonde y/lui ressemble/survit/echappe/repond. 'Cunegonde (to it/him/her) resembles/survives/escapes/answers. '
When inserted i n the causative construction, the dative cannot be cliticized on the causative, but the y clitic can. Compare ( l ) -(3) and the following: ( 1 0)
a.
Le sculpteur y a fait ressembler sa statue (a l'idee du bonheur). 'The sculptor (to it) made his statue resemble (to the idea of happiness).' b . Mon grand-pere y a fait survivre ses trois enfants (a I a seconde guerre mondiale). 'My grandfather (to it) made live through his three children (the second World War) . ' c. J'y ai fait/entendu n!pondre mon frere avec grand aplomb (a cette question). ' I (to it) made/saw answer my brother undisturbedly (to that question). '
Now Goodall's ( 1 987) analysis predicted that so-called inergative verbs can not cliticize PP complements onto the causative. The trace of the PP com plement should make Case assignment impossible in both (2)-(3b) and (6). Nevertheless, the sentences in (6) are acceptable. For this problem, the only way to save Goodall's ( 1 987) analysis would be to distinguish homonyms for the abovementioned verbs: an 'ergative' verb with y and an inergative verb
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The restrictions noted in (1)-(3) only seem t o involve cliticization, contrary to what is predicted by Goodall (1987). Moreover, Goodall's (1 987) analysis predicts the unacceptability of sentences where PP complements other than datives are cliticized. Verbs like ressembler, echapper, survivre or repondre select an indirect object of the + I - �nimate type that can be cliticized as resp. lui or y.
45
with lui. This rather unplausible solution brings us to another problem for all analyses outlined in the preceding paragraph: the definition of ergative verbs.
3. PROBLEMS FOR THE ERGATIVE H Y POTHESIS
(II)
a.
Une partie en a beneficielprofite aux rebelles. ' Part of it profited to the rebels. ' b . • II a ete beneficielprofite aux rebelles. 'There was profited to the rebels. '
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
All solutions sketched have two serious drawbacks. First, the formal defini tion of ergative verbs in French does not seem to apply to all verbs allowing for their dative to be cliticized on the causative. A second and more serious problem for the ergative hypothesis lies in the observation that so-called in ergative verbs are not the only verbs for which dative cliticization onto the causative is excluded. As far as the formal definition of ergative verbs is concerned, Tasmowski ( 1 984) has pointed out that it is very difficult to define ergative verbs in French, since the formal tests that have been proposed cannot always be ap plied rigorously. This problem is worth being analyzed in some detail. A quick glance at Gross's (1 975) lists 5 and 7 learns that 35 verbs correspond to the format NPJ VP a NP2 +-+ NPJ /ui2 VP. Of these verbs, l belongs to a literary register (agnier) , and 1 6 do not enter the causative scheme be cause they are stative and have a nonagentive subject, 3 8 verbs enter the scheme NPJ lui Vcaus Vinf NP2, and thus would be ergative: revenir, profiter, incomber, echoir, bem!ficier, apparaftre, arriver, parvenir. 1 0 verbs do not enter the scheme, but their indirect object can b e realized lexi cally at the right of the causative construction: ceder, echapper, faire obsta cle, mentir, obeir, resister, sourire, succeder, survivre, ressembler. Now, considering that most examples adduced in the literature on ergativity in French concern movement verbs, it seems hard to prove that projiter, in comber, echoir, Mnejicier are ergative, while echapper, clearly a movement verb, is not . Ruwet ( 1 988) argues that the property of taking etre as an auxiliary in the perfect tenses is a sufficient condition for ergativity. Accord ing to this definition, echapper could be inergative, since its perfect tenses displays avoir in the construction with a dative. However, the ergative status of benejicier, incomber, and projiter cannot be defined in this way, since they also have avoir in the past tense. Nevertheless, these verbs satisfy some other tests for ergativity cited by Tasmowski (1985): benejicier, profiter can display a partitive en originating in the 'subject' of the ergative verb, and they do not have an impersonal passive.
46
( 1 2)
Arnaud lui a fait/entendu donner/conseiller/promettre des Iivres. 'Arnaud made/heard him give/recommended/promise books (to someone). 'Arnaud made/heard books be given/recommend/promised to him (by someone). '
When a complement introduced by par is inserted into these sentences, the dative can only be interpreted as the indirect object of the infinitive, since the par-complement absorbs the Agent role. (1 3)
Arnaud lui a fait/entendu donner/conseiller/promettre des livres par Paul. 'Arnaud made/heard books be given/recommended/promised to him by Paul . '
H owever, verbs like emprunter, demander, opposer d o not generate am biguous sentences when inserted in the abovementioned construction: the dative lui can only function as the interpretive subject of the infinitive. In causative constructions with these verbs, the dative lui is always of the non lexical type.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
However, Tasmowski ( 1 985 :335 - 336) points out that these tests are in operative for a number of reasons that will not concern us here. Finally, Ruwet (1988) notes that ergatives take the repetitive suffix re-. £choir, beneflcier, proflter do not share this morphological characteristic. Moreover, it can be doubted whether this test applies to all ergatives, since the movement verb arriver, a candidate for ergativity (cfr. supra) does not have it either. These problems for a clear definition of ergative verbs in French show that the ergative-inergative distinction in French is too unpre cise a tool to handle restrictions on causatives with. Moreover, it is quite un satisfactory to note that a clear cut difference in acceptability of dative cliticization depends on a very sloppy definition of the verbs allowing for this cliticization onto the causative. The ergative hypothesis for causative constructions in French was de signed to account for the noncliticization of certain lexical datives onto the causative. However, this hypothesis is at odds with some further restric tions on cliticized lexical datives. The insertion of verbs like donner, promettre, conseil/er, interdire in the causative scheme NPJ lui Vcaus Vinf NP2 yields ambiguous sentences. The dative lui on the causative can func tion as the indirect object of the infinitive or as its interpretive subject (Agent). The dative functioning as the interpretive subject of the infinitive actually is a nonlexical4 dative originating in the causative (Milner 1982).
47 ( 1 4)
a. b.
c.
Charles lui a fait/vu emprunter/demander/soustraire cette somme. 'Charles made/saw him borrow/ask/withdraw that sum . ' L e directeur leur a entendu opposer cet argument. ' 'The director heard them oppose that argument. ' Je lui ai vu preferer ce candidat. ' I saw him prefer that candidate. '
( 1 5)
a. •charles lui a fait/vu emprunter/demander/soustraire cette somme par cet escroc. 'Charles made/saw that sum be borrowed/asked/withdrawn from him by that scoundrel. ' b . • Le directeur leur a entendu opposer cet argument par son secretaire. 'The director heard that argument be opposed to him by his secretary.' c. • Je lui ai vu preferer ce candidat par le directeur. 'I saw that candidate be preferred to him by the director. '
The lexical dative of verbs such as demander, emprunter, opposer can only be lexically present at the right of the causative construction to function as the indirect object of the infinitive. (16)
a.
Charles a fait/vu emprunter/demander/soustraire une somme considerable a cet homme par cet escroc. 'Charles made/saw a considerable sum be borrowed/asked/ withdrawn from that man by that scoundrel . ' b . Le directeur 2 �ntendu opposer cet argument au personnel par son secretaire. 'The director heard that argument be opposed to the personnel by his secretary. ' c . J'ai vu preferer par l e directeur ce candidat inconnu a son propre frere. ' I saw that candidate be preferred by the director to his own brother. '
These data show that restrictions on dative cliticization also apply to ditran sitive verbs which clearly have nothing to do with the ergative-inergative
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This absence of ambiguity suggests that the lexical dative clitic of verbs such as emprunter, demander, opposer cannot be attached to the causative. This hypothesis is confirmed by the insertion of par NP which yields unaccepta ble sentences.
48
distinction. It will be clear by now that inergative verbs only constitute a subset of the verbs that do not allow for their lexical dative to be cliticized onto the causative. If the ergative hypothesis were maintained, further restrictions would be necessary in order to deal with these observations. This hypothesis clearly fails to account for all restrictions on the cliticization of lexical datives in the causative construction.
4 . A THEMATIC RESTRICTION O N DATIVE CL ITICIZATION
( 1 7)
a. • Je lui fais telephoner/avouer Mathilde. 'I (to him) make call/confess Mathilde. ' b. Je lui fais telephoner/avouer cette histoire par Mathilde. 'I (to him) make call/confess this story by Mathilde.'
Unlike thematic functions of the Agent-Patient type, thematic functions of the Source-Goal type can be thought of as essentially relational. We can say that a Source/Goal function is only fully realized in its link with a Theme. The Theme-Source or the Theme-Goal relation can be conceived of as chain which bears the thematic function. An independent argument for this posi tion can be found in the fact that the only argument of intransitive verbs can be Agent or Patient, but never Source or Goal. 5 If this characterization of the Source/Goal relations is correct, the impossibility of ( l 7a) and (3b) is due to the fact that the Goal function cannot be attributed to the indirect
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These restrictions can be accounted for straightforwardly when the thematic relations linking the infinitival arguments are taken into consideration. The so-called 'ergative verbs' have a semantic characteristic in common: their in direct object is the Goal argument of the subject-Theme. A parallel observa tion can be made for infinitives such as donner, conseiller, promettre (cfr. ( 1 2)), where the direct object is a Theme and the indirect object a Goal argu ment. In the causative construction, the Theme arguments of both types of infinitive are redistributed around the causative construction as direct ob jects, and the indirect object Goal can be cliticized on the causative. Whenever the indirect object of the infinitive is not a Goal argument, pronominalization on the causative is impossible. First, it can be noted that the thematic function of the indirect object selected by verbs like emprunter, reclamer, demander, soustraire can be identified as a Source. For the 'iner gative' verbs mentioned (mentir, nuire, obeir, resister etc.), this thematic function cannot be clearly defined. However, for our purpose it is sufficient to say that only indirect objects with a Goal function can be cliticized on the causative. The contrast noted in the following sentence can also be ex plained along these lines.
49
object. Since the Theme argument is left unexpressed, the thematic Goal chain does not obtain. Consequently, the indirect object clitic cannot be considered a Goal argument, and the sentence is unacceptable by virtue of the general restriction on cliticized Goal datives. On the contrary, (l 7b) is fully acceptable because the Theme argument is expressed . Note however that the Goal restriction only applies to lexical datives. At first sight, certain nonlexical datives can be interpreted as Source ar guments. ( 1 8)
If the Goal restriction is only t o be applied t o lexical datives, we should be able to give a formal definition of both lexical and nonlexical datives. In Rooryck ( l987b) it is shown that lexical and nonlexical datives can be distin guished by two formal tests. Unlike the lexical dative, this type of dative cannot appear in the passive construction, or in a construction with a clitic direct object and a lexical indirect object. These properties can be explained if the nonlexical dative is viewed as an essentially eli tic element that can mar ginally be lexicalized (see note 4). Moreover, the absence of nonlexical da tives in passive constructions shows that this type of dative has no argument status and is co-selected by the direct object function. ( 1 9)
a. • ? Je l ' ai arrachelconfisquelrafle a Martin. ' I took it from Martin. ' b . * ?Ce manteau a ete arrachelconfisquelrafle a Martin. 'That coat was taken from Martin.'
(20)
a.
Je l'ai demandelemprunte a Martin. 'I asked/borrowed it from Martin. ' b. Ce manteau a ete demande a Martin. 'That coat was asked/borrowed from Martin. '
It can be observed that the thematic function of the nonlexical dative is not stable: in principle, a Benefactive/Malefactive reading obtains, depending on the interpretation of the sentence. (2 1 )
Je lui ai pris ce livre. 'I took that book for/from him . '
However, with certain types o f NP (body parts, clothes) this thematic rela tion can denote a more precise inalienable possession.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Je lui ai fait/vu arracher/confisquer/rafler ce manteau par mon ser viteur. 'I made/saw that coat be taken away from him by my servant. '
50
(22)
a. b.
Je lui ai casse le bras. 'I broke his arm.' Je lui ai vu cette jupe. 'I saw that skirt on her. '
(23)
a.
Mme. Lafontaine leur a entendu reprocher ces erreurs par l'instituteur. 'Mrs . Lafontaine heard these errors be reproached to them by the teacher. ' b. Je lui ai vu pardonner sa tentative de meurtre par Ie Pape. 'I saw his attempt to murder be forgiven to him by the Pope.'
The lexical dative of the verbs reprocher, pardonner cannot be analyzed as a Goal of the direct object Theme. Nevertheless, the sentences cited are fully acceptable and thus contradict the restriction on cliticized datives. Conse quently, we will have to reformulate this descriptive condition if we want to account for these data. In order to achieve this goal, we want to reformu late the thematic relations of the Source/Goal type. For all verbs analyzed, the Theme-Goal relation can be viewed as a relation that obtains possibly (proposer, conseil/er, promettre) or necessarily (arriver, parvenir/donner, telephoner), or that is prevented (cacher, camoujler, interdire) at a time t 1 after the time of action t0 of the verb itself. In Rooryck (1 987a), the Theme-Goal relationship is analyzed as a relation of contact between an argument Y and an argument Z at a time t 1 • Likewise, the Theme-Source relation can be described as a contact between an argument Y and an argu ment Z at a time t_ 1 before the time of action t0 of the verb under analysis. A Theme-Goal relationship only makes sense when a contact between Theme and Goal is implied. Now, for judgment verbs such as reprocher, pardonner the semantic relation holding between the direct object and the indirect object can also be described in terms of contact.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
As we noted above, this nonlexical dative can also function as the interpre tive subject (Agent) of the infinitive (Rooryck 1 988). We would like to main tain that the Source interpretation is imposed on the thematic instability of the nonlexical dative which normally has a Benefactive/Malefactive in terpretation. The abovementioned restrictions on the cliticization of datives in the causative construction can be accounted for by the descriptive condition that only lexical datives with the thematic function of Goal can be pronominalized on the causative. Although this restriction covers the cases hitherto mentioned, some ex ceptions can be found.
51
(24)
Je pardonne/reproche cette faute a Louis. 'I forgive/reproach Louis that error. '
(25)
a.
Je lui ai reproche!pardonne son imprudence, done, de mon point de vue, il a ete imprudent/*il a subi !'imprudence. 'I reproach/forgive him his carelessness, so, from my point of view, he has been careless/•underwent carelessness. b. Ce message lui est parvenu/arrive!echappe, done il l'a eu/r�u/•subi . 'That message (to/from him) arrived/escaped, so he has had/*underwent it.' c. Je lui ai donne!demande ce livre, done, de mon point de vue, il doit l'avoir/•le subir. 'I gave/asked him that book, so, from my point of view, he should have/•undergo it.'
Note that in all these cases the 'contact' paraphrase cannot be negated without obtaining a contradiction. This shows that the paraphrase can be
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
From the point o f view o f the agentive subject, there i s a relation o f contact between these two arguments: Louis is responsible for the error. As argued in Rooryck ( l 987a), this relation is independent of the time of action to of the verb. On the contrary, verbs of the type ressembler, nuire, obeir do not imply any contact between the lexical dative and the subject Theme. Verbs like demander, reclamer, emprunter however do imply a contact between the lexical dative and the direct object. Since this relation is between a Theme and a Source, it occurs at a time t_ 1 before the time of action to of the verb. It could be objected that the notion of contact is used metaphorically in the case of reprocher, pardonner, while it is not for verbs implying a 'real' Theme-Goal contact relation. Why do pardonner and reprocher imply con tact and not e.g. nuire? However, the presence or absence of a 'contact' re lation can be tested by using paraphrases of these relations as relevant inferences. Thematic relations of the Agent-Patient type involve relations of power exerted by someone or something on someone or something. The notion of 'contact' does not imply this type of relation. Rather, it must be expressed as a relation of 'having/being' or 'being responsible for' . A rela tion of power cannot be expressed in these terms: a Patient undergoes the power of the Agent. The relation of contact can be paraphrased by the verbs avoir, etre, recevoir (have, be, receive), a relation of power by subir (under go) . The verbs we have analyzed as implying a 'contact' relation construct sentences to which a 'contact' paraphrase can be adjoined, but not a 'power' paraphrase.
52 viewed as a necessary implication of the preceding sentence.6 The verbs that do not allow for their dative to be cliticized on the causative do not imply 'contact' paraphrases . (26)
a. * Elle lui ressemble/succede/ment, done elle/il l'a eu/re�u/subi. 'She resembles him/follows him up/lies to him, so she/he has had/received/underwent her/him . ' b. Elle lui a obeiln!sistelcedelsurvecu, done elle a d u le subir/*l'avoir/*le recevoir . ' She obeyed/resisted/gave way/survived (to) h i m , s o s h e has had to undergo/ *have/*receive him.
(27)
a.
b.
Ce comportement lui a beneficie, done il a dO en obtenir /*subir quelque chose. 'That behaviour benefited to him, so he got something out of it/*underwent it . ' Ce comportement lui a nui, done i l a d u e n subir/*obtenir quel que chose. 'That behaviour harmed him, so he has had to undergo it/*did not get/ got something out of it.'
The metaphorical use of the notion of contact is not only possible, it is even necessary in order to explain certain examples of dative cliticization on the causative. (28)
a. b.
Dieu leur a fait apparaitre Ia Vierge. ' God made the Virgin appear to them . ' L a V ierge est apparu aux en fants, done ils l ' ont aper�ue/ *subie. 'The Virgin appeared to the children, so they have seen/*underwent her . '
I n addition t o the 'psychological' (25a), 'physical' (25bc) o r 'indirect' (27a) contact , the paraphrase of (28b) indicates that some sort of ' eye-contact' is necessarily7 established between the referents of the arguments of ap paraitre. For the verbs that do not allow for dative cliticization on the causa tive, no 'contact' paraphrase can be used as a necessary inference, although in some cases ' power' (Patient) paraphrases are possible (cfr. (26b)). This rethinking of thematic relations of the Source/Goal type allows for a reformulation of the restriction on cliticized lexical datives in the causative construction. Only datives that can entertain a relation of contact with an
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The contrast between beneficier (contact, Theme-Goal) and nuire (Agent Patient) is particularly revealing in this respect .
53 expressed Theme argument at a moment t 1 after the time of action o f the verb can be cliticized on the causative. Since the relation of contact between the direct and indirect object of j udgment verbs like reprocher, pardonner is independent of the time t0 of the verbal action, the restriction formulated also includes these verbs.
5. CONCLUSION
Research Assistant of the National Fund for Scientific Research �partml!nt of Linguistics K. U. L!!uvl!n Blijde-lnkomststraat 21 B-3000 LEUVEN Belgium
NOTES •
I would like to thank Beatrice Lamiroy, Ludo Mel is, Karel Van den Eynde and two anony
mous referees of the Journal of Semantics for constructive comments and extensive discussions on this subject, and Dirk Delabastita for improving my English. I would like to express my gratitude to the National Fund for Scientific Research (Belgium) for its financial support.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I have tried to show that the ergative hypothesis is unable to provide a correct account for the restrictions on the cliticization of lexical datives in the causative construction. I n order to give a correct description o f these res trictions, I have proposed a descriptive semantic condition stipulating that only lexical datives of a certain thematic type may cliticize on the causative. In this way, a single subcategorization scheme can be maintained for the causative ' restructuring' construction of type (4). A last question that I want to raise concerns the theoretical relevance o f this analysis. How can the approach presented here b e integrated in a n existing theoretical framework? Only t h e pronominal approach presented in Blanche-Benveniste ( 1 984) seems to offer an adequate framework in which to account for these restrictions. Since this approach distinguishes syntactic functions on the basis of their possibility to enter certain construc tions, the distinction they already draw between lexical datives of the P2 and the P3 type can be used to formally define the restriction noted on the clitici zation of lexical datives on the causative. 8 On the semantic side, our ap proach of thematic relations clearly fits in a cognitive semantics framework along the lines of Langacker ( 1 987) for the (prototypical) notion of contact between arguments. In this way, purely structuralist and anti-structuralist currents in respectively syntactic and semantic research seem to converge.
54 I.
As pointed out by Tasmowski ( 1 98 5 : 232-239), Damourette and Pichon ( 1 9 1 1 - 1 940: par.
1 059-2057) distinguish two infinitival constructions for French causatives. This analysis also shows up in Blanche- Benveniste et alii ( 1 984: 1 86- 1 88). In the first construction, the comple ments of the infinitival construction can be cliticized on the infinitive (a). This construction is currently analyzed as a sentential complement containing an infinitive with an overt subject. In the framework of Chomsky ( 1 98 1 , 1 986), this construction can be analyzed along the lines of ECM verbs (�lieve) in English. For such an analysis in the barrier-framework, see D' Hulst and Rooryck (forthcoming). I n the second construction, nothing can appear between the main verb and the infinitive which merge into a complex verb by a restructuring operation (b). This operation is introduced as a Thematic-Index Rewriting rule by Rouveret and Vergnaud, a rule o f Union (Fauconnier 1 983), or 'Faire-attraction' (Milner 1982). See Rooryck ( 1 988) for a criti cism of this type of analysis w hich was first advocated by Kayne ( 1 977). Je le fais/entends/vois/laisse leur en donner. 'I make/hear/see/have him give them of it . '
b. - J e leur en fais/entends/vois/laisse donner. 'I make/hear/see/have give them o f i t . ' - J 'en fais/vois/entends/laisse donner par eux.
'I make/hear/see/have 8\ ve them of it by them . ' - J ' y fais/vois/entends/lai !se parlier/ manger Theophraste. ' I make/hear/see/have Theophraste leave/eat there.' 2.
This double subcategorization of the causative 'restructuring' construction clearly is in
contradiction with the unified account o f the causative ' restructuring' construction as a categorial idiom (Rooryck 1 988). 3. These verbs are gouttr, nuirt, repugner, rtStu, satisfaire, aller, convenir, advenir, appar
tenir, dtplaire, importer, manquer, �er, plairt, riu.ssir, sroir. Nuirt is an interesting
case,
since the verb seems to be acceptable in the causative construction with an animate subject, and unacceptable with an inanimate subject. This opposition is probably due to the strong correlation between agentivity and animacy. J 'ai fait/vu nuire, •cette situation/? ce directeur aux inter!ts du personnel. 'I made/saw this situation/this director harm the interests of the personnel . 4.
For the distinction between lexical and nonlexical datives, and for the restrictions on the
lexicalisation o f nonlexical datives,
see
Leclere ( 1 976), Barnes ( 1 980, 1985), Rooryck ( 1 987b).
See Rooryck ( 1 988) for an analysis of the nonlexical dative of causative constructions as the Agent of the infinitive. 5.
Moreover, for some ditransitive verbs, a correct thematic description requires that the rela
tion between Theme and Goal cannot obtain: cacher, camouf/er, interdire, rtfuser. Now a negated Goal is simply nonsense, but a negation of the link Theme-Goal by the agentive subject seems to provide for an adequate thematic description of these verbs. Note that they allow for the NP/ lui Vcaus Vmf NP2 construction: Sa femme lui a fait cacher/interdire le vin par le medecin.
' H is wife made wine be stowed away/prohibited for him by the doctor . '
6.
However, some verbs imply negated contact relations, where a negated paraphrase is
necessary (see note 5). J e lui ai interdit l 'alcool, done de mon point de vue, il ne pourra plus en avoir/•Je subir.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a.
55 ' 1 prohibited wine for h i m , s o , from my point o f view, he is n o t allowed to
have/•undergo it anymore. In this case, affirmation o f the paraphrase yields a contradiction. 7. This paraphrase cannot be negated without contradiction. 8 . Karel V a n den Eynde, personal communication.
REFERENCES Bailard, Joelle 1 98 1 : A functional approach to subject inversion. Studies in Language 5 , I , 1 - 29. Barnes, Betsy 1 980: The notion of 'Dative' in linguistic theory and the grammar of French. Lmgvtsttcae Invesllgationes 4: 245-292.
guage 9, 2: 1 59- 1 95 . Blanche-Benveniste, Claire, J . Deulofeu, J . Stefanini and K . V a n den Eynde 1 984: L 'Approche Pronominale. SELAF, Paris. Burzio, Luigi 1 986: Italian Syntax: a Government-Binding Approach. Reidel, Dordrecht. Chomsky, Noam 1 98 1 : Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam 1 986: Barriers. MIT Press, Cambridge Mass.
Damourette, Jacques and E. Pichon 1 9 1 1 - 1 940: Des Mots a Ia Pensie. D' Artrey, Paris. D'Hulst, Yves and 1. Rooryck forthcoming: An ECM analysis of French perception and movement verbs. Fauconnier, Gilles 1 98 3 : Generalized union. I n : L. Tasmowski and D. Willems (eds.),
Problems in Syntax. Communication and Cognition, 1 95 - 230. Goodall, Grant 1 987: Parallel Structures in Syntax. Cambridge University Press, Cambridge. Gross, Maurice 1 975: Methodes en Syntaxe. Hermann, Paris. Kayne, Richard 1 977: Syntaxe du Fran�ais. u cycle Transformationnel. Le Seuil, Paris. Langacker, Ronald 1987 : Foundations of Cognitive Grammar Vol. /. Stanford University Press, Stanford . Leclere, Christian 1 976: Datifs syntaxiques et datif ethique. I n : J . Cl. Chevalier and M . Gross (eds .). Methodes en Grammaire Fran�aise, Klincksieck, Paris, pp. 73-%. Milner, Jean-Claude 1982: Ordrt!S et Raisons de Langue. Le Seuil, Paris. Rooryck, Johan 1 987a: Les Verbes de Contr61e: une Analyse de 1'/nterpritation du Sujet Non
Exprime des Constructions Infinitives en Fran�ais. Doctoral dissertation, K . U . Leuven. Rooryck, Johan 1 987b: Criteres formels pour le datif non lexical en francais. To appear i n : Studio Nrophilologica. Rooryck, Johan 1 988: French causatives and the Dative problem. Preprint nr 1 1 5 , Department of Linguistics, K . U . Leuven.
Rouveret, Alain and 1 .-R . Vergnaud 1 980: Specifying reference to the subject: French causa lives and conditions on representations. Linguistic Inquiry I I : 97-202. Ruwet , Nicolas 1 988: Les verbes meteorologiques et ! ' hypothese inaccusative. To appear i n :
Claire Blanche-Benveniste, Andre Chervel, a n d Maurice Gross (eds .), Mtlanges a Ia Mtmoire de Jean Stefanini. Tasmowski, Liliane 1984: ? • 'luifaire telephoner quelqu 'un d'autre': une strategie. Lingvisti cae /nvestigationes 8, 2, 403 -427. Tasmowski, Liliane 1985: Faire infinitif. I n : L . Melis (ed.), Les Constructions de Ia Phrase Fran�aiSe. Communicatton and Cognition, 223-365.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Barnes, Betsy 1 985: A functional explanation of French nonlexical datives. Studies in Lan
-
Journal of s�mantics 6: 57 93
TOOLS AND EXPLANATIONS OF COMPARISON - PART 1 *
MANFRED BIER WISCH
ABSTRACf
analyses, preserving as far as possible the concepts that have already been clarified, but modifying the structure of earlier proposals in crucial respects. The reason for adding a new theory to the ones already existing is twofold: (a)
The new theory accounts for a number of relevant facts that have systematically been
(b)
It relates these facts to those already analysed in a way which does not merely give
ignored by earlier analyses. a descriptive account, but rather an explanation in terms of a few underlying condi tions from which the whole range of facts follow in a natural way. A detailed discussion of the variow analyses proposed so far would by far exceed the limits
set for the present paper. 2 I will instead simply list, for the sake of preliminary orientation,
the main points that the present theory shares with some or all of its predecessors, and those in which it differs from them. I n accordance with other approaches, I will make the following assumptions: (i)
The Positive of relative adjectives must be analysed in close connection with the Com parative, the Equative, and a number of related constructions. More specifically, the constructions in question are all based on a single lexical representation of the adjec tives involved.
(ii)
The Positive of a relative adjective is i nterpreted with respect to a contextually deter mined class of comparison C. Within C, a standard, average, or norm Nrc. A[ is de fined with respect to the property A specified by the adjective in question, so that,
' e.g. , John is tall is interpreted roughly as 'John is taller than Nrc. htiKht[ . In the present paper, I will not be concerned with the question how C and Nrc. AI are deter mined, but simply assume that N is available. (I will usually drop the index [C, A ]
(iii)
o f N.)
Relative adjectives assign to an individual x a degree dA where d might be conceived
as a class of individuals that are equivalent with respect to A . (This notion will be somewhat modified below . ) Differing from a l l other approaches, I make t h e following assumptions: (iv)
The lexical representation of a relational adjective is semantically a kind of three place predicate that relates an individual
x,
a standard of comparison v, and a dif-
• Editorial note: Because of its unusual length this paper appears in two parts. Part 2 (i .e. Sec tions 5-9 of the paper) will be published in JS 6.2.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper, I will outline a theory of gradation ' that builds upon quite a number of previous
58
(v)
(vi)
ference c. With respect to their semantic type, both v and c are degrees, and the degree assigned to x is composed of the values of v and c. 3 One of the possible values of v is N. Comparative and Equative constructions are related to each other in roughly the fol
lowing way: the complement clause of the Comparative specifies the value of v, while that of the Equative specifies the value of c. 4 Relative adjectives belong to (at least) two classes, which I will call dimensional adjec tives (tall, long, heavy etc.), and evaluative adjectives (clever, nice, good etc.). The degrees specified by D-adjectives are extents, the degrees specified by E-adjectives are grades. 5
(vii)
There is a small number of conditions on semantic representations that determine, among others, the value the standard of comparison
v
can assume in specified con
figurations.
I . SOME RELEVANT FACTS
In this section, I will briefly discuss three groups of facts that motivate the need of a new theory, as they cannot reasonably be captured by any of the earlier theories. Before turning to the details, I will introduce two notions that are useful in this respect . As is well known, relational adjectives, as a rule, come i n pairs of anto nyms, such as tall vs. short, high vs. low, good vs. bad, clean vs. dirty etc. This pairing exhibits a fair amount of lexical idiosyncrasies, and it is far more regular for D- than for E-adj ectives. As this kind of antonymy is in trinsically related to the phenomena of gradation, an interesting theory of gradation should provide a systematic account of antonymy and the dif ferent role it plays for D- and E-adjectives. For the moment, I will simply call the two sets into which adjectives are to be grouped in this respect + Pol and - Pol adjectives, respectively. Typical examples are the following: (1)
D-Adjectives: a. + Pol: tall, long, high, heavy, old b . - Pol: short, lo w, light, new, young
(2)
E-Adjectives a. + Pol: good, beautiful, pretty, clever, intelligent b . - Pol: bad, ugly, plain, stupid
As will be seen below, the consequences of the + Pol/ - Pol distinction are far more complicated than has been recognized so far.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To conclude this preliminary outline, I should emphasize that more important than the list of individual points relating the present theory to or distinguishing it from other proposals is the general structure of the theory, which is different from its predecessors. This will become clear as we proceed.
59 It has been noted above that the interpretation of relational adjectives may involve a contextually determined norm N. I will call an expression norm-related (or for short NR) if its interpretation involves N. In this sense, the examples in (3) are norm-related, while those in (4) are not: a. b. c.
John is short . Bill is tall. I know that he is tall.
(4)
a. b. c.
How tall is Bill? I know how tall h e i s . Bill is five feet tall.
Norm-relatedness is mostly looked upon as a phenomenon that is attached , one way or another, to the Positive of relational adjectives. Its actual distri bution, however, is far more complicated, as will be seen immediately. As a matter of fact , it cannot be captured along the lines that have been pro posed so far. The first group of facts, to which I will now turn, concerns certain asym metries between + Pol and - Pol adjectives. To begin with, the following contrast, though often recognized, has never been captured in a systematic analysis: (5)
a. Bill is five feet tall. ( - NR) b. • Bill is five feet short . ( + NR)
Notice that (5b) is deviant, but has nevertheless a definite interpretation that might be paraphrased by (6): (6)
Bill is five feet tall, and that is short .
To what extent sentences like (5b) and their re-interpretation (6) are admissi ble, although they are deviant, depends in part on the individual adjective. The di fference between (5a) and (5b) is quite clearcut , though . Notice, that the two cases di ffer not only in acceptability, but also with respect to norm relatedness. Hence an account that simply assigns an ungrammatical status to the combination of a measure phrase with a - Pol adjective will not do. This is also shown by the related contrasts in (7) and (8): (7)
a. How tall is Bill? b. ? How short is Bill?
(8)
a. b.
( - NR) ( + NR)
I know how tall he is. ( - NR) I know how short he is. ( + NR)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3)
60
The status and interpretation of (7b) is somewhat dubious, while (8b) seems to be unproblematic, meaning something like (9) : (9)
I k now how far his height is below the average.
Consider next Comparative and Equative constructions: a. b.
John is taller than Bill is. ( - NR) Bill is shorter than John is. ( - NR)
( 1 1)
a. b.
John is as tall as Mary is. ( - NR) John is as short as Mary is. ( + NR)
Comparatives are not norm-related as is shown by the possibility to con tinue the examples in (10) in the following way:
( 1 2)
a. b.
John i s taller than Bill, though both are fairly short. Bill is shorter than Bill, but both are tall.
The Equative, however, shows the same asymmetry as the examples with measure phrases and Wh-words. Although these facts - together with the different pattern of norm-relatedness in E-adjectives, to which we will turn below - have been discussed in the context of presupposition ,6 no sys tematic account has been provided by any of the theories of comparison so far. Similar phenomena are to be observed with respect to MPs:
( 1 3)
a. b.
John is five inches taller than Bill is. Bill i s five inches shorter than John is.
Thus, while only + Pol-adjectives allow for a regular M P in the Positive, both + Pol and - Pol-adjectives can take an MP in the Comparative. No tice, incidentally, that the MP specifies the difference in the Comparative, while it specifies the whole extension in the Positive. This fact has an ob vious analogy in the following contrast:
(14)
a. b. c. d.
John John • John John
is is is is
three three three three
feet feet feet feet
tall. too tall. short . too short.
Formally, this di fference is reflected in different wh-phrases:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 0)
61 ( 1 5)
a. b. c.
I know how tall he is. I know how much too tall he is. I know how much taller he is than Bill.
M P s that go with the Equative are of a different kind, they d o not specify units, but multiples of degrees. The crucial point is, that these MPs combine only with + Po l , not with - Pol 0-adjectives: ( 1 6)
a. b.
( 1 7)
a. • John is half as short as Bill is. b. * Bill is twice as short as John is.
John is twice as tall as Bill is. Bill is half as tall as John is.
( 1 8)
a. b. c. d.
(+ She is clever. You know , how clever she is. ( + She is cleverer than her sister. ( + She is as clever as her mother. ( +
NR) NR) NR ?) NR)
( 1 9)
a. b. c. d.
She is dull . You know, how dull she is. She is duller than her sister . She is as dull as her mother.
NR) NR) NR) NR)
(+ (+ (+ (+
The question mark in ( 1 8c) indicates that norm-relatedness for + Pol E adjectives is sometimes dubious in the Comparative. 7 I n general, however, implicit reference to N seems to be indispensible for E-adjectives. Consider next the admissibility of measure phrases. It seems to be a trivial
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Notice that sentences like ( 1 7) are not only deviant, they also lack a deriva tive interpretation in the sense observed for the Positive construction (5b). It is by no means clear in advance whether and in which way the facts dis cussed so far are related to each other and to other relevant phenomena. A reasonable theory should, of course, provide a principled answer to these questions. The second group of facts concerns the asymmetry of D- and E-adjectives in regard to the phenomena discussed so far with respect to D-adjectives only. Notice first of all, that E-adjectives are not only less systematic with regard to the + Pol/ - Pol antonymy. (Many of them, such as wise, tough, obscure, do not have a clear counterpart at all, or they derive it by mor phological processes as in unlucky, inelegant, immobile, etc . ) They are also characterized by a different nature of their norm or standard to which gra dation might be related. Deferring this problem, I will first notice that for them norm-relatedness exhibits a rather different pattern:
62 fact that E-adjectives do not take MPs as they are not associated with scales for which units are defined. This is not the whole story, though . Suppose, there is a contest in which cleverness or beauty are scored. We might then have sentences like: (20)
a. b.
She is three points cleverer than her sister. John is almost two points better than all other candidates .
Even under these rather peculiar conditions, sentences like (2 1 ) remain strange: a . *She i s five points clever. b. • John is ten points good.
While these types of MPs are marginal, multiplicative M Ps are completely natural for E-adjectives. The crucial point here is that they are not restricted to + Pol adjectives. Thus, while the sentences in ( 1 7) are deviant, both sen tences in (22) are wellformed and have an unequivocal interpretation, which is, of course, not necessarily precise in an arithmetical sense: (22)
a. b.
John is at least three times as intelligent as Bill is. Bob is only hal f as stupid as his brother is.
The third group of facts to be discussed concerns the adjectives that are ad missible in the degree clauses of Comparative and Equative constructions. According to all theories of comparison, sentences like (23a) are related either by deletion or by interpretive rules - to those like (23b): (23)
a. b.
John is taller t han Bill (is) . John is taller than Bill is tall.
The correctness of this assumption becomes extremely dubious for the cor responding - Pol adjectives: (24)
a. John is shorter than Bill (is). b. • John is shorter than Bill is short .
Once the ungrammaticality of (24b) is recognized, the status of (23b) be comes dubious as well . Notice, moreover, that for semantic reasons (24a) could more plausibly be related to (25) : (25)
John is shorter than Bill is tal l .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2 1 )
63 This is not the right solution either, however: While (24b) is simply deviant, (25) is not synonymous with (24a) , it has a different, though somewhat mar ginal interpretation that might be paraphrased in the following way: (26)
The degree to which John's height is below the norm is greater than the degree to which Bill's height is above the norm.
Similar problems arise with respect to Equatives: (27)
a. ? John is as tall as Mary is tall . b. ? John is as short as Mary is short.
(28)
John's height is as much below the norm as Mary's.
For the + Pol-counterpart (27a) and the reduced form ( I I a) the correspond ing paraphrase relation does not hold, as ( I I a) is not norm-related. Before turning to further ramifications of the problem at hand, I will briefly touch the role of pitch accent that is involved here. Consider once more example (25), which has the alleged interpretation (26) if and only i f tall has a pitch accent that brings into focus its antonymous relation t o the matrix adjective short. Without the pitch accent, (25) does not have the meaning (26), nor is it synonymous with (24a) , but simply ungrammatical. This observation sheds some light on one factor involved in the questionable status of (27) as well as (23b) and (24b). In these sentences the second adjec tive cannot receive pitch accent. But it cannot normally remain without it either, if we assume a condition like (29): (29)
Adjectives in a degree complement clause that are related to the matrix adjective must be assigned a pitch accent .
This condition is related to the semantic aspect in an obvious way: pitch ac cent means focus , and focus means new information. In other words, adjec tives that simply repeat the property A of comparison already fixed by the matrix adjective cannot normally be realized in the surface. It goes without saying that (29) is rather ad hoc and must be reduced to more general princi ples that relate semantic interpretation to surface structure. It serves the purpose, however, of sorting out a factor that interacts with the semantic
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
W hile both of these sentences are dubious as a possible source for their counterparts without the second occurrence of the adjective, this time the - Pol case is at least semantically appropriate, in the following sense: Both (27b) and its reduced counterpart ( I I b) can be paraphrased by (28), in accor dance with the + NR status of ( I I b).
64
properties of adjectives in producing the phenomena under discussion. Having separated the consequences of pitch accent and focus, which ac count for the corresponding phenomena in (30) to (3 1 ) based on £-adjec tives, quite a number of purely semantic phenomena remain . (30)
a. ? John is cleverer than Bill is clever. b. ?Bill is more stupid than Mary is stupid .
(3 1 )
a. ? He is as good a s she i s good . b. ? He is as bad as she is bad .
(32)
a. b.
The table is longer than the door is high. The bridge is as long as the river is wide.
(33)
a. b.
Mary is nicer than her brother is clever. The plot is as dull as the music is ugly.
As all adjectives are contrastive, they have pitch accent in accordance with (29). Observe now the following asymmetry between D- and £-adjectives:
Mary is nicer than her brother is stupid. The plot is as interesting as the music is boring.
(34)
a. b.
(35)
a. •The closet is higher than the table is short . b . ? The tree is as high as the road is narrow.
Once two £-adjectives are construed as commensurable, + Pol and - Pol items combi ne rather freely. This does not hold for D-adjectives: while high and short or narro w relate to one-dimensional spatial extension, thus being directly commensurable, the combination of + Pol and - Pol D-adjectives is highly restricted. More generally, - Pol D-adjectives cannot normally ap pear in a degree clause. This restriction seems to hold more strictly for Com parative than for Equative constructions. The observation is related to a final point to be made. Sentences like (35b), which cannot be interpreted in the direct dimen sional sense, can be re-interpreted in analogy to £-adjectives. Under this secondary i nterpretation, (35b) means something like (36): (36)
The grade to which the tree is high is the same as the grade to which the road is narrow.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Consider first the possibility of comparing two different properties or dimensions, as in the following cases:
65 (This secondary interpretation is similar, though not identical, to the way in which (25) receives the interpretation (26) .) Once D-adjectives are re interpreted as E-adjectives, their + Pol and - Pol elements combine freely, as is the case for £-adjectives in general. This secondary interpretation is more easily available for Equative than for Comparative constructions for systematic reasons. Hence (35b) is less deviant than (35a), whose secondary interpretation would be something like (37): (37)
The grade to which the closet is high is greater than the grade to which the table is short .
2. BASIC ASSUM PTIONS
The theory to be proposed is to be conceived within a modular view of knowledge structures in the sense discussed in Chomsky ( 1 980). The facts that are of interest in the present context are determined by the interaction of two major systems: the grammar G and the conceptual system C. Both of these systems are modular in their internal organization . As to C, a general conception of which is still lacking, I will merely assume that it con tains, among other elements, a subsystem C5, in terms of which the scalar interpretation of quantitative j udgement and comparison is organized. Although the theory of scales and measurement developed originally with respect to problems of psychophysics provides a technical framework in terms of which C5 might be made precise, I will rely on largely intuitive notions specifying types of scales and the conceptual structure of acts o f comparison. For the time being, i t i s su fficient t o assume that representa tions determined by the rules and principles of C5 provide the interpreta tion of linguistic structures expressing gradation. How this component o f conceptual organization i s interrelated with other conceptual domains, as
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For obvious reasons, this group of facts has a wide range of ramifications, a full display of which would take us too far afield . As can be seen from the above examples, differences in acceptability are sometimes rather sub tle, and judgements are in many cases controversial. On closer inspection, however, even these uncertainties turn out to be anything but mere chaos; they rather fall into a fairly robust pattern, which a reasonable theory should be able to account for. There are a number of relevant generalizations to be extracted from the facts discussed in this section and from a number of related phenomena that will come up as we proceed. I nstead of stating these generalizations explicit ly, I will go on and develop the outlines of a theory from which these facts follow in a rather natural way.
66 well as with perceptual, motoric, and possibly further cognitive systems, must be left open in the present context . It is to be hoped, though , that sys tematic exploration of particular domains, such as gradation, will eventual ly contribute to a more comprehensive theory of the conceptual system C and the representations it has to provide. As to the structure of G, I will accept without further discussion the basic assumptions of the Revised Extended Standard Theory (REST) as devel oped in Chomsky ( 1 977; 1 98 1 ) according to which the various components of G determine a system of representations that are related in the following way: ,
D-Structure
!
S-Structure
/ '-
Phonetic Form
Logical Form
While PF-representations are ultimately related to articulatory and percep tual patterns , LF-representations must be related to conceptual structures. Whether this relation results from a direct mapping of LF into conceptual representations, or is mediated by intervening levels of representation is not clear in advance. The first alternative is argued for e.g. by Jackendoff ( 1 978; 1 984), while an intervening level o f semantic representation, the rules and principles of which are still part of the linguistic k nowledge, i .e. of the gram mar G, is assumed e.g. in Bierwisch ( 1 98 1 ; 1 982) and will be pursued in what follows. The ultimate justification for positing an intermediate level of semantic representation must be provided by relevant generalizations that cannot be stated without recourse to the level in question . The fact that a fairly wide range of phenomena related to gradation can be explained by means of conditions referring crucially to semantic representations is a case in point and hence of interest with respect to the status of the representa tions in question . Let me briefly outline the basic characteristics of the intended component of Semantic Form SF. As any other component of G, the SF-component re quires a specification of the general format of pertinent representations and of the rules and principles that relate these representations to other compo nents of G. The general form of SF-representations is that of expressions in a lambda categorial language. For the sake of illustration , suppose that (40) is the SF representation of (39): (39)
Who did John expect to visit Mary
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(38)
67 (40)
xi PERSON x PAST EXPECT JOHN VISIT xi MARY 1
Q
I I N "V
((S/S)/S)/N
I
SIN
I
N
�v
(S/S)
S
SIS
I
SIS
I
\/
SIN S
/
I
N
I
I NI '\. S1/
SINN N
S
This example is based on more or less standard assumptions, it is not meant to make substantial claims with respect to particular details. What it illus trates , might be summarized as follows : (a)
(b)
(c)
SF-representations are labelled trees made up of three types of ele ments: categories, such as S , N, SIS, etc . , semantic primes, such as Q, PAST, PERSON, etc . , and variables x, Xp x2, etc. Categories are attached to non-terminal nodes, they categorize the dominated subtrees , including the variables and primes attached to the terminal nodes. Semantic primes and variables are interpreted by appropriate struc tures of the conceptual system C, in much the same sense in which the primes of PF-representations are interpreted by perceptual and articulatory parameters . I will have to say more on this in the sequel with respect to the primes involved in gradation. Categories are either basic or complex, the latter dominating func tors , so that a category a/b, where a and b are categories , dominates a functor which combines with an argument of category b yielding a complex expression of category a. The system of categories thus defines the syntax o f possible SF-representations. It furthermore specifies the ontology that SF projects on the conceptual system C, insofar. as the categories of SF determine the type of conceptual structures in terms of which the categorized expressions are inter preted.8
Turning next to the rules and principles that relate SF to other levels of representation, or more precisely, to LF, we must recognize two compo nents : (a) the lexical rules specifying the SF-representation of lexical items, and (b) the combinatorial rules determining the compositional representa tion of syntactically complex expressions . The first component embodies
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
s
68 the full range of lexical idiosyncrasies, although it is based, of course, on general principles . The second component consists in a rather limited system of universal principles. Before I shall illustrate some of the relevant properties of the two compo nents, two general remarks will be in order. Consider first the interaction of lexical information with rules and principles of G. With respect to the syntactic components, REST recognizes at least four types of relevant lexi cal properties: (4 1 )
Syntactic subcategorization Features determining intrinsic Case Properties determining 0-roles Properties determining lexical Control
A crucial condition governing the interaction of lexical information with syntactic rules is the 'projection principle' proposed in Chomsky ( 1 98 1 : 29): (42)
Representations at each syntactic level (i.e. LF, and D- and S structure) are projected from the lexicon, in that they observe the subcategorization properties of lexical items.
More generally, the projection principle guarantees that lexical items assign thematic roles to their arguments invariably at all syntactic levels. The way in which the projection principle is maintained is paved by the trace theory of movement rules which allows traces left in the initial position of moved constituents to be interpreted as variables at LF. This in turn is directly rele vant to the present discussion, as 0-roles and lexical Control are lexical properties that are substantially related to, or even emanating from, the semantic content of lexical items . I nsofar as the projection principle guides the interaction of autonomous syntactic principles preserving the assign ment of 0-roles (and possibly other lexical properties) at all syntactic levels, it paves the way, so to speak, along which aspects of meaning enter the com putational structure of language. This leads to the second remark. In order to interact with computational rules and principles, 0-roles must themselves be amenable to computational processes, i . e . they are to constitute a structural aspect of meaning that necessarily induces a corresponding structure into the semantic representa tion of lexical items. This notion can be pursued in various ways. Suppose that the semantic representation of lexical items exhibits a certain amount of internal structure, i . e . , that it is made up from structural components that provide something like a skeleton for principles of conceptual interpretation to work with. Such components would have to be justified on the one hand by systematic relations within the lexical system of possible languages, on
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a. b. c. d.
69 the other hand by rules and principles providing a coherent interpretation in terms of conceptual representations. Suppose, furthermore, that the in ternal semantic structure of lexical items is organized along the lines out lined above for SF-representations in general, so that the ultimate primes of SF are the components from which lexical items are made up according to the syntax of SF. On the basis of these assumptions, 0-roles can be de fined by means of primes of SF along the following lines :
(43)
The intuitive notion behind (43) is to define 0-roles in terms of configura tions at SF, in a similar vein as grammatical functions are defined in terms of phrase structure configurations. I nstead of developing a -more technical account of this notion, I will now turn to the rules that assign SF representations to syntactic structures. Let me first illustrate the lexical component of these rules by means of a standard example. Suppose that, in accordance with traditional assump tions about semantic decomposition, the SF-skeleton of give can b e represented i n the following way:
(45) CAUSE
I
S/SS
I I I \f'
DO x S/
u S
I
I
I I
/\;1/
CHANGE NEG HAVE S/SS
S
I
y
z
I
yI I
HAVE y
I
Sl
z
s s
s
Under appropriate conceptual interpretation of the posited primes, (45) claims that x gives z to y means something like 'x's doing something u brings about a change from y's not having z to y's having z' . I do not want to j usti 10 fy the substantial details o f this analysis. What is of interest here, is the general format of lexical representations, with respect to which a number of comments will be made. First, as already mentioned, lexical SF-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If a is an argument of Pi at SF, where Pi is a designated prime o f SF, then P i assigns t o a a particular semantic role Si . A 0-role 0 k that a lexical item L l assigns t o a constituent C i s the composite o f the semantic roles SJ assigned b y the SF-representation o f Ll to the SF-representation of C. 9
70 structures, which assign lexical items their representation at SF, are subject to the general conditions on SF-representations . That is clearly the case in (45). Secondly, the SF of lexical items, as any other kind of lexical informa tion, should be redundancy free, insofar as general properties of the lexical system can be expressed by redundancy rules. A case in point is the fact, that, in general, only the final state o f the CHANGE-relation is lexically specified, while the initial state can be derived by the following redundancy rule:
(46)
[CHANGE-TO x] - [CHANGE [NEG x] x]
(47)
give; [ - N, + V] , [_ N P2 N P 3] , [_ NP 3 to NP2] ; x 3 x2 [x1 [Vx4 CAUSE [DO x 1 x4] [CHANGE-TO [HAVE x2 x3 ) ] ]]
As (47) indicates, those variables that correspond to the proper arguments of give are bound by lambda-abstractors, which serve two interrelated pur poses : they turn the open proposition (45) into a three-place predicate, and they play a crucial role in the combinatorial rules, to which we turn immedi ately. The remaining variable x4 (that is u in (45)), which cannot be related to a syntactic constituent at LF, is bound by an operator that has the stan dard properties of an existential quantifier at SF, but a somewhat di fferent function with respect to conceptual interpretation, as it selects a contextual ly determined instance of the appropriate type at the level of conceptual representation. (I will return to this point below .) Fourthly, the abstractor in a sense 'collects' the semantic functions as signed to the occurrences of the variable it binds inside the lexical represen tation, thereby creating the 0-role associated with the variable in question. Finally, subcategorized constituents must be connected with a 0-role (although not every 0-role is connected to a lexically subcategorized consti tuent). This connection can formally be expressed in various ways. In the format exemplified by (47) it is represented by identical subscripts of varia bles and subcategorized constituents : x2 and x3 are connected to the direct and indirect object, respectively, while x 1 provides the 0-role of the sub ject; it is therefore not connected to a subcategorized constituent. It should be obvious from these remarks that abstractors play a crucial
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Thirdly, as stated so far, (45) is an open proposition involving four unbound variables : u of category S, x, y, and z of category N. The verb give, however, should come out as a three-place predicate, rather than as an open proposi tion, and it should , moreover, assign appropriate 0-roles on the basis of its SF-structure to the subject, direct, and indirect object, respectively. This will be achieved by binding the variables in (45) by two types of operators in the lexical entry for give:
71 role i n the combinatorial rules, to which we turn now. The basic process is that of lambda-conversion according to the familiar equivalence (48) apply ing to lexically interpreted LF-representations . (48)
5( [
•
•
•
x . . . ]a .... [ . . . a . . . ] where x and a are of the same category.
We will say that an argument a specifies i, if a replaces the variables bound by i according to (48). The main combinatorial rule can now be formulated as follows: (49)
The result of Rule (49) is an SF-representation q> ' that results from q> by sub stituting all occurrences of x in q> by \jl. If (49) is applied to all constituents dominated by y , we get an integrated SF-representation x . which will be
assigned to y as its representation at SF. Notice that x might still contain an abstractor, i . e . it might be of the form 5( [�] . if y does not dominate a constituent whose SF-representation appropriately specifies 5(. I n that case, 5( is open to specification at the next level up in LF. Thus starting with the SF-representations provided by lexical rules, the Argument Rule (49) recur sively assigns integrated SF-representations to all constituents of a given LF. A number of amendments are required, of which I will mention the fol lowing: (50)
Unspecified Argument Rule: x . . (q>] is replaced by . . v x [q>] , where . . . does not contain any .
.
y.
This rule turns an abstractor that is not specified by (49) i nto a referential quantifier of the type mentioned earlier, i . e . , an operator that selects a con textually determined instance at the conceptual level . Both (49) and (50) are optional and unordered, as are other rules of G. I nstead o f pursuing further technical details that must be clarified in order to make the combinatorial rules work appropriately, I will conclude this sketch of the SF-component by a simple example that illustrates the basic ideas . According to standard assumptions, (5 l a) is derived from the D-Structure (5 1 b) by moving the object NP into the subject position yielding the S-Structure (5 l c).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Argument Rule : I f a and p are directly dominated by y in LF, and x [q>] is the SF representation assigned to a , and \jl is the SF-representation assigned to p, and 'II is of the same category as x, then 'II specifies 5(.
72 (5 1 )
a. b. c.
The book was given to John. l s [5 [ N P e ) INFL[v p b e [given [ N pthe book] lpp t o John]]]]] l s [ 5 [ NP · the book] I NFL[yp be [given [ N P· e ) [PP t o John]])]] I
I
From (5 1 c), the LF-representation is derived by rules of construal which, among others, interpret indices as variables. Following Higginbotham ( 1 983), I will assume in particular that NP-specifiers are operators that bind appropriate variables, so that (52) will emerge as the LF-representation of (5 1 ) : (52)
[5 [ N P the xi book xi] Past l v p be [given xi [PP to John])]]
(53)
a. b. c. d. e.
book: John: the : Past: be [ +
x [BOOK x] JOHN x [DEF x] PAST Passiv] : x [x]
with BOOK of category S/N of category N with DEF of category (S/S)/S of category S/S with x of category S
The semantic constants appearing in (53 ) are all abbreviations, which need not be analyzed here . I will assume without further justification that PAST is a sentential operator, and DEF a restricted quantifier which turns a nominal into a sentential operator. The passive auxiliary is an operator that requires its complement to be of category S, which expresses the fact that 1 a passive VP does not assign a 0-role to its subject . l Given (47) and (53), we assign lexical SF-representations to (52) , to which the combinatorial rules can apply. The Argument Rule derives the representations to be at tached to the subject NP and to the complement of be in the obvious way: (54)
a. b.
the book: [DEF xi [BOOK xd] of category S/S given X; t o John : x1 [vx4 [CAUSE [DO x 1 x4] [CHANGE-TO [HAVE JOHN xi])]] of category S/N
As the passive be requires a complement of category S, rule (50) must apply to (54b) converting the abstractor ){1 into a referential operator, which ex presses the fact that passive constructions involve an unspecified actor. As be adds nothing to this representation, it will come out as the SF assigned to the whole VP of (52), providing the appropriate complement to both PAST and the subject NP. We thus derive (55) as the SF-representation as signed to the top node of (52):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Suppose now that along with (47), which assigns the SF-representation to give, we have the following lexical rules :
73 (55)
[ [DEF xi [BOOK xi ] ] PAST [vx 1 [Vx4 [CAUSE [DO x1 x4] [CHANGE-TO [HAVE JOHN xi ] ] ) ] ] ]
3. THE CONCEPTUAL STRUCTURE OF COM PARISON
The semantics of gradation, as that of any other domain , must determine (i) the structure of the pertinent semantic representations, and (ii) the way, in which these representations are derived compositionally. In this section, I will be concerned with the first of these interrelated aspects . More specifi cally, I will motivate the content of the representations in question by means of the conceptual interpretation which they can receive. To begin with, I take the mental act of comparison to be a basic capacity that organizes the relevant conceptual domain . Suppose that this capacity exhibits a fairly abstract, self-contained structure that determines its inter action with other conceptual components . The minimal requirements for this structure include two entities a and b, the specification of some aspect or dimension D, with respect to which a and b are to be compared, and a relation between a and b with respect to D. I will assume more specifically, that comparison creates a relation between the values that are assigned to a and b with respect to D by some function f. Thus in its most elementary form , the act of comparison can be indicated by (56), where ' :::> ' is a concep tual prime. (56)
f(a, D)
:::>
f(b, D)
Intuitively, (56) is to be conceived of as determining that the D-value of a includes the D-value of b. Before I can make these intuitive notions more precise, I will somewhat enrich the structure of comparison.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I have arbitrarily chosen PAST to be within the scope of the subject-NP . With the present assumptions, the scope could equally be assigned the other way round . I will not go into these problems here. To summarize, the lexical and combinatorial rules of the SF-component assign a compositional SF-representation to any LF-constituent, where the Argument-Rule (49) spells out, so to speak, the semantic aspect of 0-role assignment, the syntactic aspect of which is taken care of by the projection principle. As the combinatorial rules as well as the general format of lexical rules must be regarded as the universal framework, the semantic theory of grada tion, to which I will turn now , will consist essentially in the specification of appropriate lexical representations for the basic expressions involved in gra dation, including the conceptual interpretation of the relevant semantic components.
74 Suppose that a and b are boards to be compared with respect to length. Let us assume that in fact the act of comparison consists in projecting a and b into an abstract scale in the way indicated in (58): (57) a b
f(a, L)
f(b, L)
c
I n other words, to ascertain that (56) holds for the situation represented by (57) - with L as the dimension o f comparison - a and b are assigned a com mon zero-point in L, which automatically provides the difference c, by which j{a, L) exceeds f{b, L). This , then leads to a slightly more complex structure of comparison, which might be represented by (59), assuming standard interpretation for ' = ' and ' + ' . (59)
f(a, L) = f(b, L) + c
Continuing with this quasi-algebraic notation, (58) might also be repre sented by (60): (60)
f(b, L)
=
f(a, L) - c
Anticipating the intended interpretation, we want (59) and (60) to cor respond to ( 6 1 ) and (62), respectively, in accordance with the fact , that both correctly describe the situation indicated in (57) - and (58), for that matter. (6 1 )
a i s longer than b.
(62)
b is shorter than a.
Notice, furthermore, that (57) can also be described by (63) and possibly (66) induce an additional factor, insofar as they are norm-related in the sense discussed above.
(64), while (65) and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(58)
75 (63)
b is not as long as a.
(64)
b is less long than a.
(65)
a is not as short as b.
(66)
a is less short than b.
Another point t o b e noted i n the structure of (59)/(60) i s the occurrence of two types of L-intervals: (a) Values assigned to the entities to be com pared with respect to L; let us call these intervals L-extents; (b) Intervals that constitute differences between extents ; let us call these intervals L differences. Extents and differences differ with respect to two properties: Extents include (or start at) the zero-point, do not necessarily include 0. Ex tents have a fixed directionality ' from zero up' , differences allow both direc tions : 'towards' and 'away from' zero . From these considerations, one might derive a preliminary characterization of the notion 'dimension' in volved in the structure of comparison: (67)
A dimension D is a (potentially infinite) set of D-intervals which are either D-extents or D-differences .
It is implied by (67) that D has a designated zero-point. 1 3 We will give a more precise characterization of the notion of dimension and scale below. Proceeding still on the intuitive level, I will introduce two further concepts involved in comparison. The first is that of average or norm . In accordance with almost all other analyses, I consider Positive constructions as based on comparison , the standard of comparison being provided by a contextually 4 determined norm. 1 Let us represent the standard in question by N1c ,DJ • where C represents the class of comparison on which the relevant norm
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I will return to these problems presently. For the time being, we will merely note the well-known fact, that the same situation can be conceptualized i n different ways, which are, moreover, systematically related . With respect t o (59) and (60), the difference in conceptualization comprises two interdepen dent factors : (i) the choice of either a or b as providing the standard or anchor-point for comparison , and (ii) the 'direction' in which c is to be passed through . There is an intuitive sense, in which (59) represents the sim pler or unmarked case: The direction of the operation determining the value off(b, L) is preserved in adding c, while in (60) the direction of the operation determining f(a, L) must be reversed in subtracting c. This asymmetry is i n fact the root of quite a number of the phenomena discussed in section 2 and will therefore be made precise in our theory. Notice, incidentally, that the elementary structure indicated in (56) does not allow one to express this asymmetry, as it invariably encompasses both (59) and (60). 1 2
76 depends, and D the relevant dimension. (I will continue to omit the subscript and simply write N, where no confusion arises .) N will now be treated as a D-interval, or more precisely as a D-extent, whose particular value de pends on the contextually determined class C. With these provisions, we get the following interpretations for positive adjectives: (68)
a. a is long. b. f(a, L) = N + c
(69)
a. a is short. b. f(a, L) = N - c
(70)
a is as short as b.
What (70) asserts is that a and b are of equal length, and it entails (or impli cates, or presupposes) that both are short . Schematically, the situation can be represented as in (7 1 ) with the concep tual representation (72), where the presupposition is included in angled brackets. f(a, L)
(71)
c
f(b, L) N
N (72)
f(a, L)
=
f(b, L) 1\ ( f(b, L)
=
N
-
c)
That the presupposition refers to b rather than a in (70) is borne out by the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There are various well-known questions to be raised i n this connection. The first is how to fix C and how to compute N, if C is fixed. I have nothing substantial to say in this respect. It seems reasonable, however , that basic conceptual capacities must be responsible for these operations, the compu tation of N possibly being a constituent part of the capacity of comparison itself. Another question is whether N is a specific interval, or rather a certain range of intervals. Related to this question is the specification of the dif ference c, as not just any non-empty interval will be sufficient for a to be long, or short. I take these questions to be captured by a general account of vagueness, specifying different demands of precision according to the contextual setting. 1 5 I will therefore consider N as a specific, though context-dependent, D-interval, and make no particular assumptions about 6 c in cases like (68) and (69) , except that it is not empty . 1 Consider now more intricate cases of norm-relatedness like
77
(73)
a. a is three feet long. ft + ft + ft b. f(a, L) =
I nstead o f (73) (b) we might simply have (74)
f(a, L)
=
3 ft
Similarly for Comparatives: (75)
a. a is three feet longer than b . b. f(a, L ) f(b, L ) + 3 ft =
(76)
a. b is three feet shorter than a. b. f(b, L) = f(a, L) - 3 ft
Finally, we have measurements without recourse to D-units: (77)
a. a is twice as long as b. f(b , L) b. f(a, L) = 2 ·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
negation a is not as short as b, which still implies that b is short , but makes no implication with respect to a. I now turn to the second concept that deserves preliminary discussion, viz. measurement. This involves two factors: first the introduction of a metric into the set of intervals, and second the specification of units o f measurement. As t o the metric, this has in fact already been introduced b y the u s e of ' + ' a n d ' - ' and their intended interpretation, which will b e made precise below. In other words, I suppose that the structure of comparison is based on scales that are intrinsically metrical in the sense that intervals can be mapped into real numbers preserving standard operations on numbers. With appropriate provisos concerning vagueness, this seems to hold even for £-adjectives, to which we turn later: Although John isfour times as lazy as Paul is fairly imprecise under normal conditions, it still makes sense, im plying the possibility of multiplication. I would like, in fact, to conjecture, that we are able to work conceptually with non-metrical scales, but only as a highly marked and derivative accomplishment, the unmarked case being 7 ordered, metrical scales. 1 Units of measurement presuppose metricality, but not vice versa. A unit o f measurement can be conceived of as a representative of a designated equivalence class of D-intervals. (Equivalence of intervals will be defined below.) In practice, a unit of measurement is simply a designated D-interval . There may be several such units for one scale, foot, mile, meter, or second, day, month, year being obvious examples. Again, the usual provisos as to the degree of precision apply. Suppose, then, that ft represents the L-unit foot. We now get provisional interpretations of the following kind for con structions containing measure phrases :
78
(78)
A D-scale is an ordered pair (D, :::> ), where (ui , v) l is an infinite set of D-intervals d, and (i) D = I di : di (ii) :::> is an asymmetric, transitive, reflexive relation in D, called ' interval-inclusion ' . =
A D-interval d might b e conceived a s representing the result of scanning a pertinent object along a certain dimension. It starts at some initial point u and spans a certain stretch v. The identification of objects and their relevant dimensions must be determined by other systems of conceptual organiza tion, such as spacial and temporal orientation, conceptualization of precep tion, emotion, etc. They must be analyzed independently and will be presupposed for the present purpose (u, v) accounts for the intrinsic direc tionality of intervals, which is motivated on intuitive grounds and will be exploited in the definition of further concepts . The relation of (improper) interval-inclusion makes intervals available for comparison. I ntuitively, an interval d 1 includes an interval d2 if and only if d1 includes all parts of d2• Thus , interval-inclusion can be fixed by the following axiom:
Interval-inclusion imposes a partial ordering on D. It allows defining a number of other relations, such as interval-exclusion, proper inclusion, overlapping, etc. We need not pursue these possibilities. We will now fix a designated initial point in accordance with the assump tion implicit in our informal discussion, that generally the scanning of an object assigns it a D-extent that begins at 0. We thus define a D0-scale as follows:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
So far, I have given an intuitive account o f the conceptual structure that I assume to p rovide the purport of expressions of gradation. I will now turn to a more precise formulation of the relevant concepts. Starting with the act of comparison as the constitutive capacity under lying gradation , we arrived at the concept of D-scale as its relevant formal structure. The main task for a theory of the pertinent conceptual domain is therefore to develop an adequate theory of scales . As alleged above, the types of scale available at the conceptual level are of a rather restricted varie ty. If this is correct in principle, an adequate theory would have to specify the relevant restrictions as part of an explanatory theory of conceptual or ganization. Pursuing a much more moderate goal , I will simply introduce the required concepts by way of definition. As a basic building block, I will define the notion of D-scale in the follow ing way:
79 (80)
A D0-scale is an ordered triple (i) D and :::> as in (78), and (ii) D0 is a subset of D with the (a) di ED0 iff di = (0, v) (b) for any di E D there is a
(D, D0 ,
:::> ),
where
condition
d E D0 with d i i
:::>
di
(8 1 )
Let a and b b e elements o f D0, with a :::> b and c and d any elements of D. Then a = b + c = def /1. d [c :::> d ... a :::> d /1. ..., b :::> d]
By this definition , a spans exactly the concatenation of b and c. What we need for the analysis of gradation is, however, a somewhat different opera tion, viz. one whose result is either exactly or at least the extent a. There is a fairly wide variety of constructions whose interpretation is open to these alternative interpretations. Compare for example: (82)
a. b. c.
John is ten inches taller than Bill. In fact, he is ten inches and a half (?nine inches and a half) taller. John is six feet tall. I n fact he is six feet and two inches tall. John is as tall as Bill. In fact he is somewhat taller.
Notice, that we are not dealing here with vagueness, as the ' precisely' reading alternates only with the 'at-least' -reading, that is, there is a clear directionality. A further point to be noted is the asymmetry between the two interpretations: the ' precisely' -interpretation is preferred , it will only be sus pended if evidence to the contrary shows up. This asymetry is reversed in cases like (83), whose normal interpretation is that John is shorter than Bill. (83)
John is not as tall as Bill.
Although some of these facts have been noted in the literature, no system-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Condition (b) guarantees that 0 is the initial point of the scale, i.e. that there are no intervals that start ' below' 0. The set D0 is the set of D-extents in the sense discussed above. I will assume that D0 is ordered with respect to inter val inclusion. This assumption is straightforward for linear dimensions, which constitute the vast majority, but it deserves special considerations if multidimensional comparison must be acknowledged. I will return to this question below. In order to account for the structure of comparison introduced above, we must define the operations ' + ' and ' - '. Consider first 'addition' . A straightforward way to define ' + ' in terms of the concepts introduced so far would be as follows :
80 atic account has yet been given. We will see, as we proceed, that they can in fact all be reduced to the same source, namely the definition of addition (and subtraction). The point to be noted is this: the 'exactly' -interpretation corresponds to the biconditional in (8 1 ), while the ' at-least ' -interpretation would be represented by a simple implication. These are not options of equal right, though. Rather the biconditional is preferred, except when there is evidence to the contrary. This is exactly the situation of a default interpretation. I will therefore replace (8 1 ) by the following definition:
(84)
Let a, b, c, d be as in (8 1 ) . Then a :::> b + c = def A d� � d - a � d A � b � � : M A d � � d - a � d A � b � �
(85)
A D 1 -scale is an ordered quadruple (D, D0 , � , + ) with D, D0 , and � as in (80), and + as defined in (84).
Notice, that (84) can be turned into a recursive definition , that accounts for iterative addition as in a ::::> b + c + d, so that , furthermore, iterative addi tion of equivalent intervals can be defined as follows: (86)
b = de f a :::> b 1 + b2 + . . . + bn where a :::> n b, bp . . . , bn are elements of an equivalence class as defined in (90) below. •
We have thus derived multiplication of intervals as successive addition in the usual way. This kind of multiplication will not only be needed for the in terpretation of measure phrases like three feet, ten minutes, etc . , but also for sentences like (87): (87)
a. b.
John is three times as tall as his brother. A is twice as much longer than B as C is longer than D.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In (84), the biconditional is split up into a (defining) implication and a default replication, the default being marked by the operator M as in troduced in Reiter ( 1 980). This will account, under the SF-representation to be developed i n the next section, for the preferred 'exactly' -reading in (82). The reversed preference in cases like (83) will result from the fact, that under negation the presupposition of the default is not met, hence the default does not apply. The relation ' :::> ' in the definiendum of (84), which replaces the ' = ' of (8 1 ) , represents improper interval-inclusion with the preferred mu tual inclusion determined by the default. Adding the operation just defined to the D0-scale, we get the following extension:
81 W e next enrich D 1 -scales b y adding interval subtraction . The essential point of this move is the introduction of a reversal in the orientation of inter vals : While d1 + d2 extends an extent d1 by a difference d2 , d1 - d2 first scans d1 from 0 up, and then d2 from the end of d1 down. The formal ac count is given in (88):
(88)
Let a, b E D0 with b ::> c and c, d E D. Then a C · b - c = 1\ d [c ::> d - b ::> d 1\ ..., a ::> d) : M 1\ d [c ::> d - b ::> d 1\ ..., a
de f
::>
d)
Notice that ::::> ' is an asymmetric relation and must hence be reversed in the definiendum of (88). We are now ready to define the type of scale u nderlying the interpretation of gradation in D-adjectives: '
A D2 -sca1e is an ordered quintuple (D, D0 ,
::> ,
+ , - ).
This is a fairly rich and specific algebraic structure, whose formal properties and empirical adequacy with respect to the intended purpose might be ex plored in various directions. There are a number of further concepts that can be defined with respect to this structure. We might e.g. want to define ' a + b' as the default-bound minimal interval that includes all and only the intervals included in a and b, and correspondingly ' a - b' as the default bound maximal interval that includes all those intervals that are included in a, but not in b. The usual associativity of ' + ' and non-associativity o f ' - ' will emerge. I will not state those definitions formally, but I will make use of various properties implicit in the structure of D2-scales. One notion still to be defined explicitely is equivalence of intervals, one of the things needed for the definition of units of measurement. (90)
Two intervals di di E D are v-equivalent, iff ' d.I = (u.I ' v.) (uJ. ' vJ. ) and vI. = vJ. . and d.J I DV is the set of v-equivalent intervals. =
Now a unit of measurement in D (or, for short , a unit of D) is a representa 1 tive of a designated Dv c D. B As already mentioned, there may be more than one unit for a given D. We thus arrive at D2 -scales with one or several D-u nits. The last notion to be defined is that of the norm or average. Formally, N1c, DJ is simply an element of D0 whose specific value depends on a certain class C of objects. One way to think of this dependence might be as follows : The class C obviously selects a certain subset of extents, call it q. The norm N1c.o1 = (0, n) might now be determined by some weighted middle over q. Although an empirical justification of this or some alternative
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(89)
82 approach is certainly necessary, all we need for the present purpose is the specification of N as an element of D0 depending on the parameter C. Let me conclude this section with some remarks on the nature of D. Although I would claim that gradation concerns only the invariant structure of D-scales, there must be different sets of intervals instantiating this struc ture. Compare the following examples : (9 1 )
a. b.
John i s taller than his bed is long. John is taller than his car is fast.
(92 )
a. b.
a is bigger than b.
a
b
There seem to be four, partially incompatible, interpretations: (i) (ii)
(iii) (iv)
big refers to all relevant extensions, so that in (92) the comparison is two-dimensional, and (92a) comes out undecidable. big refers (in conflicting cases) only to the dominant extension, in which case it is one-dimensional , and (92a) comes out true (assuming vertical extent to be dominant) . big refers to the product of all relevant dimensions, i.e. to the square-measure in (92); now (92a) comes out false. big refers to something like 'global impression' , which is vague, but certainly one-dimensional; again (92a) comes out false.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
While (a) has a straightforward interpretation, (b) has not , although there exists a derived, in fact fairly vague, interpretation of (b) which makes 1 height and speed somehow commensurable. 9 In any case, length, weight, speed, loudness , and many others, yield different sets of intervals , which are, however, susceptible of the same type of scalar structure. I have illus trated this structure by means of length, where the linear ordering of D0 and the metrical structure of the scale is fairly obvious . Although I suppose that these properties can indeed be generalized - at least as the unmarked case - to arbitrary scales of comparison, there are certain problems with cases that seem to involve multi-dimensional comparison. While some of 20 those cases can be discarded as apparent counterexamples , this does not seem to be appropriate with cases like big or large, which essentially involve multi-dimensional conditions. Consider the following situation:
83
4 . THE SEMANTIC FORM OF D-ADJECfiVES
Presupposing the framework outlined in Section 2, I will first fix certain relevant properties of adjectives in general. Syntactically, adjectives are of the category [ + N , + V], and they are the head of adjective phrases AP as their maximal projection. APs occur as predicates, and as adnominal and adverbial modifiers. The relevant structures can be indicated as follows :
A / ""'
(93) a.
N'
N'
AP
N
Camp
b.
� I V'
v
be
AP
A A V'
c.
V'
v
AP
Camp
In the sequel, I will largely ignore adverbial modification, as it involves 22 various unsolved problems that must be clarified independently. Both N ' and V ' normally assign a 0-role, the latter to the subject (except for passive and raising cases), the former to whatever makes lexical NPs referential . In the present framework, their semantic form will therefore have the general structure x [P x] , where [P x] represents an arbitrarily complex proposition with x as the only free variable. The question now is: How do APs contri bute to this structure, i . e . , how is their semantic representation incorporated into P? This question turns on the much debated issue whether adjectives
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Whether (iii) and (iv) must ultimately b e reduced t o the same conceptual condition, is unclear. (i), (ii), and (iii), however, are clearly different, and 1 they all seem to play a role under certain conditions':-2 For reasons , which cannot be explained here in detail, I consider (iii) or (iv) the basic interpreta tion with (i) as an auxiliary option on demand . What is crucial, however, is the fact that the scales to be invoked are linear in all cases, with the provi so, that (i) must be taken to involve more than one comparison at once, so that this interpretation boils down to the situation mentioned in note 20. I conclude from this incomplete discussion that D2-scales serve in fact as the conceptual structure for the interpretation of gradation. I would like to emphasize that the structure of scales outlined here makes explicit the conditions the conceptual interpretation of gradation must meet. It neither specifies the rules that generate the relevant conceptual representations, nor does it make any claim as to the format of those representations. It might well turn out that conceptual representations are radically different in nature from the algebraic notation I have been using for expository reasons.
84 (or rather APs) are basically predicative or attributive with respect to SF. There are essentially two alternatives which, in the present framework, can be formulated as follows (I assume that N 1 is semantically of category SIN): (i)
(ii)
Obviously, the choice between (i) and (ii) depends on general considerations about predication and modification; it includes the analysis of other N 1 -modifiers, such as restrictive relative clauses, P Ps, etc . , which cannot be discussed here. Notice, however, that in the present framework there is no need for a particular semantic rule, if (ii) is adopted, as the U nspecified Argument Rule (50) automatically applies, provided the copula has the semantic form x [x] , where x is of category SIN (see note 1 1 ) . This can be seen as follows: as be on the present assumption requires an argument of category SIN, the unspecified abstractor Q of the AP must be turned into a referential quantifier, and the V 1 will come out as x [v Q [P x 1\ Q x] ] . This seems t o be a plausible result which argues i n favour o f (ii), although it is not a very strong argument . There seems to be a somewhat more specific argument in favour of (ii) which directly relates to gradation. (94)
a. b.
John is tall . John is a tall boy.
Clearly, these sentences are norm-related, and boy seems to provide the rele vant class C of N1c. height] in (94b). Hence for D-adjectives the modified nominal has a quite specific function for the interpretation of the AP that must be captured in the SF-representation of D-adjectives. This then re quires AP to be basically attributive. There are two objections, though. First, modified nouns restrict, but do not in general uniquely determine, the comparison class C. Consider cases like This is a big cube. Without further contextual information, cube does not provide a sufficiently restricted com parison class. Secondly, the norm, and hence the comparison class, becomes irrelevant (and will in fact disappear) in comparative constructions, even if a modified noun occurs:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
AP is basically predicative, its semantic category is SIN, its SF representation has the structure x [P x] . Its attributive use requires a semantic rule that accounts for modification, yielding x [P x 1\ Q x] , where x [Q x] would be the SF-representation of the N 1 to be modified. AP is basically a modifier, its semantic category is (SIN)I(SIN), its SF-representation has the structure Q [x [P x 1\ Q x]J , where Q is a variable of category SIN, which is to be specified by the modified N 1 • Now its predicative use requires a semantic rule that turns it into an appropriate expression o f category SIN.
85 (95)
John is a taller boy than his brother, though both are not as tall as one would expect.
�A' AP
(96) (Deg)
�
A
(Comp)
Although adjectives are not subcategorized for Deg, I will assume that they may assign to it a particular 0-role, which might be called 'Grade' . 24 In the present framework, this 0-role must be provided by a particular abstractor, which I will write 'c' , where ' c' is a variable to be interpreted by D-intervals. On these assumptions, we get (97) and (98) as the structure of SF representations for non-gradable and gradable adjectives, respectively, where P represents the 'proper content' of the adjective, and Q is to be speci fied by the modified N ' . (If we drop the modificational aspect, we get the simplified (b)-versions for predicative adjectives.) (97)
a. b.
Q [x [P x x [P x ]
(98)
a. b.
c c
"
Q
[ Q [x [P x c [x [P x c)]
x]] "
Q
x]]]
For gradable adjectives, P must be a relational expression of category S/N N, provided that c is of category N, i.e., that intervals are treated as in dividuals of some kind. First of all, D-adjectives identify the conditions that select a particular dimension. Suppose that VERT refers to the conditions associated with high , viz. those that select the vertical axis of a given object. Similarly MAX might specify the maximal axis, relevant for long, etc. 25 Let DIM be a meta-variable that covers the semantic primes (or configura-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the sequel, I will tentatively adopt the alternative (ii), but will largely drop the notational complications arising from the modificational structure, if predicative constructions are at issue. After this preliminary discussion of APs, I will turn to the properties of their heads. While adjectives in general may take complements, as for exam ple proud of NP, this does not seem to be the case for D-adjectives. 23 The most characteristic property of adjectives is the rich structure of the degree constituent that realizes the specifier system of adjectives. Postponing the structure of the degree constituent to the next section, I will thus take (96) as the internal structure of AP:
86 tions of them) specifying the different conditions of different D-adjectives. I will assume that DIM is a functor-variable that specifies a particular dimension for any appropriate object x. Thus 'DIM x' will be interpreted conceptually by one of the available dimensions of x. Let furthermore QUANT be a semantic prime that represents the scanning of the extent of x along DIM x. While DIM requires a particular instance for each different adjective, QUANT will be a characteristic constant of all D-adjectives. We thus get (99a) as a partial specification of the SF-representation of D adjectives, where (99b) indicates the conceptual interpretation assigned to it. QUANT DIM
I
I
N/N
X
I
f(x, D)
b.
N/N N
\Y N
Now the crucial point is, of course, that D-adjectives do not merely select the D-extent of x, but relate it to a particular interval that is specified in various ways. I repeat some of the relevant cases together with their in tended conceptual interpretation (ignoring for the moment the asymmetrical relation ::::> ' ) : '
( 1 00)
a. b. c. d. e.
a is long. a is 3 feet long. a is longer than b. a is short. a is three feet shorter than b.
f(a, f(a, f(a, f(a, f(a,
L) L) L) L) L)
N + c 3 ft f(b, L) + c N - c f(b, L) 3 ft
I will adopt three requirements that the SF-representations of D-adjectives should meet: (i) (ii) (iii)
D-adjectives are not semantically ambiguous between a normrelated and a purely dimensional interpretation, i.e. ( I OOa) and ( l OOb) should be based on the same lexical representation of long. + Pol and - Pol D-adjectives are based on the same dimensional conditions, i.e. long and short are lexically identical in the relevant respect. Specifications of Deg, particularly Comparative and Equative con structions, contribute in a strictly compositional way to the resulting constructions.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a.
(99)
87
Usually (i) is met by taking the non-normrelated interpretation as basic and then stipulating rules that introduce norm-relatedness under particular con ditions. These rules turn out to be the more ad hoc, the more completely they are specified. (Actually, most of the cases discussed in Section l have never been taken into account.) Condition (ii) has practically been ignored. Condition (iii) will be taken up in the next section. Suppose now that we postulate the following SF-representations for ( lOOa,b,d): c [[QUANT MAX a] = [N + c)] [[QUANT MAX a] = [0 + 3 FOOT)] v c [[QUANT MAX a] [N - c)]
v
=
The basic idea is that D-adjectives involve generally three D-intervals: the extent of the object in question, an extent to which it is compared, and a difference between the two. In order to achieve this uniformity, I have used 0, to specify the compared extent in case (b). In order to be consistent, 0 must be interpreted as the empty extent (0,0), not just the initial point of the scale. Although this yields the correct conceptual interpretation accord ing to the definition of ' + ' , so far it has only been a formal move. Its sub stantial motivation will become clear in the sequel. Notice, that on this assumption D-adjectives are three-place predicates, which relate an object to a compared extent (which might be empty) and a difference. This consti tutes the first specific assumption of the present theory. In order to reconcile this analysis with condition (i), a second assumption is necessary: lexical representations of D-adjectives do not specify the value of the compared extent, they rather contain a particular variable v, the value of which might be, among others, 0 or N. The choice of N or 0 will be sub ject to systematic conditions to be discussed in section 7 below. I will now propose the following lexical SF-representation of D adjectives: (101) a. b.
+ Pol adjectives: - Pol adjectives:
c c
[x [[QUANT DIM x] [x [[QUANT DIM x]
[v + c))] [v c))]
The conceptual interpretation in terms of D2 -scales is obvious: The values of v and c are interpreted by D-intervals, ' + ' and ' - ' by the operators de fined above, and ' = ' is a constant of category SIN N interpreted by ' :;) ' or ' C-' in the context of ' + ' and ' - ' , respectively. (Here and in the sequel I use different letters for particular variables instead of x1 , x2 , etc., in order to make the representations more comprehensible.) The semantic category of these representations is (S/N)/N, i.e., they are two-place relations, associating an individual and an interval . 26 The crucial point is, that v is an
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 00 ' ) a. b. d.
88
(continued in JS 6.2)
NOTES
I.
(for Part 1 , Sections 1 -4)
A more elaborate version of this theory appears in Bierwisch (in press), which is the trans·
lation o f Bierwisch ( 1 987). The present paper develops the basic ideas of this theory, which has been modified in the extended version in a number of details, but not in essence. The most im portant change co ncerns the conditions discussed in Section 6, which could be simplified in in teresting ways. By a systematic reformulation, C I and C 2 could be reduced to a single condition, while C 3 could be dispensed with altogether. The resulting three conditions provide a more perspicuous formulation of the relevant principles . Although these and a number of other modifications are of interest for possible future developments, they do not a ffect the general orientation of the analysis developed here. As the present paper, which has circulated for a while, presents a relatively selfcontained version of the relevant proposals, I have left it unchanged . I want to thank my colleagues Reinhard Blutner, Hannes Dolling, Karin Goede, Ewald Lang, Anatoli Strigin , and l ise Zimmermann for patient discussion of the various ver sions through which this analysis has passed . Special thanks are due to Arnim von Stechow. The extensive discussion we had about the facts to be dealt with below, and his own analysis of comparative constructions provided the stimulus that eventually led to the theory proposed here. I n a sense, the present analysis grew out of an attempt to overcome what I consider the
shortcomings of his approach. Needless to say that he is not responsible for whatever faults
there might be in my analysis. 2.
For an early survey see Bartsch and Vennemann ( 1 972); for a recent discussion see Von
Stechow ( 1 984). I t might be noted, incidentally, that the ingredient notions from which the
various analyses are built up would allow other combinations of the same descriptive adequa cy. The crucial task is, therefore, to capture the relevanr generalizations, i . e . , to explain the pertinent facts in terms of underlying principles.
3.
Previous th eories analyse relational adjectives as two-place predicates relating an individu
al x to a degree d without an internal structure of d. This prevents an appropriate analysis of the Positive and its relation to other constructions. We will see below, that nevertheless rela-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
open variable not bound by any operator and hence not amenable to specifi cation by syntactic constituents. (But see below .) It will be specified rather by special conditions on semantic representations. It is by this status of v that condition (i) is met: long is not ambiguous, but rather subject to dif ferent conditions according to its context. Condition (ii) is met in the ob vious way, and condition (iii) will be shown to be met in the next section. The way, in which representations like (101) yield SF-representations such as ( 1 00 ' ) should be obvious. Suppose that threefeet is an expansion of Deg in (%) with the SF-representation 3 FOOT of category N . Then the Argu ment Rule (49) will substitute it for c, with subsequent specification of v by 0 by general conditions. If on the other hand Deg is empty, the Unspecified Argument Rule (50) will turn c into a referential quantifier, with subsequent specification of v by N again by general conditions. In both cases, we derive an expression of category S/N whose 0-role may be assigned to the subject NP .
89 tiona! adjectives are two-place predicates, in a certain sense, insofar as v is not available as an argument in the same way as are x and
4.
c.
Hence the Comparative and the Equative are parallel in a rather different way than that
assumed in other theories. The common core of various theories is the assumption that in both constructions the matrix clause as well as the complement clause specify degrees which are then related by means of parallel, though different, operations, such as ' > ' for the Comparative and ' 2: ' for the Equative, or addition for the Comparative and multiplication for the Equative. Each of these accounts fails in crucial respects. We will see, however, that addition is in fact involved in all pertinent constructions, while multiplication is a completely separate issue.
5.
The failure to recognize the essential difference between D-adjectives and E-adjectives cor
responds to the tacit, but general assumption that the analysis of tall can directly be generalized to that of all relational adjectives.
6.
See e . g . Kiefer ( 1 978) for discussion. Whether + NR is in fact a property that is appro
A fairly plausible answer to that question follows from independently motivated assumptions of the theory proposed below.
7.
This is borne out b y the controversial status of diagnostic sentences like
(1)
She is cleverer than her sister, though both are fairly dull.
That sentences like (1) are not ruled out completely is due to the possibility to construe at least
some E-adjectives in a quasi-dimensional fashion. What that means will become clear in Sec
tion 7 .
8.
A s the reader might notice, (a) - (c) i s but a rough outline of notions that have been deve
loped in various forms, e.g., by Lewis ( 1 972), Cresswell ( 1 973), or Hellan ( 1 98 1 ), the latter being particularly close to the present approach, as he not only uses a similar system of seman tic representation in analysing comparative constructions, but also combines it with a REST type syntax, albeit in a different way than that expounded below. The present proposal differs, however, from most other versions in that it construes SF-representations as subject to in terpretation in terms of conceptual structures - which are mental representations - rather than model-theoretic constructs.
9.
There are various points to be clarified with respect to (43). First, if P1 has more than one
argument, then S must be relativized to the place of a. Suppose e.g. that there is a semantic J prime DO of type S/N S. We might then say that DO assigns the semantic role Agent to its first, and Action to its second argument. A less trivial pomt, which has a number of conse
quences, is the distinction between semantic roles and 0-roles. For one thing, it allows one to reconcile the 0-criterion proposed in Chomsky ( 1 98 1 ) , according to which each syntactic argu
ment has one and only one 0-role and each 0-role is assigned to one and only one argument,
with the observation of J ackendoff ( 1 972), that a syntactic argument might bear more than one role, as in (i), where - in one reading - John might be both Agent and Theme: (i)
John rolled down the hill.
According to the present approach, there is only one 0-role assigned to John, which is however the composite of two semantic roles in the intended reading. Another consequence of this dis tinction is the fact that there may well be semantic roles that do not enter into any 0-role as signed by a lexical item. The Action-argument of DO would be a case in point in most . occurrences of that prime. Finally, (43) ought 10 be extended 10 syntactic constituents in general, so that it encompasses also 0-roles that are assigned compositionally, e . g . by verbs and prepositions, etc. These points all deserve much more careful consideration than the present context allows.
10.
Thus one of the frequently discussed details is the fact that CAUSE must conceptually
be interpreted as something like direct causation, since, e . g . , the fact that John poisened Bill's father as a consequence of which Bill inherited a million dollar cannot possibly be described
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
priately analysed as part of the presupposition of the sentences in question, is a separate issue.
90
(i)
John is two inches taller than Bill.
(ii)
These two inches are too much.
In terms of the present analysis, two inchi!S in (i) is interpreted by a representative of a certain
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
by John gav� Bill a million dollar. Similarly, CHANGE must conceptually be restricted to a direct transition from one state to another. HAVE on the other hand is open to the interpreta tion by a large range of different (asymmetric) relations with possession as a kind of unmarked choice. Problems like these are to be dealt with in a systematic account of conceptual interpre tation. I I . On this account, be does not contribute substantially to the semantic interpretation. That is certainly an oversimplification, as the verbal head of the VP must presumably be sus ceptible of appropriate interaction with the tense constituent in determining the temporal in terpretation of the sentence. I will ignore this aspect throughout the present paper. Notice, however, that the passive be differs from the copula �. which participates in assigning a e-role to the subject. To express that difference, I will assume that the copula be has the SF representation � [x) with x of category S/N. Like the passive be, the copula does not contribute substantially to the semantic interpretation, except that it transmits the 9-role, which the pas sive � does not. 1 2 . Present theories of comparison can be classified as to whether they take (S6) or (S9)/(60) as the conceptual core of comparative constructions. Cresswell ( 1 976) and Bartsch/Venne mann ( 1 972) e.g. are based essentially on (56), while Hellan ( 1 98 1 ) and Von Stechow ( 1 984) are based on something like (59)/(60). We will see later on, that further consequences depend on this choice. 1 3 . It might be noted that D-intervals correspond in many respects to degrees as introduced by Cresswell ( 1976) and refined by Klein ( 1 980). The major difference is that Cresswell defines degrees as equivalence-classes of objects, whereas intervals emerge from the operation of com parison. These alternative conceptions differ in a similar way as the two possibilities to in troduce natural numbers, viz. as sets of sets of equal cardinality, or as constructed by the successor-operation. Notice, that comparison with respect to cardinality has numbers as degrees - or intervals, for that matter. 14. The major exception to this approach is Klein ( 1 980), who rejects not only the compara tive nature of positive D-adjectives, but also reference to degrees as a constitutive factor. Klein's arguments, however, seem to me to be misleading. See Von Stechow ( 1 984) for some discussion. The most important objection to Klein's approach is the fact, that it cannot ac count for rather elementary facts, such as the comparison of two dimensions of the same object as in Th� river is more wide than d�p. without disavowing the initial assumptions. I S . For revealing discussion of these problems, see Pinkal (1983). Notice that completely parallel questions are at issue in cases that have nothing to do with comparison and gradation, as in Austin's famous example France is hexagonal. 16. Nothing in particular hinges on these assumptions, though. They simply make the discus sion more perspicuous. It would merely be a task of formal adjustment to replace, e.g., N by a certain family of intervals and then to proceed in terms of maximal and minimal elements of this family. Nothing would be gained thereby. 1 7 . This claim must not be confused with the completely different issue of whether particular cognitive capacities, say perception of brightness or temperature, must be explained by theories using non-metrical scales. What is at issue is the conceptual structure of comparison, as a con sequence of which in the unmarked case even non-metrical phenomena would be concep tualized in a metrical fashion, as soon as it comes to comparison. 1 8 . This accounts for the peculiarities of measure phrases, which generally are not referen tial, although nouns designating units can be used referentially. Compare the following contrast:
91 D•, while th� two inch� i n (ii) is interpreted by a particular interval that i s a member of D•. I will not pursue here the formal details that would make this account more explicit . 19.
This is actually not quite correct . The most (and probably only) plausible rendering o f
( 9 1 ) (b) would have t o be paraphrased b y (i): (i)
The degree to which John is taller than the average exceeds the degree to which his car is faster than the average.
In other words, height and speed are not mapped into a common scale, but rather the relative amount compared to average. This is a fairly di fferent type of reinterpretation,
as
we will
see
below. 20.
A
case
in point is the clever discussion of clever in Klein ( 1 980). Klein argues that clever
might involve e.g. social or mathematical (or some other) capability yielding incomparable degrees of cleverness. One can plausibly argue, however, that as soon as in fact different ceptual level, so that (i) comes out parallel to
cases
like (ii):
(i)
John is as clever (mathematically) as Sue is (socially).
(ii)
John is as inteUigent as Sue is patient.
Somewhat different remarks apply to the multidimensionality of good discussed i n Kaiser ( 1 979). All these cases can eventually be reduced to linear and in fact metrical scales at the con ceptual level in fairly plausible ways. 2 1 . For an analysis from the ontogenetic point of view, see Goede ( 1 983). She shows that (ii) seems to be characteristic for a particular transient phase in the development of German gro.P. 22.
Although some of these problems are closely connected to phenomena of gradation, it
seems to be a rational strategy to explore the basic structure of gradation first and to explain the more complicated cases by the interaction of this structure with other syntactic and seman tic components. 23 . It is unclear, whether the bracketed phrases in (i) to (iii) should be analyzed as optional complements of adjectives:
(i)
Tohru is tall [ for a Japanese].
(ii)
Sue is short [in comparison to her class mates].
(iii)
Bill is clever [in solving mathematical problerru].
Semantically, they contribute to the specification of C (in (i) and (ii)) or D (in (iii)) of the norm N1c,or As there is, however, a fair number of unsolved problems here, I will make no sub stantial claim, leaving the question for future research.
24.
Notice that there is a parallel with respect to the specifier of nouns and adjectives: the
Determiner of an NP specifies the E>-role that makes a noun referential, the Degree of an A P realizes the E>-role b y which an adjective identifies a grade. I t might be a n interesting question, whether this parallel can be generalized to specifiers of other categories as well. 25 . See Bierwisch ( 1 967), and more recently Lang ( 1 987) for a more detailed discussion of the different conditions involved in spatial D-adjectives. 26.
For the sake of perspicuity, I have omitted the modificational aspect of adjectives; their
inclusion would lead to the following representation: c [Q [� [[Q x)
A
[[QUANT DIM x] "' [ v
where Q is of category SIN.
+
c )]J))
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
criteria or conditions are involved, the comparison must be appropriately relativized at the con
92 REFERENCES Bartsch, R . & Vennemann, Th. 1 972: �mantic Structur�. Athen!ium, Frankfurt/M. Bierwisch, M. 1967: Some semantic universals o f German adjectivals. Foundatiofl5 of Lan
guag�
3: 1 - 36.
Bierwisch, M . 1 98 1 : Basic issues i n the development o f word meaning. In: W. Deutsch (ed . ) ,
Th� Child's Construction of Language. Academic Press, London, New York. Pp. 34 1 -387. Bierwisch, M . I 982: Formal and lexical semantics. Linguistisch� Btrichu 82: 3 - 1 7 . Bierwisch, M . 1 987: Semantik der Graduierung. I n : M . Bierwisch, E . Lang (eds.). Bierwisch, M . (in press). The semantics o f gradation. I n : M . Bierwisch, E . Lang (eds.), Gram-
matical and Conc�ptual Asf)«ts of Dim�IUional Adjectives. Springer Verlag, Berlin, Heidelberg, New York, Tokio. Bierwisch, M. & Lang, E. 1987: Grammatisch� und konuptu�/1� As�kt� von Dimensionsad
4: 275-344.
Chomsky, N . 1 977: Essays on Form and lnt�rp�tation, North-Holland, New York, Am sterdam.
Chomsky, N. 1 977a: On Wh-movement. In: P. Culicover, T. Wasow & A. Akmajan (eds.),
Formal Syntax. Academic Press, New York. Pp.
7 1 - 1 32 .
Chomsky, N . 1 980: Rults and Repr��ntations. Columbia University Press, New York. Chomsky, N . 1 98 1 : �tu� on Gov�rnment and Bindmg. Foris Publications, Dordrecht. Cresswell, M. 1 973 : Logics and Languages. Methuen, London. Cresswell, M. 1 976: The semantics of degree. In: B. Partee (ed . ) , Montagu� Grammar. Aca demic Press, New York. Pp. 26 1 -292. Goede, K. 1 983 : Zum Zusammenhang zwischen den alterstypischen Antworten auf Fragen mit 'groBer' und 'mehr' . Zeitschrift fur Psychologie 1 9 1 : 233-252. Hellan, L . 1 98 1 : Towards an lnugrated A nalysis of Comparativts. Narr, Tiibingen. Higginbotham, J. 1 98 3 : Logical form, binding and nominals. Linguistic Inquiry 14: 337-394. Hornstein, N. 1 977: Towards a theory of tense. Linguistic Inquiry 8: 52 1 - 538. J ackendoff, R . 1 972: Semantic lnterp�tation in Generativ� Grammar. MIT Press, Cam bridge, Mass. J ackendoff, R. 1 977: X-Syntax: A Study of Phras� Structur�. MIT Press, Cambridge, Mass. J ackendoff, R . 1 978: Grammar as evidence for conceptual structure. In: M . Halle, J. Bresnan, G . A . Miller (eds.), Linguistic Theory and Psychological RNIIIy. M IT Press, Cambridge, Mass. Pp. 201 -228. J ackendoff, R . 1 984: Semantics and Cognition. M IT Press, Cambridge, Mass. Kaiser, G . 1979: Hoch und gut - O berlegungen zur Semantik polarer Adjektive. LinguiStische
Bericht�
59: 1 - 26.
Kiefer, F. 1978: Adjectives and presuppositions. Theoretical Linguistics S: 1 35- 1 74.
Klein, E . 1 980: A semantics for positive and comparative adjectives. Linguistics and
Philosophy 4 :
1 -45 .
Lang, E. 1 987: Semantik der Dimensionsauszeichnung. I n : M . Bierwisch and E. Lang (eds.) Leisi, E. 1 953: D�r Wortinhalt; Seine Struktur im �tsch�n und Englisch�n. Winter, Heidelberg. Lewis, D. 1 972: General semantics. I n : D. Davidson & G. Harman (eds.), Stmantics ofNatural
Language. Reidel, Dordrecht . May, R . 1 977: Th� Grammar of Quantification. Dissertation. M I T . Pinkal, M . 1 98 3 : On t h e limits of lexical meaning. I n : R . Bliuerle, C . Schwarze, A. von Stechow (eds.), Meaning, Use, and lnterpr�tation of Languag�. De Gruyter, Berlin, New York. Pp. 400-423 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
j�ktiven. Akademie-Verlag, Berlin. Bresnan, J. 1 97 3 : SyntaJt o f the comparative clause construction in English. Linguistic Inquiry
93 A rtificial lnt�llig�nce 1 3 : 8 1 - 1 32 . Journal of �man tics 3 . Discourse and logical form . Linguistic Inquiry 8 : 1 0 1 - 1 39 . Predication. Linguistic Inquiry I I : 203 -238. Argument structure and morphology. Th� Linguistic R�i�w 1: 8 1 - 1 1 4 .
Reiter, R. 1980: A logic for default reasoning.
Von Stechow, A . 1 984: Comparing semantic theories o f comparison. Williams, E . 1 917: Williams, E . 1 980: Williams, E . 1 98 1 :
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Joiii7Uil of Semantics 6: 95
-
98
BOOK REVIEW
Hiyan Alshawi, Memory and Context for Language Interpretation. Cam bridge University Press, Cambridge, 1 987. Pp. ix + 1 88 . $25 .00 (hardback). BART GEURTS
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Broadly construed, this book is an investigation into the ways in which world knowledge, lexical information, and context contribute to the inter pretation of texts. It does not provide an in-depth study of a narrowly cir cumscribed problem area, but instead proposes a " relatively unified frame work" (p. 1 ) for text interpretation. The author situates his work in the field of automatic, not human, language processing. It should become clear, however, that its relevance extends beyond the boundaries of that field. The book falls into two parts . In the f1rst part Alshawi presents his gener al framework for text interpretation, which comprises a · memory model and a context mechanism as its main components . Within this framework solutions are sought to three specific problems: the resolution of referential expressions, word sense disambiguation, and what is called by the author " relationship interpretation " , i .e. the problem of making explicit the se mantic relations underlying, for instance, genitive noun phrases, preposi tional phrases with with , and noun-noun compounds. The memory and context models are then related to other research in artificial intelligence. The second part of the book describes and evaluates a computer pro gram, called "Capture" , which implements the ideas put forward in the first part. The Capture system performs a task that requires the ability to adequately interpret texts and, more specifically, to solve the interpretation problems mentioned above: the system creates a relational data base on the basis of natural-language input texts. Although in some respects it eluci dates matters discussed in the preceding pages, this second part will mainly be of interest to readers specifically concerned with automatic language processing. This review therefore concentrates on the first part , which is of a more general interest . I n the memory model worked out by Alshawi , knowledge is encoded in a semantic network, i . e . an aggregate o f nodes interconnected by labeled links, which together represents information about objects, their properties , and relationships. Contextual information is represented, at any given mo ment during text processing, by a collection of so-called contextfactors. As sociated with a context factor are a scope, which is a set of memory entities, and a significance weight. When the context factor comes into existence this weight is set to a pre-determined numerical value, and subsequently it is decremented - by gradual decay or otherwise, depending on its type -
96 in the course of the interpretation process . At any given point, the salience or - the term preferred by Alshawi - context activation of a memory entity is obtained simply by summing the significance weights of the context fac tors that have that entity in their scope. Context activation is treated as an open-ended notion in that virtually all processes involved in text interpretation can create context factors. Thus, for instance, when the syntactic analyser recognizes a sentence as a passive, it can foreground its subject by creating a context factor for it, and when a particular entity is referred to, memory mechanisms can activate conceptu ally related entities. Twelve different types of context factors are discussed, seven of which play a role in the Capture system. The author makes clear, particular set, and that, on the other hand, the framework he proposes is expressly designed to be able to handle alternative and I or additional types . The basic concept underlying Alshawi's treatment of memory processes for information retrieval and inference is what he refers to as "marker processing" . (More common terms for the same thing are " marker pass ing" and " marker propagation" ; the notion of "spreading activation" may be regarded as its psychological counterpart .) Marker processing is es sentially a way to perform set operations on the entities stored in memory (i.e. nodes and links). For instance , in order to retrieve all female doctors it knows, a marker-processing system performs an intersection search. This is done by placing one marker, M l , on all instantiations of the memory en tity "doctor" , and another marker, M2, on the all instantiations of the memory entity " female". Once these marking operations have been com pleted, a simple search procedure collects all entities marked by both M l and M2, so as to obtain the set of all female doctors . To illustrate how all this apparatus is put to work, I will briefly discuss Alshawi ' s method for the resolution of singular referential expressions, which is part of one of the three interpretation problems mentioned above which Alshawi tackles within his general framework . Alshawi's basic method for reference resolution is quite straightforward. The procedure starts off with the generation of constraint markers. These are obtained by applying to the syntactic analysis of the referential expres sions in question a set of rules that require, for instance, that all the memory entities denoted by the head noun of a full noun phrase are to be marked. After all the appropriate marker-generation rules have been ap plied, an initial search is performed, which ignores all entities whose con text activation remains below the focus threshold, which is a constant that indicates the amount of activation that an entity needs in order to be in fo cus. If this first search fails, the threshold condition is dropped, and the en-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
however, that, on the one hand, his theory is not strictly committed to this
97
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tity that satisfies the constraints and has the highest context activation is chosen . On the face of it this procedure might appear overly simplistic, but Al shawi demonstrates that, in fact, it can produce fairly sophisticated results when embedded in the kind of model he proposes. The central component of this model - or rather, this class of models - is the context-factor sys tem. This system can, to give just one example, activate entities that were not mentioned explicitly in the text, but are contextually implicated by the interpretation process. Thus it becomes possible to account for certain well known cases of anaphoric coherence exemplified e.g. by a text that intro duces a house, and immediately afterwards refers to the roof. The scope of Alshawi's framework is restricted in a number of ways, as the author himself points out . First, the class of sentence types covered is not very broad, and , possibly as a result of this, the texts that were con structed to test the Capture system (which are all listed in an appendix) do not give a very natural impression. Secondly, since Alshawi has attempted ' 'to maximize the 'performance I complexity' ratio' ' (p .2) of his system, in terpretation tasks that require complex inferences (e.g. because they in volve hearer models) or non-deterministic procedures (which would allow the system to give up an interpretation previously chosen in favour of an other one, because new input indicates that it is preferable after all) are left out of consideration. On the other hand, however, one could well argue that, given the state of the art in artificial intelligence as well as in its neigh bour disciplines, these are exactly the right kinds of limitations to impose on this type of research. Alshawi outlines a general framework for text processing within which models c:m be constructed with, judging from the model detailed by the author himself, several attractive properties. To begin with , the framework makes for simple and computationally tractable systems which, the re marks in the previous paragraph notwithstanding, cover a coherent and surprisingly wide range of phenomena. It furthermore assimilates some of the best work in artificial intelligence, either by directly incorporating it or by working out solutions that combine the advantages of alternative ap proaches. An example of the former is the memory mechanisms that Al shawi employs, which make use of the marker-processing techniques devel oped by Fahlman. The context-factor system exemplifies the latter, as it is shown to support, in effect, both top-down and bottom-up interpretation processes. Finally, and linking up to the example just given, it is worth men tioning that Alshawi's framework seems to be able to accommodate rela tively robust processing models - which is interesting because robustness is
98 an essential , but often ignored, characteristic of human cognitive skills in general, and language skills in particular. The book's presentation is on the whole fairly adequate. Especially note worthy in this context are the impartiality and explicitness with which the author assesses the strengths and weaknesses of his theory and of the model he implemented - generally not the most conspicuous features of artificial intelligence texts, or texts emanating from any other field, for that matter. of Language and Literature of Brabant
Department University
P . O . Box 90 1 53 5000 LE Tilburg
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The Netherlands
EDITOR'S NOTE
D. Geeraerts T. Hoekstra E. Lang K. Oatley G. Redeker H. Schotel R. Schreuder H. Stark T. Mitchell
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The editors wish to express their gratitude to the following colleagues who are not consulting editors of this Journal but have been kind enough during the past year to referee papers that were submitted for publication:
Journal of StmanticJ 6: I O J - 1 46
TOOLS AND EXPLANATIONS OF COMPARISON - PART 2*
MANFRED BIERWISCH
5 . COMPARATIVE AND EQUATIVE CONSTRUCfiONS
� I I � I �
J ohn is two I feet taller than/times
( 1 02)
as
tall as ) Bill is.
s
( 1 03)
NP
John
!NFL
Pres
VP
AP
V
be
� l A � A'
S'
� co�s
�
A
[ )\ T "II 'h�o � T 'T A [ 1\ g
two
�oot
Urnes
J
Wh
X;
Bill
Pres
AP
V
be
l
Deg
x1
l
A
Pro
• Editorial note: Because of its unusual length this paper appears in two parts . Part I (i.e. Sec tions
J -4
of the paper) was published in J S
6. 1 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Measure phrases are but a special instance of Deg. We will now incorporate some of the more complicated Degree constructions . Although I will mainly be concerned with their semantic structure, a few syntactic prerequisites must be fixed. For expository reasons, I will follow Jackendoff ( 1 977), without pursuing the full range of complexities discussed there. 27 Suppose, then, that (1 03) is the L F -structure underlying (1 02):
1 02
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I will briefly comment on a number of more or less crucial points . First, I have assumed that the complement clause is not extra-posed, but rather directly generated in its final position. This raises various questions which need careful consideration, but cannot be pursued here. Secondly, I have placed than and as in complementizer position. This is an arbitrary (and rather dubious) decision, which has no substantial consequences in the present context. Than and as are formal markers of the complement constit uent of Comparative and Equativ e constructions without any semantic content. What is crucial, however, is the appearance of a Wh-Deg operator t hat is moved to the COMP-position . It is phonetically empty, but it has an essential semantic function: The SF-representation of [Wh x] will be an ab stractor 5t, such that the SF-representation of the whole clause comes out as a property of intervals. (It is, in fact, a property of D-differences, hence I will represent the abstractor as c, in accordance with the notational conven tions used so far . ) This assumption, w hich is basically due to Chomsky ( 1 977), brings comparative complement clauses in close connection to rela tive clauses: both are formed by Wh-Movement, and both represent proper ties of individuals of certain types . These characteristics encompass both simple clauses like as Bill is and complex clauses such as than Sue thought that everybody expected Bill to be. Another crucial point is the appearance of [A pro] . which is phonetically empty, but semantically non-distinct from the adjective in the matrix clause. The notion of non-distinctness will be discussed below. An essential condi tion for its occurrence is the 'correspondence' to the matrix adjective. Although the requirement is intuitively obvious, it is anything but clear how it is to be made precise in general syntactic terms. For the time being, I will simply rely on the intuitive plausibility of the condition, without introduc ing any ad hoc formalization. One reason for this is that the problem is probably related to a final point to be mentioned . Complement clauses of Comparatives can be reduced in a systematic way so that, again in intuitive terms, constituents that are identical to counter parts in the matrix clause are usually not realized at the surface . Two ques tions must be clarified in this respect: fi rst, whether the phonetically empty elements are deleted under identity, or whether they are not inserted in the first place and then supplied by rules of interpretation or construal . Second ly, deletion as well as interpretation requires some kind of across-the-board correspondence in the sense of Williams ( 1 977), which raises a number of intricate problems: across-the-board correspondence is straightforward for parallel structures, such as coordinate clauses, but Comparative construc tions require it for a matrix and an embedded clause, the latter being a con stituent of the former. Whatever the correct solution to these problems will tu rn out to be, I will presuppose here that it provides an independently moti vated LF-representation accessible to semantic interpretation without addi-
103 tiona! assumptions. L et me illustrate the presumed SF-representations of complement clauses by some slightly simplified examples: ( 1 04)
( 1 05)
( 1 06)
a. b. c.
John is taller than Bill (is). John is taller [than Wh c Bill is c pro] c [ [QUANT VERT BI LL] [v + c))
a. b. c.
The table is two feet longer than (it is) high. The table is two feet longer [than Wh c it is c high] c [ [QUANT VERT y ] [v + c))
a. b. c.
John is taller than Bill expected (him to be) John is taller [than Wh c Bill expected he be c pro] c [PAST [EXPECT BILL [ [QUANT VERT y] [ v + c) ))]
=
=
The parenthesized parts in (a) need not be phonetically realized. The brack eted parts of the LF-representations (b) have the SF-representation (c), where 'y' in ( 1 05c) and ( 1 06c) is a variable that must be bound by (i . e . , must be coreferential with) the subject of the matrix clause. We will see later that the value of v will be 0 in all these cases. Notice that the representations in (c) are of category SIN , viz. expressions for properties of D-intervals, as mentioned above. This holds also for the SF-representation of than Bill or than high in ( 1 04) and ( 1 05). For ease of reference, I will use W as a variable over SF-expressions of this kind. Hence [ W c) represents a proposition that ascribes the property W to the interval c. Turning now to the analysis of Comparative constructions, remember first what we want their conceptual interpretation to be: ( 1 07)
a. b.
a is longer than b . f(a, L) ::::> f(b, L) + c
Given the SF-representation of D-adjectives proposed in ( 1 0 1 ) , the basic point is fairly obvious: The D-extent assigned to the object that provides the standard for comparison must become the interpretation of the variable v. Notice that this holds equally for + Pol and - Pol adjectives. ( 1 08)
a. b.
a is shorter than b . f(a, L) C· f(b, L ) - c
In order to derive the corresponding SF-structures in a compositional way, three things have to be accomplished: (i) the complement clause than Wh c b is c pro must be interpreted as than Wh c b is c long for both ( 1 07) and ( 1 08). Why this is so, and how it can be achieved, will be discussed below. (ii) The interval property represented by the complement clause must be
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
=
1 04 turned into a n extent that actually has that property. Given the property W as j ust introduced, this is achieved by prefixing the iota-operator '1c' to the proposition [ W c) . Formally, this amounts simply to substituting the ab stractor c contained in Why the iota-operator, which now binds the variable bound by c. Notice that the uniqueness condition associated with the iota operator requires the default-condition discussed above to be in force, as otherwise c would not be uniquely determined. This has exactly the desired consequences for the interpretation of comparative constructions. (iii) The expression lC [ W c) must eventually be substituted for the variable v in the matrix adjective. These considerations lead to the following SF-structure for longer (than):
longer: c [ W [x [ [QUANT M AX x]
=
[c ' [W c ' ] + c)]]]
The lexical representation for shorter will be identical to ( 1 09), except that ' + ' is to be replaced by ' - ' . Assuming now ( 1 04c) as the value specifying the abstractor W, we derive ( I l l ) as the SF-representation for ( 1 1 0): 28
( 1 1 0)
John is shorter than Bill .
(1 1 1)
Vc [ [QUANT VERT JOHN] [v + c ' ] ] - c)]
[lC ' [ [QUANT VERT BI LL]
The analysis proposed so far is similar in some respects to that of Hellan ( 1 98 1 ) and Von Stechow ( 1 984). It differs, however in crucial respects. The main point is, that 1c [ W c) - and hence the reading of the complement clause - replaces the extent-variable v, which automatically accounts for the fact that Comparative constructions of D-adjectives can never be norm related, as N can show up only as a specification of v , which is no longer availabe. (Notice that there is a variable v in ( 1 1 1 ) which originates, however, from the adjectives in the complement clause and must be 0 by general condition s . ) Another crucial consequence of this analysis is, that the additive (or subtractive) ingredient of Comparatives is not a contribution of the comparative morpheme, but rather of the adjective. This provides the required parallelism for + Pol and - Pol adjectives giving the 'converse' ef fect of the Comparative without further ado. In this way condition (iii) for mulated in Section 4, is met : Positive and Comparative constructions are in fact derived in a strictly parallel way for all types of adjectives. There are a number of further consequences that follow from the above analysis. I will mention three of them. First, since W is in the scope of an iota-operator, which requires c to be uniquely specified, the complement clause cannot contain a negation : the value for c in Bill is not c tall is simply not uniquely determined. Hence the u ngrammaticality of sentences like
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 09)
105 John is taller than Bill is not must not be captured by a stipulated syntactic constraint, but can be explained in terms of SF-properties. Secondly, being in the scope of 1c gives W the status of a presupposition (or conventional implicature) - very much like restrictive relative clauses. This does exempt the complement clause from negation . Hence the negation in ( 1 1 2a) does not affect the implicit proposition, that Bill is c ' -tall. As fur thermore the iota-operator requires uniqueness of c ' , the default interpretation of ' + ' (and ' - ') remains in force for the complement clause, so that John is not taller than Bill can only mean that John is at most as tall as Bill. This can be seen by the equivalence of ( l l 2b) and ( l l 2c). (Here and in the sequel I will abbreviate QUANT DIM by D ' . ) a. b. c.
J o h n i s not taller than Bill. ., [v c [[D ' JOHN] = [1c ' [[D ' BI LL] = [ v + c ' ]] + c)]) v c ' (c ' unique) [ ( [0 ' BIL L] = [v + c ' ] ] 1\ ., v c [[D ' J OHN] = [c ' + c)] ]
The third point concerns the content of the complement clause. Here the relevant consequences follow only if further assumptions are introduced . They will be motivated below. What is at issue, is the third class of problems discussed in Section I , viz. the ungrammaticality of sentences like (24b), repeated here as ( 1 1 3) :
(1 13)
• John i s shorter than Bill i s short .
In terms of the conceptual interpretation, the complement clause of a Com parative has to specify a property of an extent, i . e . , of an interval that starts at 0, because v must be interpreted by an extent. This corresponds to the in tuitive notion, that both taller than b and shorter than b refer to b's height, rather than b's shortness. This condition is met by specifying the v of the complement clause by 0, so that 1c ' [ W c '] comes out as an extent. As an automatic consequence of this condition, - Pol adjectives are not admitted in the complement clause because they require c to be subtracted from v, which is impossible for v = 0. Hence c ' pro must be interpreted as c ' tall in the complement of both taller and shorter. That is in effect what ( I l l ) illustrates . We will return to this point in the next section. As it stands, the analysis of the Comparative given in ( 1 09) is inadequate in two respects: First it does not account for the analytic Comparative more, hence it is not sufficiently general. And , secondly, it does not fit into the syntactic structure indicated in ( 1 03). 2 9 What we need to do in order to complete the analysis is to strip out , so to speak, the effect of the compara tive morpheme from ( 1 09). To that effect, the open variable v in the SF structure of 0-adjectives must be made available by means of a prefixed ab-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 1 2)
1 06 stractor v for syntactically determined specification. Suppose, now, that A is a variable over SF-structures of D-adjectives as given in ( 1 09), i .e. , A is of category (S/N) IN and contains a free variable v. Then v [A [1c 1 [ W C 1 ])] will have the desired result. This structure must, of course, be prefixed by abstractors binding A and W, which yields ( 1 1 4) as the SF-structure of more and -er (where -er must later be attached to the following adjective by the familiar process of affix hopping). ( 1 1 4)
more/-er:
A [ W [v [A (1 C 1 [W C 1 ] ] ) ] )
( 1 1 5)
morel-er: e "
[A [W [( v [A c " ] ] [1 C 1 [W C 1 ] ) ] ] )
I f this i s first applied to the NP two feet, the abstractor c " is specified giving the following SF-structure: ( 1 1 6)
two feet morel-er:
A [W [ [v [A 2 FOOT]] [c 1 [W C 1 ]] ] ]
The expression 2 FOOT will eventually b e absorbed b y the abstractor c i n A , such that the Degree constituent indeed specifies the c and hence receives its 8-role - although its SF-representation is by no means a simple constant of category N. I f, on the other hand, no measure phrase appears under Deg then c" is converted by rule (50) into a referential operator with narrowest scope that ultimately binds the grade-variable in A . 30 In order to achieve that result, ( 1 1 6) must apply to the SF-representation A and W of the adjective and the complement clause, in that order, which is in line with the LF-structure ( 1 03). We will turn now to the Equative construction. The basic idea is this: While the Comparative specifies the extent variable v by means of the com plement clause, the Equative in the same way specifies the difference varia ble c. As this account differs substantially from previous theories, let me first give some intuitive motivation for it . Obviously ( 1 1 6a) could approxi mately be construed as having the interpretation ( l l 6b). The situation is more complicated with ( 1 1 7) , which means something like ( 1 1 8) , involving the presupposed norm-relatedness indicated in { l l 7b): -
1,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I f ( 1 1 4) i s applied t o SF-structures o f adjectives like ( 1 0 1 ) , i t yields represen tations such as { 1 09) by successive lambda conversion. One further adjust ment is called for , though, in order to account for the measure phrases that may appear under Deg 1 in ( 1 03) . These must first be absorbed by the Com parative morpheme yielding the SF-structure of two feet more, etc. We thus end up with ( 1 1 5) as the SF-structure of more:
107 ( 1 1 6)
a. b.
a i s as long as b. f(a, L) = f(b, L)
( 1 1 7)
a. b.
a is as short as b. f(a, L) = f(b, L) /\ ( f(b, L)
( 1 1 8)
C·
N - c)
a is as much below the norm as b.
As ( 1 1 8) suggests, the Equative construction involves in fact an equality o f differences rather than of extents directly. Furthermore, while - Pol 0 adjectives are excluded from complement clauses o f Comparatives, this i s n o t the case for Equatives: a. b.
J ohn is as tall as Bill is short. The window i s almost a s low a s it i s narrow.
Hence complement clauses of Equatives are not subject to the restrictions that hold for extents. These, and a number of further relevant properties o f Equatives, follow directly, if w e assume ( 1 20b) t o b e the SF-representation of ( 1 20a): ( 1 20)
a. b.
John i s as short as Bill (is). [ [Q UANT VERT JOHN] = [N [N - c ' ] ] ] ]
11
c ' [ [QUANT VERT BILL]
=
T h e eta-operator that binds c ' in ( 1 20b) i s identical with the iota-operator , except that i t does not require uniqueness. This accounts for two inter related facts : first, Equatives admit negative complement clauses, which Comparatives do not: (121)
a. b.
John is as tall as nobody else. • John is taller than nobody else.
Secondly, the preferred interpretation of negated Equatives is not based on the uniqueness of c ' . This can be seen as follows: ( 1 22)
a. b. c.
John is not as short as Bill. [N - [11 c ' [ [D ' BILL] = [ N - c ' ]] ] ] --, [[D ' JOHN] [N - c ' JJJ v c ' [([D ' B ILL] = [ N - c ' J] /\ --, [[D ' JOHN] =
=
The preferred interpretation of ( 1 2 2a) is that o f John is shorter than Bill, plus the presupposition that Bill is short . ( 1 22b) indicates the SF-structure of ( 1 22a), and it is by definition equivalent to ( l 22c). The presupposition is maintained as the complement clause is not affected by the negation. The
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(1 1 9)
1 08
( 1 23) as (of Deg): A ('W [A [Tt c ' [W c ' ]]]] If A and W are specified by appropriate readings, (123) yields SF-represen tations as indicated in (120) . Notice that again the Degree constituent ulti mately specifies the abstractor {: of the D-adjectives to which it applies. There is one point to be added to ( 1 23), in order to account for construc tions like twice as tall as S. Let p be a variable over SF-structures assigned to NPs like three times, t wice, half etc., which might be called 'factor phrases ' . n is a functor of category N/N , and represents multiplication of its argument, so that [n c) is interpreted as the product of c and the factor contained in n. Obviously, n must apply to the extent specified by the com plement clause of Equatives . Furthermore, factor phrases must be optional in a different way from measure phrases like two feet: absence of measure phrases triggers the Unspecified Argument Rule, which provides a contextu ally specified value; absence of factor phrases is nothing but absence. 3 1 I represent this optionality by including n and the abstractor by which it is bound into parentheses. We thus get ( 1 24) instead of (1 23): ( 1 24) as (of Deg):
(fi)
[A [ W [A [Tt c ' [W [(n) c ' ]]])]]
Notice that n must be precluded from - Pol D-adjectives in order to account for the ungrammaticality noted above: ( 1 7)
a. "' John is half as short as Bill is.
This will be taken care of by the conditions to be discussed below. I will now briefly consider constructions with less (which I will call Sub tractive constructions). Although less derives from the Comparative of lit tle, Subtractive constructions are more closely related to Equatives than to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
matrix clause, however, is in the scope of negation, so that the default reading of ' - ' does not apply, and the lack of the uniqueness condition on c ' automatically provides the preferred interpretation. Notice that I have assumed in ( l 20b) t hat the extent variable v is specified by N for both the matrix adjective and the adjective of the complement clause. This assumption follows from the conditions to be discussed below. It provides, of course, the norm-relatedness involved in these constructions, and it gives it the status of a presupposition, as N occurs within the scope of the eta-operator. This holds only for - Pol adjectives, though. + Pol ad jectives admit both N and 0 as the value of v in these constructions, hence norm-relatedness is not obligatory for as tall as, but only for as short as. Using the notational conventions introduced above, the SF-structure of the Degree-morpheme as can now be g iven as follows:
109 Comparatives: their complement clause specifies the difference variable c rather than the extent variable v. To see this, consider the intended con ceptual interpretation: ( 1 25) a. b.
a is two feet less short than b . [f(a, L) ::) f(b, L) + 2 ft ] 1\ ( f(b, L)
C·
N - c' )
Just as Equatives, Subtractives preserve norm-relatedness for - Pol adjec tives, and they are based on the comparison of differences rather than ex tents directly, as the paraphrase ( 1 26) suggests:
On the other hand, Subtractives share with Comparatives the uniqueness condition on c ' , since they do not allow negated complement clauses. Hence, they are based on the iota-operator. I will thus assume the SF representation ( 1 27b) for sentences like ( l 27a): ( 1 27) a. b.
John is less short than Bill. v c " [[D ' JOHN] = [N - [1 ' [[D ' BI LL] [N - c ' ]] - c " ]]
Under standard assumptions, ( l 27b) determines the appropriate conceptual interpretation. And it follows from the assumptions introduced so far, if we set up (1 28) as the SF-structure for less: ( 1 28) less (of Deg):
c"
[ A [ W [A [ 1 c ' [W c ' ] - c " ]])]
Notice that there are three occurrences of ' - ' in ( l 27b). The first two of them derive from short (and the pro corresponding to it). They would be replaced by ' + ' . if short is substituted by tall. The last one derives from less, indicating its relatedness to little (which I will not pursue here). So far, I have considered only constructions whose complement can reasonably be construed as (the residue of) a clause, including cases like than Bill, than tall, than he believes, etc. Notice, incidentally, that even completely missing complements are taken care of, because in those cases the Unspecified Argument Rule would turn W into a referential operator that picks an appropriate property from the context. Thus the but-clause of ( l 29a) would have the SF-structure ( l 29b), where the most likely interpreta tion of W is provided by the initial clause:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 26) The extent to which a is short is two feet smaller than the extent to which be is short.
1 10 (1 29) a. b.
Bill is tall, but John is taller. v c" v W [[QUANT VERT JOHN]
[1 c I [W c ' ]
+
c , ]]
Consider now sentences like {1 30): ( 1 30) a. John is taller than six feet. b. Sue is shorter than six feet.
(131)
v
c " [[QUANT VERT JOHN]
=
[6 FOOT + c " ]]
This would require, however, an ad-hoc-adjustment of the analysis of more/-er given in (1 1 5) . Can we preserve the analysis given so far and still derive something like ( 1 3 1)? What would be needed to achieve this is some means that turns a measure phrase into a property of intervals. There are various possibilities to be considered here, depending on how measure phrases are to be treated for independent reasons. One way would be to con sider expressions like '6 foot' as belonging alternatively to category N or SIN. 32 With this proviso, measure phrases could be treated directly as a possible complement to Comparatives. The result would be structures like ( 1 3 1 ) with 1 c ' [6 FOOT c ' ] instead of '6 foot' . Another possibility would be to assign (132a) the SF-structure ( 1 3 2b): ( 1 32) a.
than six feet
b.
c
[6 FOOT
=
c]
This alternative is similar in spirit to a proposal made in Bresnan ( 1 973). It can be elaborated in various ways. I will leave it at that, concluding merely that the analysis of sentences like ( 1 30) can be reconciled with the present account. Notice, incidentally, that along with ( 1 30), we have ( 1 33), where (a) is ulti mately equivalent to (1 30a), while (b) is ungrammatical: ( 1 33) a. John is more than six feet tall. b. • sue is more than six feet short . This follows directly from the present analysis, since the measure phrase in ( 1 30) specifies the extent variable v, while more than sixfeet is a degree con stituent that specifies the difference variable c and is excluded from - Pol adjectives for the same reason as simple measure phrases.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Here the status of the measure phrase as a residue of a complement clause is rather dubious: there is no than six feet is. Intuitively, the measure phrase should directly specify the extent variable v with ( 1 3 1 ) as the SF-structure of ( 1 30a):
Ill
( 1 34)
much : c
( 1 35)
more: c"
[Q [x [ [QUANT Q x]
=
[ v + c]]]]
[Q [ W [x [[QUANT Q x]
=
[1 C 1 [W c 1 ] + c " ]]]]]
Notice that ( 1 34) is of exactly the same category as the SF-structure of at tributive adjectives indicated in (98) above. Thus ( 1 34) accounts for the ad nominal much, and ( 1 35) for its lexically fixed Comparative. Here are some provisional illustrations, based on obvious assumptions and abbreviations: ( 1 36)
[ N ' much water] : x [Vc [[QUANT WATER x]
( 1 37)
[ N ' two gallons more water than wine] : x [[QUANT WATER x] [1 c I [Vy [ [QUANT WINE y] = [0 + c I ]]] + 2 GALLON]]
=
[N + c]]] =
Interestingly, ( D ') can also be used as a modifier of adjectives as in ( 1 38), which is not to be confused with the Comparative more, nor with the Degree-modifier, to which I shall turn presently . ( 1 3 8)
John is more tall than slim.
In order to make the pertinent SF-structure more perspicuous, I shall in troduce the following notational abbreviation, which will be useful for other purposes as well:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The last remark brings me to a bunch of problems which I have deliber ately avoided so far. I am referring to the analysis of much, many and their comparative more. In some sense, these items are expressions of gradation par excellence. l nspite of this, they cannot be treated here in full detail, be cause they raise a large number of problems whose analysis would go far beyond the present limits. The main point is their oscillating syntactic sta tus: they can occur as quantifiers, adjectives, adverbials, degree-modifiers, and it is by no means obvious whether these different occurrences can all be reduced to the same SF-structure, although there must be a common core in ay event. In order to indicate, how the present theory might eventually cover these items, I will give a few hints regarding their SF-representation. Obviously, many and few as well as much and little are antonyms in much the same way as + Pol and - Pol adjectives and should hence· be analysed in parallel fashion. Their characteristic property is not the fixing of any par ticular dimension or condition for comparison. They represent, so to speak, gradation per se. As a first upshot, we might therefore assume that much/many differs from tall, long etc. in that it has a variable instead of one of the constants VERT, MAX, etc. We would thus have the following SF-representations, where Q is a variable of category S/N:
112 ( 1 39)
x
[TALL x]
=
def x [Vc [[QUANT VERT x]
=
[v + clll
The definiens of ( 1 39) is the Positive reading of tall, the v of which will be come N by general condition. Similar abbreviations can be defined for all 0-adjectives. With this proviso, the SF-structure of ( 1 38) derived by means of ( 1 35) would be as follows: ( 1 40) Vc " [[QUANT TALL JOHN] [0 + c ' ]] + c " ]
[1 c I [ [QUANT SLIM JOHN]
=
( 1 4 1 ) John i s more tall [[than W h c ] John i s c pro slim] Notice that according to (140), ( 1 38) does not express a comparison between two extents of John, but rather between the extent to which he is tall and slim, respectively. In effect, then, the comparison is shifted from the scale of physical extension to another, more complex scale. This becomes ex plicit, if we spell out the defined structure of TALL (and SLIM, for that matter). We will return to that kind of shift in Section 7 . S o far, I have not introduced i n the analysis of much and more any new concepts over and above those already used. There is one point however, that needs some clarification. Let us return to the categorization and in terpretation of QUANT and DIM. The intuitive notion that DIM maps its argument into a pertinent D-scale so that QUANT ' scans' the extent of the argument can be made precise in various ways. The notation used so far can reasonably be interpreted in the way of (l 42a), which might, however, be construed as an abbreviation for something like (142b): ( 1 42)
a.
I
I I
QUANT
DIM
x
N/N
N/N
N
V / \ N
b.
I I I I I \ / /\ 1/
QUANT
\
(N/S)/N N
c
DIM
N
X
C
SIN N N
N
s
N
The crucial point is that the analysis of much and more, as it stands, cannot be reconciled with the assumptions embodied in either (l 42a) or (l42b), as we have assumed that Q is of category SIN. What we need is a functor that
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This presupposes that the LF-structure of ( 1 38) is ( 1 4 1 ) with the usual as sumption about pro, whose SF-structure will be that of much.
1 13 turns one-place predicates like WATER into relations (or functors) that associate the argument of the predicate with a scale of quantities. Suppose that AMOUNT represents such a functor, so that (1 43b) would be inter preted as mapping an amount of water x into a D-scale of amounts. (143a) assimilates this structure to that of (142a): (1 43)
a.
I
I I
AMOUNT WATER (N/N)/(S/N) SIN
x
N
I
I
''IN"'
I I
Y;jl
AMOUNT WATER
x
c
"
s
N
In other words, [AMOUNT Q] turns x into a quantity of Q. This is straight forward for mass nouns like water, sand etc. In order to include adjectives like tall, slim into the values of Q, we must generalize its interpretation, so that [AMOUNT TALL] specifies the amount to which x is tall, i.e. above the norm of height. Notice that by the definition of tall this reduces ulti mately to an interval on the D-scale of height, telescoping so to speak the norm-relatedness. With these considerations in mind, (1 34) is to be replaced by (144): (144) much :
c
[<J [x [[QUANT AMOUNT Q x]
=
[ v + c)])]
A similar adjustment is to be made for more. Suppose now that we let AMOUNT apply also to sets, where it specifies their cardinality, and we represent the Plural, somewhat ad hoc, by a functor SET, which turns a common noun into a set of individuals that have the property represented by the noun. Then the SF-structure of many would be identical to that of much given in ( 1 44), assuming appropriate syntactic categorisation. Let us briefly look now at cases like more than six feet tall, i.e. to the Degree modifier. Obviously, more than sixfeet must determine an interval, just as the measure phrase sixfeet, that is, its SF-representation must be an expression of category N that eventually specifies the abstractor c of tall. In other words, the Degree modifiers more and much merely provide an in terval amenable to further specification. The obvious way to incorporate these conditions into the present framework is to set up the following SF structures: ( 1 45) much (of Deg):
c
[TJ c' [c '
[ v + c))]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
<}
b.
1 14 ( 1 46) more (of Deg): ( 1 47) less (of Deg):
c"
c"
[ W [T)c [ [c
[ W [T)c [[c
=
=
[l c ' [W c ' ]
c " ]]]]]
+
[lc ' [W c ' ] - c " ]]]]]
On the basis of previous assumptions, we would get, by means of (146), the following SF-structure for (1 33a): ( 1 33) a. ( 1 48)
John is more than six feet tall.
[[QUANT VERT JOHN] c ' ] + c " ])]]]]
=
[0
+
[11 c [V c " [c
=
[1c ' [6 FOOT
=
( 1 49) too (of Deg):
c"
[A rf> [[v [A c " ]] [TARG P]]]]
( 1 50) enough (of Deg): A [ P [A [TARG P]]] The correspondence between these structures and (1 1 5) and ( 1 23) is obvious. To illustrate the gist of this account, I give the following examples without further comment: ( 1 5 1 ) John i s two feet too short t o be i n the team. ( 1 52)
[[QUANT VERT J OHN] - 2 FOOT]]
=
[[TARG [JOHN IS IN THE TEAM]]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I leave it to the reader to verify that ( 1 48) indeed results compositionally from the involved lexical representations, 33 and that it is conceptually equivalent to ( 1 3 1 ), in spite of its different structure. Although a large number of more complex structures, such as much less than six feet tall, much more than two feet shorter etc. are captured by the discussed representations, there are various problems that I have not touched. It should be sufficiently clear, however, that the basic assumptions of the present theory can be elaborated in natural ways to cover the com plexities of more and its cognates. To complete this section, I will briefly look at some other Degree constructions. Consider first too and enough . What they have in common, is that they take complement clauses describing a condition which is turned into an extent that fixes a target or debit. Let us represent this target function, somewhat ad hoc, by an operator TARG of category N/S which takes some proposition P and derives from it a D-extent that serves as the target in question. There is a lot more to say about the syntax and semantics of the pertinent complements, and about the nature of TARG. 34 The present stipulation is sufficient, however, to bring out the fact that too and enough are the target-analogues to Comparatives and Equatives, respective ly. This is indicated by the following tentative SF-representations:
1 15 ( 1 53) John is tall enough to be in the team. ( 1 54) [[QUANT VERT JOHN] TEAMll ll
=
[0 + [TARG [JOHN IS IN THE
I will finally sketch the analysis of the Superlative. Following usual practice, I take it to be related to the Comparative construction in the sense indicated by the following paraphrase, where C refers to some (contextually deter mined) set, to which John belongs: ( 1 55) a. b.
John is the tallest (C) John is taller than any other C.
( 1 56) 1\ y v c [[Q y] 1\ [y ;e JOHN] -- [[QUANT VERT JOHN] [I\. c ' [[QUANT VERT y] = [v + c ' ]] + c]]] Treating mostl-est as an element of Deg, we can assign to it the following SF -structure: (1 57) mostl-est (of Deg): A [ Q [x [/\ y [[v [ A X c]] [1c ' [A Y c ' llllll
v
c [ [Q y] /\ [y
;e
x]
-
Thus the Superlative in fact resembles the Comparative, as it specifies the extent-variable v of its head, and it has the property of a specifier of adjec tives, insofar as it ultimately specifies the difference abstractor c of its head. This outline completes the discussion of Comparatives and related con structions.
6. CONDITIONS ON EXTENT-VARIABLES
One of the cornerstones of the present theory is the extent-variable v, which has rather peculiar properties: it is not bound in the lexical SF-structures in which it occurs by any operator, thereby differing from other variables. It can be bound (and accordingly specified) by a compositionally introduced abstractor v in the Comparative and its cognates. If it is not so specified, it must be substituted by either 0 or Nrc. 01 , the admissibility of one or the other of these values being subject to certain conditions. So far, I have made scattered remarks about these conditions, indicating only their effect by in tuitively motivated choices of 0 or N. Before I discuss these conditions
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Ignoring various details, we may assume that (1 56) is the SF-structure of (1 55), where Q is a property that defines C:
1 16
( 1 58)
a. b. c.
[v § c) [[v 1 § c d § c2] [v 1 § [v2 § c2]]
Positive constructions Comparative and its cognates Equative constructions
In (b), the v of the head has been substituted by [v 1 § c1 ] , in (c) the c of the head has been substituted by [v2 § c2] . There is one interesting property of these combinatorial structures: while c (and c1 , c2 for that matter) allow for further substitution, so that we get patterns like ( 1 59), of which we have seen various instances, there is no further recursion on v beyond .0 58b). This is a direct consequence of the fact that v is not bound by an abstractor, except 3 secondarily in Comparatives. 5 ( 1 59)
a. b.
[v1 § [[v2 § c2] § c3] Subtractives (cf. ( 1 27)) [[v 1 § c 1 ] § [c2 § c3 ] ] (e .g. : more than two feet taller than Bill)
We note, furthermore, that Equatives and Subtractives necessarily have two variables v1 and v2 as indicated in ( 1 58c) and ( 1 59a), respectively. We will say that in these configurations v1 commands v2 • Suppose now , that every v occurring in a final derived SF-structure is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
more systematically, I will make a few general remarks about their nature. Based on considerations about the structure of comparison, the extent variable v was brought up in order to get a uniform lexical representation o f both the norm-related and the norm-neutral reading of 0-adjectives. In actual fact, however, its function is o f a much more general character, as the difference in question is proliferated through a wide range of construc tions . Hence v and the conditions on it in a sense extract general properties of these constructions, allowing for a systematic account of various inter acting factors. In this way, v and the conditions in question play a crucial role in explaining the facts discussed in Section I , and various other phenomena discussed in Sections 4 and 5 . Notice that this approach is simi lar in spirit to the construction of explanatory principles in syntax by ex tracting general principles, such as the subjacency condition, the principles of Case theory, etc. in order to simplify the class of permissible rules. The conditions on v in much the same way reduce properties of particular SF representations to generalized properties of possible SF-structures. I will briefly return to this aspect in Section 8 . In order to state the conditions i n question, let u s systematize the relevant patterns of occurrence. Notice, first, that v originates in the configuration [ . . . [v + c) ] or [ . . . [v - c) ] , where c is always bound by an operator. We thus can restrict our attention to the configuration [ v § c) where § is a varia ble over + and - . The combinatorial process will now derive the following configurations :
1 17 arbitrarily substituted by either N or 0. We then have the following possible values eventually replacing a lexically induced v: ( 1 60) a. b. c.
1 c [W c) for some property of intervals W TARG P for some proposition P 0, N by arbitrary choice
}
by lambda conversion
( 1 6 1 ) spec(v, E ) i s the (compositional o r free) specification of a variable v in a given SF-representation E. ( 1 60) lists the types of possible cases of spec(v, E). There are two ways to make ( 1 6 1 ) applicable. (i) The definition applies to fully interpreted LF constituents, tracing so to speak the sequence of substitutions. (For free specifications, this condition is trivial, but for compositional specifications it involves systematic comparison of stages of derivation.) (ii) The 'tracing' is actually encoded in the resulting representations. This can again be done in at least two ways, both of which can be formulated as notational modifi cations of the rule of lambda-conversion: (162) a. b.
� �
[P x) a [P x) a
-
[P l x a)) [P xi ) ai
(a) is similar in spirit to the use of traces in syntax: it encodes a property of the input into the output of a rule, but it preserves the original operation of lambda-conversion. (b) provides the same information as original lamb da-conversion , but generates radically different structures. It is similar in spirit to coindexing as generally used in syntax. 36 As nothing in particular depends on the choice between these three alter natives in the present context, I will adopt here the most conservative alter-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let us call (a) and (b) compositional, and (c) free specifications of v. We shall now state the conditions in question as conditions on specifications of v. Their primary effect will be to sort out non-admissible free specifications of v, thereby determining which constructions can or must be norm-related, and which ones not. The status of the conditions is thus similar to that of filters or output conditions. The conditions will in part be of a rather in direct character, and they will in fact account for a wide range of pheno mena beyond norm-relatedness. It seems to me that the actual content of the conditions can be expressed in a number of ways, pending further clarification of the nature of SF representations and SF-rules. Hence the way along which I will proceed is exploratory, rather than definitive. Let me first introduce the following no tation:
1
18
native (i), although (iib) might turn out, in the long run, to be more ap propriate in many respects. Notice, that the definition ( 1 6 1 ) can be extended to complex SF-structures without further assumptions. What we will need besides spec(v, E) is in particular spec([v § c] , E) . Another concept that is useful in formulating the conditions is defined in ( 1 63): ( 1 63) Int(X) is the range of conceptual structures that are possible inter pretations of the SF-expression X.
( 1 64) C l : l nt(spec(V, E))
=
D0 for arbitrary E .
C 1 says that any specification o f v must be interpreted b y an extent, i.e. , an interval that starts at 0. That includes, of course, the empty extent (0, 0). Hence C l is not restrictive with respect to N and 0 directly, but it has restrictive consequences in an indirect way. Consider configurations like ( l 58b), originating from comparatives, where 1c 1 [ [v 1 § c 1 ]J is spec(v) of the matrix adjective. (I will henceforth drop E, where it is arbitrary.) Now [v 1 § c 1 1J) is an extent only if spec(v 1 ) = 0. Hence Cl applied Int(1c 1 [ to v requires v 1 to be 0 in (158) (b). Thus C l garantees that c is always ad ded to, or subtracted from, an extent. This captures the fact that John's height, not his difference with respect to N, is at issue in Bill is taller than John. C l has various further consequences by interaction with other condi tions. We shall see some of them as we proceed. Intuitively C l says that only extents can serve as the standard of comparison. (That includes, among others, also Int(T ARG P).) .
•
•
•
•
•
( 1 65) C2: Int(spec([v § c) , E)
=
D for arbitrary E
According to C2, the interpretation of spec([v § cJ) must always be an inter val . Hence C2 excludes [0-c] for arbitrary spec(c). Therefore v in this con text must either be specified compositionally, i.e., by the Comparative, or it must be N. This has various consequences. C l guarantees, among other things, that - Pol adjectives must be norm-related in the Positive, but not in the Comparative. Together with C l , it guarantees that no - Pol adjec tives are admitted in the complement clause of Comparatives, as these must be interpreted by an extent. This, then, explains the third group of facts dis cussed in Section l above. It guarantees , in particular, that in Bill is shorter
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Thus Int(O) e.g. is the interval (0, 0) of arbitrary D-scales with a designated D0 , l ot(§) are the operations + and - D2-scales. Similarly for complex X along the lines sketched so far. I shall now formulate and briefly comment on the conditions:
119 than John again John's height is at issue. C2 also requires - Pol adjectives to be norm-related in the Equative. This follows in two steps: first, comple ment clauses of Equatives specify c, not v of the matrix adjectives, hence they are not subject to C l and therefore admit - Pol adjectives. Secondly, both the matrix and the complement adjective are subject to C2. Therefore both v 1 and v2 in configurations like (1 58c) and (1 59a) must be N. The in tuitive content of Cl is almost trivial: it simply requires interpretability, as there are no intervals outside (or 'below' ) D. (166) C3: spec(vl ' E)
=
spec(v2 , E) for E
=
[ . . . v 1 § [ . . . v2 . . . ]]
( 1 67) C4: spec(v, E)
;rt.
N for E
=
[v § [[n c)
.
.
. ]]
In the form I have stated C4 it applies also to cases like (1 33) ruling out
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
C3 applies to Equatives and Subtractives, where v 1 commands v2 , and it re quires that either both are 0 or both are N. It admits either 0 or N for + Pol adjectives, provided that both instances have the same specification. This corresponds to the fact that John is as tall as Bill is open to norm-related or to norm-neutral interpretation. Together with C2, it determines the norm-related. interpretation of tall in cases like John is as tall as Bill is short. 37 Notice that there is some overlap with C2 here. As I have just shown, the result of C3 follows independently for - Pol adjectives. C3 cannot simply be dismissed, though, for the reasons just discussed. Intui tively, C3 expresses the fact , that comparison of two intervals starts at the same point of the scale. The next condition will take care of measure and factor phrases, which have various properties in common. With respect to distribution, both are restrictive in their combination with - Pol adjectives: measure-phrases are excluded from the Positive, factor phrases from the Equative of - Pol D adjectives, although measure-phrases are not restricted with respect to the Comparative. A second property they share is the exclusion of norm-relatedness, which means that they determine an interval that is either an extent, or a difference that does not start at N. We can formally express the common fate of measure and factor phrases by exploiting another property they have in common. I have introduced an SF-variable n of category N/N ranging over SF-structures of factor phrases. It seems reasonable to assume that measure phrases, which by definition represent multiples of D-units, also include an expression of that kind, so that their SF-structure is in effect of the form [n c) , where Int(c) is some unit of a D-scale. This structure is identical to that created by factor phrases, except for the condition on Int(c). We are thus led to the following condition on v in the context of multiplication:
1 20
which requires a specification of c of the form [N- [[6 FOOT] + c ' )) ; thereby it violates C4. That spec(v) in this case must be N follows from C2. More generally, multiplication in the con text of the Positive or Equative of - Pol adjectives violates either C2 or C4. Hence C4 predicts not only the conflict arising in the interpretation of Mary isfive feet short, but also the strategy of illegal escape: the adjective is inter preted in two ways simultaneously, once by ignoring the measure phrase, once by ignoring the ' - ' . These are the minimal adjustments leading to compliance with both C2 and C4. A more complicated escape strategy is predicted for cases like Mary is twice as short as Bill, since the Equative must furthermore meet C3, which cannot be met by this strategy. I will return to this problem in the next section. Intuitively, C4 says that measurement starts at 0 or at the end of some particular specified interval. C l - C4 determine fixed values of v for all cases except for the Positive and the Equative of + Pol adjectives. As already mentioned, the ambiva lence predicted for Equatives like John is as tall as you thought etc. seems to be correct. There is no ambivalence, however, for cases like John is tall. This requires a last condition: Sue is more than six feet short,
=
N for E
=
v
c [ . . . v + c)
For - Pol adjectives, the corresponding specification follows already from C2, which excludes [0-c] for more general reasons. C5 applies to + Pol adjectives that are not subject to any other specification and would allow both 0 and N without C5. Thus C5 is intuitively a requirement that makes + Pol adjectives informative. Notice that it does not apply to cases like How tall is John? , where c is bound by the question operator, rather than the referential quantifier. It should be clear that Cl to C5 constitute a fairly complex network of conditions which, together with the basic assumptions about the pertinent SF-structures, predict a wide range of phenomena, not all of which have been spelled out in detail. Instead of going through further instances illus trating the various effects, we will briefly look at the conditions themselves. What springs to the eye is that they fall into two fairly different groups. Group I contains C l and C2, Group II contains C3 to C 5 . They differ in two respects: (i) Group I has arbitrary E, while II has specific conditions·. Hence Group ris strictly local, or context-independent, Group II is not. (The context belongs to SF, it may, but need not, involve syntactically deter mined context . Thus the context of C5 is intra-lexical , while that of C2 and C3 involves conditions that generally result from combinatorial processes.) (ii) Group I refers to conditions of Interpretation, i.e. to aspects of concep tual representations, while Group II is internal to SF. In formal terms: only Group I makes use of the function Int(X).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 68) C5: spec(v, E)
121
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These observations raise a number of questions t o which I will give no definite answers, although some suggestions seem to be warranted. First, one might wonder whether C1 and C2 could be restated without reference to l nt(X}, i.e., purely in terms of SF, e.g. by adding conditions on E. Although this might in fact be possible, it seems to lead to substantially more complex conditions, which, moreover, obscure the rationale behind these conditions. It might be useful in this respect to speculate about the na ture of these conditions and the concepts they make use of. Notice, first of all, that C1 and C2 do not refer to particular elements or configurations of conceptual structure, but rather to general properties of conceptual do mains. In this respect, reference to Int(X) in SF-conditions is comparable to reference to 8-roles in syntax: just as the 8-criterion refers to general conditions the particular content of which is spelled out in semantics, C 1 and C2 refer to general conditions the particular content of which is speci fied in conceptual structure. This leads to a second observation. As men tioned earlier, the projection principle and the a-criterion in a sense pave the way between syntax and semantics. In a similar vein, conditions like C 1 and C2 pave the way between semantics and conceptual structure. I n this sense, they are part of the more general rules and principles that relate grammar and conceptual organization, providing systematic constraints for the conceptual interpretation of linguistic expressions. It should finally be noted, that it would also be inappropriate to take the opposite tack, trying to reformulate C 1 to C5 uniformly as conditions on conceptual structure. Besides the fact that this would miss the point just made, it would be wrong for empirical reasons: C2, C3, or C5 for example crucially involve linguis tic properties which cannot be part of conceptual structure per se (given reasonable assumptions about conceptual structure), and hence cannot be stated in purely conceptual terms. Returning to less speculative considerations, it should be emphasized that the conditions as stated above are clearly in need of further clarification, both in detail and principle. Although they correctly account for the rele vant facts, as far as I can see, they do not achieve maximal conceptual economy. I have already mentioned cases of overlap, which are likely to in dicate the necessity of conceptual clarification, eventually leading to rele vant improvements. In that sense, C I to C5 formulate a field of exploration, rather than a satisfactory result. With respect to the general format of the conditions, I have made plausible, though somewhat arbitrary, as sumptions. Further study must clarify their status within a formally ex plicit theory of semantic structure. This leads to a final remark. Whatever the outcome of the clarifications indicated will be, it seems sufficiently clear that the substance of C1 to C5 must be part of the SF-component of Universal Grammar in the sense discussed in Chomsky (198 1 ) and related work. I would, in fact, expect C 1 to C5 (or rather some modified version
1 22
( 1 69)
Let [A , a] be an adjective a with the lexical SF-structure SF(a) [
c
=
I n other words, pro has all the semantic properties of its corresponding head, but it does not specify the polarity. This is indicated by the variable u, which we will now include into the values, over which § ranges. We now have to determine how u is specified under pertinent conditions. As u is an open variable (of category N/NN. incidentally), I will proceed in much the same way as for v, supposing that u is arbitrarily specified as + or - and subject to a condition that selects admissible specifications. Before I formulate this condition, which decides the + Pol or - Pol character of pro , the following observation might be useful. We cannot sim ply say that pro has no value for u , i.e. that it represents the naked dimen sion, because pro decisively contributes its Polarity to the Equative construction. It is in fact t he - Pol character of the pro rather than the matrix adjective in sentences like John is as short as Bill which (via C2) creates the obligatorily implied norm-relatedness. In Comparative construc tions, on the other hand, pro must clearly have + Pol character, in order to get a possible output of C 1 . Hence what we need is the pattern of (1 70), given in obvious abbreviation: ( 1 70) a. b. c. d.
[[ [[ [ [
+ ] + ] + ] - ] + [ + )] - [ )] -
taller than pro shorter than pro as tall as pro as short as pro
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of them) to be part of Core grammar, probably subject to some parametric variation with respect to the specification of E, as different languages ex hibit different properties within their system of gradation, Russian being an interesting case in point. There is still one problem to be clarified, viz. the SF-structure of pro occurring in complement clauses of Comparatives and Equatives. I have been assuming throughout that the SF-structure of pro is identical to that o f the corresponding matrix adjective, except for the specification of ' + ' and ' - ' . We will now make these assumptions more explicit. Let A1 and A 2 be corresponding adjective-positions in Comparative and Equative con structions, i .e. A 1 is the head of AP, and A 2 is its counterpart in the com plement clause. I presuppose here, that the notion of corresponding position is to be made precise in some way. The SF-structure of pro is then determined as follows:
1 23 Compare these with the patterns resulting from lexically filled complement adjectives: (171)
a . [[ + ] b. [[ + ] c. * [[ - ] d. * [[ - ]
( 1 72)
a. b. c.
[ + ] ] the truck is as long as it is high as short as it is high [ + ]] [ - ]] as long as it is low [ - ]] as short as it is low
Although some of the cases in ( 1 7 1 ) and ( 1 72) are a bit clumsy, only ( 1 7 1 b) and ( 1 7 1 c) are ungrammatical . They are excluded by C l plus C2, as dis cussed above. (There might be a secondary interpretation of these cases by means o f an escape strategy to be discussed below. This re-interpretation is triggered, however, only because the regular interpretation is ruled out.) We will now state the condition that accounts for the pattern ( 1 70): ( 1 73)
C6: spec(u, E)
=
§1 for E
=
[.
.
.
v1 §1 [
.
.
.
v2
_
c2 ) ]
Let u s see how C6 creates the pattern ( 1 70) . I t applies t o the Equative cases (c) and (d), adj usting the pro to the matrix adjective, ruling out two of the four possible combinations. The two inappropriate combinations of the Comparative construction are already excluded by C l and C2, in the same way as ( 1 7 1 c) and ( 1 7 1 d). The fact that three of the four pro-cases in ( 1 70) interpret pro as + Pol, giving + Pol a greater share in the range o f possibilities, might prompt a general consideration. It has frequently been remarked that + Pol adjectives are unmarked in an intuitive sense, although no explicit account of this intuition is yet available. We have in fact reduced the pertinent observa tions, such as the admissibility of measure phrases, the obligatory norm relatedness of Equatives, etc. , to various interacting factors. This suggests that the markedness of - Pol adjectives is not a simple, homogeneous phenomenon. It might be, however, that the condition on u in pro is related to the core point of the markedness asymmetry in D-adj ectives. If this speculation is right, C6 might eventually turn out to be a fragment o f a semantic theory of markedness, which is still missing. I will conclude this section by pointing out that the interpretation of pro can easily be extended to attributive adjectives, given the remarks in the beginning of Section 4. To that effect, we must merely extend the notion of corresponding constituents to pairs of nominals N; and N�. and we must
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
d.
[+ [[+ [-
+ ] the truck i s longer than i t i s high -] shorter than it is high longer than it is low +] -] shorter than it is low
1 24 admit nomi nal pro of the type [N ' pro] . With this adjustment, ( 1 69) will determine the extended interpretation of pro, which accounts for the de viancy of ( l 74a) and the interpretation of Robin as a boy in ( l 74b). ( 1 74)
a. b.
• John is a much taller boy than Mary . John is a shorter boy than Robin.
So far, I have developed the theory of gradation exclusively on the basis of D-adjectives. It will now be shown that it encompasses the different phenomena of E-adjectives without any further assumptions, given an in dependently motivated analysis of their lexical SF-structure.
There are two questions to be answered in this section: What is the structure of E-adjectives, as opposed to D-adjectives, and what is the antonymy rela tion for E-adjectives? We will take them up in turn. As mentioned in Section 1 , the set of E-adjectives is less clearly delimited and less systematically structured than the set of D-adjectives. It may very well turn out that E-adjectives do not form a homogeneous set at all. I need not make any particular claim in this regard, as I am primarily interested in the question how nan-D-adjectives behave with respect to gradation and how this behaviour can be explained. Following reasonable methodology, I will analyse sufficiently clear cases, the analysis of which is then to be generalized to more complex cases. To begin with , there are two major observations about E-adj ectives. First , although adjectives like stupid, wild, crucial, and so on allow for sca lar projection realized by the typical range of Degree specifiers of adjectives, they do not provide a dimension in the first place. Whereas tall, heavy, warm etc . define a particular (more or less complex) aspect, along which ob jects are to be compared and measured, the adjectives we are now concerned with first of all fix a more or less complex condition which a pertinent object may or may not meet. That is what we in fact mean by saying that an adjec tive ascribes a property. As already mentioned , D-adjectives might be con strued as assigning a property as well, viz. being above or below the norm in the relevant aspect. But th is is an intrinsically gradual property, which the property assigned by wild, good, homogeneous etc. is not . I will capture this intuition by treating E-adjectives as standard one-place predicates ftxing a certain condition or property P. As I am not interested here in the analysis of these conditions, which may actually be rather complex, I will simply consider P as a variable over SF-constants of category SIN . 38 Thus GOOD represents the relevant property of good, BAD represents the property of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
7. GRADATION OF E·ADJECfiVES
1 25
( 1 75)
a.
D-adjectives: c
0 b.
E-adjectives : 0
c
The decisive point is not whether there is any clear delimitation of C on the scale, but only that C can include the empty extent for degrees of realiza tion, but not for extensions along a dimension . This point will have conse quences for the value of N: N can never be 0 for D-adjectives, but it may be close to 0 for E-adj ectives. This is clear even on intuitive grounds: the standard or norm with respect to evil is best conceived of as indifference with respect to good and evil. Let us make these preliminary considerations somewhat more precise. Suppose that the lexical representation of all except the D-adjectives is of the general form given in (97a), repeated here as ( 1 76): ( 1 76)
� [x [[P x] 1\ [Q x]]
I will continue to drop t he modifier aspect represented by Q. Thus good and bad would be ( 1 76a) and ( 1 76b), respectively. Suppose, furthermore, that there is a general lexical redundancy rule which optionally expands ( 1 76) into ( 1 77) with QUANT and AMOUNT as discussed above: ( 1 76)
a. b.
x [GOOD x] x [BAD x]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
bad, etc. Now gradation comes in by an evaluation as to how distinctly the property in question is marked or realized in different instances. This might in effect be done by exploiting different aspects in different cases, some of which are systematic and obvious, while others may be highly idiosyncratic and context-dependent. 39 I will assume here that there is a conceptual operation that accomplishes this evaluation with respect to the degree to which a property is realized. It seems to be a plausible, but by no means necessary, conjecture that this operation is one of the conceptual interpreta tions of the SF-operator AMOUNT that we have introduced with respect to much and more. This account of gradation in E-adjectives is corroborated by the second observation. In maximally simple terms: extents are never null, but degrees of realization might be. Thus even very short things have a certain length, but it does not make sense to ascribe to lazy people a certain diligence. This might be schematized by the range of extents that constitute the comparison class for D- and E-adjectives:
1 26 ( 1 77)
a. b.
c [x [[QUANT AMOU NT GOOD x] c [x [[QUANT AMOUNT BAD x]
[v + c]]] [v + c] ]]
The rule that generates these expansions could be stated as ( 1 78): ( 1 78)
x [P x] - c [x [ [QUANT AMOUNT P x]
=
[v + c]]]
Presupposing the general combinatorial rules o f the SF-component, ( 1 78) can actually be replaced by the SF-structure ( 1 79), which absorbs the input of ( 1 78) by lambda-conversion and gives its output as the resulting SF structure:
P [c [x [ [QUANT AMOUNT P x]
=
[v + c]]]]
I nterestingly, this is almost exactly the lexical SF-structure of the attributive reading of much given in ( 1 44). If we assume this to be a systematic connec tion, we have reduced a special property of E-adjectives to a more general condition. Application of ( 1 79) makes adjectives available for gradation. This should be possible, of course, only for gradable predicates. I do not take this to be a lexical, but rather a conceptual property . Thus, whether pregnant or prime are amenable to comparison, seems to be a matter of their conceptual interpretation, not their SF-structure. Nothing in particular hinges on this assumption, though . What is relevant, however, is that the same condition holds for much and many: the former applies to mass nouns, which are gradable with respect to quantity, the latter to plurals, which are gradable with respect to cardinality of sets .40 However this issue might be settled, the following equivalence holds for gradable adjectives: ( 1 80)
/\ x [ [P x]
....
v c [ [QUANT AMOUNT P x]
=
[N + c]]]
This follows directly from the assumptions made so far. We thus have arrived at a general format of the structure of E-adjectives by assuming one-place predicates as their lexical specification and exploiting the SF-structure of much as a redundancy rule. The resulting representa tions have the form of + Pol D-adjectives and are thus available to all the rules and conditions applying to + Pol adjectives. One of the interesting consequences is, that factor phrases are never excluded for E-adjectives by C4, so that we have three times as stupid along with three times as clever. We now turn to the second problem, viz. the antonymy relation of E adjectives . So far, we have treated all E-adj ectives on a par, antonymy being at best a consequence of the adjectives in their unexpanded form. This correctly corresponds to the fact mentioned in section 2 that anonymy in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 79)
1 27
BAD
(181) <
GOOD
------
0
>
It might be tempting to construe this type of scale along the lines of (positive and negative) real numbers, thereby creating a unified continuous set of in tervals that is open-ended both ways. What we will do instead, is have two scales, one being a projection of the other: ( 1 82)
A D3 -scale is and ordered septuple (D, D0 , D ' , :J , + , - , g) where ) is a D2 scale as defined in (89), D' is a set o f (D, D0 , :J , + , intervals, and g is a bi-unique mapping from D onto D ' . -
Hence we assume that for any d E D there is a uniquely determined g(d) E D. I n other words, g is a function from D onto D ' . Thus g projects the struc ture of the D2 -scale into D ' , determining, among other things, a subset o f extents in D ' corresponding to D0 of D. In particular, g preserves the ordering under :J and the structure of the operations + and ' - ' . Thus we have: '
( 1 83)
g(a + b)
=
g(a) + g(b); g(a - b)
=
'
g(a) - g(b)
Returning to the antonymous adjectives, their relevant property can now be described as projecting the degree of a common property into the two dif ferent parts of a D3 -scale. This will be done in the following way: let P and P be two contrary predicates with [Px] - • [Px] . We will say that P and Pare antonymous, if they meet the following condition with respect to a given D3 -scale, where G is a semantic constant of category N/N whose con ceptual interpretation is a function g: ( 1 84)
[ [QUANT AMOUNT P x] [[QUANT AMOUNT P x]
= =
[v + c)] .... G (v + c)]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
E-adjectives is not only less systematic, but has also a different character from antonymy of D-adjectives. There is, however, a certain correspondence between antonymy in both classes, in spite of the essential differences, which have largely been over looked in previous studies, a notable exception being Kiefer ( 1 978), who, however, does not give a systematic explanation. Suppose then, that GOOD and BAD are antonymous in the sense to be explained. How is this relation to be captured? What we need is a way to bring together the two scales into which the two adjectives are projected separately. In order to clarify the resulting type of scale, we need to go back to t he considerations about D scales. In its simplest form, the new type of scale that we need can be schematized as follows:
1 28 Assuming that in fact BAD can be construed as GOOD, we replace ( l 77b) by ( 1 85): ( 1 85)
bad: c [x [[QUANT AMOUNT GOOD x]
=
G [v + c)]]
( 1 86)
a. b.
John is worse than Bill. ( + ND) John is as bad as Bill . ( + ND)
( 1 87)
a. b.
John is better than Bil l . ( + NR?) John is as good as Bill. (- NR)
( 1 88)
a. b.
John is more intelligent than Bill. ( + NR) John is as intelligent as Bill . ( + NR)
While the situation is sufficiently clear with bad, judgements with respect to good and intelligent are less unequivocal . Assuming that both good and intelligent are positive terms (a problem to which I will return below), we face the problem, that even positive terms may behave differently with respect to n orm-relatedness. Actually, good seems to be a somewhat special case, since typically E-adjectives tend to be norm-related throughout. Let us first look at negative antonyms. We have to make sure, that they are norm-related in all cases, including t he Comparative. There seems to be only one way to derive this result, given the conditions C l to C6: we must stipulate that for negative E-adjectives N is an extent not different from 0. That seems to be a reasonable assumption, given the structure o f D3-scales, with 0 as an indifference-point, and the explanation of degree of realization discussed above: a property may be ascribed correctly if it is realized to some extent. On this account, the v of the complement adjective (or rather of pro) will be N and still pass C l , as now it does not prevent lC [ . . . [N + c]] from being an extent. Similarly, Equatives will necessarily be norm-related, if 0 is N. In fact, all predictions come out correctly on this assumption, ex cept for one point: C4 requires v to be different from N for cases like John is twice as stupid as Bill to pass through. This is only a matter of formula-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let us call P the negative and P the positive term of a pair of antonyms. It is obvious that positive and negative E-adjectives are crucially different from + Pol and - Pol D-adjectives. As we will see immediately, however, the latter are in some respects but a special case of the former. The crucial difference is that, even with the explicit representation of the antonymy that we have introduced, both positive and negative antonyms are based on the additive operation. Hence with respect to Cl to C6 all E-adjectives behave like + Pol D-adjectives. Although this gives the correct result in most respects, there are a number of differences still in need of explanation. Let me repeat some of the crucial facts:
1 29
account for the fact that their interpretation is as much dependent on com parison classes as the interpretation of 0-adjectives. By stipulating O[C , D) ' this dependence is reintroduced, albeit in a different way, which correctly accounts for the first observation made above with respect to E-adjectives. The idea is simply this: while D-adjectives create the dependence on com parison classes by computing a particular extent N, i .e. by fixing a C dependent value of D, E-adjectives create the C-dependence by adjusting the origin of D to C. In other words, while D-adjectives move around a value in D according to different C, E-adjectives so to speak move around the scale as a whole with respect to each given C. We will, of course, as sume that C l to C6 simply ignore the parameter indices on 0 or N, whose specification is a matter of conceptual interpretation anyway. In order to give it a formal status, we might include the stipulation j ust discussed as CO into the set of conditions on variables: ( 1 89)
CO : a. N - 0 if Int(N) = D3 -scale b. spec{v, E) � N for Int(spec(v))
=
D3 -scale
(a) and (b) are alternative, though equally provisional, formulations, which I will not further evaluate. I am assuming here, as seems reasonable, given the previous discussion, that E-adjectives map their degrees generally into D3 -scales, even if they do not have uniquely defined lexical antonyms . In tuitively, CO says that gradation of properties requires a location of scales, rather than the specification of a standard interval. Turning next to positive E-adjectives, nothing more has to be said with respect to cases like ( 1 88). We simply extend the assumptions about 0 to E adjectives in general. Notice that this, together with the introduction of D3 -scales, suggests a somewhat different schematization of the relation of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tion, though . What C4 is to guarantee, is that v must not be N if N � 0. This can be achieved in various ways. O ne might e.g. modify C4 by adding the condition j ust mentioned . I will briefly outline another possibility that does not change C4 and allows, furthermore, to account for an additional fact . Suppose we express the stipulation that N is 0 for E-adjectives simply by excluding N as a possible specification for all these cases. Suppose, further more, that norm-relatedness is captured not by N per se, but by the fact that N is parametrized for the comparison class C, which seems to be a plausibe assumption . Suppose finally that for E-adjectives this parameter applies to 0 rather than N. We then derive norm-relatedness in the required cases without changing any other assumptions. Let us briefly reflect on the con tent of these assumptions. The mere requirement that N be 0 for (negative) E-adjectives would not
1 30 scales and comparison classes. We may indicate this relation by ( 1 90) in stead of ( 1 75): ( 1 90)
a. D-adjectives:
c
0 b.
E-adjectives:
c
(191)
c [� [ [QUANT AMOUNT (Vc ' [[QUANT VERT x ]
= =
[v ' + c ' ] ] ] ) G [ v + c)]]
Here c' originates from the regular SF-structure of tall (the antonym of short), and c comes in through ( 1 79). v ' must be N by condition C 5, and v must be 0 by CO . ( 1 90) is now open to Comparative and Equative con 1 structions, among others.4 Let us see how structures like ( 1 9 1 ) are to be in terpreted and how they relate to the facts . As I have already said, there are two references to scales involved: QUANT VERT refers to a D2-scale, QUANT AMOUNT, applied to properties, refers to a D3-scale . It does so by mapping the degree to which a property is realized onto the negative or positive part of the D3-scale, depending on whether G is present or not. Now the degree to which x is tall or short is the interval c' , by which x differs from N. Thus ( 1 9 1 ) maps, so to speak , the c ' -value determined by x into a value for c on the D' -part of a D3-scale. ( 1 90) gives a rough schematiza tion: ( 1 9 1 ) can be construed as representing a mapping from ( l 90a) into ( 1 90b) . We might also think of this mapping as simply reinterpreting the N
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We are now left with cases liice ( 1 87), where positive £-adjectives, so to speak, range over both parts of the D3-scale. Before we turn to this problem, we will briefly consider how D-adjectives are construed as £ adjectives. The basic point is fairly simple and has already been established in connection with attributive much and more, viz. the application of QUANT AMOUNT to D-adjectives, which already contain the functor QUANT in their SF-structure. The resulting structures are of the type illus trated in ( 1 40) above. I will now drop the simplification achieved by defini tion ( 1 39), in order to explicitly mark the occurrence of two interlocked scales. By applying the redundancy rule ( 1 79) and the equivalence ( 1 85), we get ( 1 9 1 ) as a (secondary) representation of short, which construes it as a negative E-adjective that is antonymous to tall:
131 of the D2 -scale as the origin 0 o f a D3 -scale with one important modifica tion : since the image scale D3 represents degrees of realization, it does not preserve units of measurement. Hence expressions like ( 1 9 1 ) do not allow for the usual measure phrases. (We will see the consequences of this fact presently.) More generally, the metric of the mirror scale is not identical with that of the original scale. Turning to the relevant facts , there are two types of cases to be distin guished, in which a mapping from one scale to another shows up. John is more tall than he is slim.
( 1 93)
• John is taller than Bill is short .
Cases like ( 192) have already been discussed. Here the second scale comes in as the contribution of more by regular compositional processes, as illus trated in ( 1 40) . Hence they are ·grammatical, even though tall and slim must be construed as (secondary) E-adjectives. Things are rather different with ( 1 93). Compositional processes interpret it as the regular Comparative of a D-adjective, which is ruled out by C 1 and C2. But now it comes in by the backdoor, using ( 1 79) as a pickloc k . By this move, it gets the illicit reading that John is more above the norm in height than Bill is below it. Hence cases like ( 1 93) are predicted to have a predetermined interpretation of semi- or ungrammatical status, because they have a regular, but uninterpretable SF structure and an interpretable, though ungrammatical one. Notice that the line of argument taken here is corroborated by the fact that the escape interpretation is automatically precluded, if a measure phrase is added: ( 1 94)
• John is two feet taller than Bill is short .
This can no longer be interpreted by using the strategy just described, as measure phrases using units from a D2-scale are excluded from interpreta tion in a Drscale. Hence ( 1 94) can only be interpreted by treating short as if it were a pro, i.e. by simply ignoring it. Consider next Equatives like: ( 1 95)
John is as tall as Bill is short .
They do not violate any of the conditions CO to C6, hence their interpreta tion is direct and grammatical . Notice, however, that C3 requires both v's to be equal, and since the v of short must be N by C2, the tall in ( 1 95) must be norm-related , as is in fact the case. Compare this with: ( 1 96)
• John is twice as short as Bill.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 92)
1 32
( 1 97)
very: i> (}{ ['v'c [ [QUANT AMOUNT P x]
=
[v + c) ]]]
Comparing this structure to that o f attributive much given in (144), it can immediately be seen that very is identical in S-structure with much , if the Unspecified Argument Rule applies to c in (144). That is as it should be, since very clever is something like the morphological realization of •much clever.42 The second point concerns the fact that projection into a second scale as discussed above also accounts for the way in which adjectives that do not relate to the same D-scale are made comparable: ( 1 98)
a. b.
John i s less intelligent than Mary i s nice. Bill i s as tall as his ideas are worthless.
The 'problem' to be solved in interpreting such sentences is not so much to find a common scale, but rather to determine the respect in which different
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sentences like these have hardly any interpretation, intuitions about their possible meaning are rather confused. That is because they violate C4 re quiring a shared starting point for measurement. The particularly strange status of these cases follows from the following consideratio ns . The viola tion of C4 could be avoided by applying the strategy described for ( 1 93). But now a new problem arises: the Drscale invoked in this strategy does not preserve in general the original metric, which is the only one that could reasonably be applied in the case at hand. Hence the next move would be to reinstate the metric of the original scale. That this interpretation can in fact be forced, becomes clear if you think of a situation where it is important to become as short as possible. Then one might in fact introduce a D3-scale, measuring 'downward' from its origin and still using feet as units. I have discussed these semigrarnmaticaJ cases in order to show two things. First, their interpretation and their status follows from independently moti vated assumptions without further ado. And secondly, the theory predicts not only the properties of regular cases, but also the structure and the status of borderline cases. In other words, it explains why and how semigrarnmati cal constructions are interpretable. That is exactly what we want an ade quate theory to do. Two further points are to be added. First, there is a classical case of inter locked scales, viz. the semantics of very. What very tall means is, according to received analysis - see, e.g . , Klein ( 1 980) , Von Stechow (1 984) - tall with respect to a comparison class that is already tal l . Another way to say the same thing is that the di fference from N is above the norm for differences. This can be expressed as follows:
1 33 properties are graded . Once this respect is identified, the pertinent degrees are mapped into a common Drscale in an ordinary fashion. I will finally turn to the problem of ( 1 87), i.e. , the fact that some E adjectives, o f which good is the most prominent example, appear not to be generally norm-related. The relevant point is most clearly illustrated by the following examples : a. b.
Bill is bad . John is better.
(200)
a. b.
Bill is good. John is worse.
While in (200), (b) is not a possible continuation of (a), this seems to be fair ly natural for ( 1 99) . Hence bad/better are related very much like short/taller, but good/worse are not related like tall/shorter. Moreover, the degree to which Bill is bad provides the extent to which the degree to which John is good (or rather better) is compared. In order to see how these obser vations can be captured within the present theory, we have to return once more to the structure of D3-scales. So far we have assumed without com ment that D and D' are disjoint sets, i . e . , that the two sub-scales meet just at their origin 0. Whether this is in fact the case, depends completely on the particular specification of the function g. I have made no restrictive assump tions in this respect beyond the condition that the general structure o f D2-scales is preserved under g . Suppose now that g i s allowed to map D onto D ' in an intersecting fashion. Schematically: D
(20 1 ) g(D)
In other words, we may have cases such that g(d) E D for values d E D n D ' . The 'orientation' of d and g(d) with respect to + and - remains, o f course, converse. The idea is that in interpreting G , the function g can b e chosen in various ways, shifting up and down, s o t o speak, t h e image of D with respect to D itself, presumably with emply intersection as the un marked option. Consider, from this point o f view, the SF-structure of ( 1 99b). (In order to make the relevant point more salient, I will abbreviate QUANT AMOUNT GOOD by GOOD ' .) (20 1 )
vc ' v W [[GOOD ' JOHN]
=
[lc [ W c ) + c ' ]]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 99)
1 34 Now in the present context, W must be construed as determined by ( 1 99a) . whose S-structure is (202):
(202)
vc [ [GOOD 1 BILL]
=
G [0 + c) ]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Taken together, these two representations require that the interpretation of 1C[(202)] be the extent to which the interpretation of c in (20 1 ) is to be ad ded . This can be achieved by choosing g in such a way that D and D 1 inter sect up to the value of c in (202). Intuitively spoken, the negative part of the scale is moved up, so that the origin of the positive part coincides with an explicitly specified value. In this way, the degree of the positive adjective covers a contextually specified part of the degree-scale of its negative anto nym. This is not achieved by extending the positive part of the D3 -scale, but rather by moving up its negative part . In this way, the norm-relatedness of good is not really abandoned, but rather the norm of good and bad is split into two different specifications O [C D) and O [C D ' ] ' where the former , , cannot preserve its original force, as it is located within the range of bad. The resulting situation is similar to a D2-scale, where 01c, DJ would be con strued as 0 and o 1c, D ' J as N. It is different, though, as there are intervals ' below' o1c, D J • but not below 0. In any case, the analysis shows, in which way good can acquire certain properties of a + Pol D-adjective, without really being one. There are three important consequences of this analysis. First, it predicts that the absorption of a part of the negative scale shows up only relative to the pertinent negative antonym . Otherwise, the positive E-adjective is just positive. This accounts for the (somewhat insecure) judgements concerning the norm-relatedness of cases like ( 1 87a). Secondly , it predicts a clear asymmetry in the relevant respect between good and bad (and positive and negative E-adjectives in general). It is the interpretation of G that provides the scale-shift in question, and would thus not give the same result if positive and negative terms are exchanged . This can be shown by various examples. The easiest way is to consider the analy sis of (200), which is identical to that of ( 1 99), except that G must be dropped from (202) and introduced after the ' = ' in (20 1 ) . This difference is crucial, though, because now the interpretations of G cannot be chosen depending on the compared extent c. In other words, no pseudo-D2-scale could be created . Notice that now we have a clear condition which distin guishes positive and negative E-adjectives in formal terms. Without these considerations, it would be arbitrary, whether BAD is construed as GOOD or GOOD as BAD. It can, in fact, be shown that the decision makes sense only with respect to adjectives that allow the interpretations of G to shift in the way indicated. This brings us to the third consequence: it is a lexical (or possibly concep-
135
8. EXTENDING THE THEORY OF GRADATION
In developing the proposed theory of gradation, I have explicitly analyzed the types of facts discussed in Section I as well as a fair number of related phenomena, showing how they are to be derived from the interaction of cer tain general assumptions. I have claimed initially, that the theory will not only explain the facts not covered by previous analyses, but also the stan dard phenomena analyzed in alternative accounts. I cannot go, of course, through all the facts that would be relevant in this respect. In order to indi cate that the claim seems nevertheless to be warranted, I will briefly look at some typical cases. The first example concerns the analysis of recursion in the specifier constituent. Consider sentences like (203): (203)
A is as much taller than B as C is shorter than D.
In order to make the SF-structure perspicuous, I will abbreviate QUANT VERT by VERT' . (204) [TJC [[VERT ' A] = [lc ' [ [VERT ' B) = [0 + c ' ]] + c) ] = [v + [TJC [ [VERT ' C) = [l c " [[VERT ' D) = [0 + c " ]] - c)]]]
I cannot discuss the complete compositional derivation of (204), because that would require certain assumptions about the S-structure of complex ad jective specifier constituents which have not been discussed here. But I have
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tual) property of £-adjectives, whether they allow a variable or require a fixed interpretation of G determining the relation between positive and negative antonyms. Once the possibility of this option is countenanced, a number of puzzling problems fall into place. The good/bad case is but a straightforward example of the problem . Cases like sick vs. sane e.g. are more tantalizing. I will not analyze the pertinent properties, though. To summarize, I have analyzed £-adjectives by relying on the operator AMOUNT already introduced before, and the constant G, giving rise to D -scales with alternative options of selecting g. By adding CO to the con 3 ditions discussed before, all the peculiarities of £-adjectives, including the borderline-interpretations of D-adjectives and the sernigrammaticality of various escape i nterpretations, follow from the theory of gradation deve loped on the basis of D-adjectives which in a natural sense constitute the core domain of gradation .
1 36 indicated in (204) the rough compositional structure resulting from the anal [v + TIC W2]] ysis of the component parts. The matrix frame [TIC W 1 results from the Equative construction as much . . . as . . . in the Deg constituent of tall. W1 and W2 specified by the Comparative constructions taller than . . . and shorter than . . . , respectively. All of this is in line with standard assumptions, e.g. , in Jackendoff ( 1 977). The specification of the variables v in W1 and W2 as 0 follow from C l , C2, and C6. There seems to be, however, a problem with the v resulting from much, which I left un specified in (204). According to C3 , as it stands, the v in question must be identical to the v which it commands. This is in effect c " . We do not have at our disposal any means of specifying v according to that condition, and even if we were to introduce an adjustment to that effect , it would give the wrong result, because what (203) says is that the two differences are equal, which requires the v in (204) to be 0. Hence, the only appropriate way to deal with cases like (204) is to preserve the option 0 and N for otherwise un specified v and to adjust C3 accordingly, giving 0 as spec(v) in this case. This could easily be done in some ad-hoc-manner. But as mentioned earlier, C3 seems to be in need of some conceptual clarification anyway, hence I will leave it at that. What the example shows, however, is that the basic as sumptions of the present theory seem to be correct in principle, also if ex tended to considerably more complex structures. Consider next the interaction of gradation with other semantic operators. I have already given some hints concerning sentential negation. I have shown , in particular, that the present analysis predicts the correct interpre tation for sentences like those in (205) - see ( 1 1 2) for a detailed analysis. =
a. b.
John is not taller than Bill. John is not as tall as Bill.
We have, fu rthermore, given a semantic account for the exclusion of senten tial negation from Comparative complement clauses: (206)
a . • John is taller than you wouldn't believe. b. John is as tall as you wouldn't believe.
The contrast in interpretability between (a) and (b) was reduced to the uniqueness condition associated with the iota-, but not the eta-operator . 43 We can in fact show, that not all negative complements of Equative con structions receive an equally comprehensible interpretation. Consider (207a), whose SF-structure is ultimately equivalent to (207b): (207)
a. b.
?John is as tall as Bill is not . [0 Vc [[ -, [[VERT ' BILL) = [0 + cJJ ]) =
+ cJJ ] 1\ [[VERT ' JOHN]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(205)
137 Clearly, the first conjunct of this representation does not provide an inter pretable condition for the height of John. Hence, whether negative comple ments of Equatives are interpretable, depends on their internal structure, giving the different interpretability of (206b) and (207a). There is one interesting point to be added here. Notice that (208a) ac cording to our assumptions has the SF-structure (208b): (208)
a. b. c.
John is not tall. ..., [Vc [ [VERT ' JOHN] 1\c [ ..., [ [VERT ' JOHN]
[N + c)] ] [N + c) ] ]
(209)
a. b.
Everybody is taller than John . [ [EVERY x] v c [ [VERT ' x] = [0 + c ' ]] + c] ] ]
[1c ' [VERT ' JOHN]
If we interpret v c i n the way j ust discussed, we get the desired result, except for one point: in order to avoid contradiction, everybody must be inter preted as [EVERY x [x ;e JOHN] ] , that is as everybody else. The condition in question was already required for the analysis of the Superlative. It might very well be that we are facing here a fact that has to be captured by more general conditions on quantification. I have nothing to say in this respect. With this proviso, the representation of (2 1 0) can be given as follows: (2 1 0)
a. b.
John is taller than everybody else. [ [EVERY x [x ;e JOHN]] [ [Vc [ [VERT ' JOHN] = [lc ' [[VERT ' x] = [0 + c ' ] ] + c]]]]
To conclude this selection of problems related to gradation, let us briefly look at sentences like (2 1 1 ) which have two readings, one according to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
N o w if (208b) i s t o b e equivalent t o (208c), a s i t in fact should be, then the referential quantifier introduced by the Unspeci fied Argument Rule must be subject to standard operations defined for the existential quantifier. Although I have not relied on this aspect of referential quantification in cru cial respects, it seems to be reasonable to include it into the conditions deter mining the status of referential quantification. This is, however, a problem that must be deatlt with in a more general way, as it is by no means restricted to quantifiers resulting from gradation. Consider now the interaction of gradation and quantification. Without making specific claims with respect to quantification, I will simply assume that quantifier phrases are given widest scope in LF-structure, e.g. along the lines proposed in May ( 1 977). We thus would get something like (209b) as the SF-representation of (209a):
1 38 which John has contradictory beliefs, and one where he is in error about his height . (2 1 1 )
John believes that he is taller than he is.
The complement clause of believe, according to the present theory, has the following SF-structure: ( 2 1 2)
Vc [ [VERT ' JOHN]
[l c '
[ [VERT ' JOHN]
=
[0 + c ' ] ] + c)]
"
X
9 . FURTHER PERSPECTIVES
In this concluding section, I will look at some general properties of the theory we have arrived at. The overall framework is a grammar of the type developed in Chomsky ( 1 98 1 ) and related work , into which an SF-component is incorporated , map ping LF- into SF-structures, which eventually determine the interpretation in terms of conceptual representations. The SF-component consists essen tially of three parts: (2 1 3)
a. b. c.
combinatorial rules lexical SF-structures conditions on SF-structures
The combinatorial rules are based o n the principle of lambda-conversion (or some formal alternative, like coindexing), and referential quantification. The formal properties o f these rules as well as the general format of lexical SF-structures are determined by the general nature of SF representations,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There are various proposals in the literature for the treatment of opaque contexts like those in (2 1 1 ) . Suppose we adopt the theory of opacity deve loped in Jackendoff ( 1 984). Then (2 1 2) would be within the scope of an operator that marks it as representing a belief, i.e. , a mental representation. The ambiguity is then represented by optionally introducing a transparency operator at the place marked by x which gives the complement clause an interpretation outside the mental image, so that (2 1 2) is no longer contradic tory. This seems to me the most convincing account, although the present theory is compatible with standard analyses in terms of di fferent scope of believe as well, provided that LF-structures are distinguished accordingly. I conclude from these illustrations that the theory proposed here interacts in an appropriate way with other components of the grammar in order to account for pertinent facts.
1 39
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
i . e . , the formal universals of the SF-component. The conditions on SF structures presumably constitute a number of interacting subcomponents o f the SF-component. I will turn t o this point presently. The theory of gradation consists now in the lexical representations par ticipating in the pertinent constructions, and the system of conditions CO to C6. As has been remarked above, the present formulation of these condi tions is but a first approximation, which might be improved by further study. Before I will briefly speculate on possible lines of development, I will comment on some assumptions that are implicit in the present approach . First of all, SF-structures must allow for internally structured lexical representations, that is, for systematic lexical decomposition. Otherwise, the type of explanation pursued here would not be possible. The semantic components to be postulated to this effect must be systematically interpreta ble in terms of conceptual structure. Notice, that this is not in general a matter of simple point-by-point interpretation ; it rather depends on the in herent structure of conceptual representations , so that systematic SF configurations rather than isolated constants are projected into organized conceptual structures. The properties of scales of comparison, which are in volved in the interpretation of various constants and variables appearing i n expressions of gradation, provide a straightforward example. Secondly, the explanation of semantic properties requires representations and principles that are of a fairly abstract nature. Thus we have reduced a wide range of phenomena to conditions concerning interpretability, mea surability, etc . , which are independent of individual lexical items, or even particular types of construction. In other words, interesting generalizations emerge if one disentangles general principles from particular lexical items, although they must, of course, apply to, or interact with, the particular con figurations originating in lexical items. It should be emphasized that this is by no means at variance with the compositional nature of semantic representations. I have in fact adopted a rather radical principle of composi tionality, deriving SF-structures in strictly compositional fashion from in dependently motivated syntactic representations. The crucial point is, that certain relevant generalizations must be stated in terms of properties of SF structures, independently of their compositional origin. (This approach is incompatible with a type of rule-by-rule interpretation as adopted, e.g. in M ontague grammar.) A third assumption is closely related to the second one. In order to ap proach explanatory generalizations, one must focus on underlying princi ples rather than on the classification of empirical data. Thus we have seen that, e.g . , norm-relatedness, which has been treated, insofar as it has been treated at all, as a uniform phenomenon to be described in terms of presup position, is in fact the outcome of the interaction of lexical properties, com binatorial rules , and general conditions, which jointly produce a rather
140
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
flexible pattern of phenomena that would resist any direct classification in terms o f presupposition, implication, or what not . I n a similar vein, the an tonymy relation has been reduced to two rather different structural configu rations. Interestingly enough , the concern for abstract underlying principles leads to a systematic account of a far wider range of empirical facts, in cluding, e.g . , the nature of borderline cases and semi-grammatical struc tures , than any direct classi ficatory description would countenance. It should be obvious that the general o rientation inherent in these assump tions is cognate to research principles that have proved to be fruitful in other areas of linguistics, notably in the approach developed for syntax in Chomsky ( 1 98 1 ) and much related work . This similarity applies to the breaking up of surface units into more elementary components, such as phonetic or syntactic features, as well as to the abstractness of underlying principles with respect to particular representations and to empirical data. We shall bear this in mind when looking at the perspectives for further development. The theory proposed here makes specific assumptions as to the structure of D- and E-adjectives and of Comparative and Equative constructions. These assumptions differ systematically from other accounts in that they preserve a rather strict type of compositionality. The most essential distinc tion, however, is the reduction of a fairly wide range of apparently scattered phenomena to the conditions CO to C6. These are in essence conditions on the specification of particular SF-variables, viz. v and u, which differ from other variables in that they are free in a particular sense. One may now ask whether these variables, and hence the conditions responsible for them, are a peculiarity of gradation, o r whether they are part of the systematic struc ture of the SF-component. I have already conjectured that C 6 might actual ly be part of a theory of markedness, and hence integrated into the more general theory of substantive universals of the SF-component. But even if this speculation goes in the right direction, we still must be concerned with the status of CO to C 5 . Now, if one looks at this question i n a wider perspective, one might realize that v is not solitary, after all . A likely candidate for an account rather simi lar in spirit is (local) deixis, i . e . , the interpretation of here, there, and related terms. As is well-known, they require the specification of a reference point to which deictic conditions are related in characteristic ways. There seems to be a limited range of options for the specification of the reference point. These options are determined by the conceptual system of spatial orienta tion and presumably subject to conditions similar to those governing v. This is shown in a more salient way if one takes into account the options available for the interpretation of verbs like come and go (in their deictic rendering). Another domain that suggests itself for related considerations is the system of tense and temporal adverbs. Hornstein ( 1 977) has developed a
141
(2 1 4)
x [AT [TI E [F [S R E)]] x ]
This would relate a proposition x to an event time E which is a function of S and R. Now S and R are variables whose specification (by time adverbials or free choice) is subject to general conditions motivated in part by the con ceptual structure of temporal orientation which provides the interpretation of tense and time adverbials. Although the details of this conjecture are utterly ad hoc and not meant to anticipate systematic study, the similarity of the theory of tense and gradation should be apprehensible. The related ness not only concerns by the way, similarities in the structure of the theories in question, but might involve substantial aspects as well, since the formal structure of the conceptual domain interpreting tenses has, of course, much in common with sets of intervals involved in the interpretation of degrees. We might, therefore, explore the possibility of subsuming conditions like C 1 and C2 under even more abstract principles relating SF-structure to general aspects of conceptual interpretation. The picture of SF-structure that emerges from these considerations is this: compositionally derived SF-representations are subject to various condi tions and principles constituting organized sub-theories of semantic struc ture. These sub-theories determine the abstract structure of various semantic domains and the way in which they eventually relate to conceptual representations . They impose, in particular, fairly abstract conditions on the way in which lexical entries can be structured and contribute to the com positional interpretation of complex expressions. If these considerations are correct in principle, interesting consequences will emerge. (a) First of all , it becomes an empirical task to develop a formal theory of general properties of SF-rules and SF-representations. The assumptions outlined in Section 2 and applied throughout in the present paper are, I
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
restrictive account of universally possible tenses which i s crucially based on Reichenbach's concepts of S(peech)-, R(eference)-, and E(vent)-point. He shows that possible tenses are determined by systematically constrained ar rangements of S, R, and E, ordering and association being the available relations. Hornstein goes on to explain a wide range of phenomena by show ing how temporal adverbials are then related to S, R, or E. Without going into any of the interesting details, I will merely point out that Hornstein's largely informal analysis can plausibly be adapted to the present frame work, e.g. , along the following lines . Suppose that there is an SF-constant AT that relates time intervals to events or states-of-affairs , so that [AT t x] is a sentential operator which says that x takes place at t. Suppose, further more, that F is a relation on S, R, and E, restricted by Hornstein's con straints on ordering and association. As a first result, we would thus get something like (2 1 4) as the general format of possible tenses:
1 42
(2 1 5)
a. b.
John is twice as stupid as his brother.
• John is twice as short as his brother.
Far more complicated cases can easily be adduced, particularly if secondary and semi-grammatical interpretations are included. Actual experience might in fact include instances that invite erroneous generalizations, which obviously are never made, however. Utterances of sentences like She isfour feet short illustrate the point. Once we assume that abstract conditions like CO to C6 are part of the initial endowment of the human mind, the right extrapolations follow from rather limited data that can reasonably be ex pected to be included in actual experience. In Bierwisch ( 1 98 1 ) , I have suggested that word meanings are not ac quired by establishing associative links between word forms and concepts, nor by the accretion of individual semantic features, but rather by pro-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
hope, a plausible approximation. But even if that is right, they allow for a lot of arbitrariness that must be eliminated on empirical grounds. (b) It should be possible not only to clarify and systematize the conditions discussed in this paper, but also to extend the type of analysis pursued here to further domains, such as local deixis, tense, quantification, and probably others, constituting what migh be called the core structure of semantics in the sense of core grammar proposed in Chomsky ( 1 98 1 ). (c) By the same token, it becomes a meaningful task to determine the modular structure of the SF-component. This implies at least two types of problems. First, we may try to identify specific conditions characteristic for one or the other domain, and more general principles cutting across or coor dinating the alleged subsystems. Secondly, it will be necessary to determine general properties of SF-structures in terms of which the form and content ' ' of conditions and principles is to be stated. The relation v 1 commands v2 used in C3 might be (I case in point. (d) Finally, the status of semantics as an empirically motivated, genuine component of grammar which determines specific aspects of the computa tional structure of language by means of specific conditions and principles provides a general perspective, both for exploring semantic phenomena as part of the grammatical structure, and for construing the notion of gram mar as containing a component of semantic form . Consider, finally, the implications of the theory of gradation, put in this perspective, for the problem of language acquisition. Among the pheno mena we have been considering, there are a large number of instances imply ing tacit knowledge that cannot be derived by means of inductive principles from actual experience. To take j ust one example: there are clear intuitions about the different status and interpretation of (2 1 5a) and (2 1 5b), although there is hardly any experience from which it might be inferred:
1 43 jecting organized semantic structures into the domain of independently de veloping conceptual structures. Conditions of the type discussed here are reasonable candidates for organizing principles in the relevant respect. Thus the ungrammaticality of (2 1 5 b) follows, without further experience, once it is established that short is a -Pol D-adjective, the general format of which might in turn be derived from principles of SF-representations by means of limited experience. In this sense, the theory developed in this paper directly bears on the logical problem of language acquisition. It provides in parti cular a tentative perspective along which the domain of semantics can be given some structure which would allow one to explore the problem in an interesting way.
NOTES (for P art 2, Sections 5-9}
27.
There are a fair number of controversial issues which I cannot take up in the present con
text. See Bresnan ( 1 973), Chomsky ( 1 977a), and Hellan ( 1 98 1 ) for extensive discussion. The main tenets of the present theory could be reconciled, within certain limits, with various alter native proposals with regard to the syntactic aspect. I will not go into the pertinent details. It must be noted, however, that the fairly high degree of flexibility i ndicates an insufficient un derstanding of some of the underlying principles in the syntax as well the semantics of Com paratives. In this respect, none of the previous theories fares any better than the present one. 28.
Actually, short is the antonym of both
long and tall.
I will not pursue here the interesting
question, whether this fact must be captured by a lexical ambiguity of short, specifying either VERT or MAX as the relevant condition, or whether short is simply unspecified for the verti cality condition involved in 29.
tall.
It should be noted, however, that ( 1 09} would be appropriate for a radical lexicalist ap
proach that takes morphological comparatives as the result of lexical rules. This would require first an adjustment of ( 1 03), such that
long + er etc. could be inserted directly from the lexicon, long into long + tr. The semantic part of such the representation for more to be given below.
and secondly a morphological rule that turns a rule would in effect be identical to 30.
This requires actually a generalization of the condition embodied in the Unspecified
Argument Rule (50}, which always assigns narrowest scope to the referential operator. Notice that (50}, if applied to ( 1 1 5}, places v c " before A which is then replaced by an expression that begins with two abstractors. We thus require a general principle that moves referential opera tors over all abstractors that happen to appear in their scope. This seems to be a reasonable assumption, which is quite in line with the intended nature of (50) and the operator it in troduces. 31.
I disagree in this respect with Von Stechow ( 1984), who interprets absence of factor
phrases as multiplication by one. Von Stechow is forced to this move, since he reduces the di fference between Comparative and Equative constructions to that between addition and multiplication. In the present theory, the multiplication by one is unnecessary; it would require a n ad hoc stipulation for the interpretation of an unspecified
n, and it is empirically inadequate
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Akademie der Wissenschaften der DDR Zentra/institut fiir Sprachwissenschaft Prenzlauer Promenade 149-/52 DDR-1/00 &r/in
1 44 anyway:
a is one tim� as short as b is deviant, but o n Von Stechow's account i t would have a is as shon as b.
the same SF-structure as 32.
In terms of the analysis of D-units discussed in Section 4, this would correspond to the
possibility of interpreting measure phrases alternatively as equivalence classes or as their representatives . 33.
Actually, a further generalization with respect to referential quantifier position is as
sumed here: referential quantifiers introduced by rule (SO) are moved over eta- and iota
operators as well as over abstractors, if they immediately preceed them. 34.
Von Stechow ( 1 984) argues convincingly, that what I have encapsulated in TARG in
volves in effect a counterfactual conditional. 35.
As far a s I can see , there is only o n e apparent exception, viz. too-constructions like:
1 ohn is ten inches too short to be taller than Bill.
(i)
context. 36.
The effect of ( 1 62b) seems to be identical to the kind of coindexing proposed in Williams'
( 1 980, 1981) theory of predication. It might also be appropriate to deal with problems origi nating from extraposed constituents. Hence further exploration might suggest to reformulate the Argument Rule (49) in accordance with (1 62b), which, incidentally, would assimilate the architecture of SF-representations more closely to that of LF. 37.
This case is actually more complicated. I t seems to me, that here a scale-shift is involved
in the sense mentioned with respect to more tall than slim . See the discussion of ( 1 40) above. I will return to this point in Section 7. C3 however remains in force. 38.
Although I will restrict the attention to one-place predicates, all what is said here carries
over directly to relational adjectives like
related (to), proud (of} etc. which require a richer ar
gument structure. The shared feature is obviou s : n-place predicates fix a condition that n
objects must meet for the predicate to be ascribed. 39.
For 0-adjectives, the obvious way to turn a property-interpretation back into a scalar one
is to pick up the dimension already involved. The relevant interval will now be different, though: instead of the extent of x with respect to DIM, it is now the difference to N1c.
DIM[
c with respect
that besomes relevant. We will return to this point later. Even this is not the whole
story, though. It has occasionally been remarked, e.g. by Leisi ( 1 943), that the norm N for D
adjectives may be determined in different ways: either by a comparison class
C - that is what
we have been assuming throughout - or by the proportion within a given object . Thus a long nose or a narrow window might be long or narrow
as
to its intrinsic proportion, rather than
in comparison to the absolute length or width of noses or windows of certain relevant classes . This interpretation, it seem s t o m e , can only be captured by treating D-adjectives a s predicates
mor�. much does not seem to be an incident
along the lines mentioned above in connection with attributive 40.
The intrinsic connection between E-adjectives and
for yet another reason. I suppose that it ultimately explains why E-adjectives - as discussed in Section 2 - do not admit measure phrases in the positive, even if there are units used in the
•five point good. This pattern is parallel to fiv� gallons mor� wattr, but •two gallons much wat�r. Bresnan ( 1 973) and Jackendoff ( 1 977) account for the
comparative: ten point betttr, but
latter asymmetry by the stipulation of much-deletion, which does not seem to me a satisfactory
how/six feet tall how muchlsixf�t tall�r. A vague hint regarding the background of these facts might be
solution, because there is a wider range of connected phenomena. Consider vs.
that differences and extents behave differently with respect to measure phrases. These specula tions must be left for separate study, however.
41.
There appears to be a formal problem with Comparatives, though. The Comparative
morpheme has to pick up a variable
v to be bound by ii for later substitution of the complement
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
where the complement clause contains itself a Comparative. But this Comparative is in the scope of TARG, which must not be included in the patterns that are relevant in the present
145 clause. But now there are two v's in ( 1 9 1 ) . W e need n o precaution for this complication, however, letting the Comparative bind an arbitrarily selected variable. I f the first v is chosen , the result is vacuous, since the second scale remains redundant. Hence only the choice of the second
is of interest. This will become clear as a result of the considerations which follow
v
in the text. I leave it to the reader to figure it out in formal detail. No such question arises with
c and c' of ( 1 9 1 ) are bound by appropriate operators. very is an adjective modifier. It could equally well be analyzed as an ad
Equative constructions, as both 42.
On this account,
jective specifier . Its SF-structure would then be related to ( 1 45) as ( 1 97) is related to ( 1 44) . I
will not spell out this alternat ive, the decision being dependent on syntactic considerations anyway. 43.
The account given for negated complements of Comparatives is identical in spirit to that
o f Von Stechow ( 1 984), who does not have a contrast between (206a) and (206b), though, sim
ply because he uses the definiteness operator for both Comparative and Equative construc number of other problems involving the interaction with Comparative constructions carries
over to the present theory, because it is identical to his analysis in the relevant respects, in spite of crucial differences in others. I will not go through the various cases in detail.
REFERENCES
R. &
Bartsch,
Vennemann, Th . 1 972:
Mmantic Structures.
Athenlium, Frankfurt/M.
Bierwisch, M. 1 967 : Some semantic universals of german adjectivals.
guag�
Foundations of Lan
3: 1 - 3 6 .
Bierwisch, M . 1 98 1 : Basic issues in t h e development of word meaning. I n : W . Deutsch (ed .),
The Child's Construction of Languag�. Academic Press, London, New York. Pp. 34 1 - 387. Linguistische &richte 82: 3 - 1 7 .
Bierwisch, M . 1 982: Formal and lexical semantics.
Bierwisch, M. 1 987: Semantik der Graduierung. I n : M . Bierwisch, E. Lang (eds .). Bierwisch, M . (in press). The semantics of gradation. I n : M. Bierwisch, E . Lang (eds .),
matical and Conceptual As�ts of Dim�nsional A dj�ties.
Gram-
Springer Verlag, Berlin, Heidel
berg, New York, Tokio. Bierwisch, M.
j�ktiven.
& Lang, E.
1 987:
Grammatisch� und konzeptudl� As�kte von Dim�nsionsad
Akademie-Verlag, Berlin.
Bresnan, J . 1 97 3: Syntax of the comparative clause construction in English.
4:
Linguistic Inquiry
275 - 344.
Chomsky, N. 1 977:
Essays on Form and Int�rpretation,
North-Holland, New York, Am
sterdam . Chomsky, N. 1 977a: On Wh-movement. I n : P. Culicover, T. Wasow
Formal Syntax.
&
A. Akmajan (eds . ) ,
Academic Press, New York. 7 1 - 1 32 .
Rules and R�presentations. Columbia University Press, New York. Lectures on Governm�nt and Binding. Foris Publication , Dordrecht. 1 97 3 : Logics and Languages. Methuen, London. 1 976: The semantics of degree. In: B. Partee (ed .), Montagu� Grammar. Aca
Chomsky, N . 1 980: Chomsky, N. 1 98 1 : Cresswell , M. Cresswell, M.
demic Press, New York. 26 1 -292.
K. 1 983 : Zum Zusammenhang zwischen den alterstypischen Antworten auf Fragen mit Zeitschrift fiir Psychologi� 1 9 1 : 233 -252. Hellan, L . 1 98 1 : Towards an Integratffl A nalysis of Comparatives. Narr, Tilbingen. Higginbotham, J . 1 983: Logical form, binding and nominals. Linguistic Inquiry 1 4 : 337- 394. Hornstein, N. 1 977: Towards a theory of tense. Linguistic Inquiry 8 : 521 - 538. Jackendoff, R. 1 972: Mmantic Interpr�tation in G�nerativ� Grammar. M IT Press, Cam Goede,
'groBer' und 'mehr'.
bridge.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tions. It migh be noted on this occasion, that the analysis that Von Stechow proposes for a
1 46 J ackendoff, R. 1977: X-Syntax: A Study ofPhr� Structur�. MIT Press, Cambridge, Mass. J ackendoff, R. 1 978: Grammar as evidence for conceptual structure. In: M. Halle, J. Bresnan, G.A. Miller (eds.), Linguistic Throry and Psychological R�lity. MIT Press, Cambridge, Mass. Pp. 201 -228. Jackendoff, R . 1984: &mantics and Cognition. MIT Press, Cambridge, MIUs. Kaiser, G . 1 979: Hoch und gut - Oberlegungen zur Semantik polarer Adjektive. Linguistisch�
Berichte 59: 1 - 26. Kiefer, F. 1 978: Adjectives and presuppositions . TMomical Linguistics S: 1 35- 1 74. Klein, E . 1980: A semantics for positive and comparative adjectives. Linguistics and
Philosophy 4: 1 -45. Lang, E. 1978: Semantik der Dimensionsauszeichnung. In: M . Bierwisch and E . Lang (ed5.)
Languag�. Reidel, Dordrecht. May, R. 1977: The Grammar of Quantification. Dissertation. MIT. Pinkal, M . 1983: On the limits of lexical meaning. In: R. BAuerle, C. Schwarze, A. von Stechow (eds .), M�ning, U:>t!, and Int�rpr�tation of Languag�. de Oruyter, Berlin, New York. Pp. 400 - 423.
Reiter, R . 1980: A logic for default reasoning. A rtificiai int�llig�n� 1 3 : 8 1 - 1 32.
Von Stechow, A. 1 984: Compari ng semantic theories of comparison. Journal of Semantics J. Williams, E. 1977: Discourse and logical form. Linguistic Inquiry 8: 1 0 1 - 1 39 . Williams, E. 1 980: Predication. Linguistic Inquiry I I : 203 -238.
Williams, E. 1 98 1 : Argument structure and morphology. Th� Linguistic R�i�w 1: 8 1 - 1 1 4 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lei.5i, E . 1953: D�r Wortinhalt; &in� struktur im �tsch�n und Englisch�n. Winter, Heidelberg. Lewis, D. 1972: General semantics. In: D. Davidson & G. Harman (eds.), Semantics ofNatural
Journal of Stmantics 6: 147- 1 60
DISCOURSE MODELS A S INTERFACES BETWEEN LANGUAGE AND THE SPATIAL WORLD
SIMON C. GAR ROD and ANTHONY J . SANFORD
ABSTRACT
mental models of space. We argue that such models capture the functional geometry of spatial scenes to represent various control relations between the obj ects in the scene. The discussion centres around two analyses. First, an analysis of the spatial descriptions taken from task oriented dialogue, which seem to reflect a number of distinct mental models of the same visual scene, and secondly an analysis of simple English spatial prepositions. We argue that these prepositions express control relations rather than simple spatial relations and depend for their interpretation on the model of space assumed by speaker and listener. This analysis suggests that mental models whould be seen as i nterfaces between the language and the world of discourse rather than simply surrogates for that world.
In cognitive Science it is now commonplace to read accounts of language understanding which treat discourse models or discourse representations as central to the comprehension process (Sanford & Garrod 1 98 1 ; J ohnson Laird 1 983; Kamp 1 98 1 ) . Such models have primarily been thought to func tion as surrogates, standing in for the real or fictional world portrayed in the discourse. Where the real world might contain referents, the model con tains discourse referents; and where there might be real world relations, the model contains d iscourse relations. Such a processing metaphor is very ap pealing for understanding written discourse which is usually divorced from its real world context, but what happens in the more down to earth case where we use language to talk about things in the immediate context, for instance, when we refer to some aspect of the visual scene in giving a route direction, or read a guide book to help us find our way about a town? Do discourse models still play a role here when there is no immediate need for a surrogate representation? In this paper we will consider an alternative teleology, which relies upon a somewhat different metaphor of discourse models as interfaces between the language and the world portrayed. Our basic contention is that language only relates to the world in a principled way through the mediation of mental models of that world . Hence utter ances may only be given precise meanings in relation to models which can then be variously mapped onto the world to yield a variety of distinct in terpretations.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This paper outlines an argument that the meaning of spatial terms depends critically upon our
1 48
SPATIAL MODELS AS THE BASIS FOR LOCATIVE DESCRIPTIONS I N DIALOGUE
The idea that mental models of space might act as a foundation for spatial discourse was initially suggested to us by an analysis of spatial dialogues car ried out by Garrod & Anderson ( 1 987). These dialogues arose from a spe cially designed computer maze game given to a large number of subjects. A feature of the dialogues was that the players would regularly describe to each other places in the maze which were known to the experimenter and this made it possible to carry out a semantic analysis of a large corpus of locative descriptions ( 1 396 in all). On the basis of the analysis, Garrod and Anderson noticed that pairs of conversants would develop distinct but consistent description schemes, which could naturally be classified into four basic types: path descriptions, line descriptions, co-ordinate descriptions and figural descriptions. For in stance, the point marked 'y' in Figure I a might be described by any pair in one of the following ways:
( I ) Path description. " See the bottom left, go along one and up one , that's where I am" (2) line description. " I ' m on the second level , second from the left" (3) co-ordinate descri pt ion. " I ' m at
E two "
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The discussion will concentrate on a particular type of discourse com monly associated with a real physical context, namely discourse about space. We begin by considering some empirical evidence from the study o f spatial dialogue which illustrates how spatial descriptions depend for there interpretation upon particular, locally established, spatial models of the scene being described. It wil l be agued that such models represent the func tional geometry of scenes, and can properly be described as mental concep tual models rather than just the direct outcome of perception. The discus sion then turns to the semantics of locative prepositions, which takes a lead from the work of Herskovits ( 1 986) and will show how many of the appar ent complexities in usage may be explained by the nature of the various spatial models which one can build of an object or scene. Finally we will consider the role of spatial models in explaining extensions of the meaning of spatial terms to non-spatial comexts.
149
B
Fig. /. Examples of di fferent mazes used by Garrod & Anderson ( 1 988)
(4) Figural description. "See the rectangle at the bottom left, well I ' m in the middle box on the bottom of it " . Each o f these description schemes seems to depend upon rather different ways of conceptualising the overall structure of the maze. Thus path de scriptions rely on treating the maze as a network of nodes connected by paths, line descriptions depend upon breaking down the overall configura tion into a set of lines oriented in some way (e .g. horizontally, vertically or even diagonally), co-ordinate descriptions depend upon sets of interesting horizontal and vertical lines, figural descriptions rely on sets of identifiable figures or patterns of boxes in the maze. This led us to the proposal that each description scheme was built upon a particular mental model of the maze configuration, which in the case of effective communication would become " agreed upon" by the participants. Although mental models have been discussed in relation to a wide range of cognitive activities from deductive reasoning to explanations of physical devices there is quite a broad agreement about the fundamental structure and function of such models in understanding. Basically, a mental model of a situation consists of two things: ( I ) some set of autonomous objects which map onto the situation and give it an ' ontology' (Greeno 1 983), and (2) a tight knit set of relations between the objects in the domain which cap ture the 'topology' of the situation and bear a strong structural correspon dence to the actual functional relations between real objects in that situa tion (de Kleer & Brown 1 98 3 ; Forbus 1 983). Thus mental models of space have the effect of breaking down any scene into signi ficant spatial entities
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A
1 50
(point x) A : Right, see the bottom left hand corner. B: The bottom left. A: There's a box and then there's a gap. B: Uh-huh. A : And there's a box and then there's another box . B : Uh-huh. A : I ' m right there.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
- points, lines , regions or volumes of space - associated with the various objects in the scene, and then representing significant spatial relations be tween those entities (see Forbus, 1 983). In other words, spatial models represent what we will call the functional geometry of scenes, that is the spa tial relations by virtue of which objects in the scene may interact with each other. In certain respects this corresponds to the simple Euclidean geome try, but at the same time it must also capture the transformational possi bilities, the way in which spatial relations may change or remain stable ac cording to our concept of how the objects in the scene can interact with each other and we as participant observers can interact with the objects in the scene. In terms of the maze game description schemes, a particular model wiii therefore have the effect of imposing an 'ontology' on the scene by iden tifying components (e .g. points, lines, complex figures, paths or whatever) and putting them into such a functional geometric configuration. Now if we assume that models underly the description schemes used in the maze game dialogues then they should impose strict constraints on any description, both in terms of what spatial entities can be talked about and how these spatial entities might be discriminated from each other within the description. To illustrate this, consider the contrast between a description based upon a path network model and one based on a line model. In the path network model the actual boxes in the maze are represented as nodes connected to each other according to the paths within the maze itself. Hence distance between boxes corresponds to the number of paths or boxes which have to be traversed in getting from one box to another, rather than their physical distance. Sometimes this makes it very difficult to give a sim ple location description . For instance, if we take point 'x' in Figure l a, it is only related to the nearest prominent box (at the bottom left-hand corn er) via a tortuous route going to the right and up and left and then up again, even though it is immediately above the corner, similarly for point 'z'. This means that speakers who have adopted such a model either have to produce a very long-winded description or construct bizarre descriptions such as the following (from Garrod & Anderson 1 987) :
151 (point y) A: I ' m one to the right then one up, then there's a gap right. B: Uh-huh. A: I'm j ust in the box above that gap. No such constraints apply for the descriptions according to a line model, which captures vertical relations between rows and horizontal relations be tween points on a row irrespective of any actual path links. Thus typical descriptions for points 'x' and 'y' were the following: (point x) "Third row up, on the left"
Garrod & Anderson ( 1 987) were also able to demonstrate that dialogue pairs would develop quite specialised description subschemes based on a single type of model . For instance, we found evidence for at least three var iants of the basic horizontal line type of model, one in which the lines or rows were strictly ordered from bottom to top (as with the floors of a build ing), another where they could be ordered in either way and a third where there was essentially no ordering between the rows. Thus when a pair of conversants adopt some particular model of the maze configuration this has the effect of constraining their location descrip tions to the extent that the ' local' meaning of any expression can only be derived through the model itself. For instance, if we want to establish what a term like "row" or "column " actually means in a given dialogue we have to know whether it maps onto a horizontal or vertical line object in the model that has been adopted by the conversants. We might also need to know whether the line objects are already seriated within the model, to sup port descriptions like "the second row " . This is even more important in un derstanding figural descriptions such as this description of point 'm' in Figure I b: " I' m at the end of the middle right indicator", where the con versants will have to have implicitly agreed on an 'ontology' of figural ob jects within the maze. So this analysis of location descriptions which emerges when people play the maze game clearly implicates mental models of space in locative descrip tions . Such models are not simply perceptual representations but reflect dis tinct functional geometries imposed on the maze and come from the players joint conception of what they are looking at and interacting with. How ever, it could be argued that such description schemes are in some sense forced on speakers by the special nature of the game and the relative ab stractness of mazes as obj ects in space. In the next section we will argue that such models are equally important in understanding the use of locative prepositions in more everyday circumstances .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(point y) "Third row up, second on the left"
1 52 LOCATIVE P R E POSITIONS A N D MENTAL MODELS OF SPACE
Locative prepositions such as "in " , "on " , or "at " , are among the shortest and simplest expressions in the language, yet one does not have to look hard to find circumstances of their usage which seem to confound any straightforward semantic analysis.
in
in: inclusion of a geometric construct in a one-, two- or three-dimensional geometric construct.
To exempli fy the various dimensions one can take "a person in a queue" as one-dimensional and "a word in the margin " as two dimensional, but for the present we will consider the three-dimensional cases as in our exam ple of "a pear in the bowl " . Given Herskovits's characterisation of the ideal meaning of "in" , the problem is to determine a geometric construct of the bowl such that in all the appropriate cases it would include the pear but exclude it for the others. Clearly one has to rule out a canonical view of the bowl as including only the space which it displaces, although this may be required for other descrip tions like "the crack in the bowl " . In the pear case, the geometric construct • might be to contain a notional space bounded by the bottom and sides of the bowl and some imaginary lid placed over its top . This would work for example (a) and possibly (b) as well. But how can one handle cases such as (c) or (d) where the pear is within such a space but not naturally described as "in the bowl " ? For example (c), it might be argued that the notional space depends upon a canonical orientation for the bowl (i .e. upright) and does not apply in situations where the bowl is inverted. However, case (d) cannot be accounted for by such a manoeuver. It would also complicate similar uses of "in" for such things as light-bulbs "in" but not " under" sockets (see (e)). At this point, it is tempting to give up and accept that there is no principled general account of locative relations like "in " . The word
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Take, for example, the apparently innocuous definite description "the pear in the bowl " 1 and consider the circumstances when we would use such a description and those when it seems quite inappropriate to do so . Figure 2 illustrates various configurations of pears and bowls some of which can readily be described as situations where the 'pear is in the bowl ' but others, representing identical geometric relations of pear and bowl , which could not normally be described in this way. In considering these examples, we will start with what Herskovits ( 1 986) proposes as the ideal meaning of the locative preposition "in" which is:
1 53
b
a
e
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
c
d
r
Fig. 2. (a) The pear is in the bowl.
(b) The pear is in the bowl. (c) The pear is under the bowl. (d) The pear is over the bowl. (e) The light-bulb is in the socket. (f) Which black ball is in the cup?
simply has many di fferent uses even within the spatial domain and it must be presumed that all such uses have to be learned and stored independently . But such a sceptical view does not do j ustice t o the intuition that " i n " ex presses some straightforward concept of containment which is either pres-
! 54
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ent or absent in these various examples. The problem is how to define con tainment in geometric terms; we would suggest that what is required is a clearer notion of the functional geometry of such situations. If one reconsiders the various examples in Figure 2 in this light, in all the cases where one would normally use " i n " , the bowl is seen to functionally contain the pear. One way of describing the functional component of this relation is in terms of the way the bowl (the container) controls the location of the pear (the contained object). The rule seems to be that the container can be moved in almost any way and this will move what is contained in it, hence the container controls the location of what is contains. As a corollary to this, the movement of the contained object may also be constrained by the presence of the container, hence it is often necessary to "open" a con tainer and so cancel the control relation in order to remove something from it . Thus in cases (a) and (b) most people would assume that if the bowl is moved, under normal circumstances the pear would move along with it (i .e. so long as the bowl is not actually inverted). But for cases (c) and (d) this could not be assumed. Similarly in case (e) the light-bulb can be described as ' 'in the socket ' ' since the control I containment relation holds in this case irrespective of orientation. The question therefore arises of how to describe the meanings of the prepositions so as to account for the wide range of applications yet retain the intuitively appealing idea that it depends upon a relatively simple rela tion within an abstract conceptual model of the spatial scene. Our conten tion will be that the most parsimonious account is one which takes such spa tial terms as denoting relations within mental models of space which capture the functional geometry of the scenes being described. Hence any adequate characterisation of the use of the terms will depend upon a clear account of the variety of mental models which can be constructed of any scene and the way in which the spatial terms can be directly related to entities or relations within such models. According to this account , the effective use of a locative will depend up on two things, first the particular model that is imposed on the scene and secondly the appropriateness of the functional geometric relation expressed by the preposition. Although these two factors are quite distinct, they are by no means independent in practice. Thus if we say "the pear is in the bowl" we are highlighting a particular relationship between the pear and the bowl whereby one is said to contain the other, but this may itself suggest a spatial model for the pear and bowl containing j ust such a relationship. Thus one of the main advantages o f putting models between the language and the world of discourse is to allow for the multiplicity of interpretations or ' perspectives' that may be imposed on the same scene. Even though it may be odd to describe example (d) above as a case of a pear in a bowl there will be circumstances which might justify j ust such a construal of the sit-
155
in : inclusion of a geometric construct in a one-, two- or three-dimensional
functionally controlling space. Of course, functional control must come from a mental model of the sit-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
uation. (For instance, a s part o f a game where you have to try and get the pear into the bowl by manipulating the stand which supports it.) No doubt there are comparable situations which would j ustify the use o f ' 'in" for sit uations like that in (c) as well . What is important is not to have to treat such uses as representing peculiar 'meanings' of "in" but rather as reflecting the imposition of particular models on the scene which in turn capture the functionally significant spatial relations according to different perspec tives. However, before we consider the various ways in which speakers and listeners may impose different models on the same scene we have to look in more detail at the functional geometric relations which we believe under ly the simple locative prepositions " in " , "on" and " at" in English. Any functional geometric relation can be broken down into two compo nents, a purely spatial component and a more general interactional one. As we have already suggested , the relation "in" seems to have a prototypical spatial realisation where one object is included within the space associated with another object, and signifies an interactional state whereby the space is controlled in some sense by the container. Similar arguments can be made in relation to one- and two-dimensional physical spaces: again , it is easy to argue for a functional component in the choice of English spatial prepositions. Thus one would say of a person standing in a line (accidentally), that the person is "in a line" with the others. But if that line is a queue, and the person is not queueing, one would not say that they are in the queue. It is the control relation between the queue and the individual which is important. Thus, if the queue moved and the person who was actually waiting in the queue failed to move, they could still, sensibly, say that they were in the queue. Thus it is neither necessary nor sufficient to be physically in any particular line to be said to be in a queue. The important thing is a functional relation which orders the ele ments one-dimensionally. A typical two-dimensional case might be a note in the margin of a page. Here it is not quite obvious how the containing space controls the loca tion of its contents. However, we would suggest that there is still the func tional component of meaning being expressed in this case. If for some rea son the margin is moved (e.g. in a computer word processing system) one would expect it to retain its orginal contents. In other words, to put some thing in the margin is to impose a certain restriction on its location. We might now consider modifying Herskovits's formulation of "in" in the following way:
1 56 uation being described. The preposition "in" is seen as asserting a partic ular kind of direct control of some controller over appropriate entities in its control space, with the idea of control space being entirely dependent on a mental model of a situation . Thus it is not the meaning of "in" which varies over situations, but the notion of functional control. We believe this notion of control to quite central to the nature of prepo sitions . Consider some rather more extended uses of spatial prepositions: John is in a bad mood. Paddy is in good health.
John is in the sphere of influence of
X.
This is taken to be a fairly direct assertion that X controls John over some domain that is signalled by the content of the expression X. An identical ar gument can be made for being in good health . This suggests that we could further modify Herskovits's ideal meaning by substituting the term "con ceptualisation" for "geometric construct" , and by liberalising the dimen sionality specification, perhaps even eliminating it. I f this line of argument is correct, then in a situation where establishing the presence or absence of the appropriate control relations leads to an un certain outcome should lead to uncertainty over whether it is appropriate to use "in" . An attempt to find such a case is illustrated in Figure 2 f. Here we have a cup lying on its side with the balls spilling out of it. Now consider whether each of the two black balls are "in the cup" . Our intuitions are that people would be more inclined to accept the lower of the two balls as being in the cup than the upper one; we would expect this because the cup is more directly controlling the lower ball , while the upper one is more like ly to move, and so is less under the control of the cup.
on Herskovits ( 1 986) proposes the ideal meaning of "on" to be the following: o n : For a geometric construct X to be contiguous with a line or surface Y . I f Y i s the surface o f an object O Y , and X i s the space occupied b y an other object Ox , for 01 to support Ox.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These two cases are consistent with our argument, since they both depict states which directly control what a person can do. Moods control people, and so when we assert that someone is in a mood or an emotional state we are asserting that they are in the (conceptual) control space of the mood or state. A general expression which captures this is when we say something like:
!57
'---j; -+:: .-·::
7
___
b
c
Fig. 3. (a) The painting is on the wall. (b) The light is on the ceiling. (c) The light is on the
floor.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a
1 58
John is on social security. Louise is on the bottle. Being "on social security" means being supported financially by the social security system. Being "on the bottle" means that one's behavioural de meaner is being supported by an alcoholic binge. We can perhaps illustrate the weaker control relation of support from the strong relation associated with "in" by comparing " Louise is on the bottle" with " Louise is in an al coholic state" . The former focusses on the fact that alcohol is supporting her likely behaviour, but that the state in which she is may perhaps be modifiable through her own efforts. The latter draws our attention to some thing (an alcoholic state) which we construe as having absolute control over some aspects of her likely behaviour. We do not construe her as having con trol over it at all. It is this tight control relationship which seems to be the characteristic of "in " , rather than the weaker support control that goes with "on" . For unextended uses, we do not need to modify Herskovits 's proposed ideal meaning, we simply need to note that the idea of " support" is situa tion-relative .
at Finally we come to the preposition " at " for which Herskovits gives the fol lowing ideal meaning: at : for a point to coincide with another
This basic definition seems to capture the purely spatial relation expressed by "at" rather well . However, the basic definition seems to us to under specify the conditions for choosing to use "at" rather than some other ex-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this definition , we can discern both the purely spatial component - con tiguity - and the interactional component - support . Like containment, support can be characterised as a control relation whereby one object con trols the location of the other by opposing the force of gravity. That is the supported object does not drop because of the supporting object. Again it is easy to find contrasting examples to show that contiguity alone is not a su fficient conditon for the normal use of "on" (see Figure 3). These examples also illustrate that it is not necessary for the support to be below the object supported, just so long as it is roughly contiguous and supporting (compare examples (b) and (c) in the figure). The idea of support occurs in extended usages also. Consider the fol lowing:
1 59
DISCUSSION
Our analysis is but a sketch of the kind of psychological-semantic analysis which it is desirable to perform on locatives. The major point is that purely geometric meanings of the terms considered are inadequate. Such descrip tions badly underspecify the conditions when a particular preposition can and cannot be used. One possible solution, explored here, depends upon a notion of situation-specific functional geometries. The assumption is that in any given situation where a spatial preposition is used there is a cor responding mental model to which the utterence refers. The mental models will be organised along lines which embody representations of useful no-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
pressions, and it does this by ignoring functional geometry. Take for exam ple the description "the pupil at the desk " . Clearly, more than simple coin cidence of points is being expressed here. I f the pupil were to be seated pointed away from the desk with his back leaning against the side, it would be misleading to describe him as "at the des k " , and one would use some other form of spatial description to indicate coincidence, like "the pupil near the desk or touching the desk " . What is significant is the functional coincidence of pupil and desk. To be "at something" is to be in a position to interact with it according to the normal scheme of things. So to be " at a desk" is to be in a position to use it, to be " at a supermarket" implies you are there to shop, to be "at your office" suggests that you are there in some professional capacity rather than just visiting and so on. Hence the relation is one of useful coincidence. Again , this feature of useful coincidence can be seen in various extended uses. For instance, it is acceptable English conversational slang in many ideolects to ask of someone "What is he at? " , meaning "What is he do ing / up to? We may translate this into "With what is he having a functional interaction? " . The feature o f control seems to take on a polarisation which is opposite to that of "in" . If X is at Y, then X is the controller or the potential con troller of significant activities associated with Y in some situation-specific mental model. To summarise, we have suggested that these three basic locative prepo sitions signify functional geometric relations between their subject and ob ject . " I n " signifies containment, "on" support, and "at" functional coin cidence. I f one is to capture such conceptual relations in a semantic account of such prepositions then one needs to be able to specify the mental model that the speaker and listener impose on the scene, the same kind of model which would serve as a working representation for reasoning about and comprehending the physical world.
1 60
Ikpartm�nt of Psychology Univ�rsity of Glasgow Glasgow G/2 BR T
REFERENCES
De Kleer, J.
& Brown, J . S .
1983: Assumptions and ambiguities in mechanistic mental models .
I n : D. Gentner and A . L . Stevens (eds.),
Mmtal Mod�ls, Lawrence Erlbaum Associates,
Hillsdale, NJ .
Forbus, K . D . 1 983: Qualitative reasoning about space and motion. I n : D. Gentner and A . L . Stevens (eds.), Garrod, S . C .
Mental Modtls, Lawrence Erlbaum Associates, Hillsdale, N J .
& Anderson, Anthony
1987: Saying what y o u mean in dialogue: A study i n con
ceptual and semantic coordination.
Cognition 27: 1 8 1 -2 1 8 .
Greeno, J . G . 1 983 : Conceptual entities. I n : D . Gentner and A . L . Stevens (eels.),
M�ntal Models, Lawrence Erlbaum Associates, Hillsdale, NJ. Hersl:: o vits, A . 1986: Language and Spatial Cognition, Cambridge University Press, Cam bridge. J o hnson-Laird, P . N . 1983:
M�ntal Models, Cambridge University Press, Cambridge.
Kamp, H. 198 1 : A theory of truth and semantic representation. I n : J. Groendijl:: , T . Janssen, and M. Stol:: hof (eds .),
Formal m�thods in th� study of languag�. Mathematical Centre
Tracts, Amsterdam . Sanford, A . J . Chichester.
&
Garrod, S . C . 1 98 1 :
Undustanding Wri//�n Language, John Wiley & Sons,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tions o f control, and i t i s through these that the significance o f a spatial preposition arises. From a lexical semantic point of view, they are functions linking assertions to aspects of mental models which capture functional con trol relations. Our analysis is limited in a number of ways, and we hope simply to illus trate lines which we might follow in attempting to provide a psychologically motivated semantics of locative prepositions. One obvious limitation is that we have only speculated about English, and that the notions of reference to control function which we are trying to explore may not have a simple one to-one mapping onto lexical items in English. We are aware that in other European languages, lexicalisation will interact with case. I ndeed, this fact alone gives us more hope that we are on the right lines, since case has much to do with control relations. We hope to investigate this in the near future, as well as extend our considerations to other spatial terms. The idea of a psychologically-motivated semantics of locatives rests on the question: what kinds of assertions do we want to make about the spatial relations between things in the world? Once it is recognised that assertions about relations will be for some purpose, it is not a great step to see that any underlying geome try will necessarily be functional, and will accordingly be a situation-specific mental model. More abstract uses will be derivatives.
Journal of Stmantics 6: 1 6 1 - 168
BOOK REVIEW
Gerhard Heyer; Generische Kennzeichnungen. Zur Logik und Ontologie generischer Bedeutungen . Miinchen, Wien: Philosophia Verlag, 1 987. Pp. 289. DM 1 38 . MANFRED K RIFKA
I . T H E HISTORICAL PERSPECfiYE
I n the first part of his book, Heyer sketches in a concise way the treatment of genericity in the history of logic. He concentrates on some representative authors, namely Aristotle, the medieval philosophers Petrus Hispanus, Wil liam of Shyreswood and William of Ockham , and finally Gottlob Frege, with an outlook on Russell and Quine. As these authors hardly ever come close to discussing genericity explicitly, the only way to identify their views of this phenomenon is to look at their treatment of example sentences like homo est species which we today would classify as generic. According to this historical overview , generic sentences were taken into consideration in the logic of Aristotle (especially in his Topic) and in the Realist scholastic "supposition theory " , but were gradually neglected with the rise of Nominalism and its heir, modern logic since Frege. The ba sic assumption of supposition theory was that a noun has two modes of sig nification, called suppositio simplex and suppositio personalis, that is refer ence to a universal and reference to an individual representing a universal, respectively. In the Realist tradition, the suppositio simplex was assumed to be basic, whereas in Ockham's system , a noun in the suppositio simplex does not refer at all and is inferior to a noun in the suppositio personalis, which truly constitutes a bridge between language and reality. This clearly foreshadows the attempts at the regimentation of natural language so char-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Genericity i s a topic we don't know much about. There are some interesting observations, some preliminary classifications and some rudimentary the ories for certain generic constructions in certain languages . But many problems of the syntax and especially the semantics of generic sentences re main unsolved up to now. As other disciplines increasingly show interest in this topic, most prominently Artificial I ntelligence i n its research on default reasoning, it can be safely assumed that genericity will become an impor tant subject in the next years. Heyer's work is one of the few book-length treatments of this subject; his basic ideas, together with some further developments, are accessible for the English reader in Heyer ( 1 985) as well.
1 62 acteristic for large parts of modern language philosophy. Without reading too much into the Realist scholastic texts, Heyer claims that their authors took natural language more seriously than the writers in the nominalist tra dition. In a way, then, Heyer' s own treatment, as well as similar ap proaches like the ones of Carlson ( 1 97 8) or Chierchia ( 1 982), constitute a turn back to the Realist tradition. Heyer deserves recognition for making this broad historical perspective explicit.
2. THE CLASSIFICATION OF DEFINITE GENERIC SENTENCES
(A)
(sufficient:) I f a definite singular NP x in a given sentence can be replaced salvo veritate (with no change of meaning) by the cor responding definite plural NP, then x can be interpreted as generic.
(B)
(necessary:) A definite singular NP x can be interpreted as generic only if the head noun of x cannot be replaced salvo veritate by its next superordinated noun.
(A) clearly can be applied only to certain languages ; e.g., it is of no use for l anguages which lack plural marking. Even in German, there is a slight register shift between singular definite NPs and plural definite NPs with generic meaning . For example, ( I a) belongs to the style of biological text books, whereas ( l b) rather is an example of everyday speech . (1)
a. b.
Die Dronte ist ausgestorben. 'The dodo is extinct. ' Die Dronten sind ausgestorben. 'The dodos are extinct . '
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Heyer's treatment of genericity is essentially restricted to singular NPs in German with a count noun as head and a definite article. That is, he ex cludes from the discussion plural NPs as in Dogs bark, mass nouns as in Gold is expensive, NPs with idefinite article as in A birdflies and other lan guages, as e.g. languages which lack a definite article. The reasons for these restrictions are, in my opinion, not well founded and seem to be just a mat ter of convenience. In this, Heyer follows a tradition in the literature of con centrating on one's favourite type of generics, but it can be easily imagined that this attitude prevents a thorough analysis of the phenomenon of gener icity. I think the time h as come for an "onomasiologic" approach, which looks how genericity manifests itself in language, to complement the usual " semasiologic" approach, which examines the use of a specific construc tion, as e.g. NPs with a definite article. Heyer first tries to establish which definite NP should count as generic. He uses two criteria, a sufficient one and a necessary one:
163 Criterion
(B), however, seems t o b e universally applicable. T h e idea behind
it is that the reference to an object by a deftnite description containing a cer tain nominal predicate is somewhat arbitrary, as the speaker could choose another predicate, e . g . a more general or more speciftc one to do this job .
I can refer to the apple in front of me with the Granny Smith apple, the apple or the fruit, at least in a context where only one fruit is For example,
present . Deftnite NPs used generically should lack this arbitrarily, because they are not deftnite descriptions, but proper names of kinds (a point ar gued for in detail by Heyer). For example, the following sentences differ in a context-independent way in their meaning if read generically (but of course not necessarily in their truth value) :
(3)
a.
The house mouse reached Australia in
b.
The mouse reached Australia i n
a.
The lion has whiskers .
b.
The cat has whiskers.
1 770.
1 770.
Heyer goes on to distinguish four types of sentences with deftnite generic NPs . First, he differentiates between sentences with "personal meaning" (4a,b) and sentences with " absolute meaning" (4c,d) according to the crite rion whether the singular deftnite NP can be replaced salva
veritate by a sin
gular indefinite NP or not. Then both types are differentiated further ac cording to the criterion whether they are necessarily true or false (4a,c) or contingent (4b ,d): (4)
a.
Der Mensch ist ein vemunftbegabtes Lebewesen. 'Man is a rational animal' ( = A man is a rational animal .)
b.
Der Schotte trinkt Whisky. 'The Scotsman drinks whisky. ' ( = A Scotsman drinks whisky.)
c.
Der Mensch ist eine Spezies . ' Man is a species' ( � A man is a species .)
d.
1 969 den Mond. 1 %9 . ' ( � A man set foot o n the Moon i n 1 969.)
Der Mensch betrat
'Man set foot on the Moon in
I think that the careful argumentation for a strict distinction between the personal and the absolute interpretation is one of the maj or contributions of Heyer's work . There is, however, a slight problem with Heyer's test, as
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2)
1 64
a special subclass of verbal predicates, namely collective predicates, can be combined with definite generic NPs (cf. Sa) , but not with singular indefi nite generic NPs (cf. Sb), a phenomenon which was discussed in Gerstner ( 1 979). But clearly, (Sa) should be considered as a personal generic sentence as well. (S)
a. b.
The antelope gathers near waterholes.
• An antelope gathers near waterholes.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I will not go into this minor problem here, but would like to make a more general point. If the personal interpretation is identified via the replacabil ity of the definite generic NP by an indefinite (generic) N P , then the con cept of personal genericity should cover sentences with indefinite generic N Ps as well. Note that indefinite generic sentences and definite generic sen tences in the personal interpretation have more in common; for example, their verbal predicate must be stative, and they can contain adverbs like usually. At this point , Heyer's confinement to definite generic NPs be comes particularly restrictive, because his test implies that sentences with i ndefinite generic NPs and sentences with definite generic NPs in personal interpretation form a natural class and should be treated together. Actual ly, the concept of personal generic sentences understood in the more gener al way resembles the notion of I-genericity (I for "indefinite "), and the con cept of absolute generic sentences constitutes a subconcept of D-genericity (D for "definite "), as developed in Kri fka ( 1 987). To distinguish necessary from contingent sentences, Heyer employs two tests . First , conti ngent setences have di fferent temporal properties. The dis cussion of this point is rather unclear; for example, Heyer maintains that (4d) does not change its meaning when put into present tense, and that (4b) looses its generic meaning when put into past tense, which is clearly false, as shown by examples like die Germanen franken Met 'the Teutons drank mead ' . The main point seems to be that if the necessary sentences (4a,c) are put into past tense, then it is implied that the kind denoted by der Mensch does not exist anymore, whereas this implication is lacking with the contin gent sentences (4b ,d). - The second test basically says that necessary sen tences contain a copula and a superordinate predicate like Lebewesen or Spezies. I cannot agree with this criterion either. For example, if (4a) is necessarily true, then the sentence Der Mensch ist verniinftig 'Man is ration al' should be necessarily true as well, but fails to have the proposed syntac tic form for necessary statements. In my opinion, the distinction between necessary absolute and contin gent absolute generic sentences is not as important as another one which Heyer describes less prominently, namely the distinction between absolute generic sentences with kind predicates like be a species, be extinct and ab-
1 65
(6)
a. b. c.
In Alaska, we photographed the grizzly and the moose . The Frenchman eats horsemeat. (only few d o ! ) The German customer bought 80,000 BMWs last year.
As a final comment on Heyer's classification, I think that the distinction among personal generic sentences between necessary ones and contingent ones is spurious from a linguistic viewpoint, as there are no clear linguistic criteria which distinguish them. According to Heyer, those two types differ in their inferential properties . For example, from (4a) it follows that every man is rational, whereas from (4b) it does not follow that every Scotsman drinks whisky, but only that every typical Scotsman drinks whisky. The difference can be captured as well with different quantificational adverbs; (4a) can be paraphrased as Man is always a rational animal, whereas (4b) can be paraphrased as The Scotsman usually drinks whisky. But this does not prove that (4a) and (4b) differed in the first place in their semantic representation. It is more attractive, I think, to have a notion of personal genericity which simply leaves unspecified whether a generalization holds for every object belonging to a kind or only for some typical objects (or equivalently, whether it holds necessarily or allows for exceptions). In my opinion, Heyer' s treatment of this problem is as questionable as a treat ment of the noun child as ambiguous, because it can be paraph rased as boy or girl in di fferent contexts .
3. T H E SEMANTICS OF SENTENCES W ITH DEFINITE GENE R I C NPs
In the final chapter of his book, Heyer develops a formal semantics for definite generic sentences. This should be of considerable interest, because
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
solute generic sentences with predicates applicable to ordinary individuals as well, as e . g . set foot on the Moon. This latter class is a proper subclass of the contingent absolute sentences . It contains sentences in which a property of an object is "projected" to the kind the object belongs to. For example, in (4d) the property of setting foot on the Moon is projected from Neil Armstrong to the kind homo sapiens. It would have been interesting to learn more about under which condi tions this p rojection is possible. One condition surely is that a predicate which applies to some vanguard representative of a kind can be projected to the kind itself, which I called the "avantgarde reading" in Krifka ( 1 987). But note that the property must be felt to be important enough; therefore examples like Man jumped over 8.90 m in 1968 are less acceptable than (4d). And there are more conditions which allow for the projection of an object property to the kind, as e . g . in
1 66 genericity clearly is a problem case for truth-conditional semantics . Heyer assumes as model structure which contains ordinary objects and " generic individuals" . Generic individuals are related to objects by two re lations, "is a representative o f ' and "is a typical representative of ' . There are three sorts of predicates , kind predicates, stage predicates and disposi
A kind predicate like be extinct can only be applied to a A stage predi cate like enter, set foot on the moon can be applied either to objects or to tion predicates.
generic individual and yields an absolute generic sentence.
generic individuals, and in the latter case yields a contingent absolute gener ic sentence. Heyer, unfortunately, says nothing about the truth conditions of these sentences, for example whether they can be formulated in terms of spect is that kind predicates cannot be applied to objects because this would result in a category mistake. With disposition predicates like
barks, drinks whisky,
which yield per
sonal generic sentences, Heyer is more explicit. As already discussed above, he distinguishes between necessary and contingent sentences . In his recon struction, a necessary dispositional predicate applied to a generic individual can be projected down to every representation of the kind individual, whereas a contingent dispositional predicate can only be projected down to its typical representatives. Thus, the distinction between necessary and con tingent predication is shifted to the distinction between representatives in general and typical representatives. I think there are severe problems with this move. First of all, it does not help anything to solve the problem of generalizations which allow for ex ceptions. This problem raises its head again as the problem of determining the typical representatives of a kind. But more important is a second point, namely that a representative object may be typical with respect to one predi cate and untypical with respect to another. In Heyer's theory, such an ob ject must belong to the untypical representatives of the kind, and so we can not even infer that the first predicate applies to it . This becomes disastrous if we consider the possibility that a kind may have no representative which is typical in every respect. For example, it would be very difficult to determine a typical representative of man in Heyer's sense, because hope fully all of us have at least some property in which we deviate from typi cality. To make this point clearer, consider the following sentences . They should be true given an adequate semantics, yet there is no single duck which both has colourful feathers (only the male ones have them) and lays spreckled eggs (only the females do) .
(7)
a.
The duck has colourful feathers.
b.
The duck lays spreckled eggs .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the representatives of the generic individual or not. All he says in this re
1 67
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To my mind, this problem forbids any treatment of personal generic predi cation in terms of typical representatives. The meaning component of typi cality cannot be put completely into the semantic representation of the gener ic N P , but must somehow be related to the verbal predicate. This is possible if we assume a generic operator having scope over the verbal p redicate, as reflected e.g. in adverbs like typically. Furthermore, I have remarked above that contingent and necessary predications should not be separated in the first place, but should be captured by a common semantic operator. There is another feature in Heyer's model structure which I have not commented about yet. Heyer tries to model taxonomic hierarchies in terms of different tiers of individuals. According to this theory, ordinary objects constitute the lowest tier, and there is an unspecified number of individual tiers which contain generic individuals of ascending hierarchical levels . The subspecies relation, e.g. the relation between the cat and the lion, is modelled with the representative relation: a generic individual x is a subspe cies of a generic individual y if x is a representative of y and belongs to a lower tier than y . Heyer's stratificational ontology strikes m e as rather unnatural, a real Procrustean bed for the variability we find with real taxonomic hierarchies. To be sure, it has been argued that we can identify one level of individuals across different taxonomic hierarchies in natural languages which has the characteristic property that its nodes are denoted by morphologically sim ple nouns, like apple and quince in the hierarchy of fruits, and bed and table i n the hierarchy of furniture. This point is discussed i n the ethno-linguistic literature, e.g. in Berlin, Breedlove and Raven ( 1 973). But Heyer does not cite evidence like this one, and surely it does not enforce a stratification of all generic individuals. A further problem is that Heyer distinguishes the different tiers by numerical indices. He does not comment upon these in dices , but as a progressive refinement of a taxonomical classification should be possible at any point of hierarchy, these indices cannot be in tegers, but have to be real s l I think all those complications and ontological commitments are unnecessary if we simply assume that generic individuals are structured by partial orderings, as proposed e.g. by Kay ( 1 97 1) . A final point i s worth mentioning. I n Heyer's model structure, there i s a concept corresponding to every generic individual (namely, the concept comprising the set of representatives of the generic individual). On the one hand, Heyer does not claim that there should be a generic individual cor responding to every concept. In this point he differs from related ap proaches, as e.g. Chierchia ( 1 982) and Turner ( 1 983). As it is well known, this second claim leads into foundational problems which can be overcome in principle, e.g. by assuming Scott Domains as model structures. Heyer does not touch upon this interesting question, but I think that there is an argument for his position. It is well known (and cited by Heyer) that not
1 68
�minor fur natiirlich-sprachlich� Sysum� Universitat TiJbing�n Bi�ing�rstrasse 10 D-7400 TiJbingen Germany
REFERENCES Berlin, Brent, Dennis E . Breedlove & Peter H . Rav en 1973: General Principles of Classifica tion and Nomenclature in Folk Biology. A merican Anthropologist 75: 2 1 4-242. Carlson, Greg N . 1978: Reference to Kinds in English. Ph.D. diss., Uni v ersity of Massa chusetts at Amherst. Published 1 980, New York, Garland. Chierchia, Gennaro 1 982: Nominalizatioru and Montague G rammar. Linguistics and Philos
ophy 5 : 303-354.
Gerstner, Claudia 1 979: O ber Generizitlit. Unpublished master thesis, Uni versity of Munich. Heyer, Gerhard 1 985: Generic Descriptions, Default Reasoning, and Typicality. Theoretical
Linguistics 10: 33-72. Kay, Paul 1 97 1 : Taxonomy and Semantic Contrast. Language 47: 866-887. Krif\a, Manfred 1987: An Outline of Genericity. Forschungsbericht des Seminars fiir natiirlich-sprachliche Systeme, Uni v ersitlit Tiibingen. Turner, Ray 1 983: Montague Grammar, Nominalizations and Scott's Domains. Linguistics
and Philosophy 6: 259-288.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
every nominal predicate can yield a deftnite generic NP; for example, Carl son notes that the Coke bottle is a perfect generic NP, whereas the green bottle is not. It seems that generic individuals cannot be constructed from scratch, but have to be well-established in the previous knowledge of speak er and hearer. I we assume generic individuals only for the semantics of deftnite generic NPs, then this would be an argument against the complex model structures of Chierchia and Turner, and in favour of the simpler model structures of Heyer. To sum up, I think that Heyer's work is most valuable in its historic parts and in the recognition of different types of deftnite generic NPs, which be have rather differently in many respects. However, his contribution suffers from the rather arbitrary limitations he imposes on his research. And the semantics Heyer gives for the different types of deftnite generic NPs is either not well developed or developed, in my opinion, in the wrong direc tion. It should be mentioned that Heyer ' s exposition is extraordinarily clear if compared to many other works devoted to the subject of genericity. This may be one reason why the problems of his approach turn up so clearly as well.
Journal of Semantics 6: 1 69- 1 74
BOOK REVIEW
Collins Cobuild English Language Dictionary (Collins Birmingham Univer sity International Language Database). Collins, London-Glasgow, 1 987 . xxiv + 1 703 pp. £ 7 . 95 (paperback). P I ETER A . M . SEUREN
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This dictionary embodies an attempt at making (monolingual) dictionaries more systematic , more user-friendly, and more accurately descriptive o f the living language ("helping learners with real English" i t exclaims on the cover) . There can be no doubt that the aims set by the makers of this dic tionary, largely the English Language Department of Birmingham Univer sity, are timely and well-chosen. There certainly is a need for more serious lexicology to go into the lexicography of dictionary making. One might even say that a modern dictionary should also be organized in such a way that rapid computer-searches can be made on a sufficiently large number of practically useful parameters. Clearly, judicious use of up-to-date theore tical linguistic knowledge may well contribute to success in achieving the aims set . The question now is: to what extent has Collins Cobuild (hence forth CC) been successful in doing what it set out to do? One clearly detects a desire to provide lively and readily usable meaning descriptions for the entries . The language used for the descriptions is simple, non-pompous, at times even colloquial. The dictionary clearly addresses it sel f also to users, English speaking or foreign, with limited experience in dic tionary use. For this category of users the book is certainly more "user friendly" than other more established dictionaries. Whether it also has greater "user-fiendliness " for more advanced or even professional diction ary users is a di fferent matter. These may detect a mixture of condescension and chumminess in this dictionary, which they would gladly exchange for greater efficiency and accuracy. The informal " you " is all over the pages: " I f your thoughts are fuzzy or what you are thinking about is fuzzy, you are confused and cannot see an idea clearly or make a decision " . I ' m not sure that all readers will appreciate being talked to in this manner. What is most helpful and enlightening, anyway, is a generous sprinkling of example sentences in the meaning description sections. For practical purposes o f meaning description, few things are more help ful than well-chosen exam ples . A great deal of thought and care has obviously gone into the actual mean ing descriptions, which are generally clear and well laid out . Yet there is plenty of room for improvement on this score. To give just a few examples, for the verb flog , the first meaning given is: " I f you flog something, you sell it; an informal word used in British English . EG We thought we might be able to flog it to someone. " As a second meaning we find: " If you flog someone, you hit them very hard with a whip or stick as a punishment for
1 70
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
something which they have done. " First, such descriptions are unnecessar ily verbose: the part "if youflog someone" may be left out without any loss of information or user-friendliness . And although the element " punish ment" does seem to be either prototypical or perhaps even constitutive o f the meaning of this verb, the addition of " for something which they have done" is both tedious and misleading (it may also be for something they have not done, or for something that the flogger thinks, or pretends, or what not, that they did or failed to do). But secondly, the fact that the meaning " sell" is given first makes one suspect that the makers of CC have pandered to popular taste perhaps a b it more than they should have. It is the task of a dictionary maker not only to provide information but also to be an authori ty on what is considered to be accu rate use of the words of the language in all sociolinguistic registers, not primarily, or preferably, the "informal" ones. This tendency to pander to popular taste is, unfortunately, present all through CC. It might be added that the description of the informal use of flog is misleading because o f incompleteness: when one " flogs" something it is because one wants to get rid of it, the thing being considered of inferior quality and not worth keepi ng; one is then also prepared to accept a reduced price. A naive user of CC might think it appropriate to let a millionaire say, in an informal talk, that he will " flog" his Rembrandt painting to the Brit ish Museum. Or, to take just one other example, the meaning description of the adjec tive bald shows a great deal of care . It is specified for persons (having little or no hair on the top of the head), for objects: ' ' Something that is bald does not have the natural covering which you might expect it to have, for exam ple fur or grass, EG . . . a bald granite o utcrop" , and then in particular for tyres, and for statements, questions, accounts, etc. Yet, although this anal ysis shows that quite some thought has gone into it, it is also obvious that more is needed . For example, it is not made clear why a tree without its leaves cannot be called "bald " , but must be called "bare" , even when it is an evergreen, whose " natural covering" lasts the year round . The same goes for a bare hill, a bare piece of rock. Note that under bare the example "a hilly patch of bare red rock" is given. A user may well wonder why a granite outcrop is called "bald" but a patch of redrock should be called " bare", or whether either would do in both cases . The point is that there is a certain division of labour between bald, bare, nude, naked, and perhaps a few others, which differs considerably from corresponding word families in other languages (e .g. German kohl, bloss, nackt) . It would be useful to know to what extent there are unifying principles behind such divisions of labour, and to what extent the combinations are idiosyncratic. In the case of bald, for example, one might wonder whether there is any point in postulating that roundness of the object plays a part in the appropriateness conditions for this word (its etymology appears to be "balled "). We speak
171
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of bald tyres, bald heads, and in some dialects (in America perhaps more than in Britain) of a bald mountain. In any case, combinations like a bald statement are conventionalized and idiosyncratic: they do not follow from any unifying pri nciple of the lexical meaning of bald. Cp. the li kewise idio syncratic collocations bare detail, barefact, bare amount, etc. Unfortunate ly, CC does not indicate, for these cases, that they are conventionalized rather than rule-governed. It must be said in CC's defence that any commercially viable dictionary must strike a balance between investment in research and size on the one hand, and completeness and quality on the other. An ideal dictionary may just require too large an investment for a publishing firm to afford, even in conjunction with a university department . Such a comment is no doubt fair . Yet the question remains whether CC could perhaps not have profited more from already available insights in these matters, or whether the qual ity could not have been improved considerably with just a little more effort . It has, for example, already been pointed out that CC could easily have crammed more useful information into the same space, due to the repeti tiveness and verbosity of the descriptions and the comments. It is clear, i n any case, that much more attention should b e paid in linguistics, as i t i s prac ticed in the universities, to lexicology, i .e. the theoretical study of lexical items, and in particular of lexical meanings . Lexicology should be recog nised as a separate subdiscipline of linguistics, along with phonology, syn tax, etc. The makers of commercial dictionaries could then draw on the in sights and theories developed in theoretical Iexicology. Even so, CC could have done better, in particular as regards systematic lexical categorizations . There is no systematic distinction in CC between negative and non-negative words. Yet it is easily argued that of the follow ing pairs the first member is positive while the second is negative: assert I de ny, increase I decrease, just I hardly, many Ijew, old Iyoung, big I small, expensive I cheap. This is not just of theoretical importance, because there is a further distinction in the positive category: some positive adjectives are used " neutrally" when accompanied by a measure phrase, such as "how _? " : a question like " How old is the baby?" does not imply that the baby in question is old; it is simply a neutral question about the baby's age. Not so, however, in: " How expensive is that coat? " . Here there clearly is an im plication that the coat is expensive. It must be noted that such phenomena are not predictable just on grounds of general semantics, because there are interlinguistic di fferences . The German "Wie teuer ist dieser Mantel ? " , for example, does not imply that the coat is expensive, even though the nearest translation of expensive is teuer. Likewise for English old and French vieux: the French vieux cannot be used to ask a neutral question about age, because it always implies advanced age, unlike English old. Now although CC does specify for old that it can be used neutrally to specify age (" I f you
1 72
.
.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
say that someone or something is a particular numbt:r of years, months, etc. old, you mean that they have lived o r existed for that length of time"), whereas no such use is speci fied for, e.g. expensive, it would help the user i f a category ind icator were applied here, such as "positive neutralizing" for old etc . , and "positive non-neutral izing" for cases such as expensive. This is just an example, but many more could be provided. Thus, although there is an extensive literature on negative and pos itive polarity items, CC has in common with all other existing dictionaries that no mention is made of this lexically important distinction. For example, the verb bat, in con j u nction with an eye(lid), requires a negation or a (semi)negative adverb (such as hardly) for a main clause in which it occurs to be grammatical . CC lists this negatively polar verb as number 5 under the entry bat (which, unil luminatingly, has the meaning "cricket bat" as number l , and our nightly flyer as number 3) in the following way : " I f you say that someone did not bat an eyelid, you mean that they showed no sign of surprise or concern" . How can the user infer that i f one does show a sign of surprise o r concern, the expression bat an eyelid cannot be used? Conversely, for a positive polarity item like bristle (with) it is not specified that a so-called "echo" effect occurs when this verb is used in a negative sentence: "The place did not bristle with policemen" . Factivity of verbs or adjectives is not mentioned in any way, even though this notion has been around in semantics for about twenty years now. A verb or an adjective isfactive when it takes a that-clause in subject or object position and presupposes the truth of that clause. Thus the verb realize is factive: a sentence like "Henry realized that he had been fined" carries the logical consequence that Henry had been fined. Likewise for predicates such as know, have forgotten, regret, be surprised, be a pity, be regrettable, be advantageous, etc. Factivity distinguishes between verbs such as the factive know that . and the non-factive be convinced that . . . All sons of lexical processes, which are regular and productive, yet hard ly ever fully predictable (if they were fully predictable they would be part of the grammar), are, though often duly reflected in the descriptions, never explicitly identi fied . Adjectives, for example, are often derived from nouns in such a way that their meaning amounts to a preposition phrase (usually but not always with the preposition of) over the noun in question. Thus one has , in English , monumental sculptor, which means, apart from "sculptor of monumental proportions" , rather " sculptor of monuments" , Likewise nocturnal prowl (" prowl during the night"), attitudinal change ("change of attitudes " ) , etc. Such derived adjectives are invariably limited to attribu tive use . CC does mention the exclusively attributive use, but not the noun derived character. Many adjectives can be used causatively. Thus, a sad story is not a story that is in a state of despondency, but rather a story that makes one sad.
1 73
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Likewise for, e . g . , happy ending, proud victory, etc . Not, however, for *glad story, •nervous event (but e .g. the Modern Greek for " nervous" does allow for causative use) . Again, CC does, on the whole, mention the possible uses (and does so rather better than most other dictionaries), but it does not, unfortunately, mention the categories. It is a pity that CC has not paid more attention to specific lexical cate gories of meaning description (and instead of the unnecessary verbose ele ments in the descriptions). CC has taken a first step in this direction by put ting all information of a more technical nature in a separate column to the right of the meaning descriptions, which are given in ordinary prose. This "technical" column could, and in my opinion should, have been a great deal more precise and more complete. This column contains, for one thing, the information on what is usually called subcategorization frames for predicates. These frames specify the number and kind of terms that can go with a verb or adjective. The sub categorization frame specifications given in CC are, however, very often in complete and sometimes even sloppy. In this respect CC compares badly with, e.g., the Oxford Learners Dictionary. The main advantage of good and systematic categorial information in a dictionary is that such information makes for a vast increase in possibilities for computer application. A computer-stored lexical database with such in formation allows for rapid and efficient search procedures, which can be of great value, for example, in computer-assisted translation. It will enable a translator who must turn, say, the Modern Greek equivalent for " nervous light " , i .e. in the causative sense of " nervous " , into English : a simple query will tell him instantly that English nervous does not allow for causa tive use. Adequate lexical categorization will enhance the speed and accura cy of computer-assisted translation . It will enable a translator to know in stantly that, e.g. , the Dutch equivalent for "a glad face" must be rendered in English as a happy face, or that the English phrase a funny man is to be rendered in Italian as un uomo spiritoso only if it means "a man with a good sense of humour " , but that this same phrase in the sense of "a slightly strange man" will properly correspond to un uomo un po ' strano, or un uomo ridico/o. Clearly, lexicology has not developed far enough to provide sufficiently adequate categorization systems for all such cases, but some progress has been made, and one has the feeling that CC could have made better use o f this progress. I t i s anyway important t o mention the practical advantages of adequate lexicologial categories for writers and translators. A translator who is fully competent in the two languages involved will not or hardly need such help. But such translators are rare and expensive. Given the dra matic increase in the amount of translation work to be done, especially in the context of the European Community, research into ways of facilitating
1 74
' '
Nijmtgtn University Philosophy lnstitutt P.O. Box 9108 6500 HK Nijmtgtn Tht Ntthtrlands
.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
translating procedures should have a high priority. We know now that fully automatic translation will remain a pipedream for quite some time. But that does not mean that translation has to remain the way it is, entirely done "by hand " . The appearance of CC is a welcome opportunity for stressing that it is realistic to work towards the development of computational tech niques that will assist translators and generally writers in a foreign language. As with all dictionaries, one can spot omissions in CC. Not so much in the area of "obscene" terms , notoriously neglected in English-language dic tionaries. Such terms are, on the whole, given, though with shamefully defi cient meaning descriptions . There still are, however, glaring omissions, such as the expressions Dutch treat, Dutch uncle, Dutch wife. One also misses, under the entry real, the maining "just like but not in reality", as in 1 ohn is a real diplomat I actor I . . ' ' , implying that John is not really a diplomat, or an actor , or what have you . On the whole, C C i s an interesting dictionary, which may well b e quite successful. It is an attempt at opening windows in lexicography, windows that have remained shut for too long. In doing so it opens new and enticing perspectives for dictionary making. If it falls short of what a devoted lexi cologist would hope for it is easily forgiven, mainly because it underscores the importance and the feasibility of significant advances in lexicography, based on advances in lexicology.
Journal of Semantics 6: 1 75-226
PRESUPPOSITION AND NEGATION
PIETER A . M . SEUREN
ABSTRACT
0.
INTRODUCTION
Ever since Strawson proposed ( 1 950, 1 952, 1 95 4) to regard the negation
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This paper is an attempt to show that given the available observations on the behaviour of ne gation and presuppositions there is no simpler explanation than to assume that natural lan guage has two distinct negation operators, the minimal negation which preserves presupposi tions and the radical negation which does not . The three-valued logic emerging from this distinction, and especially its model-theory, are discussed in detail. It is, however, stressed that the logic itself is only epiphenomenal on the structures and processes involved in the interpreta tion of sentences. Horn ( 1 985) brings new observations to bear, related with metalinguistic uses of negation, and proposes a "pragmatic" ambiguity in negation to the effect that in descriptive (or "straight") use negation is the classical bivalent operator, whereas in metalinguistic use it is non-truthfunctional but only pragmatic. Van der Sandt (to appear) accepts Hom's observa tions but proposes a different solution: he proposes an ambiguity in the argument clause of the negation operator (which, for him, too, is classical and bivalent), according to whether the negation takes only the strictly asserted proposition or covers also the presuppositions, the (scalar) implicatures and other implications (in particular of style and register) of the sentence expressing that proposition. These theories are discussed at some length. The three-valued analysis is defended on the basis of partly new observations, which do not seem to fit either Horn's or Van der Sandt's solution. It is then placed in the context of in cremental discourse semantics, where both negations are seen to do the job of keeping incre ments out of the discourse domain, though each does so in its own specific way. The metalin guistic character of the radical negation is accounted for in terms of the incremental apparatus. The metalinguistic use of negation in denials of implicatures or implications of style and register is regarded as a particular form of minimal negation, where the negation denies not the proposition itself but the appropriateness of the use of an expression in it. This appropriate ness negation is truth-functional and not pragmatic, but it applies to a particular, independent ly motivated, analysis of the argument clause. The ambiguity of negation in natural language is different from the ordinary type of am biguity found in the lexicon. Normally, lexical ambiguities are idiosyncratic, highly contingent, and unpredictable from lan!S.;age to language. In the case of negation, however, the two mean ings are closely related, both truth-<:onditionally and incrementally. Moreover, the mechanism of discourse incrementation automatically selects the right meaning. These properties are taken to provide a sufficient basis for discarding the, otherwise valid, objection that negation is un likely to be ambiguous because no known language makes a lexical distinction between the two readings.
1 76
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
operator in natural language as presupposition-preserving there has been unclarity, in most of the literature, about the logical consequences of such a proposal. There was, and is, moreover, great unclarity regarding the rela tion between the semantics and the logic of this operator, and pragmatics, mainly of Gricean stock, h as been frequently draughted to help out on dilemmas in the logical or semantic analyses of presupposition and nega tion. It is the purpose of this paper to provide greater clarity in the problems concerning the logic and the semantics of presupposition and negation in natural language, and to spell out what solutions are viable, given the empir ical facts on the one hand and the constraints of logic on the other. For a good understanding of what is attempted in this paper, a few methodological preliminaries will be useful . First, some comment is in order about the relation between logic and semantics. A distinction is made, here and elsewhere (e.g. Seuren 1 985), between the logic and the semantics o f natural language sentences i n the following way. Semantics i s understood as the empirical theory of how humans understand and interpret their utter ed sentences the way they do. Semantics is thus part both of linguistics and of cognitive psychology. Logic is regarded as epiphenomenal upon the psy cholinguistic structures and processes at work during the semantic construc tion and comprehension of linguistic utterances. In other words, no separate "logic component" is postulated. Whatever logic adheres to lan guage emerges as a result of what goes on during semantic processing . This is not meant to imply that logic should not be taken seriously. On the con trary, logical soundness is one of the important independent constraining forces on any sound semantic theory. But logical properties do not necessar ily count . Only if expressed in logical forms must logical, and hence truth conditional , distinctions be somehow reflected in the semantic analyses o f sentences. I t i s important t o realize that t o specify the logic of language is to engage in some form of applied logic. Another powerful constraining force for semantic theory is truth conditional soundness. But again, this does not mean that the adjectives semantic and truth-conditional should be taken to be co-extensive, as is standardly done in model-theoretic semantics, which takes the mathemati cal model-theory of logical calculi as the prototype of any semantic theory. There is no need to belittle the merits of the model-theoretic approach, which has greatly contributed to present-day semantics. Yet (" possible world" ) model-theory is hardly a plausible contender for the cognitive and linguistic problems ahead. In our view , semantics for natural language covers, besides the straight forward truth-conditional aspects , also the speech act properties of sen tences as well as everything to do with the processes of discourse incrementa tion, including the presuppositions in their different forms . The reason for this delimitation is that these are the properties of linguistic expressions
1 77 that , together, produce the systematic contribution made by language towards the comprehension of sentences . (The remainder is contributed by what we usually call background knowledge.) This paper thus contains linguistic as well as logical and cognitive analyses.
I.
THE EMPIRICAL ASPECT AND THE LOGICAL PROBLEM
1. 1. Russell, Frege, Strawson
(1)
a. b.
The present king o f France i s bald . The present king of France is not bald.
I n (I a) the property of baldness appears to be predicated, yet there is noth ing for it to be predicated o f. Sentence (I b) appears to imply that the present king of France's pate is hairy, and hence that there is something properly called "the present king of France" in the world to which it applies . Since (I a) implies the same thing, it would appear that this existential implication
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The presupposition problem has its origin in two observations, made by philosophers and logicians of all ages. Both observations cause trouble for the age-old cherished Aristotelian Principle of the Excluded Third (PET). Classical (Aristotelian) logic is based on two basic principles, the Principle of Contradiction, which says that no sentence can be both true and not true, and PET . The Principle of Contradiction is basic to any form of logic and cannot be challenged. PET, however, is negotiable. Aristotle himself res tricted PET to statements about the past and the present, excepting state ments about the future, truth and falsity not being, in his view, well-defined for the future. (Past and present truths or falsehoods are forever fixed and beyond the reach of human intervention: the future, however, is not fixed and thus escapes the strict metaphysical necessity of past and present facts . ) W e will, in this paper, negotiate PET quite a bit further than has been deemed necessary or desirable in the mainstream of logical tradition. The observations that cause trouble for PET are, in principle: (a) One can speak intelligibly while assigning a property, by the use of a predicate, without there being a really existing object for the property to be assigned to. (b) Negation in language has the tendency to preserve presuppositions , although negation as standardly defined in logic cancels all entailments of its argument proposition, except , of course, logical truths. Both observations are demonstrated in Russell 's ( 1 905) famous (or perhaps one should say: hackneyed) pair of examples:
1 78 is preserved under negation . Russell ( 1 905) wished to preserve PET. His answer was, as i s well known, twofold. First, he maintained that the definite description the present king of France has no status in logic but should be analyzed in terms of the ex istential quantifier (plus a u niqueness clause), so that ( I a) will be analyzed as follows: (2)
a.
3x (KoF(x)
II
Bald(x)
11
Vy (KoF(y) .... y
=
x))
(2)
b.
..., 3x {KoF(x)
II
Bald(x)
II
Vy (KoF(y) .... y
=
x))
It is, however, also possible to insert the negation operator in other positions in (2a), giving rise to so-called " internal negation " . For reasons best known to themselves, human speakers prefer to read a sentence like ( I b) with the negation stuck in just before the predicate "Bald " , as in: (2)
c.
3x {KoF(x)
II
...,
Bald(x)
II
Vy (KoF(y) .... y
=
x))
This analysis saves PET, and keeps the logic standard, which is what Rus sell, and with him the great majority o f logicians, wanted. It thus became the commonly accepted view regarding presuppositions and negation, and much of the effort spent in even quite recent publications on this subject is directly due to the wish to keep it so , despite the attacks that have been levelled against it. The first big attack carne from Strawson ( 1 950, 1 952, 1 954). Strawson re jected both of Russell's answers. In his view definite terms in sentences cor respond exactly to (referring) terms in logical analysis, and negation is not classical but preserves presuppositional entailments. Moreover, a sentence whose presuppositions are not all true (fulfilled) lacks a truth-value, and thus plays no part in the logical truth-functions. This was also the dominant view if not proposed certain ly considered in Frege ( 1 892). In Frege's view, the extension (" Bedeutung" or designation) of a sentence is its truth-value. He moreover wanted logical analysis to be such that the extension (truth value) of a sentence can be computed by functional calculus from the exten sions of its constituents . It follows that if a sentence contains a constituent that lacks an extension and yet is needed for the calculus, no value will
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This analysis makes (I a) unambiguously false in the present world, since there is no entity in this world that can be truthfully said to be, at present, king of France. Russell claimed, furthermore, that negation in language is like negation in standard logic, a truth-function that converts truth into fal sity and vice versa. If (2a) is logically negated, i . e. with the negation opera tor prefixed to it ("external negation" ) , the resulting sentence will be true :
1 79 result. Since definite terms are needed for the intended calculus, they must have an extension, i .e . a reference value in the world with respect to which the truth-value is computed. The lack of such an extension must result in the lack of a truth-value. A sentence thus " presupposes" the real existence of the reference values of its definite terms, no matter whether it is negated or not . Both ( I a) and ( 1 b) thus entail, in virtue of this presupposition, the real existence o f the present king of France. The Frege-Strawson view is a direct attack on PET, given the possibility of sentences without a truth-value. It may be useful to mention that PET can be split up into two separate principles:
It will be clear that the Frege-Strawson analysis denies the Full Valuation Principle, but not the Bivalence Principle. Due, mainly, to a glaring lack of explicitness in both Frege's and Strawson's texts , considerable confusion arose in the subsequent literature about the precise consequences for logic if the Frege-Strawson analysis is ac cepted. That is, given that a presupposes b, i .e. a >> b, and thus, in this view, a i= b and ...., a i = b and a lacks a truth-value when b is false (or without a truth-value), and b is not a logical or necessary truth, what conse quences will this have for logic as we know it? The most helpful way o f presenting this question was developed b y Van Fraassen in a number of pub lications, especially ( 1 97 1 ) . Regrettably, the great value of Van Fraassen' s way of analysing this question has not been recognized by the majority o f those who have, over the past twenty or s o years, contributed to t h e litera ture dealing with questions of presupposition and negation. Van Fraassen works with valuations rather than with "possible worlds" . A valuation is a complete set of truth-value assignments to the sentences o f a language L , in particular the atomic sentences of L (i .e. those sentences that are formed without any of the truth-functional operators of negation, conjunction and/or disjunction). A valuation v0 thus defines a "world" , or rather a state of affairs, to the extent that L is able to specify it. I f there are two truth-values and no gaps are allowed, and if, moreover, all the atomic sentences of L are logically independent (there are no entailment re lations among them), then , if L has m logically independent atomic sen tences, the number of possible valuations is zm. For example, if L has j ust three logically independent atomic sentences, a, b and c (symbolised as L 3 (a, b, c), the number of possible valuations is 2 = 8 (we use " I " for truth and "2" for falsehood) .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Principle of the Excluded Third Full Valuation Principle: Every sentence of t h e language under ana a. lysis has a truth-value. b. Bivalence Principle: There are exactly two truth-values: True and False.
1 80 u
vi : v2 : v3 : v4 : vs : v6 : v7 :
Vg :
a I 2 I 2 1 2 1 2
b
c I I 1 1 2 2 2 2
I I 2 2 1 1 2 2
•a
•b
•C
2 I 2 1 2 1 2 1
2 2 1 1 2 2 1 1
2 2 2 2 1 1 1 1
a
l\
b
. . . etc
I 2 2 2 1 2 2 2
Obviously, truth-functional compositiOns follow automatically. Thus, given that, for example, v3 (b) = 2, it follows that v 3 ( -, b) = 1 , or, given that v3 (a) = 1 and v3 (b) = 2, it follows that v 3 (a 1\ b) = 2 . It i s clearly unrealistic to assume that the atomic sentences of any natural language are logically independent in the sense that for no atomic sentences a and b of some natural language, where a -.c b, a I = b. For example, English has two atomic sentences John has been murdered and John is dead, where the former entails the latter, and it seems hardly possible to imagine a language without such pairs of atomic sentences. Entailment relations, whether lexical or logical, reduce the number of admissible valuations. Thus, supposing that in L(a, b, c) a I = b, v3 and v 7 are inadmissible be cause no world can exist where a is true and b is false. This entailment rela tion thus reduces the number of admissible valuations by two. Now suppose that a presuppose b (i .e. a > > b). In the Frege-Strawson analysis this means that for any v n , if vn (a) I or 2, v n (b) = 1 , and if, for any vn • v n(b ) .C 1 , a has no value for that v n • as is shown in Fig. 2: =
U:
vi : v2 : v3 : v4 : vs : v6 :
a 1 2 1 2
b
c
. . . etc.
1 1 2 2 2 2
1 2
Fig. ]. : Valuation space for L(a, b, c), where a
>>
b, in the Frege-Strawson analysis.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. /: Classical valuation space for L(a, b, c).
181 The absence o f a truth-value for a i n v 3 and v6 is an infringement of PET, in particular the Full Valuation Principle. The common expression is that we have here a bivalent logic with gaps (Such gaps are an u nwelcome com plication for any standard Boolean semantics for the logical calculus, as we shall see in Section 2. 1 .) Note that the valuations v 3 , v4 , v 7 and v 8 in Fig. I are now inadmissible. Instead, Fig . 2 contains the new valuations, with gaps, v 3 and v 6 . What is important here is that the truth-fu nctional opera tors will fail to yield values for any input which contains an unvalued sen tence. This gives the following truth-tables for negation, conjunction and disjunction:
p
1
1\ q 2
2 I
1 2
1 2
2 2
-
-
-
-
-
I
V q 2
-
1 1
1 2
-
-
-
-
-
Fig. 3: Truth-tables for bivalent calculus with gaps.
Clearly, these are simply the classical tables, with the extra provision that no value results when the input is not fully valued . This extra provision, however, is nothing new: it follows from the definition of the notion of function in set-theory. We shall see in Section 2 that the introduction of gaps in the field of valuations of a language L is something quite different from the introduction of a third truth-value. Apart from the fact that the occu rrence of gaps causes certain complica tions of a general nature in the logic (complications, however, which can be dealt with), there is nothing logically wrong or even all that remarkable about a Strawsonian bivalent propositional calculus with gaps . The reason for rejecting it does not lie in its logical properties but in its empirical short comings. Strawson 's criticism of Russell was inspired by worries about the empirical inadequacy of the Russell analysis of definite descriptions. These worries have proved to be well-founded (see in particular Seuren 1 98 5 : 2 1 4- 7), and, as a result, the Russell analysis i s now clearly out of favour. It is a little ironical to see that the same empirical axe that Strawson wielded now hit him.
1.2. The entailment analysis What is at issue is, in part at least, very simple: it isn't true, at least not generally, that, if a > > b, a is without a truth-value when b fails to be true. We can say in perfect truth , e.g . .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
'P
182 (3)
The king of France is NOT bald: there is no king of France.
1.3. Linguistic complications
Although no claim of completeness can be made, and further cases will almost certainly come to light as research proceeds, we can say that in the following classes of cases negation is per se presupposition-preserving: A. Morphologically incorporated negations (except when incorporated into a quantifier, as in nobody, never, neither) . Thus, negative prefixes like un-, in-, dis-, a-, cannot fulfil the cancelling role of not as in (3) above. Thus, (4a) is strongly felt to be incompatible, whereas (4b) easily allows for the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It is true that normally speaking, let us say as a default, negative sentences are understood as preserving their presuppositions: they invite the inference that the presuppositions still hold, but, normally speaking, that inference can be overruled . This observation was brought to the fore around 1 975 by a variety of authors (e.g. Boer & Lycan 1 976; Wilson 1 975). These authors proceeded on the assumption that this presupposition-cancelling use of the negation operator, albeit marked by special intonation and possibly other features as well , is general in the sense that all presuppositions can be can celled that way. They thus proposed that, from a strictly logical point of view, presuppositions are just entailments and are, therefore, cancelled un der negation . What we identify as presuppositions are, in their view, only pragmatic phenomena, to do with reasonable expectations in speaker hearer interaction. Gricean pragmatics was thus invoked to account for presuppositional phenomena, in particular the invited inference character of presupposition under negation and other entailment cancelling opera tors. It is fair to say, however, that this pragmatic approach has not been successful (see in particular Van der Sandt 1 982:50 - 88 ; 1 988:50-86). What these authors proposed was a reversal to the classical bivalent system without gaps. If a > > b, then classical valuation spaces, as in fig. 1 , will do, with all valuations v n such that v0 (a) = 1 and v 0 (b) = 2 being declared inadmissible. That is, a >> b is treated as though from a logico-semantic point of view it were simply a I= b. This analysis is often referred to as the entailment analysis of presupposition. This solution would have been satisfactory if pragmatic theory had been up to the task assigned to it, and if, moreover, there had not been empirical obstacles. As regards the latter, it has been observed (Seuren 1 98 5 : 229- 34) (a) that in certain constructions negation cannot cancel presuppositions but has to preserve them, and (b) that in other constructions the only negation possible is the presupposition-cancelling one. This calls for some illus tration.
183 cancelling interpretation. The exclamation mark indicates incompatibility (i .e. contrariness): (4)
a. ! Tim is UNrealistic about the risk: he doesn't know there to be one. b. VTim is NOT realistic about the risk : he doesn't know there to be one.
(5)
a. ! NOT all doors were locked: there were no doors. b. v'A ll doors were NOT locked: there were no doors.
Both sentences are to be understood with the negation as the highest opera tor, followed by the universal quantifier. (Sa) poses no problem for classical logic: since the first sentence of (5a) entails the existence of doors, it is in compatible with the second sentence. But (5b), which should have the same analysis, does pose a problem , precisely because it is not a contrary pair. I n o u r analysis, a n account of the logical properties of presuppositions will put this right. C. Negations in non-assertive clauses. Typically, negations in non-assertive subordinate or main clauses cannot cancel presuppositions: (6)
a. !Tim seems NOT to be back: he hasn't been away at all. b . VTim does NOT seem to be back: he hasn't been away at all .
(7)
! Do NOT go back to your wife: you haven't even left her.
D. Negations with certain quantifiers. As was demonstrated in (5b) above, the negation with the quantifier all can be used to cancel presuppositions. Not, however, it seems, with , e . g . , each (of the) , or both (of the) : (8)
! Each of the children was NOT given a sweet: there were no children.
(9)
! Both of his children are NOT spoiled: he has no children.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
B. Negations in non-canonical positions. By "canonical position " is meant, for negation in English at least, the position in construction with the finite verb form. The remarkable thing is that negations in any other position than the canonical one are necessarily presupposition-preserving, even when they are logically speaking the highest operator and thus function as sentence ne gation:
1 84 E . Non-extraposed factive subject clauses. Negation over a factive main verb does not affect the factive presupposition, though it may affect other presuppositions in the same sentence: ( 1 0)
a. ! That Tom speaks French does NOT irritate Joanna : he doesn't speak French . b. Vft does NOT irritate Joanna that Tom speaks French: he doesn 't.
(11)
v'That Tom speaks French does NOT irritate the king o f France: there is no King of France.
Note, moreover, that when the factive clause is pronominalized by means of that, the factive presupposition still has to remain under negation , as is shown in ( 1 2a). But when the negation is reinforced with epistemic possibili ty and comes out as cannot, the factive presupposition can be cancelled, as in ( 1 2b): (1 2)
a. ! That does NOT irritate Joanna: he doesn't speak French . (cp ( lOa)). b. v'That CANNOT irritate Joanna: he doesn't speak French.
F. Cleft and pseudocleft constructions. As is well-known, these have a specific existential presupposition associated with the clefted (i .e. the WH-) constituent: if in the non-cleft version of the sentence this constituent re quires a really existing object for the sentence to be true, so does the clefted constituent, whether in cleft or in pseudo-cleft constructions. This presup position is u ncancellable by negation: (13)
! What h e said was NOT "Damn ! " : h e said nothing a t all .
But other presuppositions, not directly associated with the clefted consti tuent, are fully cancellable: (14)
Vft is NOT Mr. Hamilton who wrote the letter: Mr. Hamilton doesn't exist .
G. Contrastive accents. This is an exact parallel to the (pseudo)cleft con structions. In sentences with contrastive accent the accented constituent
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Note, however, that the existential presupposition that goes with the object of a factive verb like irritate can easily be cancelled, even with a non extraposed factive clause:
1 85 serves as a predicate establishing the identity of the entity mentioned in the non-accented part. This latter entity is presupposed to exist in all cases where it is in the corresponding sentence without contrastive accent. This presupposition cannot be cancelled under negation: ! The WAITER didn't start the argument: nobody did.
( 1 5)
Again, however, other presuppositions, such as those associated with the accented part, are freely cancellable: YThe WAITER did NOT start the argument: there was no waiter .
( 1 6)
( 1 7)
a. b. c. d.
She couldn' t possibly have known that . She could hardly breathe any more. It was difficult for him to go on any longer. (cp . • . . . easy . . . ) It DOES matter that her boss i s a n alcoholic.
The point is that the negation which is required in simple assertive clauses with NPis (if there is no other negative word and no auxiliary emphasis) is per se presupposition-preserving, for all presuppositions in the sentence. Thus, the sentences of ( 1 8) are all strongly felt to be incompatible (contrary): ( 1 8)
a. ! It does NOT matter that her boss is an alcoholic: the man isn't. (factive) b. ! Mr Jones does NOT live in Paris any more: he doesn't exist . (ex istential) c . ! He did NOT a t all acknowledge m y presence: I wasn't there. (factive)
Interestingly, NPis have a counterpart in so-called " Positive Polarity
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
H . Negations with Negative Polarity Items. As is well-known, every lan guage has a, usually large, number of so-called " Negative Polarity Items" (NPI). These are words, constructions or expressions which, for no known reason, require a negation or, for some at least, a negative word when used in simple declarative sentences. (Their behaviour in other clause-types differs in ways that have as yet never been systematically studied .) Some, but certainly not all, NPis allow for emphatic auxiliaries (do-support when there is no auxiliary) as a form of negativity. In the examples below the NPis are italicised. ( 1 7a) gives a standard case. In ( l 7b , c) one has NPis with nega tive words (hardly, difficult ) . In ( l 7d) one has a case of emphatic do support:
1 86
( 1 9)
a. b.
Harold still lives in Paris . Harold doesn 't live i n Paris any more.
The test is now that the presuppositions of ( l 9a) are no longer default infer ences when simple not is inserted, whereas those of ( l 9b) are not cancellable, as we saw in ( l 8b): (20)
a. VH: arold does NOT still live in Paris: he has never set foot in France. b. ! Harold doesn't live in Paris any more: he has never set foot in France.
Examples of English PPis are (see also Seuren 1 98 5 : 233): rather, far from, hardly, terrific, daunting, ravenous, staunch, as fit as a fiddle, at most, at least, perhaps, already, certainly, surely, awful, even, each, both, most, some, several, few, not. (Note that the negation word not itself is a PPI: a succession of two or more nots has the effect of cancelling all presupposi tions and creating an echo. But if there is no stark succession of two nots, as in (2 1 ) below, they can both be presupposition-preserving.) Thus, generally, when a PPI stands in the immediate scope o f not it can cels the presuppositions of the sentence, not leaving even a default infer ence. It then also produces an echo-effect. Baker has observed ( 1 970) that , i nterestingly, this is not so when there is double negation (other than stark succession of nots): (2 1 )
There is nobody here who hasn 't already had his breakfast.
Baker's observations are tantalizing, but still unexplained.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Items" (PPI). When a PPI stands directly under negation, the sentence loses its default property of inviting presuppositional inferences and acquires what is usually called an "echo effect" : it sounds as if the same sentence but without the negation has been uttered (or strongly suggested) in immedi ately preceding discourse, preferably by a different speaker. Take, for ex ample, the PPI still, which induces the presupposition that what is said in the rest of the sentence was true at least till j ust before the moment of utter ance, and the sentence as a whole, if in the present tense, asserts that that situation continues to obtain. Contrast this with the NPI any more, which induces the same presupposition b ut lets the sentence, with the obligatory negation , assert that that situation has ceased to obtain . Thus, given a sen tence with the PPI still, its natural negation will not be that sentence with the default-cancelling and " echoing" not but rather that sentence with still replaced by not . . . any more, as in the following pair:
187 A further unexplained complication i s that some, not all, PPis allow themselves to stand under an unaccented not when an explicit or implicit comparison is made. (22)
a. b.
You are not still building (as we are). She hadn't already finished (as you had).
(23)
a. •She is richer than you already/still are. b. YShe is richer than some of her colleagues .
This agrees with the observation, made in A above, that morphologically incorporated negations are necessarily presupposition-preserving and can not (pace Baker) take PPis in their immediate scope. It is not known what system or mechanism is responsible for the emer gence of polarity items, whether positive or negative, and their behaviour. Nor is much known about the question what factors lie behind the fact that o ften the negation word cancels presuppositions as entailments but leaves them as default inferences , while in certain classes of cases it preserves some or all of the presuppositions in the sentence at hand , and in other classes o f cases it eliminates even the default inference of the presuppositions. It would seem that a theory of topic-comment modulation (as being elaborat ed by Van Kuppevelt in Nijmegen) might lay bare the grounds of the neces sary preservation of presuppositions in the categories E, F and G above (i .e. non-extraposed factive clauses, (pseudo)clefts and contrastive accents): Sentences that fall under these categories have a grammatically fixed topic comment structure built into them in such a way that the topic coincides with the presupposition, and presupposition-cancelling can probably be shown to be incompatible with topic-hood . Yet on the whole, our theoreti cal insights still fall short of an explanation of the facts concerned. Even so, however, the answer cannot be simply that the negation operator in language is just the simple bivalent truth-functional operator known from classical logic, somehow modi fied by pragmatic factors, as the entail ment analysis has it. This type of analysis is in principle unable to cope with the clear-cut difference in entailment types between the cases where presup-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Such sentences have a (slight) echo-effect, but preserve presuppositions. However, and this is crucial to us, heavy accent on not is excluded for such cases. PPis are generally excluded in the scope of implicitly negative operators (or, if one prefers, operators with underlying negation), such as the com parative than . This is shown in (23a). In (23b) the some is outside the scope of than; this sentence is interpretable as "there are some of her colleagues who she is richer than " :
1 88
I . 4. Theoretical alternatives 1 .4. 1 . Argument-split theories As far as can be j udged at present, there seem to be, in principle , two viable alternatives . First, one can try and keep negation classical and bivalent, as in the entailment analysis, but seek a, preferably non-pragmatic, solution for the problems raised . Such a solution would have to consist, not in the assumption of different negation operators, but in the assumption of di ffer ent argument-values for one single negation operator. If this is the classical bivalent operator without gaps, one solution might be to show that there are good empirical grounds for assuming that under certain conditions the clas sical negation operator takes as argument a proposition of type a, which leaves the presuppositions as default inferences, while under di fferent con ditions it takes as argument a proposition of type {3, which necessarily preserves presuppositions, and under again different conditions it takes as argument a proposition of type -y , which has not even the default inferences left to it. Alternatively, one may assume that the one and only negation operator of natural language is the presupposition-preserving Strawsonian negation, which is bivalent but with gaps . In that case it must be shown that it pays to !et this negation operate, under certain conditions, only on propo sitions of type {3 (with presuppositions), and under other conditions only on propositions of type -y (which have lost t heir presuppositions). I f neither set o f conditions is met, the sentence will be ambiguous between a presupposition-preserving and a presu pposition-cancelling reading, and , some, perhaps pragmatic, theory will have to explain why the former is the preferred one and the latter is realized only under marked conditions. Let us call theories of this type argument-split (or A S) theories. No such theory has been put on the market yet in anything like a suffi ciently elaborated version. Even so, however, it must be said that there are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
positions are necessarily preserved and those where they are necessarily can celled. The minimal conclusion to be drawn is that there are at least three systematically di ffering ways of using the negation: ( 1 ) with the presupposi tions necessarily preserved , (2) with the presuppositions reduced to default inferences, and (3) with even the default inferences removed. The question is now: what theory has the best chance of coming to grips with the facts observed above? A Gricean pragmatic theory may be considered for certain peripheral parts of the job, but it does not seem the first choice for the cen tral problems, given the known failure, so far, of such theories in those areas . The observed facts are anyway too linguistically structural to be a natural object for pragmatics, whose typical hunting ground is the non linguistic interactional aspects of communication.
1 89
(24)
a. ! There was a car. The car stopped, and the car did not stop. b . There was a car, and the car stopped. And there was a car and the car did not stop .
I f, as this theory would seem to require, every occurrence of a definite de scription brings along a spelling out of the associated existential presupposi tion, the only possible, contradictory reading of (24a) should be impossible, and the reading associated with (24b) would then be the only possible one. 1 In work that is available but not yet published at the time of writing, Van der Sandt (to appear) proposes an AS-analysis with the classical bivalent ne gation operator. In this analysis, all sentences consist of a strictly proposi tional part plus an extra bag of additional information which contains the presuppositions, the (scalar) implicatures, and properties of style and register (more will be said about this in a moment). The negation operator standardly applies only to the strict proposition and will thus preserve the presuppositions as well as the other "extras " . But it may also take as argu ment the strict proposition under an assumed "echo" operator which is not phonologically realized other than by special intonational features, and which ensures that the strict proposition plus its additional information are bundled together . In those cases the negation operator will extend not only over the implicatures and other non-truth-conditional inferences but also
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
as yet no decisive a priori grounds for ruling out theories of this kind. An obvious thought is to split up a sentence into a logical conjunction of its presuppositions and its assertion proper . The "argument split" would then amount to a simple difference in scope for the negation operator . Let aP stand for the sentence a with its presuppositions p. "Not-a" would then, in formally unmarked cases at least, be ambiguous between p /\ ..., a on the one hand, and ..., (p /\ a) on the other, with, presumably, a preference for the former. This analysis (let us call it the conjunction analysis) is usually implicitly invoked when authors speak of " internal" , i.e. presupposi tion-preserving, versus "external " , i . e . presupposition-cancelling, nega tion . It has, however, little going for it, mainly because of structural rea sons. One problem with this conjunction analysis is that the set o f presuppositions p functions structurally and logically simply a s a conjunct or set of conjuncts. For this to be possible, p has to be "unpacked" from the carrier sentence a in the sense that p must be fleshed out as a full, syntac tically and semantically correct sentence or set of sentences . Since none o f the existing theories of syntax or semantics is remotely capable of carrying out such a task, this analysis must, for grammatical reasons, be deemed to be unrealistic. A further problem for the conjunction analysis consists in the fact that it cannot explain why (24a) is felt to be contradictory whereas (24b) is not:
190 over the presuppositions, with the result that these are cancelled, and are no longer logical consequences. The echo operator is meant to account for all cases where a sentence functions as a correction on a previously uttered sen tence. That is, the non-negated part of the first sentences of (25) are all con sidered to stand under the echo operator: (25)
a. b. c.
It is NOT sad that she died so young: she is still very much alive. He doesn't hate SOME of his pupils: he hates them ALL. No Johnny, aunt Bessie isn ' t " SPLITTING" tomorrow, she is LEAVING. Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I n (25a), a factive presupposition is denied. I n (25b), a scalar implicature is corrected . (25c) corrects an expression deemed inappropriate for stylistic or sociolinguistic reasons. (cp . Horn 1 98 5 for an ample discussion). Accord ing to this analysis, ordinary classical negation will do, but there is a sys tematic ambiguity in the argument proposition depending on whether or not it stands under t he echo operator. At least as far as presuppositions are con cerned, this theory is logically equivalent with the conjunction analysis. Yet structurally it is different, in that the extra information need not be " un packed" but is part of the semantics and pragmatics of the utterance that expresses the strict proposition. Moreover, such an analysis will not have to cope with problems such as (24) above, since neither (24a) nor (24b) satisfy the conditions for the use of the echo operator, and the theory would, if suitably extended, make the right predictions. This variety o f AS-theory will, therefore, have a considerable edge over the conjunction analysis, though it still needs to be shown how, on account of what structural princi ples, utterances carry the semantic " extras " . A split in predicate conditions, as in (30) below, may work for presuppositions, but how would one account for the implicatures and the style or register implications? A central aspect of this analysis is the attempt to subsume all cases of ut terance correction, i.e. all echo-cases, under one category, semantically characterized by the echo operator. Underlying this attempt is the claim that all negations that extend over presuppositious, (scalar) implicatures, or register choices, as exemplified in (25a- c) above, are utterance appropriate ness denials (hence the echo), and form a natural class. It remains to be seen whether this claim will withstand scrutiny. One specific difficulty for this theory lies in the fact that the negation that cancels presuppositions cannot occur in any other position in the sentence than what has been called the canonical position, i.e. with the finite verb form (see category B above: negations in non-canonical positions necessari ly preserve presuppositions). In this respect the presupposition-cancelling negation distinguishes itself from negations in, let us say, the Horn cases, i .e. negations that cancel (scalar) implicatures or correct inappropriate lexi-
191 cal, phonological or grammatical choices. Negations in Horn cases d o not have to be in the canonical position, witness, e . g . : (26)
a. b.
Not several but all guests left after the row. Not Lizzy, if you please, but Her Majesty the Queen was wear ing a red hat (cp . Horn 1 985: 1 33).
(27)
a. VH e did NOT only lose his arm . He only lost his little finger. b. ! Not only did he lose his arm . He only lost his little finger. c. ! He not only lost his arm. He only lost his little finger.
A theory like Van der Sandt's, with the echo operator, will have to explain why (27b,c) do not work, while (26a,b) do. This is not j ust a grammatical problem (though, if it were, it would be serious enough), it is also a semantic problem . For, contrary to what this theory predicts, the negation over the Horn cases does not cancel presuppo sitions. Take, for example, (25b ,c) above, and try replacing the second sen tence, i .e. the correction, by a presupposition denial . The result is unac ceptable: (28)
a. ! He doesn't hate SOME of his pupils: he doesn 't exist. b. ! No Johnny, aunt Bessie isn't "SPLITTING" tomorrow: there is no aunt Bessie.
It does not seem likely, therefore, that presupposition denials form one single natural class with the Horn cases , i .e. implicature denials and style or register corrections. On the contrary, it is quite thinkable that the Horn cases, as in (25b ,c) above, are special instances of constrastive accent (or clefting), and will thus necessarily preserve presuppositions (cp. F and G above). They will then require an analysis along something of the following lines (for (25b) and (25c), respectively):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I n (26a, b) the negation precedes the surface subject, and is hence not in the canonical position. One notes, moreover, that the quantifier several, which is, as we have seen, a PPI, does not function as a PPI here, apparently be cause the word used is not several but its quoted counterpart "several " . I f i t had been a PPI here, the negation would have had t o occupy the canonical position, and would have been presupposition-cancelling. This difference is quite real, and considerably weakens the thesis that all cases o f utterance correction form one "natural" class, which must, therefore, be accounted for uniformly. Note, for example, the di fference between (27a), which is an acceptable case of presupposition-cancelling, and (27b, c), which are not , because of their being incompatible:
1 92 (29)
a. b.
the proper word • in: " he hates • of his pupils" is not " some" but "all" . the proper word • in: " aunt Bessie is • tomorrow" is not " split ting" but "leavi ng " .
1 .4.2. NEG-split theories The other class of possible theories is c haracterized by the assumption that it is not the argument of the negation w hich is somehow ambiguous, but the negation itself. Although this possibility is mentioned in virtually all publi cations on the issue, it is not often pursued in detail . Let us speak of the class of possible NEG-split theories. The main recent proponents of this approach, o ften with an admixture of AS, are Blau ( 1 978), Karttunen & Peters ( 1 979), Horn ( 1 985) and Seuren ( 1 985). There is also Bochvar ( 1 938), which is, however, too much lacking, u nderstandably, in linguistic sophisti cation to be taken into serious account here. I shall discuss Karttunen & Peters, Horn and my own proposal now, and Blau in section 2.2. I n the first of these, Karttunen & Peters (1979), the authors propose that language has two negations, one of which preserves , and one of which can cels conventional implicatures . As far as their logic is concerned, however, the di fference lies only in the composition of the argument proposition, which is sometimes j ust the proposition, and sometimes the proposition plus its conventional implicatures. This is like the Russellian conjunction analy sis if the presuppositions are replaced with conventional implicatures. Truth-functionally, therefore, there is only one negation, i.e. the classical operator, with some form of argument-split. The difference between their two negations lies in what they are meant to do with the implicatures. And here the analysis runs into trouble. Karttunen & Peters associate with each expressionX in the language a double translation, one being the extension or (e-)expression (X)e, and the other the implicature (or i-)expression (X)i . The e-expression takes the semantic value of X as in standard formal semantics. The i-expression denotes the Gricean conventional implicatures, which, in Karttunen & Peters' view, include at least most of the presuppositions. They distinguish two distinct negations, the ordinary negation not, which is presupposition-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The transformational rule o f Predicate Lowering (Seuren 1 98 5 : 308 - 1 1 ) would then lower the predicates of (29a , b) onto the position marked by the asterisk, thus giving (25b) and (25c). Note that (25a) cannot be generated along these principles. In any case, since little is known, as yet, about the grammatical aspects of sentences containing quoted forms (the " grammar of quotes" still has to be written), we will let this particular topic rest, and proceed to a discussion of possible alternative theories.
193 preserving, and the presupposition-cancelling contradictory negation NOT. If "Ap'' stands for "A with the implicature(s) P" , (not-AP )e (the e expression) = ..., A , whiie the corresponding i-expression (not-A)i P. _ Then, (NOT-AP)e = -, (A 1\ P), and (NO T-AP)1 = P v -, P . Thus, as far as the e-expressions go there is no difference with the conjunction analysis, except that the implicatures take the place of the presuppositions. So far, the negation operator is in no way different from the classical bivalent ope rator. Only the i-expressions make a difference. It must be appreciated that this analysis is an attempt at providing a solid basis for the pragmatic distinction that these authors propose underlies the different uses of negation . Yet it runs into trouble, as we will now see. The i-expression , for not-AP is, as we have seen, just P (more precisely: (P)e) . That is, A P carries the set of conventional implicatures P, which remains outside the scope of ordinary not. Contradictory NOT, on the other hand, is neutral with respect to P, since the i-expression is the tautological P v -, P. Moreover, P falls under the scope of NOT. Given that, in this analysis, i (A p or BQ) (P v B) 1\ (A v Q), it follows that (NOT(A p or BQW = -, ((A v B) 1\ (P v B) 1\ (A v Q)). Now let, e . g . , A be true and P, B and Q be false (which is possible because conventional implicatures are not neces sarily entailed by their carrier sentences). Then (A P or BQY is true, given the truth of (A )e. But (NOT(A p or BQW is also true, since ((A v B) 1\ (P v B) 1\ (A v Q)) is false , given the falsity of the second conjunct. Under this analysis, therefore, the basic Aristotelian Principle of Contradiction seems to be violated, with the result that the logic collapses. No attempts have been made, to my knowledge, to repair this. Horn (1985) advocates a position which implies that " negation is indeed ambiguous, contra Atlas, Kempson, Gazdar, et a!. But contra Russell, Kart tunen & Peters, and the three-valued logicians, it is not SEMANTICALLY ambiguous. Rather, we are dealing with a PRAGMATIC ambiguity, a built-in duality of use. " (Horn 1 985 : 1 32). In this important paper, which is rich in observations though perhaps a little indulgent on theory. Horn fol lows the course, also, as we have seen, taken by Van der Sandt, of subsum ing all negated echo-cases under one category, which he calls metalinguistic negation: " I seek to encompass all these examples under the rubric o f metalinguistic negation: they all involve the same extended use of negation as a way for speakers to announce their unwillingness to assert something in a given way, or to accept another's assertion of it in that way. Given the behavioral resemblances j ust cited, as well as the prevailing Occamist con siderations, there is no obvious reason NOT to collapse the presupposition cancelling negation . . . with the negation attaching to conversational im plicature . . . , to pronunciation . . . , to morphology or syntax . . . , to register or speech level . . . , and to perspective or point of view ' ' . (Horn 1 985 : 1 35) This analysis clearly has much in common with Van der Sandt' s =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
=
1 94
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
analysis discussed above. The main difference is that Van der Sandt does not split up the negation operator, making it "pragmatically ambiguous", as Horn does, but, instead, splits up the argument proposition of the nega tion using the echo operator. The notion of "pragmatic ambiguity" is relatively new in linguistic the ory. Horn ( 1 985 : 1 35) attributes it to Donnellan ( 1 966). Although there is some unclarity as to what it stands for, it implies anyway the possible use of the expression in question in a non-truthconditional way. Horn comes closest to a definition on p. 1 36: " What I am claiming for negation, then, is a use distinction: it can be a descriptive truth-functional operator, taking a proposition p into a proposition not-p, or a metalinguistic operator which can be glossed 'I object to u ' , where u is crucially a linguistic utterance rather than an abstract proposition . " This implies that not is ambiguous be tween the classical truth-functional operator on the one hand and a speech act operator on the other. Leaving aside the question of whether speech act theory is pragmatics and not semantics, the tenability of Horn's position de pends in part on his contention that the metalinguistic uses are not truth conditional (and hence not truth-functional). Here, it would seem, there are problems. If Horn's metalinguistic nega tion is defined as "I object to u " , where u is any linguistic utterance, then this operator runs the risk of overgenerating possible uses of not. For exam ple, take a situation where two speakers, A and B, discuss the quality of the treatment of mental patients in a particular hospital . Now speaker A says: "One flew over the cuckoo's nest" , wistfully reminding B of that great movie and clearly implying a similarity between what is shown in the movie and what is done in the hospital under discussion. Speaker B, however, vio lently disagrees, and certainly wishes to "object to u " . If Horn's characteri zation of the non-truth conditional metalinguistic negation operator is to be taken literally, B should be able to say: ' 'One did NOT fly over the cuckoo's nest", thereby objecting to A's utterance. Clearly, however, he cannot. In terestingly, most or all languages do have conventionalized ways of object ing to utterances in cases like this. In English, for example, B might say: ' ' The hell/myfoot one flew over the cuckoo's nest". But the standard nega tion word not cannot be used. Horn will, therefore, have to delimit the class of cases where his metalin guistic negation can be used. It is not clear, however, that this can be done in a non-arbitrary way, as long as this negation is kept non-truth conditional and thus purely pragmatic. Horn himself is somewhat vague on this issue. He rests his case largely on Grice's ( 1 967) thesis that " either truth or asserta bility can be affected by negation" , and that when assertability is at issue the use of negation is not truth-functional ( 1 98 5 : 1 37). He then extends this latter, "pragmatic" use to non-linguistic performances. On p. 1 36 he pre sents the amusing example of a piano lesson: "Piano student plays passage
195
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in manner ,.,. . Teacher: ' It's not [plays passage in manner J.L) - It's [plays same passage in manner J.L'] . ' " Horn then comments: "The teacher's use o f not i s clearly not assimilable t o anything remotely resembling truth functional propositional negation . " This, however, is open to doubt. Sure ly, the teacher is making an assertion about the proper way of playing the passage, which, he says, is not ,.,. but J.L ' (cp. (29) above). In general, to say that assertability is affected by negation can hardly be taken to amount to anything other than to say that a certain expression is not assertable, and there is nothing non-truthconditional or even non-truthfunctional about that use of the negation . The picture that emerges is roughly this. As a sentence operator (i.e. with highest scope), not can be used only when an assertion is truth-functionally denied, though the denial (negation) may affect an assertion of proper lexi cal choice or any other kind of performance. The negation word not cannot be used in cases that are really pragmatic, i.e. cases where no assertion is made but an allusion is made or some other non-truth conditional effect is intended by the making of an utterance (for example, if I quote a line from Shakespeare only to let it be known how well-read I am). The language may have other (often not terribly polite) expressions to object to utterances pragmatically, but not does not seem to be one of them. If this analysis is in principle correct. Horn's metalinguistic negation is fully truth-functional, and there is no "pragmatic" ambiguity. If there is any ambiguity, it must be logico-semantic. A specific difficulty arises for Horn's position with regard to the word true. On p. 1 25 he agrees with the " fleshing out" of (3) above ( The king of France is NOT bald: there is no king of France!) as " It is not true that the king of France is bald". Horn notes (p. 1 46) that this is not always possible with metalinguistic negation . Thus, he rightly considers It is not true that the dog SHA T on the carpet - he DEFECA TED on it, unacceptable. And he will probably say the same of a sentence like It isn 't true that this wine is GOOD - it 's EXCELLENT. (I would argue that this is one more indica tion that we do not have a "natural" class here.) In any case, the possibility of the periphrastic it is not true that in some cases would seem to speak against the thesis that such uses are non-truthconditional. In order to save himself from this predicament Horn follows Grice again in saying that true often does not mean "true" but "assertable" . This is, in turn, justified by examples like It 's not true that they had a baby and got married - they got married and had a baby, where it is said that the difference is not truth conditional. But this is begging the question of how adequate standard propositional calculus is for the expression of events in temporal succession. It clearly is not. Given "the prevailing Occamist considerations" invoked by Horn himself (p. 1 3 5), there does, therefore, not seem to be sufficient reason to cut up the word true in the way proposed by Grice and Horn, or
1 96
not.
In my theory, as presented in Seuren (1985), the negation word not is also considered ambiguous, at least as far as its logical properties are concerned. More precisely, in my analysis not is LOGICALLY, and hence TRUTH CONDITIONALLY, and hence SEMANTICALLY, ambiguous. Yet the ambiguity is not arbitrary. The two not s share the property of banning their argument clause from the world picture presented in the running discourse. As far as the logic is concerned, a distinction is made between a minimal ne gation, symbolised as " - " , which preserves presuppositions, and a radical negation, written " == " , which cancels them . Corresponding to the two ne gations there are three truth-values : true (written: 1), minimally false (writ ten: 2), and radically false (written : 3). A sentence is true just in case all its truth-conditions are fulfilled . It is minimally false when the presupposition a! conditions are fulfilled, but not the assertion conditions. It is radically false when not even the presuppositional conditions are fulfilled. The presuppositions are all derived from the lexical conditions associated with the highest predicate of the sentence. These are divided into so-called preconditions and satisfaction conditions. The preconditions generate the presuppositions and the satisfaction conditions generate the "ordinary" en tailments. Generally, let an n-termed predicate (over entities) p n have the extension a(P"), in any given "world" W. Then a(P") is the set of all n tuples t of entities in W such that t fulfils, first, the preconditions of pn and, next, the satisfaction conditions of P". Formally expressed, this looks like the following:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
indeed to open up a non-truthconditional domain for not. (We shall come back to this in Section 3, where it will be shown that in an incremental semantics the truth-conditional difference between p and q and q and p is naturally expressible.) The strength of Horn's position depends also on the motivation for clas sifying, essentially, the three categories of utterance correction exemplified in (25a-c) above as one "natural" class. Here, too, there is room for doubt, given cases like those presented in (26) and (27). Apart from the difficulty pointed out above in connection with the word true, for this analysis to be viable it will have to be explained why the presupposition-cancelling nega tion has to stand in the canonical position of the sentence, as illustrated by the examples (5) and (27) above in connection with Van der Sandt's work, while the other metalinguistic negations, as in (26a,b), also occur in other positions. Horn 's analysis, like Van der Sandt's, has to cope with the non uniform behaviour of not in the cases that are meant to form a natural class. Then, there is the serious problem why English, and with it all known lan guages, do not systematically distinguish between the two functions reserved for it. 2 This is a problem that plagues all theories of ambiguous
197 (30)
a(P n)
= ( ( r ' . . . , r n ) : . . . (preconditions) . . . I . . . (satisfaction I conditions) . . . ) (where '\" stands for a term referent)
p
-
I
2
2 3
I
2 2
3
I
p
==
p
-,
p
2 I I
Fig. 4: Truth-table for minimal, radical, and classical negation (Seuren 1 985: 239).
It is stated, moreover, that any system P of propositional calculus defined by the truth-functional operators -, , /\, v, with • defined as false when its argument is true, and true otherwise, /\ as selecting any value v > I over I , and v as selecting I over any value v > I , is logically equivalent to the classical bivalent system, no matter how many truth-values P has. In other words, the number of truth-values is irrelevant for the classical calculus, although with the three operators as defined any truth-value other than "true" or " false" is otiose. On the other hand, any n-valued logic allows for n- 1 different specific negations , whereby the classical negation (i.e. -, ) is equivalent to the disjunction of the other negations. Thus, in the three valued system proposed here, -, p = ( - p v == p). See for details and proofs the Appendix to Seuren ( 1 985) by A. Weijters. In Seuren ( 1 985), no account is taken of other cases of "marked" nega tion, such as those discussed by Horn (1 985) and exemplified in (25) and (26) above. It is Horn's merit to have drawn attention to these cases. Yet, as has been made clear above, I am not convinced that the Horn cases form one (natural) category with the cases of presupposition denial. In my view, presupposition denials form a separate category, distinct from the Horn cases. These I take to involve a specific form of linguistic quotation in (pseu-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It is claimed in Seuren ( 1 985) that all presuppositions are derived from lexi cal preconditions, even those that seem to be associated with, or induced by, grammatical constructions such as (pseudo)cleft, or phonological features such as contrastive accent. Clearly, to uphold this view non-trivial gram matical analyses are required . This analysis is thus heavily dependent on sound grammatical theory. Of more direct interest here is the question of the logical properties of the two negations, and of the relation between the logic and the semantics of natural language negation. In Seuren ( 1 985: 239) the following truth-table is given for the two negations, with the classical bivalent operator thrown in for good measure, though I would claim that this classical operator does not occur in natural language:
1 98
2 . THE MODE L-THEORY O F TH REE-VALUED CALCULI
In this section we will investigate the logical and model-theoretic properties of the trivalent system proposed. This seems useful because surprisingly little work has been done in this area, perhaps due to a deeply rooted mis trust, in logical circles, with regard to such calculi. It is hoped that the straightforward simplicity of the model-theoretic aspects of three- (and multi-)valued calculi will help to take away this distrust or lack of interest. We will concentrate on the model-theory of the calculi . Note, however, that what we call "model-theory" must be distinguished clearly from what we call "semantics for natural language" . Most brands of formal semantics for natural language are based on the assumption that linguistic semantics is a variety of the kind of model-theoretic semantics developed in logic around the middle of this century. This assumption is radically dismissed here. We speak of "model-theory" when referring to "semantic" methods developed in logic . "Semantics", for us, is the study of the cognitive and linguistic processes that occur when sentences are understood. The incremental unity of negation, as opposed to its semantic and logical ambiguity, will be adumbrated in this section, but not made explicit until section 3 . We will first take a look at the standard model-theory of the classi cal bivalent calculus. 2. 1 . Standard Boolean model-theory for bivalent propositional calculus
Let us revert to fig. 1 above, which gives the classical valuation space for a language L(a, b, c) with no entailment relations among the atomic sen tences a, b, c. In the standard conception, a sentence p is said to "express a proposition" . Let us use the notation ' ' /p/" for the proposition expressed
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
do)cleft or contrastive accent constructions. They therefore involve the minimal, not the radical, negation. Then , In (1985) I provide no clear analysis of the relation between the lo gical and semantic properties of negation on the one hand, and its incremen tal effect on the other. No clear account is given why negation can be said to be ambiguous and yet sufficiently unified for there to be one single nega tion word. As long as this form of ambiguity is not clearly analysed and argued for, this analysis is open to Gazdar's criticism also applicable to Horn's analysis: why, if negation is ambiguous, is there no language in the world, as far as we know, which disambiguates between the two senses? In Section 3 more will be said about this. But first we shall have a closer look at the logical and model-theoretic aspects of the issue.
1 99 by p.lp/ is, in the standard conception, a characteristic function from valu ations to truth-values or, in other words, the set of valuations in which p is true: l vn l vn{p) = l ) . Thus, in the system presented in fig. l, Ia! is the function l < v p l ) , ( v2 ,2 ) , (v 3 , l ) , ( v4 ,2 ) , ( v5 , l ) , ( v6 ,2 ) , (v7 , 1 ) ) , or, alternatively, the set I v I v3 , v5 , v7 ) , and a expresses this function, or this set . In other words, ' an interpretable sentence is associated with a set of valuations, or, if it is n ways ambiguous, with n sets of valuations, in the way sketched in fig. 5 for the non-ambiguous sentence a:
Fig. 5: Ia! as a subset of U .
This has the advantage that the truth-functional operators can be interpret ed as simple set-theoretic operations on the valuations in the field of valua tions U. The classical bivalent negation is now interpreted as follows: for any sentence p, I ..., p/ = U-/p/, or, in other words, the complement of !pi in U. Thus, f -, af = U-/a/ l v2 , v4 , v6 , v8 ) . Likewise for conjunction and disjunction: /pAq/ = /p/Nq/, and /pvq/ = /p!U/q/. To say that p is true in some v n now means: v n E /p/, and, of course, to say that p is false in vn means: vn t !pi, which is equivalent to saying that vn E U-/p/, or ..., p is true in vn . Again, to say that ..., p is false in vn is: vn E U-/p/, i.e., vn E U-(U-/p/), and therefore vn E /p/. On this basis we can give a general definition of the notion "truth-value ", independently of the specific properties of the propositional calculus in question, as follows: =
(3 1 )
I n a propositional calculus P for a language L i n a universe (set o f valuations) U , there i s a truth-value a just i n case the propositions of the sentences of L structure U in such a way that for every p E L and every vn in U there is precisely one set of valuations H !;; U such that vn (P) = a iff vn E H .
Note that this definition allows for gaps i n U only i f a provision i s in troduced for a variable U. Take, for example, Fig. 2, where a is unvalued in v3 and v6 • Here, /a/ = I vi ' v4 ) . Now the complement of /a/ in U, i.e. U-/a/, would be l v2 , v3 , v 5 , v6 ) , and no gaps would be possible. We can,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
u
200
however, make a provision to let U vary with each sentence p E L in such a way that U P is precisely the set of valuations for which p is valued, i.e. the intersection of all /q/ of q E L such that for all v n E U, vn E /q/ if vn (P) = l or 2 . Then, for a in fig. 2, Ua -/a/ = ( v , v ), and definition (3 1 ) will 2 5 apply, with U relativised with respect to any given p E L . This gives the truth-table for negation in any U and any L without gaps and with one single complement-taking negation, i.e. classical bivalent ne gation: p •P 2
Fig. 6: Truth-table for the classical bi valent negation
The truth-tables of the standard binary truth-functional operators /\ and v are conveniently constructed as follows. Let /a/ and /b/ be represented as in fig. 7. The standard definition of /a /\ b/ is /a/ n /b/. Hence, any vn E Ia/ n /b/ will yield the value ' ' I ' ' for a /\ b, and all vn t I a/ n /b/ will yield "2" .
Ia!
2
2
Fig. 7: Truth-table construction for classical /\ .
Likewise for disjunction. Given that /a v b/
=
I a! U
/b/, we have:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
l 2
20 1
2
Fig. 8: Truth-table construction for classical
v.
2.2. Presuppositional Boolean model-theory for trivalent propositional calculus
We have seen (Section 1 .2) that the Frege-Strawson analysis of presupposi tional facts is untenable on empirical grounds: it is normally possible for a sentence to be true under negation even though its presuppositions are not all fulfilled. We have also seen (Sections 1 . 3 ; 1 .4), again on empirical grounds, that a reversal to standard bivalent logic is not a viable alternative. Our conclusion was that a choice had to be made between argument-split theories or NEG-split theories (or a combination of both). We will now have a look at the model-theoretic aspects of NEG-split theories. Instead of restricting U for each sentence p E L to UP and thus creating gaps, we will now keep U constant again, avoiding gaps, but keep UP and re-interpret it as the subuniverse ofp, still defined as the intersection of all /q/ such that for all v0 E U, v0 E /q/ if vn (p) = 1 or 2. UP is to be interpret ed as the set of valuations ("possible worlds") in U expressed by the con junction of all presuppositions of p. If p has no presuppositions in the lin guistic sense (as in, e.g. , There are trees), we still let p "presuppose" all logical truths, and say that, in such a case, UP = U. (This, as we shall see in a moment, makes it impossible for a presuppositionless p to be valued "3".)
Thus, for any sentence, p , /p/ £ U P £ U , as i s demonstrated i n fig. 9 for the sentence a, which we take to presuppose b and not to have any fur ther presuppositions.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Since a and b are represented in figs. 7 and 8 as being logically independent, the tables thus constructed can be generalized to any p, q E L .
202 u
Fig.
9: /a/ as a subset of u•.
and u . as a subset of U .
=
(32)
For any vn in U and any p E L: v n(P) v n(p) vn(p)
=
=
1 iff E /p/ 2 iff vn E U P-/p/ 3 iff vn E U-UP .
Clearly now, if p has no presuppositions, and thus U P = U, vn(P) = 3 will be impossible because v n E U-UP = 0 is impossible. Note also that there is no room left now for a fourth value corresponding to the total com plement, since, given the definitions of the three values in (32), it is not true that for any vn and any p E L. vn(P) = 4 iff vn E U-/p/. (In fact, as the reader may care to ascertain, the assumption of such a fourth value will take away the truth-functionality of the truth-functional operators, and thus des troy the logic.) This enables us to formulate the logical property of presuppositions: (33)
If a sentence p has the set PP as its presuppositions (p) )P). then for all valuations vn E U, vn (P) = 3 iff there is at least one q E PP such that vn (q) -.e I .
Equivalently, i f p ) ) q, then for all v n E U , if vn(P) = 1 or 2 than vn (q) = 1 . It is important to realize, however, that the notion of presupposition
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
W e have now created two disjoint complements for any /p/, besides the old classical complement, which is the union of the two others. We shall speak of: the inner complement of p: U P-/p/; the outer complemen t of p: U-U P ; the total complement of p : U-/p/ ( = the classical complement of p) . For L(a, b, c), with a ) )b, and the valuation space as in fig. 2 above, this means that /a/ = ( v p v4 ) , U a-/a/ ( v2 , v 5 ) , U-U8 = ( v3 , v6 ) , and , of course, U-/a/ = ( v2 , v3 , v5 , v6 ) , or the union of the inner and the outer complements of /a/. The effect of this is more structure in U, and, notably, the emergence of three truth-values, which we shall call "true" (" I "), "minimally false" ("2"), and "radically false" ("3"), defined as follows:
203 plays no role in the logic proper once that has been set up. Everything in the logic will be done in terms of three truth-values and, as we will now see, the truth-functions, including the two negations. The point is that presup positions is itself not a logical but a linguistic-semantic notion. The logic is only tailored to fit the presuppositional phenomena. On the basis of this we now define two negation operators, the minimal negation ( - ), and the radical negation ( == ), as follows: (34)
For any p E L: I - pi I == pi =
=
=
=
=
=
=
=
=
=
=
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It now follows that if for some vn , vn(P) 1 and hence vn E lpl or, equivalently, vn E UP-(UP-Ipl), vn ( -p) = 2 (and vice versa), since 1 - pl 2. = U P-Ipl, according to (34), and thus, according to (32), vn ( - p) Likewise, it follows that i f for some vn , vn (P) 2, and hence v n E UP-Ipl, vn ( - p) = 1 (and vice versa) , according, again, to (32) and (34). This gives us part of the truth-table for minimal negation (cp. fig. 4), i.e. from 1 to 2 and vice versa. It does not give us yet the function value 3 from 3. This we get when we realize that under the definitions given so far UP = U - p for any p, since U - p is still the intersection of all lql such that for all v n E U, v n E lql if vn (P) 1 or 2. With this knowledge we can now say that if vn (P) 3, and hence vn E U-UP , or, equivalently, vn E U-U - p • vn ( -p) 3 (and vice versa), according to (32). This completes the construction of the truth-table for minimal negation, as given in fig. 4 above. In similar fashion we derive the truth-table for radical negation. If, for some vn and for some p, vn (p) = 1 , and thus vn E lpl or, equivalently, vn E U-(U-Ipl), then vn E U-(U-U� , since lpl � UP . Now, U U "" P since ' == p can have no presuppositions in the linguistic sense. Therefore, vn ( == p) = 3 is impossible, as we have seen. (Since U == p is the intersection of all lql such that for all vn E U, vn E lql if vn ( ==p) = 1 or 2, it follows that all v n E U must be a member of any q intersecting with U "' P i .e. only logical ' truths can intersect to form U "" P ' and again, U "" P U). This means that if vn (P) = 1 , vn E U "" P -(U-U � . According to (34), U-UP = I == pl. Hence, if vn (P) 1 , vn E U,. P -I == pi, which, according to (32), amounts to saying that v n( == p) = 2. Likewise, if v n(P) = 2, v n E U P-Ipl, and hence v n E UP ' or, equivalently, vn E U-(U-U� , and thus vn E U "" P -(U-U� , so that if vn (P) = 2, v n ( ==p) = 2. (It also follows that if vn ( ==p) = 2, v n (P) = 1 or 2.) If, on the other hand, vn (P) 3, then, by (3 1), vn E U-UP , and thus, according to (34), vn E I == pi, and hence vn ( == p) 1 , and vice versa. This establishes the truth-table for radical negation as given in fig. 4 above. As regards the binary truth-functional operators, in particular conjunc tion and disjunction, it is quite possible to provide Boolean underpinnings for their truth-tables in a three-valued calculus constrained by minimal and
204 radical negation. This can be done in a variety of ways, all of which will preserve their classical bivalent properties. For conjunction this means that, anyway, /p 1\ ql :::: /p/ n /q/, as in the bivalent logic. The question is now how to define U P"Q in such a way that justice is done to the logical property of presuppositions as defined in (33) above. One way is to define it as UP n U q · The resulting truth-table, which is the table given in Seuren ( 1 985: 239), is constructed in fig. 10, based on arbitrary, logically independent a and b: u
1\
with U aA b
=
u. n Ub .
It is also possible, however, to define U p Aq as U P U U q . This results in the table constructed in fig. I I : u u. f----
2- = 2
/b/
!J:rz- r- 2 - - 2 I
I
� 2 - : 2 .: 1&1-
Ub:
3
Fig. 1 1 : Truth-table construction for trivalent " with u a A b
=
u. u u b .
For disjunction a similar distinction does not work . We can define U pvq as U P U U q , which gives the truth-table presented in Seuren ( 1 985:239), and constructed in fig. 12. But if U pvq is defined as U P n Uq , as in fig. I 3 , no coherent interpretation results since, given the definitions in (32), a disjunc tion will be defined as both true and radically false when one disjunct is true
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. 10: Truth-table construction for trivalent
205 u
t=::;:U :;:;::;:;:;:;:_ a
� 1--
� llat t---
z=� z z- 3 C::::
1-
LLL..L..1 1 -
1\
with
Uavb = U0 U Ub .
u u.
Fig. 13: Truth-table construction for trivalent
v
with
Uavb = u. n Ub .
and the other radically false . Fig. 1 3 , therefore, does not represent a possible analysis. In Seuren (1 985) the table for conjunction corresponds to fig. 10, as has been said. We now see that this table is preferable to the one constructed in fig. I I . According to fig. 10, a sentence of the form ac /\ bd (i.e. a presupposing c, and b presupposing d) generally presupposes both c and d, since for any vn E /a /\ b/, vn E Ua and vn E Ub. I n a general way, this is correct, since a sentence like: (35)
Angus feeds his horse and Paddy feeds his donkey.
presupposes both that Angus has a horse and that Paddy has a donkey. The table constructed in fig. I I does the same, but it also makes (35) presuppose that Angus has a horse or that Paddy has a donkey, which is linguistically incorrect, as appears from the infelicity of a discourse like (see Section 3 below):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. 12: Truth-table construction for trivalent
206 (36)
! Angus has a horse or Paddy has a donkey,
and Angus feeds his
horse and Paddy feeds his donkey.
(37)
Angus has a horse, and he either feeds it or he starves it.
Fig. 12 thus seems to be the empirically correct. Blau ( 1 978:75) has trivalent truth-tables for both the presupposition preserving and the presupposition-cancelling negation . The former is identi cal to our - , the latter, however, is the classical negation ..., , i.e with truth converted into falsity, and all other values converted into truth. We have seen that this negation does not define a truth-value in any logic with more than two truth-values, since, given the minimal negation and its concomi tant inner complement, the classical negation, with its total complement, no longer satisfies definition (3 1). Once an inner complement is defined, the only other negation that defines a truth-value is the one associated with the outer complement, i.e. the radical negation . In other words, in any system with more than two truth-values and more than one negation, the classical negation is dysfunctional as a separate operator. Blau's trivalent conjunction and disjunction operators are defined ( 1 978: 87) as follows:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We will see below, anyway, that it is not useful to let conjunctions carry presuppositions at all . A nd must be taken to block any projection of presup positions, because, if not, a sentence of the form b and ab will both assert and presuppose b, which is rightly considered repugnant in all of the rele vant literature. So it really does not matter much whether we take fig. l O or fig. 1 1 as representing the correct table. Both preserve classical logic equally. As regards disjunction. Seuren ( 1 985: 239) gives the table represented in fig. 12. Thus defined, ac v bd ) ) c v d, but ac V bd does not presuppose either c or d singly, because for some vn E /a V b/, vn � Ua or vn � Ub. Moreover, in fig. 12 a disjunction preserves all presuppositions that are shared by both disjuncts, or: ac v be ) ) c. That this is so easily seen when one realizes that if a ) ) c. U a cannot be larger than /c/ but it may be smaller. Likewise for Ub if b ) ) c. Clearly, then, for any vn E /a V b/, vn E Ua U Ub, and hence, vn E /c/. This is intuitively correct, witness the felici ty of:
207
Vq
1\q p
l
2
3
l
2
3
l
l 2 3
2 2 2
3 2 3
l l l
l
l
2 3
3 3
2 3
Fig. U: Truth-tables for tri valent conjunction and disjunction as in Blau ( 1 978: 87).
=
=
2.3. Constructive model-theory for bivalent non-presuppositional
languages
We have seen (Section 2 . 1 ) that the standard way of setting up a model theory for propositional languages is by associating each sentence of the lan guage with the set of valuations, or, in a different terminology, possible worlds, in which it is true. That set is said to be the "proposition" expressed by that sentence. There is, however, another way of constructing the model theory. Take agrun the classical valuation space for L(a, b, c) as represented in fig . l , which is repeated here for convenience:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The conjunction operator thus defined is inconsistent with Boolean seman U8 U Ub (i . e . as in fig. 1 1 ) , except tics. This table is generated by Ua/\b that the combination of truth and radical falsity yields radical falsity, which fits into an analysis as in fig. 1 0, with Ua/\b U8 n Ub. It should, in the conception of fig . 1 1 , yield minimal falsity for the combination of truth and radical falsity, since any valuation which has this combination of values will be a member of U3 U Ub. Blau's disjunction operator conforms to fig. 1 3 , which, as w e have seen , is incoherent in a model-theoretic (Boolean) in terpretation . Blau' s truth-tables for conjunction and disjunction must thus be rejected on grounds of model-theoretic interpretability. We will see in a moment (Section 3) that the logical presuppositional properties of the conjunction operator do not matter at all for natural lan guage semantics, since, as has already been said, presuppositions do not project through and, this being nothing but a concatenator of subsequent discourse increments. The logical properties of presupposition can only be taken to be epiphenomenal upon an underlying cognitive interpretative mechanism. The language is free to decide when presuppositions appear, these being a semantic property. H owever, before going into questions of this nature we will have a look at an alternative way of setting up the model theory of a propositional language .
208 u
vi : v2 : v3: v4: v� : v6 : v, : Vg:
b
a l
2 l
2 l
2 l
2
c l l
l l l
2 2
l 2 2 2 2
l l
2 2
•a
•b
•C
2
2 2 l
2 2 2 2
l
2 l
l
2
2 2
l
2 l
a /\ b l
2 2 2
l l l l
l l
. . . etc
l
2 2 2
Now, instead of letting a sentence express its proposition, which is a set of valuations, we go the other way around. We drop the notion of proposition as defined in standard model-theory, and consider a valuation v 0 to be a function from sentences of L to truth-values, that is, as the set of sentences of L for which v 0 has the value l . Thus, e.g., v 2 ( b, c, • a, . . . ) in fig . l . It i s easily seen that for a language L all o f whose sentences are atomic
==
==
==
(38)
For any V0 E U and any p E L, V0(p) and : V 0 {p)
==
==
l iff p E V0 ,
2 iff p
E L-v0 ,
and defining, in addition, the classical negation operator as follows:
(39)
p E v 0 iff -, p E L-v0 , and hence p E L-v0 iff -, p E V0 ,
we can establish the classical truth-table for l iff -, p E V0 2 iff -, p E L-v0
•
in the following way:
iff p E L-vn iff p E V0
iff v 0{p) iff v0 (p)
== ==
2 l.
In similar fashion we can , trivially, establish the truth-tables for 1\ and v , given the following definitions :
(4 1 )
For any v 0 E U and any p, q E L , p 1\ q E v 0 iff p E V0 and q E V0 , and : p V q E v 0 iff p E v 0 or q E v 0 •
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. /: Classical valuation space for L(a, b, c).
209
2.4. Constructive model-theory for trivalent propositional languages 2 . 4 . I . Fully constructed valuations Still assuming fully constructed valuations (i .e. with values for all sentences of L), we must again, as in the Boolean case, construct two disjoint comple ments, each to be designated by a separate negation operator, in such a way that the logical property of presuppositions is expressed. To this end we need to define, for each valuation v n• the presuppositional sublanguage for vn, or Lvn · Intuitively, Lvn is L minus those sentences of L that cannot be valued " I " or "2" on account of one or more of their presuppositions not being valued 1 To express this formally, we first define the notion of presuppositional sublanguage for q, or PLq , where q is any sentence of L , i n the following way: "
(42)
" .
For any q E L, PLq is the set of sentences p E L such that for all v n E U, if v n(q) -.e l then v n ( P) -.e I or 2 .
The next step is to define the notion o f nonlanguage for vn , o r NLvn:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This in itself is not very interesting, and there is no reason to prefer this treatment to the standard Boolean way of presenting the model-theory of propositional calculi, as long as one stays within the realm of mathematical logic. From a non-mathematical, linguistic point of view, however, a model-theoretic perspective in which valuations are defined as sets of sen tences has definite advantages . These begin to show when one realizes that this perspective enables one to construct a valuation v n by taking the atom ic sentences of L one by one and deciding whether or not they are to be mem bers of v n. Whenever a sentence p is not in v n, then, obviously, it is in L-vn , that i s , then ..., p E v n , as we have seen . This property i s relevant for the study of language, since it mirrors what speakers do when they build up a discourse domain: adding sentences to a discourse domain D can be seen as building up a valuation . The analog in Boolean model-theory is to construct the set of worlds that makes up a proposition. That perspective, however, does not seem to be useful in the understanding of linguistic processes (though, as long as we know as little as we do, we must remain careful in making such statements). We shall call the perspective adopted here, in which valuations are de fined as sets of sentences, constructive model-theory. In Section 3 we shall see how constructive model-theory can be put to use in a more fully fledged linguistic discourse semantics. First, however, we will have a look at the con struction and definition of minimal and radical negation in this perspective.
210 (43)
For any vn E ;e l .
U, NLvn is the
union of all PL of q E L such that vn (q) q
This enables us to define the presuppositional sublanguage, or L vn • for any v n E U as follows: (44)
For any vn
E
U , Lvn = L - NL vn ·
Now we can define three truth-values in the following way: For any vn
E
U
and any p
= 1 iff p vn{p) = 2 iff p 3 iff p vn(P)
E L, v fp) n
=
E vn E L vn - V n
E L - Lv n·
The minimal and radical negations now allow for the following definitions: (46)
For any p
E L, - p E
vn i ff p
E L vn - v n ,
and -
p E L vn - v n p E
iff vn
= p E vn i ff p E L - L v n ·
U thus has two complements, the inner complement or and the outer complement or L - L vn , - besides, of course, the total complement, which is the union of the previous two. The inner com plement is designated by the minimal negation - (i .e., all p E L vn - vn are made true by the prefixing of - ), whereas the outer complement is desig nated by the radical negation = (all p E L - L vn are made true by = ), as is demonstrated in fig. 15:
Every
L vn - v n ,
vn E
Fig. 15: Valuation v 0 with the inner complement Lvn - v n and the outer complement L - Lvn .
An important aspect of this, which will be further elaborated below, is the fact that both negation operators have in common the banishment of their argument sentence from the vn in question. The minimal negation
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(45)
21 1
=
=
=
=
=
=
' '
' '
' '
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
"ejects" the argument sentence only into the inner complement of v 0 , whereas the radical negation rockets it all the way into the outer comple ment, on account of its being incongruous with one or more other sentences that are members of v 0 • Looked at from this angle, the two negations share the discourse-semantic property of banning their argument sentence from the valuation at hand , the difference being mainly one of intensity. Yet what is a difference in intensity has direct logical and truth-conditional conse quences. The situation is comparable to that of a word like theft . In English law before 1 967, theft was either a simple misdemeanour, to be tried in a minor court, or a serious felony, involving a heavier form of trial. In either case, however, the basis for prosecution was the same. In such a system there are thus two distinct forms of theft, yet they break the same rules of the penal code. Analogously, language has two forms of negation, but both involve a removal from the discourse representation. We shall come back to this later. We are now , anyway, in a position to construct the truth-tables for minimal and radical negation . By definitions (45) and (46), v 0( - p) 1 iff 2. Then , V0( - p) 2 iff - P E Lvn -: V0 - p E V0 iff p E Lvn - V0 iff v 0 (p) 1 . From this it follows that p E Lvn iff - p E Lvn· iff p E v 0 iff V 0(p) Hence, p E L - Lvn • iff - p E L - Lvn • so that , by (45), V0 (p) = 3 iff V0 ( p) = 3 . This gives the truth-table for the minimal negation. 1 iff = p E V 0 For the radical negation we proceed likewise. v0 ( = p) iff p E L - Lvn • iff v 0(p) 3 . We are, as we have seen, setting up the logic in such a way that a sentence under the radical negation operator has no lin guistic presuppositions, since what this operator does is precisely cancel the linguistic presuppositions of its argument sentence. For the logic this means that any sentence, even one without linguistic presuppositions, still "pre supposes" all logical truths , since whenever a presuppositionless sentence has the value 1 ' ' or ''2' ' , the logical truths will still be valued 1 . Given the definitions (42-44) , it follows that, for any v 0 and for any p, = p E Lvn· This is so because if a sentence r has no presuppositions, then at any v 0 there is no q such that v 0(q) � 1 and, for all v m E U, if v m (q) � 1 then v m (r) � 1 or 2. Hence r cannot be a sentence in any PLq of any q E L, and can thus not be a sentence in any NLvn · Therefore, r must, at any v 0 , be a sentence of L - NLvn• and hence of Lvn· This being so, we conclude that any sentence of the form = p can, at any v0 , only be either a sentence of v 0 or of Lvn - v 0 • (46) tells us that = p E v 0 only when p E L - Lvn· In all other cases, therefore, = p E Lvn - v 0 , and , consequently, v 0 ( = p) = 2. This gives the table for the radical negation. Now for 1\ and v . We keep definition (4 1), according to which for any v 0 E U and any p, q E L, p 1\ q E v0 iff p E v 0 and q E V0 , and p V q E v 0 iff p E v 0 or q E v0 • We still need to specify, however, under what con ditions p 1\ q and p V q belong to either Lvn - v n or L - Lvn• and will thus be valued "2" or " 3 " , respectively. The correct truth-tables are generated,
212
2.4.2. Valuations under construction The notions developed in subsection 2.4. 1 . can be used without alteration for valuations that are not fully constructed, i.e. valuations with values given for some but not all the sentences of L . For example, let L have any large number of atomic sentences, and an infinite number of complex sen tences, recursively constructed by means of sentential connectives. Let vn be valued for the sentences a, b, c, d and e, and let the following presuppo sition relations hold: d ) ) h; f ) ) a; g ) ) b; j ) ) a; j ) ) b. Suppose vn, as constructed so far, looks as follows: etc . . . . vn:
Fig. 16: Valuation
V0
under construction .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
as i s easily seen, by stipulating that p 1\ q E L - Lvn (and hence valued "3") iff p E Lvn or q E L - Lvn• and that p V q E L - Lvn (and is hence valued "3") iff p E L - Lvn and q E L - Lvn· In all cases where p 1\ q or p V q are neither in v n nor in L - Lvn , they are in Lvn - vn , and, consequently, valued "2". There is one interesting corollary that follows immediately from this set-up. It is to do with the presuppositional status of logical truths, or sen tences that are always valued " l " in any valuation. Logical truths pose a descriptive problem for any purely logical definition of presupposition. Such a definition will say that p presupposes q if and only if, for all vn E U, if v n (q) � 1 then vn{p} � 1 or 2 . As is well-known, definitions of this kind imply that when q is a necessary or logical truth it is not only simply entailed but also presupposed by every sentence in the language. This is an undesirable consequence for reasons of descriptive adequacy: it is nonsense to say that an arbitrary sentence like My uncle isfar too fond of carrots presupposes any arbitrary logical truth like Nothing is both alive and not alive. Any sensible theory aiming at analysing and explaining presuppositional phenomena would founder on such cases. This is our main reason for not making "presupposition" a relation in the logic and for speaking only of the " logical property of presuppositions", defined in (33) above. What we see now is that, under the definition for NLvn as given in (43), no NLvn can contain any PL1 , where t is a logical truth, for the simple reason that the condition that vn (t) � l cannot be met. (Note also that for any logical truth t, PL1 = L, according to (42). } This means that logical truths will never have an influence on the delimitation of any presuppositional sublanguage Lvn · The constructive approach thus auto matically neutralizes this undesirable side-effect of a purely logical defini tion of presupposition.
213 Since d ) ) h , d E PLh, according to (42), and given the value " 2 " for d, it follows that vn (h ) = 1 . Given that f ) ) a (and f has no other presupposi tions), vn (/ ) = I or 2. However, since vn (b) = 2 and g ) ) b, vn(g) = 3 . Likewise, vn (j) = 3 o n account of one o f j's presuppositions, b , being valued "2" . Given the construction of vn as in fig. 1 6, and given the presupposition relations as specified, we conclude that the presuppositional sublanguage of vn , Lvn• contains/and h, but not g or j. As the construction of vn proceeds through the valuation of more and more sentences of L, L vn will get more and more restricted.
3. 1 .
Some principles of discourse domain construction
The construction of a valuation serves as a model for the construction of a discourse domain by a speaker. In stringing together successive utterances of sentences, a speaker may be regarded as building up a picture of a partial world, which amounts to saying that he is constructing a valuation. However, since what a speaker does is a cognitive activity, one may expect this activity to exhibit certain features that do not figure in a strictly logical modelling of it. This is indeed what we observe. The building up of a dis course domain is subject to certain constraints not found in the strictly logi cal account of valuation construction as given in the previous section. In this section we will discuss some of these constraints. When a speaker is building up a discourse domain D he can be taken to decide what value to assign to specific (non-negative) sentences. If he de cides that a sentence p is to be regarded as being true, then p is increment ed in D in certain specific ways. In section 3 .2. more will be said about what incrementation amounts to. Here it suffices to say that, in principle, when p is incremented in D, the main predicate P of p is assigned to the discourse entities ("addresses") representing the arguments of P. The in crement-value of non-negative p, i(p), is the specific way in which the in formation conveyed by p is stored in D. This process is based on the seman tic analysis of p, but not fully determined by it: background and default knowledge play a part as well in determining how p is stored in D. The build ing up of a discourse domain is thus more than simply assigning truth-values to sentences. It also, and centrally, involves a cognitively backed storage procedure, called incrementation. In this respect discourse construction is seen to be essentially richer than valuation construction. If the speaker decides to present p as being true, i(p) is incremented in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3. DISCOURSE lNCREMENTATlON
214
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
D. But if h e decides t o p resent p as (minimally) false, i(p), though a possible increment, is not incremented in D but, let us say, "quasi-incremented" . By t his is meant that i(p) is excluded from D and stored in a "counter-domain" D ' . The negation word not triggers this (quasi-)incrementation in D' . D ' i s under embargo i n the sense that its contents is excluded from D. The coun terpart of D ' in the constructive model-theory is Lvn - vn . Yet it will be noted that whereas Lvn - vn is a set of sentences, D ' is a store of (quasi-)in crements. The notions defined in (42)-(44) carry over identicaJly to the construction of discourse domains. The notion of PLq can stand unaltered. So can the notions of NLvn and Lvn , which can be renamed NL0 and L0, respectively. But what we have called the "inner complement" , if applied to some D, is not a set of sentences but a store of possible but rejected increments, D ' . The first constraint that is relevant here says that only non-negative sen tences are valued by the speaker. If he decides to present a sentence p as false, he does not value not-p as "true" . Instead, the sentence which forms the argument of the (highest) negation operator, i.e. p, is kept out of D and relegated to the inner or to the outer complement, depending on the speaker's decision. As has been said, the negation operator is the lin guistic element that triggers the argument sentence's exclusion from D. The increment function i applied to a negative sentence - p increments p in D ' . What the radicaJ negation does, we shall see in a moment. Clear ly, this constraint does not apply to sentences with a negation operator somewhere further down in their analysis, only to those sentences that have negation as the highest operator. This constraint throws some light on the fact that in natural language stark successions of the negation word, such as English not, do not occur other than with an echo-effect, in which case the argument sentence of the highest negation is quoted and the highest negation is radical. (We saw in section 1 .3 . that not is a Positive Polarity Item. ) A second constraint, let u s say the sublanguage constraint, says that for the construction of a discourse domain D only those sentences can be con sidered and processed that belong to the presuppositional sublanguage of D, or L0, to the extent that L0 has been defined given the stage of con struction of D. This means that sentences that must be radicaJly false given that one or more of their presuppositions have been excluded from D can receive no increment value: the function i is undefined for such sen tences (though, as we shall see, it is defined for their quoted forms). These are the sentences that belong to NL0. In the light of this constraint the notion of presuppositional sublanguage gains extra significance. There is more, however. The sentences of NL0 are not totally forgot ten. It is, apparently, possible for a speaker to correct or modify a D post
215
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
hoc. H e will do that either because he has second thoughts, or, more like ly, because he wishes to correct a D constructed by an interlocutor. A speaker can thus, in hindsight, include an increment that had been banned, or exclude an increment that had been included. Our observations and analyses now suggest that in the latter case, when a speaker decides to exclude an increment that had already been included in D, he can do one of two things. Either he relegates the increment in question to D 1 , by pre fixing the minimal negation (making it clear through some recognizable means that he is carrying out what psychologists call a "repair"). Or he can declare the increment null and void and say that the underlying sen tence p belongs to NL0. In the latter case he prefixes the radical negation, thereby relegating the sentence p, not the corresponding increment i{p), to the outer complement of D, i.e. to NL0 . When a speaker does that he had better also identify the specific presupposition or presuppositions of p whose increment has to be removed from D along with i{p) itself, there by causing p to belong to NL0. For this to be possible we must assume that it is possible to speak about linguistic elements such as sentences. In other words, we must assume that quoted linguistic elements can figure in sen tences, and that the incrementation procedure for such sentences involves the setting up of separate addresses to cater for expressions and not their discourse denotata. This assumption, however, is perfectly reasonable, both from a naive observational point of view and from the point of view of lin guistic theory. The sublanguage constraint, together with the possibility of post hoc cor rection, is thus responsible for the echo-effect observed in radically negated sentences. A sentence of the form not-p can only be minimally negated, with the incremental effect that l(p) is added to D 1 • But the negation in a sentence of the form not-"p" is metalinguistic and is used when (the increment of) some presupposition of p is removed from D so that, from then on, p E NL0. Now an address is set up for the sentence p, representing the real world object p. Under this address a predication is stored to the effect that the sentence called "p" falls outside L0. 3 In the following section 3.2. , we shall be more explicit on this aspect. This analysis smacks a little of both Van der Sandt's Argument-Split solu tion, with the echo-operator, and Horn ' s NEG-split theory with metalin guistic negation. And indeed, it combines elements of both theories, exclud ing, however, the element that they have in common. Both Horn and Van der Sandt take presuppositions, (scalar) implicatures and implications of style and register appropriateness together and let the metalinguistic nega tion (which, in Van der Sandt's analysis is the classical negation over the quote operator) extend over these as well as over the sentence proper. In our view this is too gross a measure, since, as we saw in section 1 .4. 1 . , de nials of implicatures or style or register appropriateness implications are
216
presupposition-preserving, whereas presupposition denials leave the im plicatures and the style or register implications intact. We therefore set the presupposition denials off, as a separate category, from the other denials. On the other hand we accept the distinction between "straight" negation and metalinguistic negation , and it would seem that, for the latter, the as sumption of a quote operator of some kind would be indispensable. The metalinguistic negation that affects (scalar) implicatures and/or im plications of appropriate selections of style and register operates on sen tences with a semantic analysis involving clefting, as illustrated in (29) above. The quoted linguistic element is the predicate of the cleft construc tion, and the assertion is about proper linguistic selection, as in (47): a. b.
She doesn't LIKE him. She LOVES him. - [the proper expression • in "she • him" is "likes"] . The proper expression • in "she • him" is "loves" .
This presupposes that there i s a proper expression to fill the gap in "she • him", and since "proper" implies "true", all presuppositional entailments of "she loves him" are preserved. It is difficult for this use of metalinguistic minimal negation to put quotes around a complete sentence. The only pos sible choice is for surface constituents to be quoted and placed in predicate position . Sentences of the form not-"p" have no choice but to instantiate radical negation. This fact is no doubt in part responsible for the fact, ob served in Section 1 .3 . , under B, that the radical negation can occur only in the canonical position of its sentence. The radical negation is sentence nega tion, but not in the ordinary way, whereby, in terms of Boolean model theory, the semantic value of the argument sentence is converted into its complement, but in a special reserved sense in which the argument sentence is taken as its own name, and the negation assigns it the property of belong ing to NL0. This implies a two-dimensional matrix for the possible uses of negation, as in fig. 1 7 : minimal [ + i] "straight" [ + i]
+
metalinguistic
+
Fig. / 7: The different uses of negation.
radical [ - I]
+
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(47)
217
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The feature [ + I - i ] indicates whether incrementation under negation results in a "straight" (quasi-)increment or in a quoted sentence being treat ed as a discourse object. The presupposition-denying radical negation must result in the quoted sentence treatment due to the sublanguage constraint for discourse domains. The presupposition-preserving minimal negation, however, results in "straight" increments, relegating them to D ' . More over, "straight", non-metalinguistic negation also results in "straight" increments, and not in the quoted sentence treatment, because if it did it would be metalinguistic by definition. As a result the box for " straight" radical negation must remain empty, since the sublanguage constraint does not allow incremental results for sentences of NL0 , other than by predicat ing this property of "p" . Approaches like those o f Hom o r Van der Sandt have the great methodo logical merit of applying Occam's razor to the process of theory building. This razor cuts growth wherever that is practicable in the light of the avail able factual data. It must, however, refrain from cutting when the data stand in the way. (It's all right to cut the stubble, but the face must be spared.) My contention is that, as so often, especially in the human sciences, careful observation of the material forces us to enrich the theory, leaving less scope for Occam's razor to do its cutting. There are, of course, other constraints for the construction of discourse domains. A prominent one among them is the presupposition first con straint, which dictates that for any sentence q which is also an incrementation unit, and for any sentencep E PLq , i(q) must precede i(p) in the construction of any D. This means that a partially constructed valuation like the one presented in fig. 16 cannot correspond to a real D under construction, be cause d has already been valued ("2"), but its presupposition h is still up for valuation. I n any real D , i(h ) would have been slipped in before i(d). The presupposition first constraint applies to incrementation units, which do not necessarily coincide with sentences. Conjunctions, in particular, seem not to function as single incrementation units, though we must con sider them, of course, to form single but complex sentences. The central pre suppositional problem with conjunctions is that sentences of the form p and aP (i.e. p conjoined with a presupposing p, as in "He is a crook, and you know it") both assert and presuppose p if we treat them as single incremen tation units. This is counterintuitive and counterproductive, since it requires that we have i(p) first, to be followed by i (p and q), which results anyway in i(p) followed by i(q). The result would be a pointless repetition of i(p). For that reason we stipulate that the conjuncts of a conjunction form separate incrementation units. Now p and aP results simply in i(p) followed by i(q). (For more discussion see Seuren 1 985: 280-284.) It is, however, not our business here to develop a complete theory of D construction, and we shall now pass on to the question of the actual in crementation result of negative sentences.
218 3.2. The incrementation of negation
(48)
a. b.
The car hit the curb. The car did not hit the curb.
For these sentences to be processes in some D, D must contain addresses for the terms the car and the curb, with at least the following information:
(The horizontal line underneath the predications "car(a)" and "curb(a)" indicates "address closure" , an operation required for referential correct ness (see Seuren 1 985: 3 1 7 - 3 19). We will not expand on this aspect of the theory here . Yet the closure operation will be mentioned wherever appro priate.) The incrementation of (48a) now results in the addition of the predi cation " hit (a l ' a2) " to both addresses (whereby the non-subscripted "a" stands for the address in question): car(a)
curb(a) hit(a 1 , a)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If a discourse domain is D to be an adequate storage for the information conveyed by successive sentences in a discourse, it must possess a fairly com plex array of storage methods. Even if we assume, as we do, that the infor mation stored in a D is purely structural and thus excludes lexical semantic analyses or encyclopaedic knowledge about the things talked about, it still requires quite complex management procedures, which it cannot be our pur pose here to specify in full. What is required in any case is a set of addresses accumulating the information provided of the individuals that figure in the discourse. There will be singular and plural addresses, and other kinds as well. (Thus, for example, addresses must be made available for the facts that are being talked about.) Moreover, D must provide the means for contain ing instructions relating to the further development of D. These instructions are expressed linguistically through ' 'technical' ' operators such as not, and, or, or if, the quantifiers, or any predicate, such as believe, that introduces an intensional subdomain. What interests us here is the way not determines the incrementation result of its argument. To illustrate this, let us take the case of a simple sentence like (48a), with its minimal negative counterpart (48b):
2 19
(No account is taken of tense or aspect phenomena. Speech act properties are also disregarded, as well as other possible factors that should be taken into account in a fully developed incrementation procedure.) The minimally negative (48b) is incremented in much the same way as (48a), but with the extra condition that the increment is to be stored not in D but in the counter-domain D ' . This must be indicated formally, for exam ple by asterisking the negated predications: car(a)
curb(a)
D now contains the instruction that the predication "hit(a p a2)" is ostra cised as an increment for the duration of D . An analogous procedure is followed when the setting up of a new address is banned and relegated to D ' . A new address is set up in one of two ways. The canonical way is through the occurrence of an existential quantifier at the top of the semantic analysis of the sentence to be incremented, as in (49a) with the semantic analysis (49b):
(49)
a. b.
A car drove past. 3 1 r x(drive past(x)), x(car(x))] A
In (49b) the existential quantifier " 3 1" functions as a technical higher order binary predicate over pairs of sets, the first set consisting of the individuals that "drive past", the second set being the set of cars. The satisfaction con dition associated with this predicate is, simply, that the two sets have an in tersection of at least one individual . The existential predicate is "technical" because its increment value is non-standard. The resulting increment con sists, in fact, in the setting up of a new singular address: a3 :
car(a) drive past(a)
(A new address set up in this way is not closed by the horizontal line: it will be as soon as a subsequent definite term denotes it. "Open" addresses are satisfied by anything in the model (world) that answers the description stored in the address. Open addresses thus have a truth-value: they can be true or false (or truth valueless if no verification domain is specified). Closed addresses are cognitively fixed onto a specific individual or set of individ-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
*hit(a 1 , a)
220 uals.) The (minimal) negation of (49a) is (50a), with the analysis (50b): (50)
a. b.
No car drove past - 3 1 r x(drive past(x)), x(car(x))] A
The resulting increment is like that of (49b), except for the asterisk which indicates that this increment, though otherwise in order, is relegated by the speaker to D ' : •a3 : car(a) drive past(a)
(5 1)
a. b. c.
She took some of it. No, she didn't take SOME of it. She took ALL of it. - [be "some" (the proper expression • in "She took • of it)] ; be " all" (the proper expression • in " She took • of it")
When (5 l b) is up for incrementation, a new address is slipped in, by back ward suppletion, for the definite term the proper expression • in ' 'She took • of it " , and this address is immediately closed . Then the predication (be) "some" is added with an asterisk and the predication (be) "all " is added without asterisk: a 15:
proper expression • in "She took • of it" (a) • "some" (a) "all" (a)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The second way of setting up a new address is by means of so-called "post hoc suppletion" (also called "backward suppletion" or "accommoda tion"). This takes place when the text contains no explicit existential state ment, but only a definite description without, as yet, a corresponding do main address. When available background knowledge supports the setting up, post hoc, of a proper address for the definite term to "land at" , that is what happens . Thus, when I begin a story uttering (48a), without first properly introducing the car I am speaking about (for example by saying something like As I was taking my morning stroll, a car drove past ), post hoc suppletion quickly slips in a car-address, as though I had actually ut tered an existential statement . Addresses set up as a result of post hoc sup pletion are immediately closed, simply because they are immediately denot ed by a definite term. This apparatus suffices, in principle, for the minimal negation. It caters for the Horn cases if provisions are made for the accommodation of quoted elements . Suppose someone says (5 l a) and I say, correcting the previous speaker, (5 1 b) with the semantic analysis (5 l c) :
22 1
(52)
The car did NOT hit the curb. There was no curb !
For such a pair of sentences to be incremented it is required that D already contains the increment of (48a). The car hit the curb. By way of " repair" the speaker wishes to undo both that increment and the increment consisting in the setting up of address a2 , the curb-address. This means that a2 is asterisked. But now i (the car hit the curb ) is undefined, since one of the presuppositions of that sentence has been banned from D and relegated to D ' . Therefore, the normal method of keeping increments away from D , by relegating them to D ' , cannot be followed. The solution adopted by the hu man linguistic faculty is, apparently, to quote the sentence in question, in stead of taking the increment, and assigning to that sentence the property of belonging to NL0• The sentence thus remains without its standard incre ment value, but it is incremented as an address in D, denoted by its quoted form (its name). The result looks something like the following:4 a36 :
"the car hit the curb" (a) NL 0 (a)
This analysis clearly implies a truth-conditional difference between minimal and radical not, and hence a semantic ambiguity. A minimally negative sentence not-p is true just in case the preconditions of the highest predicate P of p are fulfilled, but not the satisfaction conditions of p. The radically negative not-' 'p' ' is true just in case the sentence p does not belong to the presuppositional sublanguage of the D in question, i.e. there is pre supposition failure resulting from non-fulfilment of the preconditions of P. As we have seen, this truth-conditional difference is neatly expressed in terms of a three-valued logic. Such a logic, however, is nothing but a state ment of the logical properties of the system at hand. It is not by itself a description of the system.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There is thus no question of a sentence like (5 1 b) being in any way an incom patible conjunction of contrary conjuncts (assuming, reasonably, that the it stands for something real, otherwise the conclusion, in standard predicate calculus, is that she took nothing). The sentence is not, or anyway not directly, about what, if anything, she took, but about what expression describes the situation most adequately. That the semantic analysis takes an unusual form, as in (5 l c), seems best attributed to the, still largely unknown, peculiarities of the grammar of quotes, and not to an alleged am biguity of the word not, whether this ambiguity is called pragmatic (as Hom does) or semantic, or logical, or what not. Now suppose (48b) is to be incremented not with the minimal but with the radical negation:
222 This ambiguity, however, is quite unlike any arbitrary lexical ambiguity that may occur in a language, such as the ambiguity of the English noun plant, which is ambiguous between a botanical object and a complex of buildings and constructions intended for industrial production. As ambigui ties go, the ambiguity of not is highly idiosyncratic, in that it is not haphaz ard but manifests di fferent methods of banning increments from D . We may say that the increment function i contains a negation algorithm, rough ly and incompletely characterized as follows:
(53)
As has been said above, it is like theft being either a felony or a mis demeanour: in either case there is illegal taking possession of goods. Here we have minimal and radical negation. In either case there is failure of predi cate conditions, and in either case there is banning of an increment from D . Moreover, t h e increment function I i s organized i n such a way that, given some appropriate D, the question of whether not takes p or "p" as argu ment is automatically settled. If p belongs to L 0 , then not takes unquoted p as its argument, resulting in l(p) becoming part of D ' . But if the speaker decides to revise D so as to make p belong to NL 0 , not has no choice but to take "p" as its argument, with the quotes. Thus used, not assumes differ ent truth-conditions. It is not possible to say that this explains why natural languages tend not to have separate words for minimal and radical negation , since, clearly, we have as yet no general theory that will predict for special cases such as nega tion whether or not overt disambiguation will take place. All we can do at this stage is say, in hindsight, that, given the close relationship between the two meanings of not, given the automatic selection, in any D , of the proper reading, and given the strongly marked character of radical not, there is, ap parently, no need for disambiguation. s Gazdar's objection (quoted in note 2 above), that negation is unlikely to be ambiguous because languages tend not to disambiguate it, is reasonable and must be answered. The answer we can give now is that, given the analyses presented above, one is justified in saying that this objection need not cause too much concern. We now know that negation is sufficiently different from ordinary cases of lexical am biguity for this objection to lose most if not all of its force . Later research will hopefully show whether negation is unique in this respect, or whether it does, after all , fit into a more general pattern .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For any sentence with not as the highest operator, not takes the "straight" increment i{p) of its argument p or "p" . If the argument is p, not assigns i{p) to D ' . If the argument is "p" , not assigns the predication "NL0(a)" to the address an set up (post hoc) as the denotation of the name "p" . In that case, there must be some l(q) in D, such that p ) ) q, and not-q must be processed.
223 3.3. The empirical value of logical analysis
We can now be more specific about the relation between the logic that goes with the incremental system and the system itself, and in particular about the empirical relevance of logic for an adequate semantic analysis. Let us consider a few examples not directly related to negation. It has been said above (section 1 .4.2.) that truth-conditional differences may be involved in the ordering of conjuncts. This is so when the conjuncts express successive events (they have aorist aspect). In those cases there is a clear difference be tween p and q on the one hand, and q and p on the other: a. b.
He made a fortune and went to Spain. He went to Spain and made a fortune.
It is not too difficult to see that, in principle, a discourse incrementation sys tem will be able to take care of this. Discourse semantics will have to provide procedures for the proper incrementation of time-bound sentences, i.e. for the expression of tense and aspect. It will be part of such procedures to pro vide some indexing method for predications under addresses with regard to their ordering in time. The relative indexings of the conjuncts will then ex press this difference, and the truth-conditions for the conjunction as a whole will differ accordingly. There is thus no doubt that the difference between (aorist) p and q and q and p is fully truth-conditional. One may still wonder whether, in that case, natural language and in conjunctions under aorist tense is not truth functional. But this conclusion is not warranted since the time indexings of each conjunct are part of their truth-conditions, so that, with the proper in dexings, p 1\ q is equivalent to q 1\ p, where p and q are truth-conditionally correct logical analyses, and 1\ is the logical conjunction operator. Even so, however, the fact remains that p and q, where p and q are sentences, is not equivalent to q and p. If the order of the sentences is inverted, different propositions come about, as a result of the now different time indexings. In other words, once logical p 1\ q has been expressed as p and q, logical q 1\ p can no longer be expressed linguistically. This is a semantic property of natural language conjunctions, which is not reflected in standard propositional calculus, with truth-functional and symmetrical and. It is perhaps possible to develop a propositional calculus that behaves in a way that is strictly parallel to what is found in language, but that logic would have no other purpose than to reflect what happens in language. Useful though such an exercise is for a better understanding of language, no intrin sic logical purpose is served by it. It would be an exercise in applied logic, not in pure logic. A similar pattern emerges when the other traditional truth-functional
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(54)
224
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
operators of standard propositional calculus are considered. The increment value of and, as we have seen, consists in the incrementation of its conjuncts in the order in which they occur in the sentence (whereby each conjunct is a separate unit of incrementation). The disjunction operator or has the ef fect of splitting up D into as many subdomains as there are disjuncts, this splitting being marked as a commitment, on the part of the speaker, that at least one of the disjuncts/subdomains is to be added as part of D. Impli cations of the form ifp then q have the incremental effect of stipulating (as a domain instruction) that i(p) is not to be excluded from D but that it is allowable only if conjoined with i(q). Interestingly, natural language grammars provide no way to express the normal, non-contrastive, non-radical negation of conjunctions, disjunc tions and implications other than by the prefixing of "it is not the case that" or some similarly artificial periphrastic, involving separate incrementation procedures under predicates like "be the case that". Apparently, not is de fined only for increments, w hether new or already processed and now repro cessed . And apparently, it is not only conjuncts that form separate in crementation units but also disjuncts and the clauses of implications. These are aspects, however, that we cannot investigate more fully here. What, then, is the empirical value of logical systems that aim at incor porating, as much as possible, the quirks of natural linguistic interpretation processes? In at least one respect this value is clear. Any logical system re quires the setting up of well-defined logical analyses that function as units in the logic machinery. To the extent that a logic mirrors natural language processes more faithfully, one is more justified in claiming that the logical analyses figuring in it embody structural and semantic constraints on the semantic analyses to be provided by the grammar for the sentences of the language. In this way, the logical system shows up the logical conditions that must be fulfilled by any semantic theory, besides the other conditions that must be met. But there can be no question of the logical system itself being part of the semantic machinery of cognitive processing of uttered sentences. All one can demand is that the structural and logical properties of the ana lyses that occur in the logic be somehow incorporated into the semantic ana lyses and definitions of sentences and words occurring in them, so that the semantic theory is kept logically sound . This shows again that, in language, logic is epiphenomenal on the struc tures and processes that occur in the semantic and cognitive processing of uttered sentences. Existing logical analyses only do partial justice to the real ity of language, and in some cases, such as disjunction and implication, they actually distort it. Obviously, attempts at developing sound logics that do fuller justice to language are to be welcomed and appreciated if only because they attempt to show the logical soundness of language. Yet , no matter how well they fit the facts of language, any claim to the effect that the logical
225 properties of the expressions figuring in them (such as monotonicity, to take j ust an example) are relevant to the empirical study of meaning phenomena will have to be argued for separately and independently. This , then is the status claimed for the three-valued logic with its two negations described above: it appears to provide a logical account of the presuppositional differ ences of minimal and radical negation. If the logical analysis is correct and the observations on which it is based reflect linguistic reality, semantic ana lyses of sentences are constrained by this logic, and minimal and radical ne gation must both be assumed to occur in negative sentences .
NOTES I . For further details, see Seuren ( 1 985: 2 1 7). 2. Cp. Gazdar 1 979: 65-6: " But no language, to the best of my knowl�.dge, has two or more different types of negation such that the appropriate translation of ( I I ) [ John doesn 't regret having failed] could be automatically 'disambiguated' by the choice of one rather than the other. " 3 . Natural language happily mixes object and metalanguage. The Liar paradox and its kin are obviated by other means than the strict separation of object and metalanguage (cp. Seuren =
to appear).
4. This way of incrementing radically negated sentences differs from what is proposed in Seu ren ( 1 985: 3 3 1 ), where the double asterisk is used to marl:: radical negation. The proposal made here is more explanatory and based on a more careful analysis, prompted to a large extent by the works of Hom and Van der Sandt. 5. One may well wonder what happens in languages, such as Turkish, with morphologically incorporated negation. According to A in section 1 . 3, morphologically incorporated negations are necessarily presupposition-preserving. One would, therefore, expect such languages t o have a separate negation word to be used when radical negation i s called for . Turkish has, be sides the bound morpheme mV (i.e. with a vowel that follows vowel harmony), also the word deb/. It remains to be seen whether deyil is required for radical negation, besides its other func tions in the language.
REFERENCES Baker, C . L . , 1 970: Double negatives Linguistic Inquiry 1 . 2: 1 69-86. Blau, U . , 1 978: Die dreiw�rtig� Logik der Sprache. Ihre Syntax, &mantik und A n�ndung in du Sprachanalyse. De Gruyter, Berlin-New York . Bochvar, D.A., 1 939: On a three-valued logical calculus and its application to the analysis o f the paradoxes of the extended functional calculus (Russian). Matematiceskij Sbomik, N . S . : 4: 287 - 308. Boer, S. and Lycan, W . , 1 976: The myth of semantic presupposition. Indiana University Lin guistics Club. Donnellan, K . , 1 966 : Reference and definite descriptions. Philosophical R�iew. 75: 281 -304. Frege, G., 1 982: Ober Sinn und Bedeutung. Z�itschrift fiir Philosophi� und phi/osophisch� Kritik. !()(); 25-50. Gazdar, G., 1 979: Pragmatics: Implicature, Presupposition, and Logical Form. Academic Press, New York-San Francisco-London.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Nijmeg�n University Philosophy Institute P.O. Box 9108 6500 HK NIJMEGEN- Holland
226 Grice, H .P ., 1967: Logic and conversation. The William James Lectures. Har v ard University .
Unpublished. Hom, L . R . , 1 985: Metalinguistic negation and pragmatic ambiguity. Language. 6 1 . 1 : 1 2 1 - 1 74. Karttunen, L . and Peters, S., 1979: Conventional implicature. In: Ch.-K. Oh & D.A. Dinneen (eds.), Presupposition ( Syntax and Semantics I I ), Academic Press, New York-San Francisco-London: I -56. Russell, B., 1905: On denoting, Mind 14: 479- 93. Seuren, P.A.M., 1 985: Discourse Semantics. Blackwell, Oxford. Seuren, P.A.M., to appear: Les paradoxes et le langage. Logiqut tt A nalyst. Strawson , P . F . , 1950: On referring. Mind 59: 320-44. Strawson, P . F . , 1 952: Introduction to Logical Theory. Methuen, London. Strawson, P.F., 1 954: A reply to Mr Sellars. Philosophical Review 63.2: 2 1 6-3 1 . Van der Sandt, R.A. 1 982: Konttkst tn Pr�uppositie: Een studie van het projtktitprobleem tn dt pr�uppositionelt tigtnschapJNn van dt logischt konntktievtn. PhD-thesis. Nijmegen. Van der Sandt, R . A . , 1 988: Conttxt and Presupposition. Croom Helm, London-New YorkSydney. Van der Sandt, R .A . , to appear: Discourse systems and echo-quotation. Van Fraassen , B., 1 97 1 : Formal Semantics and Logic. Macmillan, New York-London. Weijters, A . , 1 985: Appendix: Presuppositional propositional calculi. In: Seuren 1 985: 483-525. Wilson, D., 197 5 : Pr�ppositions and Non-Truth-Conditional Semantics. Academic Press, London-New York-San Francisco. c
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Journal of Semantics 6: 227-269
ILLOCUTION, MOOD AND MODALITY IN A FUNCTIONAL GRAMMAR OF SPANISH
KEES H ENGEVELD
ABSTRACT
between several layers, each representing a different subact of the speec h act, and (ii) a repre sentation of noun clauses which distinguishes between non-factive, factive, and semi-factive complements.
0. INTRODUCTION 1
Many papers have been devoted to the treatment o f the Spanish indicative and subjunctive. And many of the studies contained in these papers were aimed at arriving at one definition of the meaning of the subjunctive in all its different uses. In this paper, yet another attempt to provide a satis factory description of the Spanish mood system, a different line of research is followed, based on the following assumptions: (i)
(ii) (iii)
Only in those contexts in which both indicative and subjunctive may appear under identical conditions may they be said to add to the meaning of the sentence in which they occur. This does not imply that their unique occurrence in other contexts is purely at random. There is no reason to assume a priori that the subjunctive is the marked member of the pair . The m,eaning of the subjunctive or i ndicative (in those cases i n which they may b e said t o have a meaning of their own, c f . (i)) should be determined from context to context, although the possi bly different meanings of either of them in these different contexts may have certain elements in common .
The di fferent problems related to the treatment of the mood system of Span ish will be addressed within the framework of Functional Grammar (FG). Some basic principles of this theory are presented in section 1 . In section 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In order to be able to account for the alternating and non-alternating uses of mood in Spanish this paper explores the field of illocution and modality and argues for two elaborations of the Functional Grammar framework : (i) a representation of main clauses which distinguishes
228
I . SOME BASIC PRINCIPLES OF FUNCfiONAL GRAMMAR
In FG, linguistic expressions are represented in underlying predications, in which the semantic, syntactic and pragmatic functions indicating the dif ferent relations holding between the participants in a State of A ffairs (SoA) are represented. To form such an underlying predication, a predicate frame is selected from the lexicon . A predicate frame contains, among other things, a predicate, which may be either basic or derived, and a number of argument positions, each provided with a semantic function specifying the role the arguments fulfil in the SoA 's designated by the predications built on the basis of this particular predicate frame. An example is:
I n the argument positions of such a predicate frame terms are inserted. Terms are referring expressions, which may have a complex structure. Term insertion in the argument positions of ( 1 ) leads to, for instance:
No attention is given here to the i nternal structure of terms. Pragmatic func tions may be assigned to the arguments to specify their informational status and syntactic functions to specify the perspective from which a SoA is presented. In the final expression through expression rules of underlying
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I discuss some general issues related to the treatment of mood. In partic ular, I go into the notions illocutionary force and m odality and argue that these notions pertain to different levels of the speec h act, and that mood in flection may fulfil a distinguishing function at both levels. In 2 . 1 illocution ary force and its representation in FG as proposed by Dik (forthcoming) is discussed separately. In 2.2 I address the question what kinds of modal dis tinctions a language may be expected to make. I propose to distinguish three different types of modality and discuss the different modal distinc tions to be made within each of these types. To conclude section 2 I present an adaptation of the clause model proposed by Dik (ibid . ) and locate the three types of modality within this model. In section 3 I tum my attention to the Spanish data. The use of mood inflection in main clauses and embed ded predications in Spanish is discussed in relation to illocutionary force (3 . 1 ), modal contexts (3 .2), and non-modal contexts (3.3). The use of mood in adverbial and relative clauses will not be touched upon. In section 4 I go into the relations between the different uses of the subjunctive and in dicative .
229 representations like (2) operators fulfil an important function . They should be regarded as abstract elements, representing semantic distinctions coded in a language through grammatical means. Different types of operators are distinguished: term operators and predicate operators. Term operators take care of e.g. definiteness and number. Predicate operators take care of grammatical distinctions which are coded on or near the predicate, such as Tense, Mood, Aspect and Polarity . The term 'Mood' is thus restricted to modality expressed through grammatical means. Specification of opera tors in (2) leads to, for instance:
(4)
The boys were building a shed
According to FG, language should be regarded in the first place as an instrument for social interaction . Its aim is therefore to provide the means to explain specific linguistic phenomena, where possible, " in terms of their functionality with respect to the ways they are used and to the ultimate pur poses of these uses" and to devise a theory of the language system "in such a way that it can most easily and realistically be incorporated into a wider pragmatic theory of verbal interaction" (Dik 1 978:2). Such a theory may be expected to be able to handle the linguistic means through which commu nicative intention and speaker's j udgements are coded in a linguistic sys tem. I hope to show that an adequate treatment of the Spanish mood sys tem requires the incorporation of both levels.
2. ILLOCUTIONARY FORCE AND MODALITY
2. 0. Introduction Executing a speech act has been analyzed since Austin ( 1 962) and Searle ( 1 969) as requiring the execution of a number of subacts on the part of the speaker (S). Among these subacts are the illocutionary act and the proposi tional act. In uttering a sentence S not only offers a proposition to the Addressee (A), but also transmits his communicative intention. S has a number of linguistic means at his disposal to code the content or intention he wishes A to recognize in his utterance: proposition indicating elements at the level of the propositional act, and illocutionary force indicating devices at the level of the illocutionary act. The total of proposition indicat ing elements expresses the propositional content of an utterance. The total
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
which will be expressed as:
230 of illocutionary force indicating devices expresses the illocutionary force of an utterance . Every utterance may thus be analyzed according to the follow ing scheme (cf. Dik forthcoming):
(S)
ILL(predication)
For the time being I use the term 'predication' instead of 'proposition' in line
with the FG terminology introduced earlier. The term 'clause' is used for
any combination of a predication with an illocution, as represented in
(5).
levels and their relation to mood inflection in general terms and then I go on to present a clause model in which both levels are represented .
2. 1. Mood and illocutionary force Illocutionary force indicating devices are those linguistic means through which S transmits his communicative intention. They may be subdivided into lexical and grammatical means. Performative verbs belong to the first category, whereas sentence order and mood belong to the second category. So, anticipating the data to be presented in section 3 , an assertion in Span ish is executed most directly by using a perforrnative verb like asegurar 'assure' , the declarative sentence type and the indicative (I) mood: (6)
Te aseguro que no es (I) culpa rnia. 'I assure you that it's not my fault . '
Performative verbs are mainly used to produce special effects (see Weijdema et al . 1982). Sentences like (7) are more common: (7)
No es (I) culpa rnia. 'It's not my fault. '
The only difference between (6) and (7) is that (7) lacks the performative verb present in (6). And often S will intend (7) to be interpreted , like (6), as an assertion. Yet it is not possible to establish a direct relationship between illocutionary act and sentence type or mood. The use of these illocutionary force indicating devices in (7) does not necessarily irr.ply th� S intends his utterance to be interpreted as an assertion . A sentence like: (8)
It's very cold in here,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Mood inflection can be used both as an illocutionary force indicating device and as a proposition indicating element, as is shown for Spanish in section 3 (cf. also Bolkestein 1977, 1980). In this chapter I first discuss both
23 1 may for instance be uttered to produce the effect that A closes the window or turns up the heating, depending on the particular speech situation. To account for this fact, Weijdema et al. ( 1 982) distinguish between: (9)
(i)
The illocution-for-the-speaker (ILLs): the illocution as intended by s . (ii) The illocution-of-the-expression (IL�: the illocution as coded in the linguistic expression. (iii) The illocution-for-the-addressee (ILLA) : the illocution as interpreted by the addressee.
( 1 0)
Relations between I LLs, ILLE > and ILLA
f=.:ll :=l
Anticipation of Perlocutionary ILLE Effects
I
i
:r
Int•r r•
Reconstruction of Communicative Intention
v• Act
ILLA
A speaker with a given communicative intention has to select " . . . those lin guistic devices which he thinks optimally serve the purpose of eliciting from the hearer a positive reaction to his speech act ' ' . Haver kate ( 1 979: I I ) called this the allocutionary act. The succesfulness of the strategy chosen by S depends on the recognition by A of the intention of S in the utterance in a given communicative situation. According to Dik (forthcoming) ILLE is generally expressed by means of a number of sentence types, which he considers to be ' grammaticalised carriers of basic illocutions of linguistic expressions' (see also Lyons 1 977: ch. 16 and Levinson 1 983). He proposes to assign an operator representing ILLE to predications. Such an operator triggers the expression rules which account for the formal realization of the different sentence types while at the same time providing the means to link ILLE to ILLs. In his view the communicative intention of S has relevance for linguistic description only
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These three categories can be seen as related in the way indicated in ( 1 0) :
232 in so far as linguistic means are used to code this intention in an expression . The fact that, for instance, declarative sentences may be used to make a request should then be explained within a wider pragmatic theory. Searle's ( 1 %9) ' felicity conditions' might be a point of departure for this wider pragmatic theory, as they may shed some light on the question how A reconstructs S ' s intention, if that intention is not coded explicitly in the expression. The 'preparatory condition' for the act of making an assertion is, for instance, that the information contained in the assertion is not k nown by A. Suppose now S utters the following sentence: (1 1)
The window i s open,
(i) (ii) (iii)'
that the window is open; that S knows that he knows that the window is open; that S knows that he knows that S knows that the window is open .
In this situation it will be clear to A, as a consequence of S's violation of the condition, that S has another intention than making an assertion and he will try to reconstruct an alternative ILLs. The fact that A has to apply a reconstruction model in which the primary steps concern the systematic check of the conditions associated with the speech act type most directly expressed in a declarative sentence stresses the fact that there is a conven tional relationship between sentence type and speech act type . The follow ing representations may now be used to represent direct and indirect speech acts respectively: ( 1 2)
Direct speech act Indirect speech act
ILLs ILLs
=
¢
ILLE ILLE
The approach discussed here makes it possible to relate the use of a certain sentence type to the operators representing ILLE . In 2 . 3 I will argue that the representation of sentence types in the form of 'illocutionary frames' might be more appropriate. This alternative approach does not affect the present discussion. Most languages have at least the declarative, interrogative and impera tive sentence type. The operators and their paraphrases proposed by Dik (forthcoming) are: DECL: S wishes A to add the content of the linguistic expression to his pragmatic information.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in a situation in which A k nows
23 3 INT: IMP:
S wishes A to provide him with the verbal information as requested in the linguistic expression . S wishes A to perform the action as specified in the linguistic expression.
2.2. Mood and modality2 Modality, as opposed to illocution, pertains to the domain o f propositional content. Lexical or grammatical elements giving expression to modal distinc tions are part of the information S wishes to transmit when putting forward for consideration some predication. The different semantic distinctions generally subsumed under the heading ' modality' do not seem to represent a single and coherent semantic category. Instead of providing one defini tion of modality in general, I distinguish three types of modality and dis cuss the different kinds of modal distinctions to be made within each o f these types, which may b e defined a s follows: Inherent modality:
Objective modality:
Epistemological modality:
All those linguistic means through which S can characterize the relation between a par ticipant in a SoA and the potential actualiza tion of that SoA . All those linguistic means through which S can evaluate the actuality of a SoA in terms of his knowledge of possible SoA's. All those linguistic means through which S can express his commitment with regard to the truth of a proposition.
2.2. 1. Inherent modality The di fferent distinctions to be made within this modality type are all SoA-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These operators can be assigned to independent predications and to predica tions governed by a speech act verb. In the latter case the subject of the speech act verb is the one who has performed the speech act in the embedded predication. Only in those cases in which S and the subject of the speech act verb are identical and the speech act verb is marked for present tense is the utterance performative. In all other cases an utterance is reported or repeated . Dik further argues that languages may have a number of grammatical means to convert the basic illocution as expressed by a particular sentence type into a derived illocution . Examples of such 'grammatical converters' are tag questions , elements such as please, and alternative intonation pat terns.
234 internal, as follows from the definition given in the preceding paragraph . The only possible way to give expression to these distinctions is the use of a limited number of (derived) predicates . Therefore this modality type can not have any bearing on the use of mood inflecti on. The main inherent modal distinctions, given in ( 1 3) are ability, volition, and a number of i nstances of obligation and permission , namely those in which it is reported that some participant in a state of affairs is under the obligation or has received permission to perform in that state of affairs . ( 1 3)
'Be able to' I 'Know how to' 'Be willing to' 'Have to' 'Be free to'
2.2.2. Objective modality Linguistic means giving expression to objective modal distinctions can be regarded as the output of an evaluation process on the part of S with regard to the actuality status of a SoA. Chung and Timberlake ( 1 985:24 1 ) note that "whereas there is basically one way for an event to be actual, there are numerous ways that an event can be less than completely actual" . One might simply assume that saying that a SoA is presented as actual is tanta mount to saying that it is not modalized . In that case, however, it would be difficult to account for expressions through which S can make explicit that he regards the SoA under consideration to be identical to the situation obtaining in reality, such as it is the case that. I therefore include the dis tinction 'actual' in the category of objective modal distinctions. In the same way that a SoA can be presented as actual, it can be presented as simply non-actual, as in it is not the case that. But within the non-actual domain, many other distinctions can be made. To arrive at a further classification of these distinctions a closer look at the evaluation process underlying objective modality is in order. The knowledge on which S has to base his evaluation of a SoA may be subdivided into: ( 1 4)
Two types of knowledge (i) Knowledge of possible situations obtaining in S's conception of reality or of a hypothesized situation. (ii) Knowledge of possible situations relative to some system of moral, legal or social conventions.
The labels 'epistemic' and ' deontic' modality are generally used to cover the modal distinctions which depend on S's evaluation in terms of (i) and (ii) respectively.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Inherent modality Ability (physical I acquired) Volition Obligation Permission
235
( 1 5)
SoA is contained in a. All b. Most c. Some d. Few e. No f. ?
SoW's in domain (i) Certain Probable Possible Conceivable Impossible Doubtful
(ii) Obligatory Customary Permissible Acceptable Forbidden
Further distinctions can be made. If we assign the value 100 to (a) and 0 to (e}, in principle any value in between them might be expressed, although one would not expect languages to have special devices to express, for instance, a value of 78. ' Doubtful' is analyzed in ( l 5f) as S's expressing his inability to provide an evaluation in terms of his knowledge of the SoA under consideration. The fact that complements of adjectives like doubtful and verbs like doubt take the same form as embedded questions supports this analysis. The different objective modal distinctions discussed are summarized in ( 1 6) :
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Some languages do not distinguish systematically between knowledge (i) and (ii), underlying the distinction between episternic and deontic modality. The English modal must can be used to express both certainty and obliga tion, whereas may can be used to express both possibility and permission, although they do impose different restrictions on their complements in these different u ses (See Bolkestein 1 9 80; Goossens l 985a) . The multiple function of these modals seems to be conditioned by a similar degree of compatibility of the SoA under consideration with S's knowledge of one of the two types. One might ask, then , how this compatibility is measured. A possible answer to this question may be found if we return to the definitions given for the two types of knowledge in ( 1 4) . It is indicated there that S's knowledge of possible situations is the standard for his epistemic or deontic evaluation of a SoA. These possible situations may be represented as com binations of related SoA's. I use the label 'State of the World' (SoW)3 for each representation of a possible situation. S's evaluating a SoA may now be interpreted as his checking a SoA against SoW's. If all SoW's contain the SoA designated by a predication, then S will arrive at the conclusion 'certain' if he refers to his type (i) knowledge, or 'obligatory' if he refers to his type (ii) knowledge. If only some SoW's contain the SoA under con sideration, then S will arrive at the conclusion 'possible' if he refers to his type (i) knowledge, or ' permissible' if he refers to his type (ii) knowledge. Following this analysis, the following distinctions can be said to be roughly equivalent:
236 ( 1 6)
Objective modality Actual Non-actual
Epistemic
Deontic
Certain Probable Possible Conceivable Impossible Doubtful
Obligatory Customary Permissible Acceptable Forbidden
2.2.3. Epistemological modality To start my discussion of epistemological modality, I will go into the differ ences between subjective modality, in my view a subcategory of epistemo logical modality, and objective modality. These differences have been dis cussed by Lyons ( 1 977, chs. 1 6, 1 7), Bolkestein (1 980) and Palmer ( 1 983), among others. Objective modality concerns S's evaluation of a SoA in terms of his knowledge, whereas subjective' modality concerns S's expression of the degree of his commitment with regard to the truth of the content of the predication he puts forward for consideration, i.e. it modifies a statement. Modal adverbs give expression to subjective modality, modal adjectives to objective modality. Some of the main differences between objectively and subjectively modalized predications are : (i) Objectively modalized predications can be questioned, subjectively rnodalized ones cannot: ( 1 7) ( 1 8)
Is it possible that John will come? • Possibly John will come?
(ii) Objectively modalized predications can be hypothesized in a condi tional sentence, subjectively modalized ones cannot:
( 1 9) (20)
If it is possible that John will come, I am going home. • If possibly John will come, I am going home.
(iii) Subjective modality can be formulated in positive terms only:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It follows from the analysis proposed here that elements expressing objec tive modality take a SoA as designated by a predication in their scope. This means that objective modality can be expressed through predicate opera tors or embedding predicates only.
237 (2 1 )
* Impossibly John will come.
(22)
•uncertainly John will come.
The non-existence of negative modal adverbs corresponds with the fact that the English modals, when used to give expression to subjective modality, cannot appear under negation: (23)
• John may-not be ill .
(24)
• John mustn 't be ill .
(25)
It is impossible that John will come.
(26)
It is not certain that John will come.
(iv) Subjective modality is bound to the moment of speaking, objective modality is not . Although some of the English modals which can be used to express subjective modality can take the past tense form , this form never has temporal reference but rather expresses a higher degree of reservation on the part of S (see 3 . 2 . 2 . 3 . ) : It may I might be true.
(27)
Past tense inflection on modal adjectives does have temporal reference: (28)
It was possible that John would come, so I went home.
(v) In reaction to an objectively modalized predication the source of the information contained in that predication may be questioned: (29)
A. B.
It is possible that it will rain tomorrow. Who says so?
The same question would seem clearly out of place as a reaction to a sub jectively modalized predication: (30)
A. Possibly it will rain tomorrow. B. *Who says so?
An appropriate reaction would be: (3 1 )
B.
D o you think so?
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Objective modality can be formulated in both positive and negative terms:
238
(32)
C5(predication)
The difference between objective and subjective modality noted under (v) needs some closer attention. The fact that the source of a subjectively modalized predication cannot be questioned indicates that by subjectively modalizing a predication S reveals himself as the source, as the one who gives a judgement about the information contained in that predication. However, S is not the only possible source. Chung &Timberlake ( 1 985) use the term 'epistemological mode' for those modal distinctions which " . . . evaluate the actuality of an event with respect to a source " . They do not explicitly include subjective modality within this category, but as illustrated by Foley & Van Valin ( 1 984) evidentials do not behave differently from modal adverbs expressing subjective modality. There seems to be reason to speak of one modality type, the members of which have the presence of a source in common. The different modal distinctions mentioned by Chung & Timberlake ( 1 98 5 : 244) are: (i) (ii) (iii) (iv)
Inferential mode, " . . . in which the event is characterised as in ferred from evidence. " Quotative mode, " . . . in which the event is reported from another source . " Experiential mode, '' . . . in which the event is characterized as ex perienced by the source . " "The submode in which the event is a construct (thought, belief, fantasy) of the source. "
The last, unlabeled category might receive the name 'subjective modality' . Within this category different subdistinctions can be made, expressing different degrees of commitment on the part S. Partly these distinctions parallel the distinctions made within the category of objective epistemic
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These differences indicate that subjective modality should be located out side the predication proper, i.e. outside the scope of tense and negation, and protected from the possibility of being hypothesized . Furthermore, the interpretation of subjective modality as concerning the expression of S's commitment with regard to the truth of the content of the predication is confirmed by the fact that it is impossible to question a subjectively mod alized predication . Characteristic for questions is the absence of truth com mitment on the part of S. As the question of truth value is irrelevant in the case of imperatives, subjective modality is restricted to declarative sen tences. I will return to the formalization of subjective modality in 2 . 3 and for the time being represent subjective modality as expressing S's commit ment with regard to the content of a predication as :
239 modality. A decreasing degree of commitment is reflected in the following series of modal adverbs : {3 3)
Certainly - Probably - Possibly
and underlying structure of sentences like: (34)
I wish he came more often.
This sentence is modalized in two ways: S expresses his wish for a certain situation to obtain while at the same time characterizing this situation as non-actual . So, in a sense, S creates a domain to be evaluated in terms of his knowledge. Anticipating the proposals to be made in 2 . 3 sentence (34) may be analyzed as modalized at two different levels, the subjective and the objective level (Boul is used as a shorthand notation for boulomaic subjec tive modality):
Apart from a number of adverbs like hopefully the first person present tense forms of verbs like wish, hope, and want may be used to express boulomaic subjective modality. A problem in the analysis of the latter forms is that they are probably not only used in their ' world creating' sense but also in a self descriptive one. In the latter case they should rather be ana lyzed as expressing inherent modality (see 2.2. 1 .) . The different modal distinctions of the epistemological type may be summarized as follows:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There appears to be a minimum to the degree of commitment S may express, as reflected in the ungrammaticality of impossibly. A possible explanation for this phenomenon might be that a less than minimal degree of commit ment would be in conflict with the very act of asserting. Apart from modal adverbs, the first person present tense forms of some verbs may be used to give expression to subjective modality, such as I think, I suppose.4 Some differences in the syntactic behaviour of these forms as opposed to other forms of the same verb will be illustrated for Spanish i n chapter 3 . Impersonal expressions may b e used for evidential, quotative and experiential modality. Examples are It seems, It appears. I think at least one more category should be classified as a subjective modal distinction. S may also reveal himself as a source in expressing his wishes, hopes and desires. One might say that S expresses his emotional commitment in these cases. The inclusion of boulomaic modality in the category of subjective modality provides the means to explain the existence
240 (36)
Epistemological modality Epistemic Subjective
Boulomaic
Certainty (Strong commitment) Probability (BelieO Possibility (Weak commitment) Wishing, Hoping etc.
Inferential Quotative Experiential
2.2. 4. The expression of m odality
(37)
Modality types and their expression Modality type
Level
Inherent
SoA
(Derived) predicate
Objective
K 5(SoA)
Embedding predicate
Operator
Epistemological
C5(Predication)
Embedding predicate Adverbial
Operator
Lexical
Expression Grammatical -
Both objective and epistemological modality may be expressed through embedding predicates and operators. In 2 . 3 it will be shown that there is a difference in what they embed or operate on.
2.3. A model for predication and clause Returning to Searle's ( I %9) analysis of the speech act, some further sub acts may be distinguished, as represented in (38):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the preceding paragraphs three types of modality were discussed. The dif ferent ways in which these modality types were analyzed have their reper cussions on the different ways in which they may be expressed . I nherent modality was analyzed as operating SoA-internally, objective modality as operating on a SoA as designated by a predication, and epistemological modality as operating on the content of a statement . These different levels and the possible means of expression for the different modality types are represented in (37):
24 1
(38)
Decomposition of the speech act Speech Act
Illocutionary Act
Propositional Act
Referential Act
Predicational Act
(39)
Decomposition of the clause Clause
Illocution E
Predication
Predicate Frame
Terms
In order to be able to represent these formal correlates of the subacts of a speech act in a model for the analysis o f utterances I would like to consider an alternative to the approach in which operators represent the basic illo cutions of linguistic expressions (see 2. 1 ) . Basic to my proposal, which is presented in greater detail in Hengeveld ( 1 988), is the idea that every utter ance can be analyzed at two levels: the representational and the interper sonal level. At the representational level a State of Affair is described in such a way that the addressee is able to understand what external situation is referred to. At the interpersonal level this situation is presented in such a way that the addressee is able to recognize the communicative intention of the speaker. Thus the representational level is concerned with the nar rated event, the interpersonal level with the speech event. Narrated events can be represented in Functional Grammar in the form of predications, as in (40): (40)
The representation of narrated events (e 1 : [Write v (x 1 ))A8 (xi : book (xi ))GcJ (e 1 ))
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Using the terminology introduced earlier, the clause may be analyzed in an analogous way:
242
The predication, between square brackets, is built on the basis of a predi cate frame, which contains a number of argument positions, each provided with a semantic function, and a predicate which specifies the relation be tween these arguments (see section 1 ) . The predication as a whole is pre sented here as a restrictor of the state of affairs variable e, as proposed in Vet ( 1 986). Speech events can be analyzed in an analogous way: Here the participants are the speaker, the addressee, and the content of the utterance. The rela tion between these three participants is expressed by the basic illocution of the linguistic expression, as specified by the speaker. Basic illocutions can be represented in the form of abstract illocutionary frames, as in: DECL
(S) (A) (X 1 )
Speaker wishes the Addressee to add the content X 1 to his pragmatic information.
An illocutionary frame should be regarded as expressed by the total of illocutionary force indicating devices of a clause, in particular the formal properties of the sentence type, such as word order and sentence mood. Given illocutionary frames of the type illustrated in (4 1 ), the general sche ma for the representation of speech events is as in (42): (42)
The representation of speech events (E 1 : ( ILLE (S) (A) (X 1 : (proposition] (X 1 )) ] (E 1 ))
Here the abstract illocutionary frame specifies the relation between the speaker (S), the addressee (A), and the content of the utterance (X) . The clause as a whole is presented as a restrictor of the utterance variable E . The representations of narrated event and speech event may be combined into a single representation of the utterance, as in (43) : (43)
The representation of utterances clause
E 1 : [ILL
(S) (A) (X 1 : [proposition] (X 1 ))] (E 1 )
predication
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(4 1 )
243
(44)
A. The weather will be nice tomorrow. B. Do you think so?
Anaphoric reference to full content phrases is expressed in English by means of so, as in (44B). The interchange in (44) may be represented as: 5 (45)
A. DECL (S) (A) (X I : [(Fut ei : [NiceA (xi : the weather (xi)).p] (ei ): tomorrow (ei ))] (X I )) B. INT (S) (A) (X1 : [(Pres ej : [Think v (xk : 2s (xk ))¢Exp (AXI)oJ (ej)] (X 1))
Secondly, the difference between belief de re and de dicto shows the neces sity of the distinction between the two uses of predications. In Spanish, this difference is formally reflected in negative contexts, as will be shown in 3.2.2. 1 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Starting from the innermost layer, the predication, the functions of the dif ferent layers distinguished in (43) should be understood in the following way: A predication gives a description of a set of possible SoA's. By insert ing a predication into a narrated event slot (e) it becomes an expression referring to the external situation S has in mind, i.e. a token of the SoA type designated by a predication. By inserting a fully specified predication into the content slot (X) of an illocutionary frame it becomes an expression referring to the information unit or content transmitted in some speech act. The illocutionary frame contains instructions for A about what S wants him to do with this information unit. By inserting a clause into a speech event slot it becomes an actual speech act, where the speech event variable E provides the deictic center for temporal, spatial and personal reference. Two aspects of this approach need some elaboration: The illocutionary frames themselves, and their content-argument, represented by (X). The (X)-variable introduced in the illocutionary frames is a content phrase variable. A predication in (X) represents a third order entity: the con tent of an utterance. The introduction of this variable makes it possible to distinguish between the two functions of predications: designating states of affairs at the level of the narrated event; and representing contents at the level of the speech event. To distinguish these two functions I use the term 'proposition' to refer to the content function of predications. Possibly (X) may also be taken to be the basic unit of knowledge, where knowledge is regarded as a set of propositions (see Dik 1 986a). I will return to this ques tion in my discussion of the Spanish data. There are at least two parts of grammar in which a separate content phrase variable proves to be useful. Firstly, anaphoric reference may be made to full content phrases, as exemplified in (44):
244 Some apparent advantages which follow from the representation of basic illocutions in the form of abstract illocutionary frames are the following: (i) Restrictions on the type of predication to be used with a specific ILLE can be formulated as selection restrictions. For instance, only predications which designate + control SoA's may be used in imperative constructions, as in: (46)
IMP (S) (A) (e 1 : [( + control )] (e 1 ))6
The clause model proposed here allows for the application of operators over four different layers:
..- 1 : predicate operators ..-2 : predication operators
r3
:
..-4:
proposition operators illocution operators
Illocution operators represent grammatically reflected modifications of basic illocutions. Some Spanish examples will be given in 3 . 2 . 2 . 3 . But first the different types of modality discussed earlier should be assigned a posi tion in this configuration. As inherent modality can be expressed lexically only, I restrict myself to objective and epistemological modality. Objective modality has been characterized in terms of S's evaluation of the SoA desig nated by a predication, epistemological modality in terms of S's commit ment with regard to the content of his statement. Given these characteriza tions of the two modality types, they may be assigned the positions in (48):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(ii) Illocutionary conversion can be dealt with by means of a set of illocu tionary frame formation rules, paralleling the rules by means of which de rived predicates are accounted for. (iii) The 'framing' analysis may be further expanded by considering clauses as the fundamental units to be inserted in discourse frames, thus providing a means to link syntactic description more accurately to other branches of language theory.
245 Take, for instance, the following sentence in which linguistic means are ap plied to express inherent, objective and epistemological modality:
(49)
It seems that it is possible that he
can
cure blindness.
The underlying structure of this sentence, in a language in which both epistemological and objective modality are expressed through grammatical means, is represented in (50):
(50)
In English, both objective and epistemological modality are expressed through lexical means. This may be represented as :
(5 1 )
DECL (S) (A) (X I : [Seemv (X1 : [Pres e 1 : [PossibleA (ei : [Canv Curev inf (xi : p3 (xi ))A (xk : blindness (xk ))Go] (ei )¢] (e 1 )] (X1))q,l g (X I))
Two differences in the syntactic behaviour of objectively and epistemologi cally modalized sentences in Spanish support the analysis of epistemological modality as a modality that should be assigned a position outside the predi cation proper, as opposed to objective modality, which has been assigned a position inside the predication. Firstly, predicates expressing epistemological modality do not allow elitic promotion (see Aissen & Perlmutter 1 976; Lujan 1 979), unlike verbs ex pressing objective modality . Compare epistemological parecer 'seem' in (52) with objectively used deber 'must' in (53):
(52)
a. Parece saber/o poco. b. • Lo parece saber poco. ' He seems to know little about it. '
(53)
a. b.
Debe querer hacer/o bien. L o debe querer hacer bien . 'He must want to do it well . '
Secondly, the two groups of predicates behave differently with regard to negative raising (see Lujan 1 979; Rivero 1 979). This difference may be illustrated by means of sentences (54) - (55), which contain the preposition hasta 'until ' . This preposition requires a negative context, a condition which is apparently not fulfilled in (54b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
DECL (S) (A) (Quot. X I : [PresPoss ei : [Canv Curevinf ("i: p3 (x))Ag (xi : blindness (xi ))00] (ei)] (X I ))
246 (54)
a.
(55)
a.
Parece que n o llega (I) basta las diez. ' It seems that he will not arrive until ten . ' b. • No parece que llega (I) basta las diez. 'It doesn't seem that he will arrive until ten . '
b.
Es probable que no llegue (S) basta las diez. ' I t is probable that he will not arrive until ten . ' No e s probable que llegue (S) basta las diez. 'It is not probable that he will arrive until ten . '
(56)
• Uega (I) basta las diez. ' He comes until ten . '
The formal correlate of this restriction i s that no element may pass t h e (X) boundariesJ The restriction also holds the other way round . The (X) boundary blocks the scope o f negative elements situated outside the propo sition restricting (X), as will be illustrated for Spanish in 3 . One could there fore say that (X) functions as an inseparable and closed unit. Further evidence for the correctness of the different positions I have assigned to the different modality types discussed may be derived from the order in which elements expressing modal distinctions of the three types appear in linguistic expressions. On the basis of observations in a number of languages8 Foley & Van Valin ( 1 984) arrive at the following model of the 'layered structure of the clause' : (57)
(IF(EVID(TENSE(STATUS[ IF
=
o o o
(MOD[NP(NP)(ASPECf[Predicate))))]))))
Illocutionary Force, EVID
STATUS
=
=
Evidentals & Subjective modality,
± Objective modality, MOD
=
± Inherent modality.
The order in which the different modality types appear in their model is identical to the one I have given. One language in which this order is neatly reflected is Turkish. Consider the following example (Gerjan van Schaaik, personel communication).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
An explanation for the ungrammaticality o f (52b) and (54b) is that both the clitic and the negative element are part of the content phrase with regard to which S expresses his commitment . Promotion of the clitic or transporta tion of the negative element to a position outside the scope of the elements through which S expresses his commitment leaves a gap in this content phrase. Therefore (54b) and the non-modalized (56) are ungrammatical for the same reason:
247 (58)
Her miisliiman Kur'an-i Kerim-i okuy-abil-meli-ymis . Every muslim Koran-conn Holy-ace read-able-obl-quot. ' It seems that every muslim should be able to read the Koran . '
Inherent modality i s expressed b y means of a derived stem okuyabil-, pro duced by a productive predicate formation rule. To this stem two afftxes are attached, one indicating moral obligation (-meli) , and one indicating that S obtained the information from a third person (-mi$). The order given in (58) is the only possible one.
3 . 0. Introduction
In this section the different uses of the Spanish indicative and subjunctive in main clauses and constructions governed by a verbal or non-verbal predi cate are studied. In 3 . I I discuss the use of both moods as illocutionary force indicating devices, i.e. as elements through which S can code his communi cative intention in his utterance. In 3 . 2 the use of mood in modalized con texts is gone into. A distinction is drawn between those contexts in which subjunctive or indicative are used obligatorily, and those in which both moods may appear. In the latter case mood inflection can be said to add to the meaning of a sentence, unless the application o f both categories can be attributed to differences in the underlying structure of the sentences in which they appear. In 3 . 3 the remaining uses of indicative and subjunctive in predications governed by a verbal or non-verbal predicate are presented . 3 . 1 . Mood and illocutionary force
The relation between illocutionary force and the use of mood in Spanish is illustrated in the following sentences, in which a proposition is governed by a speech act verb: (59)
Te aseguro que no es (I) culpa mfa. 'I assure you that it's not my fault . '
(60)
Te ordeno que lo hagas (S) cuanto antes. 'I order you to do it as soon as possible . '
(6 1 )
Pregunto s r vienes (I) manana. 'I ask whether you will come tomorrow. '
The indicative is used in propositions embedded under verbs of declaring
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3. MOOD IN SPANISH
248
and questioning, the subjunctive in propositions embedded under verbs of ordering. If one compares these examples with their non-performative equi valents there is a partial parallelism with respect to the use of mood : No es (I) culpa mia. 'It's not my fault . '
(63)
i Hazlo cuanto antes ! 'Do it as soon as possible ! '
(64)
l Vienes (I) manana? 'Will you come tomorrow? '
The indicative is used in both embedded and non-embedded declarative9 and interrogative sentences, as can be seen by comparing (59) and (6 1 ) with (62) and (64). With respect to imperative sentences the situation is less clear cut. A special imperative inflection is used in (63). The use of this inflection type is restricted to second person familiar singular and plural affirmative in main clauses. Whenever the verb is embedded (60), negated (65) or in second person non-familiar (66) the subjunctive is used: (65)
i No lo hagas (S)! 'Don't do it ! '
(66)
i Hagalo (S) Usted cuanto antes! ' Do it (you-pol .) as soon as possible ! '
I have no satisfactory explanation for the fact that the imperative and sub 1 junctive mood are complementary in the way they are. o The examples show, however, that both the imperative and the subjunctive mood inter vene in the expression of IMP in main clauses. This suggests that applica tion of the approach, proposed by Dik (forthcoming) and outlined in 2. 1 , in which sentences embedded under speech act verbs are provided with their own illocution is justified. I will adopt this approach in what follows. Different speech act verbs allow the embedding of more than one sen tence type. Among these verbs are those which specify the manner in which a speech act is executed , such as gritar 'yell ' , escribir 'write' , or th(" intensity with which this is done, such as insistir 'insist' and sugerir 'suggest' . The widest range of possibilities is exhibited by the verb decir 'say' , which allows the embedding of declarative, interrogative and imperative sen tences . The nine possibilities of single embedding are:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(62)
249 (67)
a. b. c. d. e. f. g. h. i.
(DECL(DECL)) Dice (I) que vienes (I) manana. (DECL(IND) Dice (I) que si vienes (I) manana. Dice (I) que vengas (S) manana. (DECL(IMP)) (INT(DECL)) i. Dice (I) que vienes (I) manana? (INT(INT)) i, Dice (I) que si vienes (I) manana? i. Dice (I) que vengas (S) manana? (INT(IMP)) Say (he I she) (come (you) (tommorrow)). i Dile (Imp) que viene (I) manana! (IMP(DECL)) (IMP(INT)) i Dile (Imp) que s i viene ( I ) manana! i DOe (Imp) que venga (S) manana! (IMP(IMP)) Say (you) (him / her) (come (he / she) (tomorrow)).
(68)
i Piensa (Imp) que es (I) facil ! 'Think that is is easy! '
(IMP(DECL))
(69)
i Finge (Imp) que estas (I) contento ! 'Pretend that you ' re happy ! '
(IMP(DECL))
The similarity of behaviour of speech act verbs and mental act verbs fits in nicely with the view of thinking as 'talking to oneselr . The similar pattern in (69) suggests that an illocutionary component is attributed to non-verbal means of communicating. Although performative utterances are not very frequent in daily usage, Spanish speakers seem to express a certain decir-consciousness when adding the subordinator que to independant clauses, as in: (70)
i Que no me gusta (I) nada esa pelicula! 'I don 't like that movie at all ! '
(7 1 )
i Que no te marches (S) manana! 'Don't you leave tomorrow ! '
(72)
i Que se siente (S) Usted! 'Sit down ! '
In all these cases, the utterance i s brought forward with more emphasis. A similar effect is produced by the addition of a performative verb (see
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
What these examples show is that in embedded clauses there is a one to one relation between sentence type and mood inflection type, independent of the sentence type of the embedding clause. Guitart ( 1 984), from whom the following examples are taken, notes that two groups of verbs show a behaviour similar to that of speech act verbs: those designating mental acts and those designating acts of non-verbal sig nalling:
250 Weijdema et al . 1 982). A possible solution for the description o f sentences like (70) - (72) is presented in 3 . 2 . 2 . 3 .
3 . 2. Mood and modality
(73)
Mood and objective modality MODALITY
Actual Certain Probable Possible Conceivable Impossible
Obligatory Customary Permissible Acceptable Forbidden
Doubtful
LEXICAL EXPRESSION
Mooo
Es el coso 'It's the case' Es cierto 'It's certain' Es probable 'It's probable' Es possible 'It's possible' Es concebible It's conceivable' Es imposible 'It's impossible'
Ver 'See' Creer 'Believe' -
Subj
-
Subj
-
Subj
-
Subj
Es obligatorio 'It's boligatory' Es conveniente 'It's suitable' Es permisible 'It's permissible' Es aceptable 'It's acceptable' Estd prohibido ' It's forbidden'
Hace jalta 'It needs' Conviene 'It suits' -
Subj
Es dudoso 'It's doubtful'
Dudar 'Doubt'
Ind lnd
Subj Subj
-
Subj
-
Subj
Ind
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3.2. 1 . The use of mood in modal contexts I n most cases the occurrence of subjunctive or indicative follows automati cally from the context in which the verb on which they are to be expressed appears. This also holds for modal contexts, i.e. contexts in which some modal distinction is expressed. All objectively modalized contexts require the application of either the indicative or the subjunctive and never allow both. In the following overview the different objective modal distinctions are given in the first column, the non-verbal and verbal predicates through which these modal distinctions may be expressed in the second and third column, and the mood these predicates require to be marked on the verbal predicate in their complements in the fourth column:
25 1
The indicative is used: (i)
(ii)
In predications governed by a predicate through which it is expressed that the SoA designated by the embedded predication is evaluated as actual or certain. In predications governed by a predicate through which it is expressed that the SoA designated by the embedded predication is evaluated as doubtful.
(74)
Mood and epistemological modality LEXICAL EXPRESSION
MODALITY
MOOD
Subjective Epistemic Cert . Prob . Pass.
-
Seguramente 'Certainly' Probablemente Creo 'I think' 'Probably' Quiztis 'Maybe' -
Boulomaic Ojald ' Hopefully' Inferential
Quotative
Experiential
Espero 'I hope'
Ind(? Subj Ind(Subj) Ind(Subj)
Subj
Ind Evidentemente Resulta 'It appears' 'Evidently' -
Aparentemente
Parece 'It seems' -
Ind
Ind
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the case of the non-impersonal verbal predicates these evaluations are attributed to the subject of the matrix clause. An exception should be made for the first person present tense of creer 'believe', to which I return in 3 .2.2.2. Furthermore, only de re doubt ('doubt ir) and de re belief are intended here. If negated, the embedding predicates classified as expressing the modal distinctions 'actual' and 'certain' require the subjunctive in their comple ments. Negation does not affect the use of mood in all other cases. I return to an explanation of the use of subjunctive and indicative in objective modal contexts in section 4, where the two different uses of the indicative are argued to be related to their use in straightforward declara tive and interrogative sentences respectively. In some of the epistemologicaliy modalized contexts both moods may ap pear. An overview of their uses is given in (74):
252 In 3 .2.2.3. I o ffer an explanation for the fact that in contexts in which ele ments expressing subjective epistemic modality appear both indicative and subjunctive may be used. But first I present some other contexts allowing mood alternation.
3 . 2.2. 1 . De dicto !de re alternation . A sentence like (75)
Creo que Juan esta (I) enfermo. ' I believe that Juan is ill. '
has two readings (cf. Burge 1 977): (i) (ii)
I believe the proposition 'J uan is ill . ' I have the impression that Juan is ill.
The first one is the de dicto, the second one the de re reading of (75). The difference is not visible in a positive context. Under negation, however, there is no ambiguity. Compare: (76)
a. b.
No creo que Juan esta (I) enfermo . No creo que Juan este (S) enfermo . 'I don't believe that Juan is ill . '
11
I n the de dieto variant (76a) the indicative is maintained. I n the de re variant (76b) the subjunctive appears, in line with the rules given for objective modality in 3 . 2 . 1 . In 2.3 the difference between content phrases and predi cations in their SeA-designating function has been formalized as a differ ence between (X) and (e). Application of this formalization to (76a) and (76b) yields: (77)
a. b.
DECL (S) (A) (X1 : [PresNeg ei : [Creerv (xi: S (xi))q,Exp (X 1 : [Pres ei : [Juan esta enfermo] (ei)] (X 1 ))G0] (ei )] (X1 )) DECL (S) (A) (X I : [PresNeg ei : [Creer v (xi : S (xi))q,Exp (Pres ei : [Juan este enfermo] (e i ))Go] (ei)] (X I ))
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3 . 2.2. Mood alternation There are some contexts in which the indicative and subjunctive can be contrasted. In two of these contexts, differences in the underlying structure of the contrasting sentences account for the use of both moods. Condition ing factors are the difference between de re and de dicto interpretations (3 .2.2. 1 .) and the absence or presence of truth commitment on the part of S, combining with the former context (3 .2.2.2.). The third context is described in terms of the interaction between illocutionary force and modality (3 . 2 . 2 . 3).
253 What is represented in (77a) is that S rejects a proposition (X1) which has been brought forward by one of the participants in the preceding conversa tion. The representation in (77b) is intended to reflect S's statement that he does not have the impression that the SoA (ej) obtains in reality. The presence of the (X) boundary in (77a) blocks the influence of the negative element in the matrix clause, i .e. it limits the scope of negation. A similar effect was illustrated in 2 . 3 in with respect to negative raising. The scope differences between (77a) and (77b) may again be illustrated , as in 2 . 3 , by testing the behaviour of sentences in which the embedded predi cation contains an adverbial expression which requires a negative context, in this case palabra de 'word of' : a. • No creo que sabe (I) palabra del asunto. b . No creo que sepa (S) palabra del asunto. ' I don't believe he knows a thing about that matter. '
The ungrammaticality of (78a) corresponds with the ungrammaticality o f independent clauses like: (79)
•sabe palabra del asunto. ' He knows a thing about that matter . '
3.2.2.2. Truth commitment. I n the preceding section I have restricted my self to de dicta I de re alternation in sentences in which the matrix verb is marked for present tense and first person singular, i.e. those cases in which subject and speaker are one and the same person. According to different authors (Lleo 1 979; Klein 1 974, 1 977) a somewhat different interpretation should be given to the following sentences , in which the subject is non-first person : (80)
Antonio no cree (I) que J uan esta (I) enfermo. 'Antonio does not believe that Juan is ill . '
(8 1 )
Antonio duda que Juan esta (I) enfermo. 'Antonio doubts that Juan i s ill . '
12
These sentences have the following interpretation : (i) (ii)
Antonio does not believe I doubts that Juan is ill. S does believe that Juan is ill.
The difference between these sentences and those presented in the preced ing section is shown in the ungrammaticality of the following sentence:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(78)
254 (82)
•oudo que J uan esta (I) enfermo . ' I doubt that Juan is ill . '
Sentences like (82) with a matrix verb with incorporated negation cannot be used to reject a content phrase, unlike the combination no + creer. 1 3 Therefore (82) contains a contradiction: (i) (ii)
S doubts that J uan is ill. S believes that Juan is ill .
(83)
(CertX 1 : [proposition] (X1))
Embedding o f this structure in the matrix clause of (80) yields the following representation:
(84)
DECL (S) (A) (X1 : [PresNeg ei : [Creerv (x 1 : Antonio (x 1 ))q,Exp (CertX1 : [e 1 : [Juan esta enfermo] (e 1 )] (X1))oJ (ei )] (X1))
Confirmation for the analysis of the embedded construction as a subjective ly modalized content phrase may be derived from the fact that S can con trast his judgement with the opinion of the subject of the matrix clause in positive terms only. The ungrammaticality of (85) corresponds with the non-existence of the modal adverb inseguramente 'uncertainly' : (85)
• Antonio cree que Juan este (S) enfermo. 'Antonio believes that Juan is ill. '
The (not semantically anomalous) interpretation o f this sentence would be: (i) (ii)
Antonio believes that Juan is ill . S does not believe that Juan is ill.
Two groups of predicates behave i n a way similar to that o f believe predi cates: verbs of saying, in so far as they refer to pronouncing rather than to executing an il/ocutionary act (see Lyons 1 977:740), and cognitive predi cates. With regard to the first category, compare the following sentences, taken from Guitart ( 1 984); (86)
a. b.
La carta no dice que Ia culpa es (I) mia. La carta no dice que Ia culpa sea (S) mia. 'The Jetter doesn't say that I'm to blame. '
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A content phrase with respect to which S expresses his commitment would receive the following structure in the approach presented in 2 . 3 :
255 According to Guitart, S's usual intention in uttering (86a) is to express " . . . that the letter has failed to include the fact that he is indeed to blame " , while i n uttering (86b) his usual intention i s " . . . to point o u t that this i s not what the letter says ( . . . ) while a t the same time not admitting that h e is t o blame . . . " . With regard to cognitive predicates , consider: (87)
a. Antonio no sabe que J uan esta (I) enfermo. b . ? Antonio no sabe que Juan este (S) enfermo . 'Antonio doesn't know that Juan is ill . '
(88)
Antonio no sabe si Juan esta (I) enfermo. 'Antonio doen' t know whether J uan is ill . '
Sentence (87b) may only be used a s a free indirect speech report of: (89)
No se que Juan este (S) enfermo. 'I don't know that J uan would be ill . '
I n which S expresses his reservation with respect t o a statement o f another person. One might say that , apart from this use, verbs like saber 'know' cannot be used if S cannot commit himself to the truth of the content of the embedded proposition, as illustrated by the ungrammaticality of (90): (90)
* No se que J uan esta (I) enfermo. 'I don't know that Juan is ill . '
This sentence contains a contradiction, contrary t o (9 1 ) (see 3 . 2 .2. 1 .) : (9 1 )
No creo que Juan esta (I) enfermo. 'I don't believe that Juan is ill '
Therefore, to account for the semi-factive character of cognitive predicates I assume that these predicates allow the embedding of (CertX)-complements only. It is in this sense that speaker presupposition distinguishes itself from logical presupposition (see 3 . 3 .2.). This approach to cognitive predicates furthermore supports the idea expressed in 2 . 3 that (X) might be the basic unit of knowledge.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sentence (87b) i s highly marked. The semi-factive character o f saber ' know' requires a positive judgment of S with regard to the content of the embed ded proposition. If S is not able to express his positive commitment it would be more appropriate to use:
256 Special attention should be given to the use of cognitive predicates in what Guitart ( 1 984: 1 6 1 ) calls ' admission of cognitive failure' . An example is: (92)
No sabia que mi articulo tenia (I) errores . 'I didn't know there were mistakes in my article . '
(93)
(87a) (92)
In both cases the embedded proposition represents K
Sts
.
3.2. 2.3. Mitigation and reinforcement. The last case of contrasting use of mood in modalized contexts concerns the possibility of using both the indic ative and the subjunctive in subjectively modalized contexts, as was indi cated in the overview of the use of mood in epistemologically modalized contexts in (74). Some examples are: (94)
Quizas vienen I vengan (I I S) manana. ' Maybe they will come tomorrow. '
(95)
Sospecho que vienen l vengan ( l i S) manana. 'I assume that they will come tomorrow. '
The use o f the subjunctive in sentences like (94) - (95) corresponds t o a higher degree o f reservation of S with regard to the truth of the content of t he proposition (see Hooper 1 974:30; Bergen 1 978). The effect is compara ble to the one produced by the use of the past tense forms of the English modals expressing subjective epistemic modality, as in: (96)
He may I might be on his way by now.
One might simply assume that the difference between the sentences of each pair should be accounted for in terms of different degrees of commitment in a subjective modal sense. However, I would like to consider a different solution, which is mainly based on the fact that a small group of Spanish
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Through the use of the indicative i n (92) S indicates that h e does know now 1 what he did not know at an earlier moment. 4 Just as S may contrast his knowledge with that of a third person, he may contrast his actual knowledge with his knowledge in an earlier stage. If K stands for knowledge, S for speaker, X for third person, ts for the moment of speaking and tr for the moment referred to, the following two formulae hold for (87a) and (92):
257 modal verbs may appear in both the indicative and the past subjunctive. Note that in (94) -(95) the verb inflected for indicative and subjunctive appears in a context which has already been modalized subjectively, while the modal verbs in the main clauses of the following sentences receive this inflection without other modalizing elements being present: 1 S Usted debe / debiera (1 / PastS) ensenarle su biblioteca. ' You must / should show him your library . '
(98)
Quiero / Quisiera (1 / PastS) que Usted le ensene / ensenase (PresS / PastS) su biblioteca. 'I want / would like you to show him your library . '
(99)
l. Puede / Pudiera (1 / PastS) enseiiarle su biblioteca? 'Can I could you show him your library? '
Apart from its use a s a n indicator of illocutionary force, these are the only possible uses of the subjunctive in independent clauses. To account for the difference in meaning of the sentence pairs (94) - (99) I once more return to speech act theory. It should be noted that some of the modal distinctions discussed in 2.2 are at the same time central notions in speech act theory. Some of Searle's ( 1 969) felicity conditions for the exe cution of a speech act, the addressee-based preparatory conditions and the speaker-based sincerity conditions, are given in ( 1 00) : ( 1 00)
S-based and A-based felicity conditions Speech act
S-based cond.
A-based cond.
Assertion
S believes p
A doesn 't know
Command
S wants p
A is able to do p
p
It cannot be a coincidence that in the formulation of the S-based conditions the two basic notions of subjective modality are used. On the contrary, one would expect, given the analysis of the clause as representing the different levels of the speech act, that the S-based conditions, formulated in terms of speaker commitment, are incorporated in the structure of the clause, as in: (101)
DECL (S) (A) (Believes X 1 : [proposition] (X 1 ))
( 1 02)
IMP (S) (A) (Wants e 1 : [predication] (e 1 ))
These might be suitable representations for the logical structure of different
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(97)
258 types of speech acts, but not for the general structure of linguistic expres sions. For one thing, the S-based sincerity conditions are often just presup posed and not expressed. For another, ( 1 02) is incorrect. What S does when making explicit the sincerity condition of a command is precisely avoiding issuing a imperative by replacing it by a statement. Therefore, for those cases in which S gives expression to the sincerity condition of a command, the following structure should replace ( 1 02): ( 1 03)
DECL (S) (A) (Wants X 1 : [proposition] (X 1 ))
(i)
(ii) (iii)
the modal verbs which may be used to give expression to the S-based sincerity conditions have been analyzed earlier as elements which take a content phrase in their scope; the modal verbs which may be used to give expression to the A-based preparatory condition have been analyzed earlier as SoA-internal; the modal verbs which may be used to arrive at a lesser degree of directness without mentioning either of these conditions have been analyzed as occupying an intermediate position, taking the SoA in their scope.
An interesting hypothesis would be that these verbs represent a scale of decreasing directness the deeper they are embedded , corresponding to the different positions assigned to the three different types of modality dis cussed in 2.2, as in:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In analyses of indirect speech acts (Searle 1 975; Haverkate 1 979) different strategies to decrease the directness of a speech act have been distinguished. One of these strategies is for S to give expression to the S-based condition, as in (98). As has been noted above, the S-based conditions of commands and assertions coincide with two of the subjective modal distinctions made. Another strategy is for S to question the A-based condition , as in (99). The A-based ability condition of commands coincides with one of the inherent modal distinctions made. The use of sentences like (97) represents a third strategy: instead of creating an obligation by issuing an order S states that an obligation exists. In the latter case A can question the source of this ob ligation, a feature of objective modality (see 2 . 2 . 2 . ) . What may be derived from these coincidences between certain modal dis tinctions and central notions of speech act theory is that one of the functions of modalizing an utterance is to arrive at a lesser degree of directness of the speec h act involved. Note that , with regard to the modal distinctions dis cussed here,
259 ( 1 04)
Degrees of directness decreasing directness 2
3
4
The following series seems to support this hypothesis ( 1 05)
a.
c. d.
If correct, the hypothesis may at the same time provide a partial explana tion for the diachronic development of modal verbs, which tend to be rein terpreted along the following line (see Goossens 1 985a-b): ( 1 06)
Diachronic development of modal verbs
3 ILLE (S) (A) (Subj . M od .
2 X1 : [Obj .Mod. e 1 : [Pred13 (x J ) (xz)
. . .
(xn)J(el))(X1))
If one of the functions of modalizing an utterance is to decrease the degree of directness, the reinterpretation of modality might be related to the wear ing off of politeness expressions and the conventionalizing of indirectness strategies. Returning now to the use of mood in the modalized sentences presented at the beginning of this section, I interpret the use of the (past) subjunctive as the application of a grammatical means through which S further mitigates the force of his speech act, and not as a device to indicate a lesser degree of commitment with regard to the content of his speech act. This view implies that this use of mood should be explained in terms of the rules that govern verbal interaction and the ways in which these rules are reflected in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
b.
I Ensefiele su biblioteca! 'Show him your library! ' Quiero que Usted le ensefie s u biblioteca. 'I want you to show him your library. ' Usted debe enseil.arle su biblioteca. ' You must show him your library. ' l. Puede Usted enseil.arle s u biblioteca? 'Can you show him your library?'
260 linguistic structure. By using the subjunctive in (94) - (95) or by using the past tense form of the modal may in (%), S leaves more room for A to dis agree with him or for himself to withdraw from a position taken. By using the subjunctive in (97) -(99) S exposes a higher degree of politeness and 1 leaves more room for refusal. I will use the term 'mitigation' 6 as a label for these different communicative strategies. Mitigating expressions, whether lexical or grammatical, should take the whole clause as representing the speech act in their scope. The underlying structure of the mitigated variants of (94) - (99) may therefore be represented as: ( 1 07)
Mit. ILLE (S) (A) (X 1 : [proposition] (X1))
( 1 08)
a. b.
Quizas es (I) seguro que Ia ceguera puede (I) ser vencida. Quizas sea (S) seguro que I a ceguera pueda (S) ser vencida. ' It may / might be possible that blindness can be cured . '
( 1 09)
a. b.
Quiero (I) que Usted l e ensene (PresS) s u biblioteca. Quisiera (PastS) que Usted l e ensenase (PastS) s u biblioteca. 'I want / would like you to show him your library . '
Mitigation favours modalized clauses. However, not all modal distinctions are equally compatible with it. Mitigating a statement while at the same time expressing strong commitment with respect to its content in a subjec tive modal sense is not a likely combination, although it does occur, as can be seen in the following example, cited by Bolinger ( 1 976 :47) : ( 1 1 0)
Segurisima estoy de que por culpa mfa se mude (S) el tiempo. 'I'm more than certain that it' s my fault that times change . '
Some parenthetical verbs may be classified as lexical mitigating expressions: (I l l)
Juan viene (I) manana, creo / temo. ' Juan will come tomorrow, I think / I 'm afraid . '
The counterpart of mitigation is reinforcement . Just as S may wish t o ex press a higher degree of reservation , he may wish to impose his speech act more strongly upon A or, as it were, put an exclamation mark behind his utterance. One of the ways in which this is done in Spanish has been illus trated in 3 . 1 . 1 . The addition of the subordinator que to a clause has a rein forcing effect , as in:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
That mitigation takes the whole clause in its scope is reflected in Spanish in the fact that mitigation affects all inflected forms of the clause, as in ( 1 08b) - ( l 09b) .
26 1 ( 1 1 2)
i
Que no me gusta nada esa pelicula ! 'I don't like that movie at all ! '
The scope of the reinforcing expression que is reflected in its initial position in the utterance. In Turkish , a V-final language, the reinforcing affix -dir 1 takes the final position in the utterance: 7 ( 1 1 3)
Bu, Tiirkce gazete degil-dir. This Turkish newspaper not-reinf. 'This is not a Turkish newspaper at all ! '
( 1 1 4)
Reinf. ILLE(X 1 : [proposition] (X 1 ))
Reinforcement may be expected to favour non-modalized clauses and clauses containing a compatible modal distinction, such as subjective strong commitment . The performative use of speech act verbs may be analyzed as a lexical reinforcing device. In chapter 2 illocutionary force and modality were analyzed as pertaining to different layers of the clause. Mood in Spanish has been discussed with respect to each of these layers earlier in this chapter. In this section some cases have been presented in which illocutionary force and modality seem to coincide. I hope to have shown that, rather than making the distinction drawn invalid, it is by virtue of this distinction that these cases can be han dled in terms of the interaction between the different layers of the clause. 3.3. Remaining uses of mood in embedded predications
Two groups of predicates which can take a predication as one of their argu ments and do not give expression to a modal distinction have not yet been discussed: verbs of causation, and predicates of subjective feeling. Verbs of causation, whether negated or not , always require the subjunctive in their finite complements, which might indicate that the subjunctive is the un marked mood in this grammatical context. Predicates of subjective feeling allow both moods in their complements, as illustrated in ( 1 1 5 ): ( 1 1 5)
Me molesta que Juan no esta / este (1 / S) aquf. 'It bothers me that Juan isn't here. '
Guitart (1 982) shows that the indicative is used i f S judges the information contained in the embedded predication to be new to A. This is reflected in the fact that the indicative is always used in focus constructions :
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The underlying structure o f sentences like ( 1 1 2) - ( 1 1 3) may be represented as :
262 ( 1 1 6)
Lo que es curioso es que Juan no lo sabe (I). ' What is strange is that Juan doesn't know it . '
The non-clefted variant allows both moods: ( 1 1 7)
Es curioso que Juan no lo sabe / sepa (l i S). 'It's strange that Juan doesn't know it . '
(i) (ii) (iii)
A's knowledge of the preceding conversation . A's knowledge of the speech situation. A's general knowledge.
Guitart notes a decreasing use of the indicative as a result of interference, which may be described as a change from (i) to (iii). Whereas speakers of Spanish who have not been subject to interference judge the newness of information in terms of (i) - (ii), Spanish-English bilinguals judge the new ness of information in terms of (iii), and use the indicative only if the in formation contained in the complement is judged to be unexpected by A . A certain grammaticalization of the Topic I Focus distinction seems t o be responsible for the fact that el hecho de que 'the fact that' sentences in pre matrix position take the subjunctive only, whereas in post-matrix position they allow both moods (see Guitart 1984; Terrell & Hooper 1 974). Before going into the formalization of factivity and the use of mood in factive complements I would like to return to the distinction that has been drawn earlier between the content-representing function of propositions and SoA-designating function of predications. Given this distinction and its formalization, the following representations may be given to terms designating first, second and third order entities: ( 1 1 8)
First, second and third order entities (x : Pred N (x 1 »sr First order: 1 (e 1 : [predication) (e 1 ))sr Second order: (X1 : [proposition) (X 1 ))5r Third order:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Although the subjunctive is used most frequently. This is not surprising, as predicates of subjective feeling are factive and therefore have complements which are ' implied by the Speaker to be true' and designate 'identifiable SoA's' (Bolkestein 1 98 1 ). Focus assignment to the complement as a whole is nevertheless possible, and it is this focus assignment which triggers the use of the indicative. Confirmation for this view can be derived from the changes Guitart ( 1 982) observes in the use of the indicative in this context among Spanish-speaking immigrants in the USA. S may judge the infor mation contained in the complement to be new to A in terms of:
263 These representations account for the possibility of referring to objects, SoA's and content phrases respectively. The way in which the difference between predications and propositions is reflected in Spanish has been dis cussed in 3 .2.2. 1 . A further distinction should be drawn now to distinguish between the factive and non-factive uses of predications. In 2.2.2 it was argued that S may evaluate a SoA with respect to its actuality. One might say now that once S reaches the conclusion to which the objective modal distinction 'certain' has been applied , he stores the information that the predication under consideration refers to a particular situation obtaining in reality or in a hypothesized situation as part of his knowledge, as in:
( 1 20)
Non-factive, factive and semi-factive complements (e 1 : [predication] (e 1 ))5r Non-factive: Factiver: (d 1 e 1 : [predication] (e 1 ))5r (Cert . X 1 : [proposition] (X 1 ))5r Semi-factive:
Bolkestein ( 1 98 1 ) proposes that factive complements be provided with a term operator • r , but mentions the possibility o f using the definiteness operator 'd' as an alternative. It follows from ( 1 20) that I prefer the latter analysis. Confirmation for this view may be derived from the fact that ver bal nouns, if used to replace a finite factive complement, are necessarily definite, as the nominalized equivalent of ( 1 1 5) shows: (121)
M e molesta Ia ausencia de Juan. 'The absence of Juan bothers me. '
The definiteness o f factive complements i s furthermore reflected in the fact that they may be preceded by determiner e/ 'the', as in: ( 1 22)
Me molesta el que Juan no est a! este (I I S) aqui. ' It bothers me (the) that Juan isn't here . '
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Dik ( 1 986a) uses representations similar to the one given in ( 1 1 9) as units of ' referential knowledge' , but restricts himself to first order entities. Returning now to the difference between factive and non-factive com plements, I assume that a factive complement refers to one of the entities available to S on the basis of his referential knowledge, whereas a non factive complement refers to a possible SoA which is under evaluation. The difference between non-factive, factive and semi-factive (see 3 . 2.2.2) com plements can be represented as follows :
264 The conditions for the use of the subjunctive and indicative in factive com plements can now be represented as follows : ( 1 23)
Indicative and subjunctive in factive complements (d l e 1 : [predication) (e 1 »srrop - Subj (d l e 1 : [predication) (e 1 ))sfFoc - Ind
4. DISCUSSION AND CONCLUSION
( 1 24)
Mood in propositions in (X 1 ) and predications in (e 1 ) (Qualified) Illocutionary frame
Predicate frame expresses
Mood in embedded construction
Epistemic modality DECL (Neutral) DECL (Reinforced) DECL (Mitigated)
Actual (Neutral) Certain Less than certain
Ind Ind Subj
IMP
Deontic modality
Subj
INT
Doubt
Ind
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the preceding chapter the different uses of mood in Spanish have been treated, following the general observations made in chapter 2 with regard to the different levels to be distinguished within the structure of the clause. The question remains to be answered now whether these uses are somehow related. In order to do so, I first go into the relation between two of the different uses of mood: in objectively modalized predications, and in propositions governed by an illocutionary frame. Recall that two types of frames were distinguished in 2 . 3 : illocutionary frames and predicate frames . Illocutionary frames govern propositions, which restrict (X) and represent the content of an utterance. Objective modal predicate frames govern predications, which restrict (e) and designate SoA' s. A comparison of the use of mood in the constructions governed by these two types of frames reveals some interesting correlations. In ( 1 24) an overview is given of the use of indicative and subjunctive in propositions governed by an illocutionary frame, including reinforced and mitigated declaratives, and in non-factive predications governed by a predicate ex pressing some ojective modal distinction. The use of mood, given in the right hand column, holds for both propositions in (X) governed by the cor responding illocutionary frame in the left hand column and predications in (e) governed by a predicate of the type indicated in the second column:
265 The correspondence between certain basic illocutions and categories of objective modality with respect to the use of mood seems to have the same explanation in all the different cases presented in ( 1 24): There is a connec tion between S's: (i)
(ii) (iii)
Note with regard to the notion of doubt that only de re doubt ('doubt if') as opposed to de dicto doubt ('doubt that ') is intended here. The correspondence between the uses of mood related to illocutionary force and objective modality should not lead to the conclusion that one sin gle principle underlies these different uses . Such an approach would not be able to account for the fact that the rules for the application of mood in ob jective modal contexts may be violated in the context of mitigation: ( 1 25)
(Neutral) Quizas es (I) seguro que viene (I) mafiana. (Mitigated) Quizas sea (S) seguro que venga (S) mafiana. 'It may I might be certain that he will come tomorrow . '
Of all the different uses o f mood presented in ( 1 24) there is only one in which mood can be said to fulfil a distinguishing function: qualification of a declarative sentence is expressed through the use o f indicative or subjunc tive. In all other cases the use of indicative or subjunctive follows automat ically from the predicate frame or illocutionary frame selected by S. Some other contexts in which the indicative and subjunctive alternate have been presented in 3 . 2 . 2. 1 and 3 . 2 . 2 . 2 . In both cases the different underlying structures assigned to the alternating sentences, representing their de re and de dicto interpretation, obligatory combined with positive truth commitment of S in some cases, account for the appearance of both moods. The use of both moods in factive complements governed by a predicate of subjective feeling remains to be discussed here. It was argued in 3 . 3 that in the latter context the Topic / Focus distinction is reflected in the use o f the subjunctive o r indicative respectively. Comparing the use of the indica tive in this context with its use in the only other context in which the alter nating use of mood can be said to add to the meaning of a sentence, i.e. its use in non-mitigated declarative sentences, the definitions of focus and declarative shed some light on the relation between these uses:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
presenting part of his knowledge in a declarative sentence I stating that something is consistent with his knowledge, with various sub distinctions at both levels; creating an obligation in an imperative sentence I stating that ac cording to his knowledge some obligation exists; asking A for information in an interrogative sentence I stating that he is unable to evaluate a predication in terms of his knowledge.
266
FOCUS : 'The Focus presents what is relatively the most important or salient information in the given setting. ' (Dik 1 978: 1 9) DECL: 'S wishes A to add the content of the linguistic expression to his pragmatic information . ' (Dik forthcoming) ·
Whether or not he wishes to express his reservation with regard to his assertion, as reflected in mitigation of a declarative sentence; Whether or not he judges part of his (referential) knowledge to be unshared by A, as reflected in Focus assignment to a factive com plement.
(i) (ii)
These questions relate to two kinds of general distinctions which have been made in this paper to account for the alternating and non-alternating uses of mood in Spanish: the distinction between different layers of the clause as representing the different subacts of a speech act ; and the distinction be tween non-factive, factive and semi-factive complements as representing a SoA under evaluation, an identifiable referent and a unit of S's knowledge respectively. !Rpartm�nt of Spanish Univ�rsity of A mst�rdam
NOTES I . I would like to thank three anonymous referees of the Journal of Semantics for their com· ments on an earlier version of this paper, which is a revised version of Working Pa{Nrs in Functional Grammar 22, 'The Spanish mood system ' . Parts of sections 2 . 2 and 2.3 were presented earlier in Hengeveld ( 1 987). Abbreviations used in this paper: G�neral: FG =
Functional Grammar, SoA
=
State of Affairs, A
=
addressee, S
=
speaker (in text) or
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Again, there are both coinciding and differentiating elements in the trigger ing conditions for the indicative: FOCUS and DECL coincide in as far as in both cases S acts upon the assumption that the information contained in the predication is new to A; they differ in as far as in the case of DECL S intentionally wants A to take into account the information contained in the proposition restricting (X) in his future behaviour, whereas in the case of Focus assignment to factive complements, S presupposes that the SoA des ignated by t he predication restricting (e) is new to A. In the latter case it is S's subjective feeling about this SoA that he wants A to take into account in his future behaviour. Consequently, in those cases in which the indicative and subjunctive can be said to add to the meaning of a sentence, S has to answer the following questions:
267 =
=
=
indicative; Wordclass�: {J any wordclass, N noun, subjunctive (in examples), I state adjective; Variabl�: E speech act, X propositional content, e verb, A V =
=
=
=
=
=
=
=
individual; ll/ocutionary frames: DECL declarative, !NT interroga of affairs, x tive, IMP i mperative; Semantic functions: sf any semantic function, Ag agent, Go goal ,
experiencer, Freq frequency; Term o�rators: d zero, Exp =
=
=
=
=
=
=
=
=
=
=
=
definite, i plural; Predicate o�rators: Progr progres i ndefinite, I singular, m sive; Predication o�rators: Pres present , Fut future, Poss possible, Non-act non-actual, Neg negation; Proposition o�rators: Quot quotative, Boul =
=
=
boulomaic modality, Cert
=
=
=
=
certain.
quand je dis j� crois (qu� . . . )? Sftrement non. L'operation de pensee n'est nullement !'objet de l'enonce;"
5. See Vet ( 1 986) for the representation of tense and time adverbials used here. 6. I assume that there is no intermediate propositional level in imperatives. 7. Although subject raising seems to violate this restriction, it does not if it is regarded as the result of double syntactic function assignment in the underlying predication, as proposed in Dik (1979). 8. Partly based on Bybee ( 1 985). 9 . Some examples of the limited use of the Subjunctive i n main and embedded declarative sen tences are presented in 3 . 2 . 2 . 3 . 10. A purely formal explanation for the difference between the use o f the Imperative mood i n affirmative main clauses and t h e Subjunctive mood in negative main clauses and embed ded clauses in the second person familiar might be that the Imperative forms are bare forms of the predicate which do not allow the application of a predicate operator (Tense in embedded clauses, Negation in main clauses). I I . Note that sentences like (76a) and (8 1 ) can be used only in a context in which the propo sition under consideration is explicitly or implicitly present in the context. 1 2 . I n some Spanish dialects this example is judged to be ungrammatical. 1 3 . I therefore disagree with Klein ( 1 977), who proposes to classify dudar 'doubt' and no
+
creer as two members of one group of predicates. 14. In this sentence the subjunctive could be substituted for the indicative in the complement, in which case the speaker would not commit himself to the truth of the embedded propo sition. 1 5 . An explanation for the use o f the past subjunctive in these cases might be that the use of the present subjunctive would cause ambiguity as it is also used for imperatives. I n sub jectively modalized contexts this ambiguity cannot arise, as only declaratives can be sub jectively modalized. 16. The terms ' mitigation' and ' reinforcement' are taken from Haverkate ( 1 979), in which a discussion of a number of mitigating and reinforcing devices in Spanish may be found. 1 7 . This is only one of the uses of -dir. See Lewis ( 1 967).
REFERENCES Aissen, J. & D. Perlmutter 1 976: Clause reduction in Spanish. In: Procudings of th� 2nd
annual metting of th� B�rkel�y linguistic soci�ty.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
2 . The distinction between three different types of modality presented in this section has been inspired by Lyons' ( 1 977, ch. 16- 1 7) discussion of modality. Other sources which have been used are Allwood et al. ( 1 977), Bolkestein ( 1 980) , Chung & Timberlake ( 1 985), Foley & Van Valin ( 1 984), and Mateus et al. ( 1 983). 3. cr. Dik's ( 1 986b) 'pictures'. 4 . Compare the following quote from Benveniste. ( 1 966) : "Est-ce-que je me decris croyant
268 Allwood, 1 . , L . Andersson University Press.
& 6. Dahl 1 977: Logic in Linguistics. Cambridge: Cambridge
Austin, 1 . L . 1 962: How to do thtngs with words. Oxford: Clarendon Press. Auwera, 1. van der & L. Goossens (eds.) 1 987: Ins and outs of the pr�dication. Oordrecht : Foris . Benveniste, E . 1 966 : Probl�m� de linguistiqu� gtneral�. Paris: Gallimard. Bergen, 1 . 1 . 1 978: One rule for the Spanish Subjunctive. Hispania 6 1 : 2 1 8- 34. Bolinger, D. 1 976: Again - One or two subjunctives? Hispania 59: 4 1 -49. Bolkestein, A . M . 1 977: The relation between form and meaning of Latin subordinate clauses governed by verba dicendi. Mn�mosyn� 29: 1 55-75 & 268-300. Bolkestein, A . M . 1 980: Probl�ms in th� description of modal verbs. Assen: van Gorcum. Bolkestein, A . M . 1 98 1 : Factivity as a condition for an optional rule: the "Ab Urbe Condita" construction and its underlying representation', in: Bolkestein et al.
Press. Bolkestein , A . M . , C. de Groot & 1 . L . Mackenzie (eds.) 1 98 5 : Prfflicates and terms in Func-
tional Grammar. Dordrecht: Foris. Burge, T. 1 977: Belief De R�. Journal of Ph1losophy 74: 338-62.
Bybee, J . B. 1 985: Morphology. Amsterdam : Benjamins . Chung, S. & A. Timberlake 1 985: Tense, aspect and mood. In: Shopen (ed .), 202-58. Cole, P. & 1 . L . Morgan (eds . ) 1975: S�ch acts. London etc . : Academic Press. Dik, S.C. 1 978: Functional Grammar. Amsterdam: North Holland ( 1 98 P , Dordrecht: Foris). Dik, S.C. 1 979: Raising in a Functional Grammar. Lingua 47 : 1 1 9-40. Dik, S . C . 1986a: Linguistically motivated knowledge representation. Working Pa�rs in Func
tional Grammar 9. Dik, S . C . 1 986b: Concerning the logical component of a natural language generator. Paper for the First European Workshop on Language Generation. Dik, S . C . forthcoming: Th� th�ry of Functional Grammar. Foley, W . A . & R . D . Van Valin 1 984: Functional syntax and universal grammar. Cambridge: Cambridge University Press . Goossens, L. 1 985a: Modality and the modals: a problem for Functional Grammar. In: Bolkestein et al. (eds .), 203 - 1 7 . Goossens, L. I 985b: The auxiliarization o f the English modals. Working Pa�rs in Functional Grammar 7 . Guitart, 1 . M . 1 982: On t h e u s e o f t h e Spanish subjunctive among Spanish English bilinguals.
Word 33: 59-67. Guitart, J .M . 1984: Syntax, semantics and pragmatics of mood in Spanish noun clauses. Hispanic Journal 6: 1 59-74. Haverkate, W . H . 1 979: lmposit1ve �ntenc� in Spanish. Theory and description in Linguistic
Pragmatics. Amsterdam : North Holland. Hengeveld, K. 1 987: Clause structure and modality in Functional Grammar. In: van der Auwera & Goossens (eds.). Hengeveld, K . 1 988: Layers and operators. Working Pa�rs in Functional Grammar 27. Heny, F .
& B. Richards (eds .) 1 98 3 : Linguistic categories: auxiliari� and r�latffl puu./�.
Vol. 2. Dordrecht: Reidel. Hooper, J . B . 1 974: On assertive predicates. /ULC-reproduction. Klein, Ph. W. 1 974: Obs�rvations on th� s�mantics ofmood in Spanish. Unpubl. diss . , Univer sity of Washington. Klein, P h . W . 1 977: Semantic factors in Spanish mood. Glossa J J : 3- 1 9 . Levinson, S . C . 1 98 3 : Pragmatics. Cambridge: Cambridge University Press.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Bolkestein, A . M . , H . A . Combe, S.C. Dik , C. de Groot, 1 . Gvozdanovic, A. Rijksbaron & C. Vet 1 98 1 : Predication and Expression in Functional Grammar. London etc . : Academic
269 Lewis, G . L . 1 %7 : Turkish grammar. Oxford : Oxford University Press. Lle6, C . 1979: Some optional rules in Spanish complementation . Tilbingen: Max Niemeyer Verlag. Lujan, M. 1 979: Clitic promotion and mood in Spanish verbal complements. /ULC-repro duction. Lyons, J. 1 977: Semantics. Cambridge: Cambridge University Press. Mateus, M . , A. Brito, S. Duarte & I. Hub Faria 1 983: Gramdtica do lingua portuguesa. Coimbra: Livraria Almedina. Palmer, F . R . 1 983: Semantic explanations for the use of the English modals . I n : Heny Richards (eds.): 205 -2 1 7 . Rivero, M . L . 1 979: Estudios
&
de gramatica generativa del Espaifol. Madrid: Clitedra.
Searle, J . J . 1 %9: S�h acts. A n essay in the philosophy of language. Cambridge: Cam bridge U niversity Press.
bridge University Press. Terrell , T.
& J . B . Hooper 1 974: A semantically based analysis of mood in Spanish. Hispania
57: 484-494. Vet, C. 1986: A pragmatic approach to tense in Functional Grammar. Working Papers in
Functional Grammar 1 6. Weijdema, W . , S.C. Dik, M . Oehlen, C. Dubber interaktie. Muiderberg: Coutinho.
& A. de Blauw 1 982: Strukturen in verbale
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Searle, J .J . 1 975: Indirect speech acts. In: Cole & Morgan (eds.): 59-82. Shopen, T. (ed .) 1 985: Language typology and syntactic description. Vol. 3 . Cambridge: Cam
JoiiTlUll of Semantics 6: 271-307
THOUGHT AND CIRCUMSTANCE*
MARK ARONSZAJN
A BSTRACf
ings, hopings and desirings, etc. Virtually all propotranents of this tradition have supposed
that the objects of thought are propositioru, the (primary) bearers of truth-value. There are various proposals within the t radition about what propositions
are:
but all standard concep
tions hold, roughly, that a proposition is circumstantial in character - something akin to a state, or condition, a way things could be.
I argue that objects of thought are not circumstantial in character. So the view that they are propositions, standardly conceived, cannot be right. The argument centers on the case of non
doxastic thoughts - wanderings and wis!llngs, in particular. The bulk of this paper, then, is
devoted to laying out an alternative conception of the objects of thought. This conception sup
ports the traditional idea that objects of thought are what we express by our utterance of sen tences. Moreover, on this new view, a partial account is afforded of what things are expressed by non-assertoric sentences - by sentences in moods other than the indicative.
I . SOME TRADITIONS CONCERNING OBJECfS OF THOUGHT
The Tradition There is a long-standing logical and philosophical tradition - I shall call it 'The Tradition' - whose followers have included Frege, Russell and Moore, to name a few. One of the cornerstones of The Tradition is the thesis that there are objects of thought, things of the sort whose instances can properly be said to be what a person is thinking. 1 This thesis is sometimes supported by appeal to ordinary usage. Proponents point out, for example, that we •
I am grateful to the organizers of the Fourth Cleves Conference for giving me the opportunity
to present (an initial segment oO this paper. I am especially grateful to Rob van der Sandt for his generosity and hospitality during my stay in Nijmegen. On the home front, thanks go to Tom Blackson, Eva Bodanszky, Lee Bowie, Edmund Gettier, Larry Hohm, Paul McNamara
and Ed Zalta for discussions, advice and encouragement. The material presented here is ex tracted from my doctoral thesis which is still under production. In that connection, special thanks are due to Barbara Partee, my thesis adviser, for her constant support, ready cheer and wise guidance.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A long-standing logical and philosophical tradition holds that there are such things as objtcts of thought, things of the sort a person may be said to be thinking - objects not only of doxastic thoughts (thoughts to the effect that something or other is the case), but of wanderings, wish
272
(D l )
x is an object of thought son is thinking.
=
dj it is possible that x is what a per
(I use the symbol ' = df as an abbreviation, roughly, for 'means, by defini tion ' . ) It will prove helpful to keep in mind that my use of the term 'object of thought' is governed strictly by (D l ) . Some comments about this definition: i) The phrase 'it is possible that' is intended to express a conceptual or broadly " logical" notion of possibility. ii) What (D l ) tells us about the meaning of 'object of thought' depends on the notion of thinking expressed on the right side. In this connection, it is important to note that t here are (at least) two senses of 'thinking ' ; in speaking of "what a person is thinking" in one of these senses one would mean something very different than what one would mean, so speaking, in the other sense. One is a broad, generic concept; the other notion is narrower, a species of the generic concept . In the narrow sense I have in mind, 'thinking' stands for the sort of event one may properly report in English by sentences involving the construction r thinking that el where 0 is a sentence, e.g. : '
William is thinking that Sarah is Dutch. Sarah is thinking that William is nervous. In this sense, what is meant by 'thinking' is occurrent belief - mental activi ty that is judgmental or assertoric in character. In the broad, generic sense, on the other hand, 'thinking' may refer to any mental event that is an occur rent counterpart of one of the so-called "propositional attitudes" . In this sense, when one says that a person is thinking, one may mean that the per son is engaged in occurrent belief, but one may also mean that the person
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
often speak of what a person is thinking, and we commonly have occasion to say of two people that they are thinking the same thing. More colloquial idioms also may be used: it is common to say of two people that the same thing has crossed their minds, or occurred to them . Moreover, it 's not as if we commonly say such things, but are always mistaken when we do so; the fact is, so it would be claimed, many times when we say such things, we are saying something true. A natural way to account for the truth of such claims is to allow that there are objects of thought. Unfortunately, the term ' object of thought' has been tainted by a host of t heory-laden connotations. Still, I think that in this case, familiarity more than compensates for the risk of misunderstanding, so I propose to use the expression, 'object of thought' , but by stipulation, my usage shall accord with the following definition:
273
(P I )
There are objects o f thought.
The Tradition has another cornerstone thesis: that there are things that may properly be said to be expressed by the utterance of sentences. This
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
is wondering or wishing or hoping, etc. In the right side of (D l ), it is this broader sense of 'thinking' that is intended. It's helpful to adopt some special terminology to keep this distinction clear cut. Hereafter, when I intend the narrower concept of thinking, I'll ex press it in one or the other of two ways: 1 ) by using the construction r think ing that 01 ; 2) by using phrases formed from the verb 'think' and its cog nates by prefixing a ' d ' . (The 'd' is short for 'doxastic' or 'doxastically' , indicating that the sort of thinking being expressed is judgmental , assertoric in character.) Examples of the latter terminology: I may speak of a person "d-thinking" something, or say that an event is a "d-thinking " , or that a " d-thought" has a certain object; I shall then be speaking of events or ac tivities falling under the narrower conception of thinking. On the other hand, occurrences of 'thinking' or 'thought' that are not part of the con struction I thinking that 01 , and to which no 'd' has been prefixed, should be understood to express the broader, generic concept of thinking. iii) It is useful to have a phrase with which to refer to events of all manner 2 of thinking. I 've adopted the phrase noetic event for the purpose here. So, d-thinkings, wanderings, wishings, hopings, etc. are all examples of what I shall be calling noetic events. iv) Proponents of The Tradition, without exception as far as I know, have taken there to be objects of thought associated with noetic events of all vari eties . The view is suggested by considerations of ordinary usage parallel to those produced before . We may say of two people that they are, for exam ple, wondering or wishing or hoping the same thing. Again, it seems to be the case that at least sometimes when such things are said, we speak truly. It is natural to account for the truth of such talk by positing things that are what we can thus be said to be wondering, wishing, etc. It will be useful in this connection to be able to speak of a thing's being " the object or ' a noetic event. 3 The notion can be explained somewhat loosely as follows. Let e be a noetic event - some particular person's doing some thinking. Then to say that a thing, x, is the object of e is to say: if e occurs, x is what that person is thinking (d-thinking, wishing, wondering, hoping, as the case may be). To say , for example, that a thing is the object of an event of wishing will be to say that this thing is what is being wished. Consequently, when I use 'object of thought' according to (D l ), I may be understood to mean anything that could be an object of a noetic event. The first cornerstone of The Tradition, then, may be put simply as follows:
274 tenet is also supported by considerations of ordinary usage. If, addressing Jones, I utter: (I)
You are a violinist.
and agreeing, she utters: (2)
I am a violinist.
u expresses x under interpretation i in language L. where 'u' is understood to range over utterances of sentences . I t will be con venient, though, whenever there is no ris k of misunderstanding, to suppress reference to utterance, interpretation and language, and speak simply of what is expressed by a sentence. So for t he second cornerstone, let us simply put: (P)
There are things that are expressed by sentences .
These cornerstones of The Tradition , (P I ) and (P2), are not uncontrover sial. It is reasonable to question, for example, whether the support of ordi nary usage typically cited really warrants positing such things as "objects of thought" or "what sentences express . " However, my concern in this paper is parochial. I am a follower of The Tradition, and there is some ' ' in house" business that I wish to address . Built upon these two cornerstones, a stronger thesis may be attributed to The Tradition: that at least some of the things expressed by .sentences are things that persons could be thinking. That is, (P3)
Some sentences express objects of though .
Here, again, supporting evidence comes from ordinary usage. For example,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It seems proper to say that our utterances express the same thing. A natural way to account for the truth of such claims is to allow that there are things expressed by sentences. What is expressed by an utterance of a sentence will be determined in part by the language spoken . It is possible that the very same utterance be made in two languages , expressing a different thing in one language than it does in the other. Even specifying the language involved may not suffice to deter mine what is expressed , for if a sentence is ambiguous in a given language, an utterance of it may be taken to express different things, depending on how the utterance is to be interpreted. Perhaps, 4 then, our basic locution should be:
275
(P4)
i) some imperatives express the objects of certain wishings, ii) some interrogatives express the objects of certain wanderings, iii) some indicative sentences express the objects of certain d-think ings.
(P4) shall be the leading idea for the present study, so I wish to consider briefly what support there is for the thesis. As with (P l ) - (P3), appeal can be made to considerations of ordinary usage. Perhaps I may take for grant ed that indictive sentences sometimes express the objects of d-thoughts. What support is there for the other two cases? If I am wondering when J ones will arrive, can't I express what I am won dering by uttering: (3)
When will Jones arrive?
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
we commonly say such things as that a certain sentence expresses exactly what a person was thinking, or that a person expressed just what was on her mind (by her use of some sentence), or what had occurred to her the moment before. It should be realized that in following The Tradition one is not committed to the view that every object of thought is possibly expressed by a sentence in some language. Followers may consistently hold some view, for example, to the effect that there are objects of thought too " large" or too "complex" to be captured by the sentences of any possible language. The Tradition does not commit one to the view that all sentences express objects of thought. Followers may consistently hold that some sentences do not express things at all. Furthermore, following The Tradition does not commit one to the view that all things expressed by sentences are objects of thought. Nevertheless, I think it is safe to say that followers of The Tradition would hold that some (perhaps proper) subset of the objects of thought includes the things expressed by a vast number and great variety of sentences. Now we get to the "in-house" business I wished to address. Although I 've just said followers of The Tradition would hold that objects of thought are expressed by a great variety of sentences, (P3) implies nothing about the breadth of that variety. I also suggested that followers of The Tradition have allowed that there are objects of thought of various species - objects of d-thinkings, of wishings, of wanderings, etc. (P3) implies nothing about which species of object of thought can be expressed by sentences. I wish to propose a strengthening of (P3) that does imply something about which species of object of thought can be expressed by which sentences; more specifically, the thesis guarantees that sentences of certain moods ex press the objects of noetic events of particular species. The idea is pretty simple:
276
Won ' t I thereby express exactly what I am thinking, what is occurring to me, what is crossing my mind? It seems plain that I will. So, (an utterance of) this interrogative expresses the object of a wondering. Similarly, if I am wishing that Jones would shut a certain window, and my wishing is suffi ciently demanding in character - say, I am thinking to myself: "Shut that blasted window , Jones" - then again it seems natural to say that I am thinking something, that something is on my mind, and that I can express what I am thinking if I utter (addressing Jones): (4)
Shut that blasted window.
The Fregean Tradition and circumstances
Interest in objects of thought within The Tradition has typically existed be cause it's been assumed that these items are the primary bearers of logical and semantic properties and relations. They are assumed, for example, to be the terms of the semantic relations of implication and inconsistency, the premises and conclusions of arguments, etc. If this assumption of The Tradition is correct, then the project of inves tigating the nature of objects of thought ought to be central to the philosophy of logic and of language. But this hasn't typically been precisely the project that people working in these fields have set for themselves. Here included are virtually all the prominent members of The Tradition. Instead attention has been restricted to a certain sub-class of objects of thought: the ones that have truth-values. This restriction of attention has been long standing and widespread enough as to constitute a sub-tradition, one that we would do well to distinguish from The Tradition itself, even though the two are virtually coextensive in their followers. Naturally, a central project for proponents of this other tradition has been to account for the nature of those items that satisfy, not (D l), but instead the following definition:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
So, (an utterance of) this imperative expresses the object of a wishing. Some interrogatives, then, express the objects of wanderings, some imperatives express the objects of wishings. It should be pretty clear that examples such as these can be multiplied readily. This seems to me to be good reason for thinking (P4) true. We can ask what entities, besides (utterances of) sentences, are required for the truth of (P4). This question is the starting point of my paper. One answer, of course, would be: "objects of thought. " But what are objects of thought? What sort of thing is it whose instances satisfy (D l )? An infor mative characterization of some appropriate class of entities is wanted. My chief project here is to provide such a characterization.
277 (D2)
x is a proposition = df it is possible that x is what a person is think ing, and x is either true or false.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I shall call this other tradition 'the Fregean Tradition ' , because the restric tion of attention to propositions that is its hallmark came about at least in part through the influence of Frege's view that only things that are true or false can bear the standard logical and semantic properties . � Consensus has not been reached in the Fregean Tradition about what items fit the billing of "proposition' ' . There are various sorts of things peo ple have taken propositions to be: sets of worlds, functions from worlds to truth-values; others have taken propositions as primitive and let worlds be the constructed things. There is an old Russellian picture according to which a proposition is a structured entity, a possible arrangement of objects under a particular property or relation. Still, all these conceptions have one important feature in common: each serves to represent the intuitive idea of a possible circumstance or situation, a possible condition or state of affairs, a way things could be. Thus , a set of worlds serves to determine a possible circumstance; one that would pre vail if any of the worlds in the set were actual . A function from worlds to truth-values serves the purpose as well: the characteristic function of a set of worlds will determine a circumstance that would prevail if any of the worlds mapped by the function to the value, I , were actual . Russell's notion also serves as a representation of the idea of a possible situation or state of affairs, for these can be thought of as relationships or arrangements of ob jects (in this connection, think of 2.01 in the Tractatus: "A state of affairs is a combination of objects "). Since virtually all conceptions of propositions in the Fregean Tradition have shared this feature, and since that tradition has been preeminent , as philosophical usage now stands, what one ordinarily gets taken to mean when speaking of " propositions" are things that are circumstantial in na ture. To make matters clear, a note on subsequent terminology is in order. I propose to use the term circumstance in what follows to stand for things fitting one of these circumstantial conceptions of proposition in particular - the Russelian one. However, I intend that any central points I make about the Russelian account of propositions should apply, mutatis mutandis, to other circumstantial accounts. Let me stress, though: we are going to be keeping the term ' proposition' fixed according to (D2). We shall be able to ask and have it as an open question whether, for example, propositions are circumstances. In adopting Russell's conception, I am committing myself to a certain view of the sort of things I am calling circumstances, things of the sort that Russell had in mind when he spoke of ' ' propositions" or " facts" . As Rus sell saw it, these things are complex, structured entities. Roughly, each such
278
N's VP-ing denotes the circumstance having the property expressed by VP as its first component and the thing denoted by N as its subsequent component. Some simple examples; we may say: ' Michael' s being astute' denotes the circumstance, ( Being astute, Michae l ) ' John' s loving Mary' denotes the circumstance, ( (the relation of) Loving, John, Mary)
2. THE CIRCUMSTANTIAL ACCOUNT OF OBJECfS OF THOUGHT
In the Fregean tradition, attention has been focused primarily on providing
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
complex consists either of a property and a thing that exemplifies the property , or else of a relation and a group of things that together bear the relation to one another. Following David Kaplan ' s idea,6 we shall represent these Russellian " propositions" by ordered sequences each having as its first component a property or relation. A relevant sequence will have exactly length two if its first component is a property, length three if its first compo nent is a dyadic relation, length four if its first component is a triadic rela tion, and so forth. Intuitively, any such sequence may be thought of as the circumstance consisting of the first component holding among the other components in the order in which these others appear in the sequence. The idea is that the arrangement of constituents in a circumstance is adequately reflected by the order of components in the representative sequences. It will be useful for intuitive motivation and discussion of particular claims and examples, to employ some nominalized forms that occur com monly in ordinary discourse, and whose instances are standardly supposed to denote the things we are calling circumstances . When we speak, for exam ple, of Michael's being astute or John's loving Mary, what we are speaking of, it shall be supposed, are circumstances. Simple instances of this con struction, like these two just cited, are formed by taking a true or false sen tence, S, with a singular subject term, putting the verb phrase of S in its gerundive form and putting the subject term in the possessive. It will suffice for the purposes of this presentation to consider only such simple cases. These phrases may be associated with the terms of ordered set notation that constitute our formal means of designating circumstances; roughly, the association is this: let N be some singular term, VP an intransitive verb phrase, then the phrase
279
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
an account of the objects of thought expressed by indicative sentences. The project of accounting for the things expressed by imperatives or interroga tives has not been widely addressed . The reason for this, I think , is fairly plain: it is strongly counterintuitive to suppose that an imperative or inter rogative expresses anything having a truth-value, yet there is no question that the things expressed by indicatives are truth-valued; the Fregean Tradi tion supposes that the logical and semantic relations are borne only by truth valued things, and the principal concern in that tradition has been to ac count for the nature of the terms of such relations. Nevertheless, it is widely held now that the things expressed by impera tives and interrogatives enter into these relations. There are, for example, inferences involving imperatives that appear to have important characteris tics of deductively conclusive arguments. There are semantic relations that of presupposition, for example - that seem plainly to have the things expressed by interrogatives among their terms. There is at least some reason, then, to question the assumption that only truth-valued things are terms of the logical and semantic relations. And so we may share the general concern of accounting for the nature of things that make up the fields of the logical and semantic relations without sharing confidence in the prevalent outlook of the Fregean Tradition, that those fields are made up exclusively of the things expressed by indicatives. This paper may be viewed as taking up the project of investigating the na ture of things expressed by sentences at large, a project that I think has been unwarrantedly ignored by the Fregean Tradition. Within that tradition it's commonly been held that indicatives express propositions - the truth valued objects of thought, and that these, in turn are things of a circumstan tial nature. I wish to begin, then, by considering whether this account can be made out with respect to the items expressed by indicatives, imperatives and interrogatives alike . It may be noted that there are some difficulties, by now well-known, facing circumstantial accounts of propositions . On some standard concep tions, those according to which propositions are sets of worlds, or functions from worlds to truth-values, there cannot be, contrary to intuitions, dis tinct, yet necessarily equivalent propositions. Some other notorious problems are due originally to Frege and concern the proper semantic treat ment of propositional attitude ascriptions. Many otherwise plausible, cir cumstantial conceptions of propositions are faced with difficulties in this connection. It may turn out that these problems can be got around without giving up a circumstantial conception. However, there is a difficulty for such concep tions that is less notorious, but, I think, at least as severe as the ones just cited. The problem becomes evident when you broaden your perspective, step outside the Fregean Tradition, and consider the matter of accounting for the nature of objects of thought generally .
280 The problem of non-doxastic thoughts The question I wish to address, then, is: are objects of thought circum stances? In cases of d-thinking, circumstances serve fairly well. In general, we need to assign to each event of a person thinking something, a circumstance that we take to be the relevant object of thought, what the person is thinking. Let N be a singular term , VP an intransitive verb phrase such that the sen tence, I N VP l is indicative and expresses something true or false. The standard association in the case of d-thinkings works as follows . With any thinking reported by a sentence of the form,
(recall that the instances of this construction serve to report d-thinkings) the associated object of thought will be the circumstance of N's VP-ing. So for example, the object of a d-thought reported by (5)
Laurie is thinking that Mark is anxious.
or by (6)
Ed is thinking that Mark is anxious.
will be the circumstance: Mark's being anxious . I f (5) and (6) are both true, and Ed and Laurie have the same person named 'Mark' in mind, then we ought to be able to say that Ed and Laurie are thinking the same thing, that the two cases involve the same object of thought. And indeed, the circumstantial proposal allows us to say this, for as we have just seen, the same circumstance is associated with both events of thinking. The standard view of those offering circumstantial accounts has it that the object of a d-thought is a proposition. So, for example, in the case of the thoughts reported by (5) and (6), the object of thought - the circum stance of Mark ' s being anxious - would be taken to be a proposition: the proposition that Mark is anxious. There is a very neat fit here. The embed ded clause of (5) and (6), that Mark is anxious
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
x is thinking that N VP,
28 1 denotes precisely the proposition which according to this circumstantial ac count is the object of the thoughts reported by (5) and (6). The surface gram mar of the reports reflects the claimed structure of the noetic events report ed. This may seem to be just right. Trouble arises, though, in accounting for the objects of non-doxastic noetic events. Consider, for example, these reports : (7)
Barbara is wondering whether M ark is careful.
(8)
Fred is wondering whether Mark is careful.
(9)
Ed is thinking that Mark is careful.
For according to the standard association, the object of thought in this case is also going to be the circumstance of Mark ' s being careful. So, the proposal has the result : (R l )
The object of the events reported by (7) and (8) and the object of the event reported by (9) are identical .
The standard view of this result is that the object of thought common to the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Assume that (7) and (8) are true with respect to Barbara and Fred, and that Barbara and Fred both have the same person named ' Mark' in mind on the occasions reported. If we are going to propose some circumstance as the ob ject of thought here, one plausible constraint on which circumstance to pro pose would be this: it ought somehow to involve or concern M ark and the property of being careful. After all, it seems natural to say that if a person is wondering whether Mark is careful, then what the person is wondering has to do, among other things, with M ark and with being careful. A first proposal , then , might be that the shared object of the events reported by (7) and (8) is just the circumstance of Mark's being careful. The suggestion has merits. For one thing, the circumstance proposed certainly involves Mark and the property of being careful in a very straightforward way: they are the constituents of the circumstance. Also, this circumstance is such that Barbara and Fred are both wondering whether it is the case; this is perhaps some grounds for holding that the circumstance is what Barbara and Fred are wondering. Furthermore, associating this circumstance with the thinkings reported by (7) and (8), we get the desired result that it will be correct to say that Barbara and Fred are thinking the same thing. But on this proposal , the object of thought in the cases reported by (7) and (8) would be the same as the one involved in a case reported by the fol lowing:
282 wanderings reported by (7) and {8) is a proposition - the proposition that Mark is careful - which is exactly what is proposed as the object of the d thought reported by (9). (Rl) is also a consequence of a very common point of view about noetic events, namely that such events, generally, are propositional attitudes (or occurrent counterparts of such attitudes). Let N be a name, ' is 0' an intran sitive verb phrase. On this view, the events reported by claims of the forms
N is 0
s is thinking that
f2)
s is wondering whether N is 0
f3)
s is wishing that
N would be 0
are alike with respect to their objects - each one has as its object the propo sition denoted by l that N is 0l The events differ in how the person in ques tion is related to that proposition: he or she is related by d-thinking in the first case, by wondering in the second, by wishing in the third . This view of noetic events is extremely prevalent and entrenched; indeed, it is certainly the received view on the obj ects of such events. What I wish to argue now is that, despite this agreement with the received view , the proposal at hand as to which circumstance may be taken to be the object of the events reported by {7) and (8) is unacceptable. For it is not the case that the events reported by {7) - {8) have the same noetic object (R l ) is false. This is established, I believe, by two separate considerations. First, the point seems plain enough j ust by reflection on the following. Suppose (7) - (9) are true, and that someone who knows what Ed is thinking wishes to know what Barbara is thinking. Surely it would be di.sinformation to tell this person that Barbara is thinking the same thing as Ed. When one person is wondering something and another is d-thinking something (and this is all they have on their minds) it is correct to say that each is thinking something, but it is simply incorrect to say that they are thinking the same thing ! Then it follows that what Fred and Barbara are thinking and what Ed is thinking are different things . Second, suppose again that (7)- (9) are true, and that what each person is thinking has j ust occurred to him or her. It does not follow from this that what just occurred to Ed can be expressed by standard use of an interroga tive sentence. On the other hand, it does follow that what has just occurred to Barbara can be expressed by standard use of an interrogative. for what just occurred to her is a question; questions are j ust what we express by in terrogatives. But then, again , it follows that what just occurred to Fred and Barbara and what j ust occurred to Ed are different things. So it is not the case that the object of t hought involved in the cases report ed by (7) and (8) is the same as that involved in the case reported by (9). So, .
-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
fl)
283
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(R l ) must be rejected. Hence, the proposal at hand as to what circumstance is the object of thought in the cases reported by (7) and (8) must be rejected, too, for it entails (R l ). The trouble here doesn't lie simply in which circumstance has been picked as the object of these wanderings; with any other proposal of some circum stance as the object of the thoughts reported in (7) and (8), we seem to face similar objections. For let c be any circumstance that might be proposed as the object of thought in those cases; we will always be able to find some in stance of d-thinking that according to the standard association has c as its object. Let e denote such an event, and let s be the person who ' s doing the thinking. Then with respect to e, since it is a d-thinking, not a wondering, we shall not be inclined to hold that s is thinking the same t hing as Fred and Barbara, for what is crossing their minds but not s's is a question . And plainly this point about wanderings reported by (7) and (8) can be general ized. The problem is that no object of a wondering is the object of a d thinking, contrary to the circumstantial account of objects of thought we have been considering. What's more, the problem doesn't lie solely in accounting for the objects of wanderings. An analogous problem can be posed for the view that the objects of wishings are circumstances. For on the circumstantial account, every object of a wishing will also be the object of some d-thinking, but con siderations parallel to those canvassed above show that the objects of wish ings are in general distinct from the objects of d-thinkings. Suppose that a person, s, is wishing that a certain man would be careful; say that Jones is the fellow in question. We may imagine that s is thinking to herself: Be care ful, Jones, for heaven's sake ! Suppose further that Jones has a friend, s ' , who knows that Jones is careful, and happens to be thinking this at the mo ment. Now ask whether s and s ' are thinking the same thing, whether the same thing is crossing their minds, and ask, too, whether what s is wishing and what s ' is d-thinking can both be expressed by their uttering the impera tive, 'Be careful' (while addressing Jones). Surely, "no" is the answer to these questions. If my contention here is correct, then at the very least, we have a problem for circumstantial accounts of the objects of non-doxastic noetic events given the standard assignment of circumstances to d-thoughts. But it is not easy to see how any assignments for d-thoughts other than the standard one would have any plausibility. Let me explain. I proposed a constraint j ust above on what circumstance should be as sociated with an instance of someone's wondering whether Mark is careful. The requirement was that a circumstance be associated which in some sense concerns or involves M ark and the property of being careful. But the same constraint would seem to apply to d-thoughts. For example, an instance o f someone's thinking that Mark is careful ought also t o get associated with
284 a circumstance concerning M ark and the property of being careful. This , for a reason analogous to the one cited in the previous case: it seems plainly true that what a person is thinking, in thinking that Mark is careful, is something that has to do with Mark and with being careful . But this constraint seems to require exactly the standard association. I submit that this standard as sociation, if any, is the appropriate one for d-thoughts, and hence, that cir cumstantial accounts of objects of thought cannot meet the objection I have raised concerning the objects of thought in non-doxastic cases .
Remaining options
A)
the objects of d-thoughts are circumstances, but the objects of non doxastic noetic events are not,
B)
objects of thought, generally, those of d -thoughts included, are not circumstances.
The first option is exhausted by the following two alternatives: AI)
there i s a natural genus o f which circumstances are one spe cies; the objects of non-doxastic noetic events make up other species of this genus,
A2)
there is no single, natural genus in which the objects of doxas tic as well as of non-doxastic noetic events may be located.
Neither A I ) nor A2) seem right to me, though I have no knock-down argu ments against them. Very briefly and roughly, my grounds for being disin clined to accept these alternatives are as follows. A I ) requires species of items that are not circumstances, but are nevertheless like in kind to circum stances, falling under the same genus. No intuitively recognizable species fit this bill. A2) requires an ad hoc categorial distinction among objects of thought. The project of this paper is to develop a proposal falling instead under option B) . It should be noted that proposals have already been made falling under this option. I have in mind the views, for example, of Stenius and Hare, who postulated pairings of mood and content as representations of objects of thought.7 The problem with these proposals is just that no natural genus of thing was produced displaying the relevant characteristics represented by such pairings. In effect, objects of thought were represented by sentences in a formal language which included a component indicating something
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
My starting question was whether objects of thought , generally, are circum stances. I conclude that this is not the case. A couple of options are left:
285 akin to circumstance, and a component indicating mood or attitude, but still no clear conception was offered of what these formulas were designed to ex press. Hence, no account was offered of what the natural language counter parts of these formulas were supposed to express. It was presumed that in dicatives, imperatives and interrogatives alike expressed things that are non-circumstantial in character, but exactly what the nature of such things might be was never squarely addressed, let alone resolved. I claim: an alternative under option B) is available that posits a single, recog nizable genus in which the objects of d-thinkings, wishings and wan derings may all be located;
ii)
members of this posited genus support the truth of (P4), the guiding principle of the present study, and
iii)
the alternative in question is not faced with objections akin to those raised above, which undercut circumstantial accounts of the objects of non-doxastic thoughts.
Development of this alternative shall occupy the rest of this paper. In passing, let me stress that I do hold that a notion of circumstance or condition or state of affairs deserves a central role in a theory of proposi tions, and in a theory of objects of thought generally. I contend, however, that we should not accept the role traditionally proposed, namely, that cir cumstances just are the objects of thought .
3. AN ALTERNATIVE CONCEPTION OF OBJECTS OF THOUGHT
According to the proposal I wish to advance, objects of thought are non circumstantial entities: they are event types; indeed they are event types whose instances have been under discussion here all along - the noetic events. To prepare ground for this proposal, the present section is devoted to certain ontological matters. I want to give a reasonably careful account of how noetic events fit into the ontological framework of Russellian cir cumstances adopted here. Then I ' ll say briefly what I mean when I speak of types of event . The purpose will be to provide a clear conception of a cer tain genus of entity - noetic event types - that includes, so I ' ll claim, the objects of thought among its members . Let us consider, then, the ontological foundations of this view.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
i)
286 The ontology of noetic events
� Circumstances
I � �
States of Affairs
Events
Mental
Noetic
D-thinkings
Wishings
Sensory
Wonderings
I have adopted the Russellian view of circumstances according to which they are complex, structured entities. Consequently, since I take noetic events to form a subspecies of the category of circumstance, these events themselves shall be regarded here as complex, structured things. I adopt a canonical view on the constituency of noetic events: each noetic event, I shall suppose, consists in a particular relation holding between a
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The notion of circumstance is basic to t he thesis. On the perspective I favor, circumstances are seen to divide into at least two non-overlapping bunches: events and states of affairs. The main concern here is with the category of events. Events are typically distinguished from circumstances of other sorts in that they may properly be said to happen, occur or take place at a time; they are occurrent or "punctual" circumstances. I shall suppose that the reader has an acceptable grasp of what I mean when I speak of "mental" events. Noetic events are a subspecies of the men tal. The characterization of noetic events that I gave at the start was, rough ly, this: they are events of intellect , they are the occurrent counterparts of the (so-called) propositional attitudes. I do not know how to do much better than this. I hope, though, that the reader is by now familiar with examples of events of the sort I have in mind. It should not be supposed, though, that the species of noetic event include only those whose instances we have been considering as examples . I am not claiming that d-thinkings, wishings, wan derings and hopings exhaust the kinds of noetic event. Not all mental events are noetic. My experiencing a certain shade of color or experiencing a certain pain in my left knee might either one be accompa nied by noetic events of all manner, but neither is a noetic event iself. Perhaps a tree diagram will be useful (the branchings cited are not intend ed to be exhaustive:
287
person and a circumstance (traditionally said, of course, to be a proposi tion). What are the relevant relations? For simplicity, I'll suppose that some (though not all) are the relations of d-thinking, wishing and wondering, and that these are the constituent relations of the noetic events that were our
( D-thinking, Mark, ( Being astute, Michael ) )
A reader who has been keeping track of things may find that what I am calling the "canonical" view about the constituents of noetic events is reminiscent of what I called the "received" view about the objects of noetic events. The latter view was roughly that circumstances ("propositions" so called) are the objects of noetic events. I am now proposing to adopt the ca nonical view, yet the heart of my criticism of circumstantial accounts, which in turn was the motivation for considering an alternative conception of ob jects of thought , was an argument against a certain consequence of the received view. So, it may seem that there is this view - the "canonical" or "received" view - the rejection of which is at the heart of my paper, but which I am now proposing to adopt . This is a misapprehension; the received view and the canonical view are independent. Suppose that a person, call her s , is thinking something. Let e be this event of s's thinking. The canonical view adopted here proposes that this event is a complex entity that consists in a relation holding between s and a circumstance. I am a proponent of The Tradition; I take the quan tification in 's is thinking something' seriously. I agree that there must be some relation that holds between s and what s is thinking iff s is thinking something. It does not follow from this or the canonical outlook on the con stituency of e, that when we speak of what s is thinking we are referring to one of the constituents of e. Consider an analogy . Plainly, we may speak of what a person is dancing: a person may dance the twist or the boogaloo , for example. What are we referring to when we speak of "the twist" or "the boogaloo "? It seems natural to say that it is a sort or type of dance. Then, if we suppose the per son is dancing some dance, does that require that we view the event itself as consisting of a relation holding between this person and some type of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
standard examples in the previous sections. With this canonical view about the structure of noetic events (and follow ing Kaplan in representing circumstances, generally, as ordered sequences), I ' ll be representing noetic events as certain ordered triples . Each of these triples contains (in this order): a relation, a person (or other creature capable of noetic activity), and a circumstance (a further ordered sequence). Con sider a simple example: If I am thinking that Michael is astute, this event will be represented by the triple whose components are the relation of d thinking, myself, and the circumstance of Michael 's being astute:
288 dancing? This does not seem to be required. We may say instead that when one speaks of what a person is dancing, one thereby indicates, not a consti tuent o f the event, but rather some particular type o f dance of which the event is an instance. 8 According to the thesis I wish to develop, objects of thought are types o f noetic event. I n the preceding discussion, I intended t o give a reasonable idea of how noetic events fit into the ontological framework adopted here. Finally, then, before stating my proposal about the nature of objects o f thought, I need to discuss briefly the sort of thing I have in mind when speaking of noetic event types.
There is a distinction commonly drawn in philosophical discussions between concrete, particular events, on one hand, and event types, on the other . John's kicking Bill at a particular time is an example of a concrete event . This event may be said to be an instance of many event types; for example, it's an instance of the somewhat specific type, kicking Bill - but the event is also, simply, a kicking. In other words, John's kicking Bill is an instance of a broader event type than that of kicking Bill; it is an instance of kicking, simpliciter. This latter type has concrete kickings of Andrew as well as con crete kickings of Bill among its instances. When we speak of " kickings" we may mean either concrete events of the kicking type, or we may mean some of the various types of kicking. A simi lar ambiguity arises in our talk of noetic events. When we speak ordinarily of " thoughts" or "judgments", "wishes" or "questions", and when I spoke earlier of "d-thinkings" "d-thoughts" "wishings" and " wander ings " , there are at least two sorts of things that we may mean. We may mean concrete noetic events - like Fred' s thinking at a certain point in time that Mark is anxious , or Barbara ' s wishing right now that Mark would be care ful. Alternatively, though, we may mean to speak of types of noetic event; j ust as we may mean to speak of types of event, not their concrete instances, when we speak of " kickings " . It is true that there are a variety of things we can refer to by using the derived nominals: 'a thought' , 'a judgment ' , ' a wish ' , ' a question' , etc. Sometimes we thereby refer to a n event or action, sometimes to the result of an event or action. But I believe that, in particu lar, when we use nominals of the forms, the the the the
thought that 0, judgment that 0, wish that 0, question whether 0,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Types
289
Formalities Before proceeding, it will be useful to introduce a more regimented vocabu lary in which we may state subsequent definitions and principles concerning the properties and logical behavior of the noetic event types that I shall take to be the objects of thought. - Variables: ' P ' , 'R' , 'T' (perhaps superscripted) will be variables for properties and relations, and may stand either in subject or predicate position in sentences. The latter series, the " T ' s " , shall be un derstood to be property variables restricted to noetic event types. ' x ' , 'y' (perhaps with superscripts)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
where 0 is a sentence, what we refer to are not events, but rather certain kinds, sorts or types of event - noetic event types, in particular. The nature of types is a controversial topic. Presumably, they are kinds of thing . . . but then what are kinds? A common view is that a kind is a special property. But what kind of property would this be? Not all proper ties are kinds or types. Being red is a property, but many would resist the claim that it is a type of thing . According to another view about types and kinds, they are pluralities of a certain special sort, where a plurality is under stood to be a set-like entity. A plurality is like a set in having members, but unlike a set in that it may be said to undergo change in membership over time. On this view we would say that event types are pluralities having par ticular concrete events as members. It may be that we still would be left with a question parallel to the one j ust raised for the view that types are proper ties. We would still want to know which of all the pluralities can be counted as kinds or types . In any case, in what follows, I 'll adopt the view that types are properties . I don't believe that the central proposal of this paper - that objects of thought are noetic event types - is committed to any particular treatment of types. I choose the property conception for the following reasons. First, properties were already countenanced in the ontological background for the Russellian conception of circumstance adopted here - so my proposal is committed to the notion of property anyway. I think if properties have al ready been acknowledged, we may as well have them do as much work for us as possible. Second, I am simply unfamiliar with the conception of types as pluralities, and do not know what characteristics types would be saddled with on that conception. It is easier for me to develop the proposal along lines with which I am better acquainted .
290 will be variables for individuals. 'c' (perhaps with superscripts) will be a variable for circumstances. 'e' (perhaps with superscripts) will be a variable restricted to noetic events. A class of variables restricted only to events at large will not be needed . We'll need variables for expressions of the language too: For variables: 'v' (perhaps with superscripts) (also: 'v I ' , 'v2', . . . )
'0' (perhaps with superscripts) - Terminology for the standard logical notions should be obvious. D shall serve as a sentence operator expressing logical necessity. I use '>-.' to form complex terms that may either stand as predicates or as singular terms for properties and relations. The formation of these terms goes according to the following (standard property abstraction): in general '
'
'>-. v i . . . >-. vn[e] ' is a term, provided vi - vn each has at least one free occurrence in e. An example here may be helpful: i f 'x is careful' is the substituend for e, then according to this schema we get >-.x[x s careful] as a complex term; on its intended interpretation, it is a term standing for the property of being careful . In what follows the proviso should be kept in mind that formulas may contain these terms in either subject or predicate positions. It will be useful to adopt a few definitions for the constituents of noetic events. Every circumstance has a relational constituent, a property or rela tion such that the circumstance consists in that property or relation being exemplified . In the case of noetic events this is always the first component of the representative triple:
(03)
R is the relational constituent of e (' RELe = R' for short)
=
df 3x3c(e
( R ,x,c ) )
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
and for expressions generally (typically , sentences or sentential clauses}:
29 1 We may speak of "the subject" of a noetic event, the person who is doing the thinking, defined: (D4)
x is the subject of e = df 3 R3c(e ('SUBJe = x')
( R,x,c ) )
Also , o n the canonical view of noetic events adopted here, there will b e ex actly one circumstance associated with any noetic event that may be thought of as "the content" of the event, according to: (D5)
It will be convenient to have some defined predicates expressing proper ties each of which is, for some particular species of noetic event, the property of being an event of that species. We do this by appeal to the predi cates introduced above standing for the basic noetic relations: (D6)
JUDGe
= df
RELe
(D7)
WISHe
= df
RELe = wishing
(D8)
QUESe
= df
RELe
=
=
d-thinking
wondering
These are to be read, respectively: e is a judgment, e is a wish, e is a question. To say, according to this usage, that a noetic event is a j udgment, a wish or a question, then , is to identify its relational constituent, and thereby, its species. A noetic event is a j udgment if it is an event of d-thinking, a wish if it is an event of wishing, a question if it is an event of wondering. With these defined monadic predicates, we may form terms that will be of central use in what follows, terms denoting noetic event types. For example: ' ft.e[QUESe] ' denotes the property of being a question (which, given the conception of types adopted here, will be the general noetic event type of wondering, the most generic type all of whose instances are things I am calling "ques tions"). As a further example, suppose we had a predicate ' RASHe' standing for the property a noetic event has if it is rash (I am supposing that a judgment, for example, or a wish, may be said to be rash). We may then form a >.. expression denoting the property of being a rash judgment:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
c is the content of e = df 3R3x(e = ( R,x,c ) ) (for short: 'CONTe = c')
292 ' >-.e[J UDGe and RASHe] ' This completes the development of the ontological foundations. Before proceeding, let me take stock and consider what's to come. At the outset of this section, I said I wished to advance the thesis that ob jects of thought are non-circumstantial entities, that they are types of noetic event. So I propose: (T l )
Objects o f thought are noetic event types
(P4)
i) Some imperatives express the objects of certain wishings, ii) some interrogatives express the objects of certain wanderings, iii) some indicative sentences express the objects of certain d-think ings.
What I propose to do then is to specify a certain subclass of noetic event types whose members, I shall claim are objects of thought. I intend to show that the members of this class do support the truth of (P4) . We shall also see that in taking these noetic event types to be objects of thought, we are not faced with the problem concerning the objects of non-doxastic thoughts discussed in Section 2.
A subclass of the objects of thought Consider, then, all noetic event types that satisfy the following definition: (D9)
T is a content-specific noetic event type O (e and e ' are instances of T iff a) RELe = RELe ' b) CONTe = CONTe ' )
=
df
The intuitive idea behind (D9) is this: a content-specific noetic event type is one that speci fies the relational constituent of its instances, specifies the content of its instances, but specifies no other features of its instances be sides these two. The types satisfying (D9) shall occupy our attention in the rest of the paper, so it may be helpful to consider some simple examples of how the definition is to be applied.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Not just any noetic event type is an object of thought, though. And I should stress that I do not intend to provide an informative account of exactly which noetic event types are objects of thought. Such an account is not re quired by the central project that I set for myself in this paper. That project was, roughly, to find a class of items which could support the truth of the following natural extension of the cornerstone tenets of The Tradition:
29 3 We noted above an over-arching noetic event type of questions; there is also a most generic type of j udgments: J: >..e [JU DGe] This fails to satisfy (09) because there are instances of J that have different contents. So, clause ii) b) isn't met. How about the following more specific type? J 1 : >..e [JUDGe & 3y (CONTe = ( >..x [x is anxious], y) )]
el:
( d-thinking, Michael, ( >..x [x i s anxious] , Mark ) ) the event of Michael' s thinking that Mark is anxious
e2:
( d-thinking, Mark, ( >..x [x is anxious] , Michael ) ) the event of Mark' s thinking that Michael is anxious
Though both are instances of J 1 , they do not involve the same contents: ( >..x [x is anxious] , Mark ) is the content of e l , ( >..x [x is anxious] , Michael ) is the content of e2 . So J 1 doesn't satisfy clause b) : there are instances of the type that do not share contents . Some noetic event types are, in a sense, too specific to count . They specify the noetic relation and content of their instances but specify more besides. For example, consider the following noetic event types: Q:
wondering furtively whether Fred is happy
W:
wishing at 1 2 :00 noon that Mark would not be anxious
J " : thinking rashly that Laurie is happy
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This is the type: thinking of some person that he or she is anxious. Think of it as the most generic type all of whose instances are de re j udgments of some person or other that he or she is anxious (not a type whose instances are de dicto judgments that an anxious person exists). J 1 also fails to satisfy the definition. Here although the content is partly specified - the first com ponent of the content has to be the property of being anxious - the content is not completely specified. To see what I mean, consider the following two noetic events, each of which is an instance of J 1 :
294 It can be seen that these do not satisfy (D9). For each of these types we can produce a pair of noetic events that have their noetic relation and content in common, where one member of the pair is an instance of the type, while the other is not - thus, clause a) will not met. Consider, for example, J 1 1 , which may be formulated >-.e[JUDGe & RASHe & CONTe
=
( >-.x [x is happy] , Laurie ) )
Now consider the noetic events described i n the following two cases:
Plainly, e3 is an instance o f J 11 , e4 is not, for e3 does while e4 does not have the property of being rash. However, both e3 and e4 are judgments , hence the events have the same relational constituent . And both have the cir cumstance of Laurie's being happy as their content. Since they have rela tional constituent and content in common, but one is an instance o f J 1 1 while the other i s not, J " fails t o satisfy clause a ) of (D9). Similar considera tions would also show that neither W nor Q satisfy this clause of the defi nition. What are some examples of noetic event types that do satisfy the definiens of (D9)? Here are some: P:
Thinking that Fred is careful
w • : Wishing that Mark would not be anxious Q • : Wondering whether Laurie is happy Take J• (analogous points can be made, mutatis mutandis, with respect to the other two). The first component of any event that is an instance of J • is the relation of d-thinking; the content o f any such event i s the circum stance of Fred ' s being careful. Unlike the cases of J and J ' , no two instances of this type will involve different contents. Unlike the case of J 1 1 , no two noetic events that agree on noetic relation and content will split with respect instantiating J • . Either both will be instances or both not .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a) Mark is thinking that Laurie is happy, but he is doing so very rashly: he has just considered the matter, and without any good evidence has hastily come to the conclusion that she is happy because, he " reasoned" , all blondes are happy. Let 'e3' denote the noetic event here of Mark's thinking that Laurie is happy. b) Fred is thinking that Laurie is happy, but he is not doing so at all rash ly : he has just seen Laurie, seen that she is smiling and knows from much past experience that Laurie is a person who smiles only if happy. A cautious person, Fred has reflected for a moment and concluded that Laurie is hap py. Call the noetic event here of Fred's thinking that Laurie is happy, 'e4' .
29 5 I propose, then, that the noetic event types satisfying (09) are objects of thought; that is: (T2)
All content-specific noetic event types are objects of thought
4 . THE NEW CONCEPTION AT WORK
I wish to consider an assignment relating sentences and events of thinking to content-specific noetic event types. The assignment applies to a very sim ple fragment of English including imperatives, interrogatives and indica tives, and some rudimentary that- and whether-clauses that can be em bedded in sentences that serve to report events of thinking. The assignment is intended to work as follows. The type assigned to a sentence will serve as the object of thought expressed by the sentence; the type assigned to a that or whether-clause is intended to serve as the object of thought for an event reported by a sentence embedding that clause. In the long run, I should want to consider a much richer fragment than the one to be considered here. For the purposes of this paper, only a simple illustration is needed , su fficing to give an idea of i) the association of noetic event types as objects of thought to sentences and to events of thinking, and ii) how my proposal circumvents the problem faced by circumstantial ac counts in identifying the objects of non-doxastic thoughts. I'll begin by cit ing the vocabulary and formation rules of the fragment. Then I'll specify a compositional assignment of sentences, and of that- and whether-clauses to content-speci fic noetic event types.
The English fragment F Most of the names and verb phrases of the fragment are already present in the language in which we have been preparing to discuss the assignment .9
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
What is common to diverse events of thinking in virtue of which we may correctly say that they have the same object is, I claim, at least in some cases the content-specific noetic event type of which the diverse events are all in stances. So, I propose that at least in some such cases, it is that content specific type, the one having the diverse events of thinking as its instances, the very property itself, that is what we are referring to when we speak of what the subjects of those events are thinking. The object of thought just is that type. Let us turn now to consider some applications of this proposal, in particu lar, how (P4) may be upheld.
296
i)
or ii)
Let N and I is el be, respectively, a name and one-place predicate, then 0 1 is a sentential clause if either a) b) c)
0 ' is an indicative clause ( Ithat N is 0l ), 0 I is an interrogative clause ( = I whether N is 0l ). 0 ' is a subjunctive clause ( = I that N would be 0l ). =
0 is a one-place predicate i ff either i) 0 is a basic one-place predicate of F, or ii) 0 is the result of attaching either a) b) or c)
an indicative clause to the right of the two-place predicate, ' is thinking' , an interrogative clause to the right of the two-place predi cate, 'is wondering' , a subjunctive clause to the right of the two-place predicate, ' is wishing' .
Nothing is a sentential clause or a one-place predicate unless ob tained from steps i) or ii). The definition of 'sentence of F' will appeal to three syntactic operations acting on name/predicate pairs.
iii)
- The sentential operations, IND, /NT and IMP: Where N is a name and I is 0l a one-place predicate of F, IND (N
'
I is 0l ), an indicative,
=
I is 0l ), an interrogative, ' IMP (N I is 0l ), an imperative, ' /NT (N
=
I N is 0. l =
I Is N 0 ? l I N be 0. l '
All indicatives, interrogatives and imperatives are obtained by application
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
- The Basic Vocabulary of F: Names: ' Michael' , 'Barbara ' , 'Fred ' , ' Laurie' , ' Mark' Basic one-place predicates : ' is careful' , ' is anxious' , 'is happy' Two-place predicates: 'is thinking ' , ' is wondering' , ' is wishing' Subjunctive and imperative copula : 'be' Modal auxiliary: 'would ' Complementizers: 'that ' , 'whether' Punctuation: ' , ' ' . ' ' ? ' Besides the basic vocabulary cited here, there are complex expressions in F. - Sentential Clauses and One-place Predicates: The definitions here proceed by simultaneous induction:
297 of the above operations to pairs o f name and one-place predicate. Now the definition of sentence of F is straightforward .
Sentence of F: 0 is a sentence ofF iff an interrogative.
0 is either an indicative, an imperative, or
Besides simple sentences such as Laurie is happy.
this definition of sentence together with the recursive specifications of sen tential clause and one-place predicate, give us much more complex sentences consisting of multiply imbedded sentential clauses. For example, the follow ing are well-formed sentences of F: Michael is thinking that Fred is wondering whether Laurie is happy. Mark is wishing that Barbara would be wishing that Mark would be careful. The next step is to specify a function assigning to each sentence of F a noetic event type that fits our intuitions about what is expressed by S .
The assignment The assignment function, • , is compositional, operating on names, one place predicates, sentential clauses and sentences, as follows. i)
Where N is a name, N* is j ust the denotation of the name.
Examples: ' Barbara' • Barbara Fred, ' Fred'* =
=
Specification of • for one-place predicates goes in three steps, first to basic one-place predicates , then to sentential clauses, then to the complex, one-place predicates formed with the sentential clauses. ii)
Where P is a basic one-place predicate of F, P* is the denota tion of the >.-term formed from that predicate.
(Keep in mind F has some predicates in common with our background language.)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Mark is anxious .
298 Example: 'is careful'* = >vc [x is careful] (i .e. , the property of being careful) iii)
Where 0 is a sentential clause, a) if 0 is l that N is P l , 0* A.e[JUDGe & CONTe b) if 0 is l that N would be P l 0* A.e[WISHe & CONTe c) if 0 is lwhether N is P l , 0* = A.e[QUESe & CONTe =
=
l is P l • , N * ) )
= < , = (
l is P l * , N* ) ]
= (
l is P l • , N* ) ]
Example: 'that Barbara is careful'* A.e[JUDGe & CONTe ( ' is careful' * , 'Barbara'* ) ] = A.e[JUDGe & CONTe = ( A.x [x i s careful] , Barbara ) ] =
iv)
where P is a complex, one-place predicate, and 0 is the sentential clause of P : P* = A.x[3e(0*e & SUBJe = x)]
An example: 'is thinking that Barbara is careful ' * = A.x[3e(' that Barbara is careful'*e & SUBJe
=
x)]
(See previous example to confirm the •-assignment cited here for 'that Bar bara is careful'.) A.x[3e(A.e ' [JUDGe ' & CONTe ' = ( A.y[y is careful] , Barbara )]e & SUBJe = x)], =
and by eliminating the A.e ' -term inside, we can simplify further: A.e[3 e(JUDGe & CONTe ( y[y is careful] , Barbara ) & SUBJe = x)] . =
=
The result: the predicate 'is thinking that Barbara is careful' is assigned the property that a person has in virtue of engaging in a judgment whose con tent is the circumstance of Barbara's being careful. Finally, • acts on sentences according to the following: v)
Where S is a sentence of F, N a name of the basic vocabulary, and P a one-place predicate of F, then either a) S IND(N, P) and S* A.e[JUDGe & CONTe = ( P* , N * ) ] , or IMP (N, P) and b) S A.e[W ISHe & CONTe ( P*, N* ) ] , or S* =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Next, the complex, one-place predicates. In general, for any complex, one-place predicate of F, there is a largest sentential clause from which it has been formed; call this "the sentential clause" of the predicate. Then we posit :
299 c)
I NT(N, P) and >-.e[QUESe & CONTe = ( P* , N* ) ] .
S s•
I ' m set now to fulfill some promises. I wish to advance two proposals, one identifying the things expressed by the sentences of F, the other identifying the objects of certain noetic events. The proposals place constraints on what things are expressed by sentences (o f F), and what things are the objects of thought (of d-thinkings, wishings and wonderings), and together these two proposals entail (P4) . The first is very straightforward: V0(if 0 is a sentence of F, then 0 expresses 0*)
The claim , then is that with respect to the sentences of this simple fragment of English, what such a sentence expresses is the noetic event type assigned to it by the • function. I ' ll consider some examples shortly to motivate the claim . The second thesis associates noetic events with their objects: (T4)
ve vc: i) ii) iii)
the object of e = >-.e[JUDGe & CONTe = c) iff JUDGe & CONTe = c the object of e = >-.e[WISHe & CONTe = c) iff WISHe & CONTe = c the object of e = >-.e[QUESe & CONTe = c) iff QUESe & CONTe = c
Put a little roughly, (T4) tells us that the objects of events of d-thinking, wishing and wondering are simply their respective content-specific types, and that only such events have those types as their objects . Let's turn t o examine consequences of (T3) and (T4), and t o see how these proposals accommodate (P4).
What some indicatives, imperatives and interrogatives express From (T3) and (T4) together with some facts we noted about the assignment function • we get the following result: (R2)
i) ii) iii)
indicatives of F express objects of d-thinkings. imperatives of F express objects of wishings. interrogatives of F express objects of wonderings .
It is not hard to see how (R2) is derived. Take the case of indicatives - clause i). Let 0 be any indicative sentence of F. By (T3), 0 expresses 0• . But if you examine how • operates on indicative sentences, you'll see that 0• is a
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(T3)
300 content-specific judgment type. By (T4), such types are the objects of d thinkings. Since choice of 0 was arbitrary among the indicatives of F, we get that the indicatives of F express objects of d-thinkings. Exactly analo gous reasoning yields clauses ii) and iii). (P4) follows directly from (R2). It has been a long-standing puzzle, what to say are the things expressed by imperatives and interrogatives. Moreover, consider the following sen tences of F: Fred is careful.
( 1 7)
Fred, be careful.
( 1 8)
Is Fred careful?
One is strongly inclined to say that no one of these sentences expresses the same thing as any of the others. Nevertheless, t here does seem to be a sense in which, though these sentences express different things, the things they ex press may be said to concern the same state of affairs or circumstance that of Fred's being careful. These points concerning ( 1 6) - ( 1 8) seem to hold , generally, for all such trios of indicative, imperative and interrogative formed from the same name and predicate. Finally, I think it is natural to hold that , in general, there is a similarity in kind between things expressed by any of the three moods, a common genus of things that are expressed by our utterance of sentences. Here, then , are some intuitive desiderata: For any trio of indicative, interrogative and imperative formed from the same name and predicate, a) each one expresses something different from the thing ex pressed by either of the others, but b) the things they express may all be said in a certain sense to con cern the same circumstance and c) things expressed by (utterances of) sentences fall under a com mon, natural genus. On the present view, with respect to the sentences of the fragment, F, each of these points is satisfied. Before showing this to be the case, let me review briefly some matters that were raised when I first discussed the notion of event type. I noted then that our ordinary usage of the terms 'j udgment' , 'wish ' , ' question' and 'thought' is ambiguous. When w e use one o f these terms we may mean to be speaking of a noetic event, but we also may mean a noetic event type. From here on, when I wish to speak of noetic events , I'll use of 'judgment' (for d-thoughts) , 'wishing ' , or 'wondering' . When I mean to speak of noetic event types, I adopt a slight contrivance in order to dis-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 6)
30 1
(T5)
the JUDGMENT that N is 0 < I is 0 l •, N• > 1
(T6)
the WISH that N would be 0 ( 'is 0 ' • , N• ) ]
(T7)
the QUESTION whether N is 0 = Ae[QUESe & CONTe N• ) ]
=
=
Ae[JUDGe & CONTe Ae[WISHe & CONTe = =
( 'is 0'•,
Armed with these identities, let's see how we arrive at the desiderata, a) -c), cited above. I ' ll work backwards. The three categories of sentence in the fragment are associated by • with different categories of object of thought. Every item assigned by • to a sen tence of F is of the same basic genus as any other: they are all noetic event types. Indicatives (true/false ones) express J UDGMENTS, interrogatives express QUESTIONS, and imperatives express WISHES. It sounds natural enough, but moreover, we have provided a precise account of what these things are, and which of them gets expressed by which indicative, imperative or interrogative in our fragment. So point c) is satisfied . Now consider what's expressed according to the present proposals by the sample sentences ( 1 6) - ( 1 8). These get assigned, respectively, to the follow ing types (the reader is invited to check the results): ( 1 6•)
Ae[JUDGe & CONTe
( 1 7•)
Ae[WISHe & CONTe
( Ax [x is careful] , Fred ) ]
( 1 8•)
Ae[QUESe & CONTe
( Ax [x i s careful] , Fred ) ]
=
( Ax [x is careful] , Fred ) ]
1 ( 1 6•)- ( 1 8•) are different things. 0 Nevertheless, there i s a clear sense in which these things expressed by ( 1 6) - ( 1 8) concern the same circumstance: each one is instantiated solely by events having the same circumstance, that
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ambiguate: I ' ll use capital letters; in particular , I 'll use ' JUDGMENT' , ' WISH ' , and 'QUESTION' , respectively, for types o f judgment, wishing and wondering. I also contended earlier that when we speak of "the judgment that 0" , "the wish that 0" , or "the question whether 0" , where 0 is a sentence, our reference is not ambiguous; we are speaking of types. In fact, when the embedded sentence in such a phrase is a sentence of F, the type denoted is, I claim, a content-specific noetic event ytpe. This claim may now be formu lated precisely; we can specify for every such term exactly which content specific noetic event type it denotes. Let N be a proper name of F, and 0 be any expression such that I is 01 is a one-place predicate of F; then I pro pose that all instances of the following schemas are true:
302 of Fred 's being careful, as their content . And it is easy to see that analogous points hold true with respect to the • assignments to any trio of sentences of F formed from the same name and predicate. So we get points a) and b).
Objects of thought Let us see how the problem of the objects of non-doxastic thoughts gets handled on the present view. First, let's consider the case of judgments. If both of these sentences Laurie is thinking that Mark is anxious
(6)
Ed is thinking that Mark is anxious
Are true, then we should say that Laurie and Ed are thinking the same thing. The circumstantial account that was considered and rejected in section I I gets this right; according t o it, both noetic events reported by these sentences have the same object : the circumstance of Mark's being anxious. Can the present account use noetic event types to equal advantage? Yes . According to the present account, the object of each of the events reported by (5) and (6) is a JUDGMENT, and moreover, the same JUDGMENT is associated with both cases, namely: J..e [JUDGe & CONTe ( x [x is anxious] , Mark ) ] the JUDGMENT that Mark i s anxious =
But now what about non-doxastic thoughts? Consider, again, sentences: (7)
Barbara is wondering whether Mark is careful
(8)
Fred is wondering whether Mark is careful
The trick is to assign a common object to the thoughts reported by these sen tences, distinct from the object of the thought reported by (9)
Ed is thinking that Mark is careful
The circumstantial account could not handle this problem . On the present account there's no difficulty. (7) and (8) both report thoughts whose object is the type, (type l )
J..e [QUESe & CONTe = ( x[x is careful] , Mark ) ]
this i s a different thing from the object o f the judgment reported b y (9), namely
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(5)
303 (type2)
}..e [JUDGe & CONTe
=
( >..x [x is careful] , Mark ) ]
1)
2) 3)
the view affords a conception of the objects of thought according to which they are seen to fall under a single, natural and recognizable genus, the view entails (P4), and the resulting conception of the objects of thought does not succumb to the problem posed in Section 2 concerning the objects of non doxastic thoughts.
These were the merits I claimed for the account at the end of Section 2. Moreover, with respect to our fragment, F, the view affords a clear and natural answer to the question: What do imperative and interrogative sen tences express? The answer accords with what the untutored might have been inclined to say all along: that imperatives express wishes and interroga tives express questions (the untutored don't know to use capitals).
5. PROPOSITIONS AND CONCLUSIONS
Propositions, judgments and truth So far, I have isc.:�ted a class of JUDGMENTS, WISHES and QUES TIONS all of which, I claim, are objects of thought . But I haven 't said any thing yet about the objects of thought that have been the principal concern in the Fregean Tradition; I haven't said anything yet about propositions. According to (D2), a proposition is a truth-valued object of thought . Then, a full account of the nature of propositions ought to do two things: 1) it should pick out a class of objects of thought that bear truth-values, and 2) it should explain what it is for such items to be true or false. The account I am about to give parallels in certain respects the preceding account of objects of thought, generally. From (T l), the thesis that all
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(type 1) is a QUESTION, (type2) is a JUDGMENT. This precisely fits the intuitions I cited for claiming that we should not say on the basis of the truth of (7)- (9) that Ed is thinking the same thing as Fred and Barbara. I said then that it is a question that is crossing the minds of the latter two whereas a judgment is crossing Ed ' s mind; that was correct , but now the claim can be made without ambiguity: it is a QUESTION that is crossing their minds. Moreover, as desired, we can say that what is crossing Fred 's and Barbara' s minds i s something that , like what's crossing E d ' s mind (if (7) - (9) are true), concerns Mark 's being careful. For all events t hat instantiate either (type l ) o r (type2) have the circumstance of Mark's being careful a s their content. Let me summarize what has been accomplished so far. I am advancing a view that consists of proposals (T 1 ) - (T7). Here are some of its features :
304
(T8)
All content-specific JUDGMENTS are propositions
I also propose to accept certain identifications concerning which proposi · tions are which JUDGMENTs . Where N and l is el are, respectively, a name and one-place predicate of F, then instances of the following hold: (T9)
The proposition that N is 0 = A.e[JUDGe & CONTe = ( 'is 0'* , N * ))
As a consequence of (T9) and (T3 ), we get all instances of: ( 1 9)
'N is 0' expresses the proposition that N is 0.
The claims asserted by instances of this schema seem to me to be exactly what we should want. I wish to turn now to the matter of truth-value. If content-specific JUDGMENTS are propositions, then such JUDGMENTS must all be possibly such as to have truth-values. What makes a JUDG MENT true or false? Roughly, my proposal is this: a JUDGMENT is true just in case its con tent obtains . This is rough in part because, strictly speaking, the notion of content has been defined for noetic events, not their types. But we can fix that: (D 1 0) contT = c)
=
d/ the c such that 0 (Ve(if e instantiates T, then CONTe
The content of a type, then , is the circumstance such that necessarily, any event instantiating the type has that circumstance as its content. Now we can say, speaking strictly of the content of a type: (01 1 ) T is true
=
df T is a JUDGMENT & contT obtains.
Let's consider a simple illustration of how (D1 1 ) applies to the particular JUDGMENTS I have singled out as propositions . Here is a representative proposition:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
objects of thought are noetic event types, together with (D2), the definition of proposition, it follows that all propositions are noetic event types. But since I do not claim that all such types are objects of thought, a fortiori they are not all propositions. Nor can I say in any informative way which noetic event types are propositions. What I shall do, though, analogous to the course taken above in addressing the general question of which noetic event types are objects of though t , is to mark off a subclass of types whose mem bers all are propositions. The character of the members of this subclass is illustrative of the character of propositions at large. Or so I claim. In the previous section, content-specific JUDGMENTS were associated by the function • to the indicative sentences of F. My proposal is that all such noetic event types are propositions:
305 (type 3) X.e[JUDGe & CONTe
=
( 'is astute'*, ' Michael ' * ) ]
This is a proposition because i t is a content-specific JUDGMENT. Since it is a proposition, it is possibly true or false. Under what conditions is it true? By (D l l ): j ust in case its content obtains. So we need to establish which cir cumstance is the content of the type. No event can instantiate (type3) unless the content of that event is the circumstance of Michael's being astute (Hint: apply * to the relevant vocabulary items and consider the fact that (type3) is specific with respect to content). Then by (D l O) , Michael's being astute is the content of the type. We have the result, then, (20)
This accords, it seems to me, with what we should have expected to be the case, a priori.
Conclusions Let ' s consider a brief list of some of the important things not accomplished in this paper; then, in defense, I'll conclude by pointing out how, in many ways, far less might have been accomplished . A short list of shortcomings: i) No account has been given here of how the logical or semantic relations can be seen as operating with these "new " objects of thought making up their fields. ii) No set of necessary and jointly sufficient conditions has been given for being an object of thought. iii) The fragment considered was meager, it included no complex singu lars (e.g. definite descriptions) and no quantification, let alone such diffi cult items as indexicals, demonstratives, or any sort of intensional operator. I think that it is reasonable to undertake study of the logical and semantic relations independently of having a clear conception of what the terms of those relations are. By the same token , I think it a reasonable task to under take a study of the nature of the terms of those relations independently of exploring the logical characteristics o f those terms. This latter task is the one I have taken up here. It might be claimed that the logical and semantic rela tions are definitive of their terms, that there simply are no characteristic fea tures of those terms other than those features having to do with how such terms behave under the logical and semantic relations (a number just is whatever obeys the Peano Postulates). Any proponent of The Tradition who holds that the objects of thought are terms of logical and semantic rela-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The proposition that Michael is astute is true iff the circumstance of Michael's being astute obtains.
306 tions should not accept this perspective. The cornerstones of the Tradition and related extensions, like our thesis (P4), provide a wealth of characteris tic features independent of the logical and semantic ones. It is true that I have offered no set of necessary and jointly su fficient con ditions for a thing's being an object of thought. However, I have offered a necessary condition: a new conception of what genus of entity the objects
NOTES I. 2.
See , for example, Frege ( 1 956), Russell ( 1 9 1 8) and Moore ( 1 95 3 , chap. 3).
According to the standard dictionary entry, to say something is "noetic" is to say that it
is "of or pertaining to . . . the intellect; characterized by, or consisting in intellectual activity ' . This seem s reasonably appropriate. Husser I made use of a related locution, 'noema' , in his work on intentionality. My use of 'noetic' is actually derived from some of Alvin Plantinga's recent work in epistemology. Plantings speaks o f a person's " noetic structure", by which he means, roughly, the structure of propositions that comprises the person's beliefs, ordered ac cording to their epistemic status for the person. I don't know whether Plantinga's terminology is derived directly from Husserl ' s . 3 . Certainly states o f belief have objects as well; perhaps objects o f the same sort . Attention shall be focused here, though, on treatment of OCCII"�nt ev�nts and acts of thinking. I do not think this policy jeopardizes any central contentions or proposals in the paper. 4.
Aren't contextual features such as the time and plac� of an utterance, who is the utterer,
who is being addressed essential features of an utterance? Otherwise, such features as these would also have to determine parameters of our basic locution.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of thought fall under. I have suggested that we look at the objects of thought as types of thinking, rather than as circumstantial entities. One advantage of this new conception is that, in contrast to the circumstantial conceptions that have arisen within the Fregean Tradition, we have a coherent account of the objects of non-doxastic thinkings. I have also made what seems to me a good start at deploying this new conception, for I have used it to isolate the things expressed by (utterances of) sentences of several very familiar al beit simple forms . The results of this deployment seem to me too accord ex actly with our intuitions. It is true that no logically tricky or complex constructions are included in the fragment. But my purpose in presenting a fragment of English has not been to show how this new· view could afford insights into the proper treat ment of the logic and semantics of natural language. It was i ntended to serve i n illustrating how the present accou nt supports the truth of (P4) while han dling the particular problem addressed in section I I , of identi fying the ob j ects of non-doxastic thoughts. It seems to me the fragment serves this pur pose well enough. I do tend to think the alternative conception of objects of thought laid out here offers new handles in grappling with some long standing problems in various areas of semantic analysis. Discussion of such putative applications, though, will have to be left for occasions other than the present one.
307 5. There are two important qualifications here. (I) Frege was certainly not the first to propose the view that trut h-valued objects of thought are the primary bearers of semantic and logical properties and relations. His teachers, Windelbandt and Lotze, preceded Frege in holding this view. Indeed, the view was already expressed i n the work of the medieval logicians. I owe these points to Goran Sundholm, whose paper at the Cleves conference tracked the historical de velopment of the view with great care. I stick with the phrase ' Fregean Tradition' only because Frege is unquestionably a leading proponent of this view in the r�nt history of logic, philosophy and semantics. (2) Though Frege held that there are items satisfying the right side of (02), he did not use the term 'proposition' to apply to them, but rather the term 'thought'. C f Frege, op.cit. 6. 7.
Kaplan ( 1 974) . For Hare, see ( 1 952, chapts. 2 & 12); for Stenius, ( 1 967).
gous to the proposal made in this paper about proper treatment of some of our talk of objects of thought. 9 . This section was influenced by some material in Terry Parsons' ( 1 985). 10. Whether these are distinct noetic event types depends on the proper analysis of their in
stances. ln the relevant cases, ! think the matter is pretty clear-cut. However, I am not claiming
that an event instantiating one of the generic types }..e [JUDGe). }..e [WISHe] and }..e [QUESe]
can't also instantiate one of the others. Maybe, for example, it can be made out that questions comprise a species of wish. This idea is actually at the heart of a fairly current proposal on the proper semantic treatment of questions, cf. Aqvist ( 1 965).
REFERENCES Aqvist, L. 1965: A New Approach to the Logical Theory of Interrogatives, Part I : A nalyis. Uppsala. Frege, G . 1956: The thought: a logical inquiry. Translated by: A . Quinton and H. Quinton, · Mind LXV: 289-3 1 1 . Hare, R . M . 1952: The Language of Morals. Oxford. Kaplan, D. 1 974: The ramified theory of types. Unpublished. Montague, R. 1 969: On the nature of certain philosophical entities. Monist 5 3 : 1 59- 1 94. Moore, G . E . 1953: Some Main Problems of Philosophy. Allen & Unwin. Parsons, T. 1 985: Underlying events in English. In E . LePore and B. McLaughlin (eds .), Ac
tions and Events: Perspectives on the Philosophy of Donald Davidson, Basil Blackwell, Oxford. pp. 235-267 . Russell, B. 1 9 1 8 : The philosophy of logical atomism. Reprinted in R. Harsch (ed .), Logic and
Knowledge, Allen & Unwi n , 1956. Stenius, E. 1 %7: Mood and language-game. Synthese 17: 254-274.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
8. I recommend to the reader's attention, R . Montague's ( 1 969). His proposal there on how to treat "object talk" of various sorts - what is experienced, what is reported, etc. - is analo
Journal of Semantics 6: 309-344
A COMPUTATIONAL ACCOUNT OF SYNTACTIC, SEMANTIC AND DISCOURSE PRINCIPLES FOR ANAPHORA RESOLUTION
NICHOLAS ASHER and HAJI ME W ADA
A BSTRACT
pragmatic and even "stylistic" constraints on anaphora. We build on our B U ILDRS im plementation of Discourse Representation (DR) Theory and Lexical Functional Grammar (LFG) discussed in Wada & Asher
( 1 986). We develop and argue for a semantically based
processing model for anaphora resolution that exploits a number of desirable features: (I) the partial semantics provided by the discourse representation structures (DRSs) of DR theory, (2) the use of syntactic and lexical features to filter out unacceptable potential anaphoric antece dents from the set of logically possible antecedents determined by the logical structure of the DRS, (3) the use of pragmatic or discourse constraints, noted by those working on focus, to impose a salience ordering on the set of grammatically acceptable potential antecedents. Only where there is a marked difference in the degree of salience among the possible antecedents does the salience ranking allow us to make predictions on preferred readings. In cases where the difference is extreme, we predict the discourse to be infelicitous if, because of other con straints, one of the markedly less salient antecedents must be linked with the pronoun. We also briefly consider the applications of our processing model to other definite noun phrases besides anaphoric pronouns.
I . INTRODUCTION
The analysis of anaphora and an efficient method for searching anaphoric antecedents is a central problem of computational linguistics. Intrasentential anaphora involving singular noun phrase antecedents has been for several years a central topic of syntax. More recently intrasentential and intersen tential has received much attention from those working in semantics, prag matics and discourse theory. The problem of anaphora is also of central concern to computer scientists working on natural language understanding systems. This research has led to a host of constraints within different paradigms on the anaphoric process. We present for a semantic framework that integrates these different constraints - syntactic, semantic, pragmatic and even "stylistic" - into a unified model of anaphora resolution. From our perspective anaphora resolution is the process by means of which we come to identi fy or otherwise relate a "discourse referent , " in the sense of Karttunen ( 1 976) and Kamp ( 1 98 1 ), introduced by an anaphoric
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We present a unified framework for the computational implementation of syntactic, semantic,
310
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
pronoun, with a discourse referent introduced b y a possible antecedent. This identification (or other relation) helps to determine the truth condi tions of a discourse containing an anaphoric pronoun, but the information that determines the identification itself is not purely semantic, syntactic or pragmatic. Thus, modelling the process of anaphora resolution forces us to confront something that everyone knows in principle but doesn't often face up to in practice: the tools of syntax and semantics won't suffice to deter mine truth conditions for discourses in general (and in particular those con taining anaphoric elements but also many others as well). So in order to have some hope of mechanically generating reasonable, determinate truth conditions (presumably the task of semantics), we need to have a theory which integrates semantic and syntactic information with less well understood notions relevant to pragmatics and discourse structure. We view our work on anaphora, as well as other work on anaphora within DR theory and related paradigms, as an attempt toward the construction of such a theory. Building on the BUILDRS implementation of Discourse Representation (DR) Theory and Lexical Functional Grammar (LFG) discussed in Wada & Asher ( 1 986), we will describe a computational implementation of our theory of anaphora resolution. We will develop a processing model for anaphora resolution that exploits a number of desirable features: ( 1 ) the partial semantics provided by the discourse representation structures (DRSs) of DR theory, (2) the use of syntactic and lexical features to ftlter out unacceptable, potential anaphoric antecedents, (3) the use of pragmatic or discourse constraints, noted by those working on the notion of local fo cus or topic. 1 The process of anaphora resolution is a process that finds the appropriate , anaphoric antecedent discourse referent to be identified or otherwise related to a discourse referent introduced by an anaphoric pronoun. The various constraints provided by syntax, semantics, pragmat ics and discourse theory "weed out" potential candidates for anaphoric linkage until a unique or appropriate discourse referent is found. The con straints are ordered in such a way as to minimize backtracking in the search for anaphoric antecedents. There are two sorts of constraints: absolute and interpretive. The absolute constraints, which include constraints based on the logical structure of the DRS as well as constraints relying on syntactic and lexical information, rule out possible anaphoric antecedents absolutely; the absolute constraints are applied first in our implementation. The in terpretive constraints impose a preference salience ordering on the set of potential antecedents that are lexically, semantically and syntactically ac ceptable. According to our processing model, the salience ordering works in the following way. Where there is no marked difference in the degree of salience among the grammatically and logically acceptable potential antece-
31 1 dents of some anaphoric pronoun, the salience ordering provides no useful information. Where there is marked difference in the degree of salience among the antecedents, it allows us to make predictions on preferred read ings. In cases where the difference is extreme, we predict the discourse to be infelicitous if, because of world knowledge, one of the markedly less salient antecedents must be linked with the pronoun. We apply our processing model also to the processing of definite descriptions which ex hibit on the familiarity theory of definiteness2 a similar anaphoric behavior to that of anaphoric pronouns. In the concluding section, we compare our work to some alternatives extant in the linguistic and computational litera ture.
BUILDRS is an implementation of an LFG parser and a OR-theoretic semantic component in PROLOG . 3 It constructs a set of DRSs, semantic representations for a discourse, from the set of f-structures of the dis course's constituent sentences delivered by the parser. It also includes an anaphora resolution module for finding antecedents to anaphoric pro nouns. The f-structures derived by the parsing component of BLDRS are standard. The semantic theory underlying DRSs is discussed in detail in Kamp ( 1 98 1 ), ( 1 983), Asher ( 1 986), ( 1 987) and to some extent in Wada and Asher ( 1 986). To recapitulate some of those discussions very briefly, a DRS is a pair of sets (U, Con ) , where U is a set of discourse referents and Con a set of conditions, i.e. property ascriptions, whose arguments may be dis course referents in U. Discourse referents come in various types; we use , � for lower case letters for discourse referents of individual type, K 1 , discourse referents of proposition-type. The other types of discourse refer ents will not be at issue here. DRSs are in effect partial models for a dis course. By embedding the DRS for a discourse within a total model, we provide the discourse with a truth conditional interpretation. On a proper embedding g of a DRS K in a model M , discourse referents in U K are mapped onto individuals in the domain of M such that all the conditions in ConK are satisfied in M relative to the assignment of objects to discourse referents provided by g . 4 The mapping from f-structures to DRSs works basically as follows. � Suppose D is a discourse consisting of sentences S 1 , , Sn . The parser component begins by mapping S 1 into an f-structure F 1 • Then the DRS construction algorithm converts F 1 into a DRS D 1 • Now the process starts over again and S 2 is mapped into an f-structure F2 ; the DRS construction algorithm now builds D2 from F2 . Now, however, a new step must be per•
•
.
.
•
•
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
2. THE BASIC FRAMEWORK: BUILDRS
3 12
(I)
A man loves a woman. She loves him.
(2) man(u 1 ) woman(u2) loves(u 1 , u2) loves(xp x2) XI = U 2 x2 = u l Here is a rough sketch of how BUILDRS arrives at (2). The discourse is read into the parser 1 sentence at a time. The parser returns an F-structure for the first sentence, which looks like this:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
formed . The construction algorithm now incorporates the new information in D2 into the already given information in D 1 to get a new DRS for the two sentence discourse ( S 1 , S2 > . D 1 acts as a context used in the interpreta tion of the new information in D 2 • The processes of f-structure construc tion, DRS construction, and DRS "amalgamation" continue until all the sentences of the discourse are processed . The parser component of the system generates a discourse referent with a unique numerical tag, whenever it encounters the determiner of a singular or plural NP or a proper name or a bare plural. This discourse referent serves as an index for a noun phrase, but it is used only to coindex gaps, pro and NPs in such phenomena as long distance dependencies and control phenomena. Our parser contrasts with others attempting to incorpate com plex syntactic theories like Government and Binding or various extensions to the original LFG in that we do not attempt to use the syntactic representa tion or a coindexing mechanism in anaphora resolution. 6 Anaphora resolu tion is a process that operates only on semantic representations, DRSs, in our system, although it depends on many sources of information - includ ing of course syntactic information. Syntactic information relevant to anaphora resolution is stored in a d atabase and used or translated into the level of semantic representation. We hope thus to avoid the redundancies that come from straightforwardly mating some established syntactic theory of binding with DR theory. 7 But we also analyse anaphora at the level o f semantic representation, because we believe that anaphora must b e under stood semantically, as part of the determination of truth conditions of a dis course. As an example of the DRS construction component of BUILDRS at work, consider ( 1 ) . The DRS in (2) results from ( 1 ) using BUILDRS:
313 (2 ' )
subj det a - u 1 pred man gender masc number sing obj det a - u2 pred woman gender fern number sing pred loves ( ( subj ) ( obj ) )
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The components o f this F-structure are now translated into partial DRS structures. These are essentially lambda abstracted formulae in which the lambda abstraction is either over discourse referents or sets of conditions. For example in the subj (subject) F-structure in (2 ' ), the a - u 1 value of the det (determiner) slot receives the translation [ud APA.Q [[u 1 ] , [P, Q]] . This structure, which we call a partial DRS, will become a DRS when the property abstracts P and Q are filled in with the appropriate values. A.P and A.Q abstract over sets of conditions, where those conditions are themselves understood as having at least one lambda abstracted argument. The partial DRS specifies that u 1 is to be applied to the lambda abstracted argument in the sets of conditions that will fill in P and Q after A. conversion. The transla tion process also yields the structure A.x [[] man(x)] as the value of the pred slot in the subj F-structure. We call this an incomplete DRS. The informa tion given by the slots gender and number is stored in a database for the dis course referents under the entry for the discourse referent introduced by the determiner. This information will be used later in anaphora resolution . Once the various entries in the F-structure have been translated, the pro gram calls the routine conversion, which simultaneously applies incomplete DRSs to property abstracts and discourse referents to abstracted arguments in the conditions of incomplete DRSs. So for example the conversion of the incomplete DRS and the partial DRS in the subj F-structure of (2 ' ) is the partial DRS [u 1 ) A.Q [[u d . [man(u 1 ) Q]] . A similar partial D RS results for the obj F-structure: [u2) A.Q [ [u2] , [woman(u2), Q] . The main verb or pred slot of the whole F-structure in (2 ' ) is another incomplete DRS, A.xA.y [[], loves(x,y)] . The information that the second argument of loves is the Object F-structure while the first argument is the Subject F-structure guides how the application o f the partial DRSs to this incomplete DRS is to proceed; the program will always convert the partial DRSs with this incomplete DRS in such a way that u 1 ends up in the first position of loves, while u2 ends up in the second. Because of this encoding, A.xA.y loves (x,y) is equivalent to A.yA.x loves(x,y). The program can legitimately first convert either the Subj or Obj F-structure ' s translations with the main pred's translation. I f we first use the Obj F-structure's translation, w e get the incomplete DRS
314 Ax. [[u2 1 , [woman (u2), loves(x, u2)] ] . Converting this with the Subj F-struc ture's translation yields the completed DRS [[u p u21 [man(u 1 ), woman(u2), loves (u p u2)] ] , here given in list notation.s Now the program turns to processing the second sentence. First the parser returns an F-structure, and the translation and conversion routine yield a DRS, K2, for the second sentence.
K2 does not yet look like some portion of (2), however, because of the peculiar conditions introduced by the anaphoric pronouns . Anaphoric pronouns like she and him introduce discourse referents that must be identi fied with other discourse referents; the conditions they introduce are partial ly filled in equations that the anaphora resolution process must complete. In K2 they are the conditions x 1 = [ 1 and x2 = [ 1. But the anaphora reso lution process cannot complete these equations until K2 has been merged with the DRS for the previous sentence in the following way: the discourse referents in the universe of K2 are appended to the set of discourse referents in the universe of the first DRS, and the set of conditions of the new DRS are appended to the set of conditions of the DRS for the second DRS. Now the anaphora resolution process can find the appropriate antecedents and produce the final result in (2) . (2) says that there are objects correlated with u 1 , u2 and x 1 and x2 (call these objects u 1 , u2, x 1 and x2 respectively) such that u 1 is a man, u2 is a woman, u 1 loves u2, x 1 = u2, x2 = u 1 , and x 1 loves x2 . Here is a n example with a slightly more complex semantic structure and anaphora resolution problem.
(3) (4)
Every man loves a woman. She is beautiful. z
=>
beautiful(z) z = [1
woman (u2) loves(u . , u2)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
loves(x 1 , x2) XI = [ 1 x2 = [ 1
315
beautiful(z) z = u2
(ACCESS) A discourse referent x is accessible to a discourse referent y just in case (i) X E UK and y E UK, or (ii) X E UK and Y E UK' and there are K 1 , , K0 such that K 1 is subordinate to K 1 and K1 is subordinate to K2 and . . . and K0 is subordinate to K. •
•
•
The point of using DR theory and the DRS construction algorithm as a com-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
BLDRS returns two different DRSs for (3), the flrst corresponding t o the wide scope reading for every man, the second corresponding to the wide scope reading for a woman . The differences result from the order in which sub and obj F-structures are combined with the translation of the main pred of the first sentence. The determiner every yields under translation a partial DRS with a complex structure; in list notation it is [u 1 ] :>..P :>..Q [[[u 1 } ,P] :) Q] . When filled in, this structure yields a complex condition of the form K :) K 1 , where K and K 1 are DRSs . This condition should be read informally as saying that for any objects corresponding to the dis course referents declared in the universe of K such that the conditions in K are all satisfied, there are objects corresponding to the discourse refer ents declared in K 1 such that all the conditions in K 1 are satisfled.9 K and K 1 are termed subordinate DRSs or subDRSs of the principal DRS. More generally, we say that K 1 is subordinate to K2 just in case K 1 occurs in a condition of ConK or as a component of a condition of the form K 1 >=> 2 K2• SubDRSs in general arise with the introduction of logical operators on DRSs such as the conditional (the operator E) ), other quantificational 1 operators, negation, modal operator or attitude predicates . 0 The construction of the rest of the DRS for the discourse proceeds as be fore, and we will not go into details. The anaphora resolution process, however, yields different results for the two DRSs constructed from (3). But to explain this we need to turn to our first and principal semantic con 11 straint on anaphora, accessibility .
3 16
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ponent in a system of anaphora resolution is to exploit the notion of accessi bility; it furnishes the core of our constraints on anaphora. Our approach to anaphora resolution hinges on being able to recover the discourse refer ents accessible to a give discourse referent. 12 The discourse referent data base constructed during the production of DRSs for a discourse reflects the relations between subordinate and "superordinate" DRSs. The database has the form of a tree structure; each node in the tree represents the universe of some DRS, and the paths between nodes represent the relation of accessi bility. So once a DRS has been constructed, it is an easy matter to find the accessible discourse referents for any discourse referent in the database. Let us see how accessibility affects anaphora resolution in some detail. The anaphora resolution module is the last module in BUILDRS, and it takes as input the complete DRS formed at the end of processing each sen tence in the discourse. The reason it needs to operate on complete DRSs is t hat the constraints on anaphoric relations imposed by the accessibility rela tion require that the logical structure of the sentence be determined. The scope of various quantifiers and truth functional operators is only fully de termined at the level of a complete DRS. So once a DRS K for a discourse has been constructed, BUILDRS goes back and searches for conditions of the form 'x = [ ] ' . A discourse referent introduced by an anaphoric pronoun will occur on a certain node nm in the tree; the discourse referents accessible to it are all those on nodes in the path from nm to the root no of the tree, which stores the universe of the top level DRS for the whole dis course. We will call nodes on this path n-accessible to nm. The program searches back along that path to find all the discourse referents accessible to x. We will call the set of discourse referents accessible to x Accx . Our ac cessibility constraint, ACCESS, says that a discourse referent x introduced by an anaphoric pronoun may only be identified with a discourse referent y E Accx . Our program, however, imposes a series of constraints on the set of accessible discourse referents to "filter out" unsuitable discourse refer ents. The first such constraint is a feature constraint FEAT, and it relies on the number and gender features stored in the discourse referent database . 1 3 The feature constraint, FEAT, says that a discourse referent x introduced by an anaphoric pronoun may only be identified with a discourse referent y having the same number and gender features. To understand how these constraints work, let us return to the DRSs in (4). Note that the logical structure of the DRS representing the left-right scope assignments precludes any successful match for the discourse refer ent introduced by the anaphoric pronoun. The discourse referent intro duced by a woman is not accessible to the discourse referent introduced by she. When the scope assignments are rearranged as in the second DRS, however, the discourse referent introduced by a woman does become acces-
317 sible to the discourse referent introduced by she, and a successful match is made. Accessibility is a powerful, semantic constraint. Exploiting principles of DRS construction for various constructions, the accessibility constraint rules out or allows the following identifications of discourse referents the indexings in (5) suggest. Note that for us two NPs are coindexed iff the dis course referents they introduce are identified during the anaphoric resolu tion process (on at least one reading). The coindexings that are starred are those that the constraints do not allow to occur (5)
c.
d. e.
f.
Every farmeri who owns a donkeyj beats itj . Hei. is unhappy. If a farmeri owns a donkeyj , hei beats itj . Itj • is unhappy (as suming narrow scope for a donkey) . Every cadeti receives a rigorous training. First hei goes to boot camp. Then hei gets intensive flight instruction in a basic train er. Finally, hei is given 200 hours of flight time in a supersonic aircraft . It's false that a mani came to visit yesterday. Hei• left a while ago (assuming narrow scope for a man). John suspects that a womani broke into his apartment . He be lieves that the police will never catch heri (assuming narrow scope for a woman . John doubts that a woman broke into his apartment. He be lieves that the police will never catch heri. (assuming narrow scope for a woman).
The admissible coindexings in the famous pair (5a) and (5b) fall straight out of the analysis of universally quantified sentences and DRT's analysis of 1 conditionals. 4 The program yields almost the same DRS for the first sen tences of (5a) and (5b); we give here the DRS for (5b) with the narrow scope reading for a man and a donkey and wide scope for the conditional operator:
farmer(u 1 ) donkey(u2) owns(u" u2) unhappy(z3 ) z3 = [ 1
beats(z 1 , z2) z, u, z2 = u2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a. b.
318 The discourse referents u 1 and u2 are accessible to the discourse referents z 1 and � given the definition of accessibility, and since the identifications of z 1 with u 1 and � with u2 pass the test required by FEAT, the anaphora resolution component of the program adds those equations to the DRS. By the definitions, however, none of the discourse referents introduced earlier in the DRS are accessibile to z3, and so the DRS for (5b) with this scope reading cannot be completed . A similarly straightforward explanation in volving accessibility and the narrow scope of the NP a man vis a vis negation accounts for the impossibility of (5d) . A more complex account of the in
nal formulation of our program (Wada & Asher 1 986) , the algorithm placed no constraints on scope possibilities and returned separate DRSs with each
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
terpretation of the attitudes combined with the constraints ACCESS and FEAT are needed to explain the data in (5e) and (5f), but it should be noted that this interpretation is independently motivated. 1 5 (5c) is to be explained once again by appealing to accessibility. But here we also require a particular interpretation of the second, third and fourth sentences in the discourse. They all describe what every cadet does and as such can be seen as amplifications of the conditions in the consequent of the => operator introduced by the first sentence. The construction process al lows the material of subsequent discourse to be entered into the consequent, subordinate DRS, when such redescription or amplification occurs. 1 6 As Roberts ( 1 987) has observed, the subordination of a larger than clausal chunk of a discourse to some supposition requires assigning an implicit quantificational or modal force to the subordinated material. 1 7 In (Sc) the subordination succeeds , because the discourse cues make clear that the material after the first sentence amplifies on the training each cadet receives and so it is taken to have an implicit quantificational force. This subordi nated reading is apparently not possible for (Sa) and (Sb) . Unfortunately, it is very difficult to give rules for when such subordinated readings are pos sible . We use cues like the enumeration of tasks or the presence of modals or quanti ficational adverbs to check for possible subordinated readings, but that isn't sufficient. It is really a combination of such cue words and lexical knowledge that enables a discourse like (5c) to pass the ACCESS constraint but not (Sa) and (Sb). 1 s None of the analyses of the examples (Sa-f) are new theoretically. But note that they rely heavily on an analysis of scope as well as a certain amount of interpretive flexibility . 19 The definition of accessibility and the structure of the discourse referent database presuppose a determinate scope assign ment for all noun phrases. We have also seen that in the DRT framework the scope of NPs is not limited by sentence or clause boundaries but by logi cal structure. Thus, an indefinite NP in a discourse may have scope over a large, multisentential chunk of discourse, and even a quantificational NP as in (5c) have scope over a multisentential chun k of discourse. In the origi
3 19 scope possibility. In view of the data in (5a-c) (but see also May ( 1 985)), however, this is simplistic and does not do justice to the facts. The formula tion of scope constraints are complex and involve perhaps as many diverse sources of information as anaphora itself. We have not unfortunately been able yet to formulate a very thorough list of principles for scope assignment, but we will list those principles of which we are aware and which we believe might be useful. 20
-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have already discussed to some extent principles relevant to scope with respect to subordinated readings as in (5c). In order to set down our other principles for analysis of scope, we need to discuss an important difference between discourse referents introduced by definite NPs and those introduced by indefinite NPs. We distinguish between two sorts of definites dependent and non-dependent. All definites generate presuppositions of familiarity; some of these are part of the non-linguistic and pragmatically given context of utterance, but some also depend on explicitly introduced linguistic elements . Non-dependent definites are those NPs whose presup positions are met (or assumed to have been met) within the non-linguistic component of the context. Dependent definites are those NPs whose presuppositions depend on already introduced, explicitly linguistic elements in the discourse. Within the theory, the " familiarity" presupposition of a definite is interpreted as the requirement that the discourse referent it in troduces be identified with some antecedently occurring discourse referent (in the case of a dependent definite) or some contextually given object . Often (but not always) a dependent definite is a complex term containing an anaphoric element as a component. The denotation of a dependent definite will typically be a function of the denotation of some other NP. De pendent definites have a restriction on their scope possibilities. Our claim is that they prefer as wide a scope as is possible, relative to the NPs that function as their linguistic antecedents or on which they are functionally de pendent. We have built our claim into the theory by requiring that a dis course referent x introduced by a dependent definite be copied into a separate list DEFi where i is the node of the accessibility tree which con tains the discourse referent with which x, or some discourse referent also in troduced ("co-introduced") by the NP that introduces x, is to be identified. 2 1 x is then deleted from its original position in the accessibility tree. Dependent definites create certain processing complexities. In order to find the appropriate anaphoric attachment for the pronouns within a depen dent definite a, the discourse referents introduced by a must be introduced as low down in the accessibility tree as possible. We must then resolve the equations for the discourse referents introduced by pronouns in a first, then raise these discourse referents up as high as possible in the accessibility tree, then resolve other equations introduced by other anaphoric pronouns. We
320 have yet to implement this order of processing in our program, however. Most, if not all, uses of proper names, demonstratives and many uses of definite descriptions are non-dependent definites. Such NPs naturally take a "referential " interpretation (or at l east a widest possible scope with respect to other operators) and so introduce discourse referents that are ac cessible to any other discourse referent introduced in the discourse. We cap ture this observation in the implemented version of the theory by copying the discourse referents introduced by non-dependent definites into a separate list DEF on the root node of the accessibility tree in the database and by introducing the following contraint.
Together with our analysis of dependent definites, (DEF) and (ACCESS) appears to account for the following data: (6)
a.
Everyone who likes Gary Cooper is a good judge of films. He was a great actor. b . • Everyone who likes his favorite movie stari is a good judge of films. Hei is a great actor. c. If Mary likes everyone who likes the best dressed seniori at Austin H igh, then she likes himi . d . •If Mary likes everyone, who likes hisj motheri, then she likes heri.
(6a) is predicted to be good, because Gary Cooper is a non-dependent definite; so the discourse referent introduced by the name moves to the DEF list at the root node of the accessibility tree. Thus, that discourse referent is always accessible to any discourse referent introduced in subsequent dis course. In (6b) on the other hand, his favorite movie star is a dependent definite that moves onto the DEF list for the node in the accessibility tree which contains the discourse referent i ntroduced by everyone. That dis course referent is inaccessible to the discourse referent introduced by he in the subsequent sentence. 22 (6c- d) show similar predictions of DEF together with ACCESS for more complex DRS structures. Indefinites force us to add further to the the constraints on anaphora resolution. Indefinites appear to be sensitive to surface order in a way that definites are not . With ACCESS alone the theory predicts (7a) (from Sidner (1983)) to be acceptable; (7b) is predicted to be bad when the indefinite is given narrow scope with respect to the conditional operator. The explana-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Suppose x is a discourse referent introduced by a anaphoric pronoun into the accessibility tree t at node n. Then x may be identified with y, only if y E ACCESS(x) U ( U i DEF) , where i is any node n-accessible to n.
(DEF)
321 tion for this prediction i s that the discourse referent provided b y John is co pied onto the root node of the accessibility tree for (7a); thus that discourse referent is accessible to the discourse referent introduced by he. On the other hand, if the indefinite is interpreted as having narrow scope with respect to the conditional operator, then it introduces a discourse referent x in the DRS for the consequent of the conditional in (7b), while the pronoun introduces a discourse referent z in the DRS for the antecedent. By the accessibility con straint, z cannot be identified with x. (7)
a.
If hei comes before the show, give Job� these tickets and send
him to the show.
b. *If hei comes before the show, give a m� these tickets and send
a man were assigned wide scope over ( 1 982) have argued, indefmites may sometimes be interpreted specifically, and under such an interpretation (7b) would be perfectly acceptable if
the conditional. Indeed as Fodor and Sag
are typically assigned maximal scope. 24 Nevertheless, it appears difficult to
assign a scope to
a man
that is wider than that of the conditional operator.
Besides the logical structure of the DRS that affects anaphora resolution through the accessibility constraint , there is another scope-like relation on discourse referents that aff� s anaphora resolution. Suppose that a is an NP in sentence Si in the discourse ( S p . . . , Si, . . . Sn > · Suppose that a
introduces a discourse referent X0 at node n on the accessibility tree t for the discourse. The F-structure to DRS mapping must first process Sp then
S 2 , and so on, so that any discourse referent introduced in sentence Sj at node n for j s i must be introduced into the DRS prior to the introduction
of the discourse referent x
a at level n. Further the construction procedure
we have devised says that if the partial DRS that is the translation of a1 is
applied to the translation of a PRED that is of the form Ax1
•
•
•
�(x 1 ,
. . . , �) after the application of the partial DRS that is the translation o f
a ' then the discourse referent introduced b y ai at level n i n t will precede i the discourse referent introduced by aj in the discourse referent list at level
n in t. These facts about the translation procedure from F-structures to
DRSs determine a strict ordering, order (,) on discourse referents in the
universe of DRS (or at each level in the accessibility tree) prior to the con struction of the DEF lists. We will adopt the following scope-like constraint on anaphora that exploits this ordering: 25
(SCOPE) Suppose x is a discourse referent introduced by an anaphoric
pronoun into the accessibility tree t at node n and suppose y is
also introduced at node n. Then:
(a) if y f. U i Defi and order (x, y) then x may not be identified with y .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
him to the show. 23
322 (b) If z is introduced at node n, order (y, x), z .,t. x, order (x, z) and x may be identified with y, then x may not be identi fied with z.
(8)
a. ? Hisi mother loves Johni b. • Hisi mother loves someonei . c . Someonei loves hisi mother. d. • Hisi mother is loved by someonei.
Our notion of scope allows one d iscourse referent x in U K to have scope over a discourse referent y if x is added to U K prior to y. 27 Although clearly the different scopes that result from the processing of two indefinite NPs won't affect truth conditions or anaphora, the scope relations that obtain between a discourse referent introduced into a universe U K by an indefinite and a discourse referent introduced into U K by a pronoun does affect aitaphora.28 The notion of order operative in our constraint SCOPE must be strongly distinguished from the notion of the precedence ordering of the NPs that give rise to the discourse referents in the sentences of the discourse: it is certainly not the case, for instance, that an antecedent NP must precede the pronoun in the sentence for felicitous anaphora to take place. The data in (9) show that defmite and indefinite antecedents do not obey a simple precedence restriction. More precisely, if an indefinite introduces a dis course referent x at node n in the accessibility tree r and an antecedently oc curring pronoun introduces a discourse referent z at some node m such that n is accessible to m in r and the antecedent and pronoun both occur in the same sentence, then the identification of x with z is perfectly permissible. (9)
a. b.
The fact that shei had already climbed this mountain before encouraged Rosai to try again. The fact that hei had already climbed this mountain before en couraged a mani to try again.
DR theory predicts (9b) to be acceptable if the rules for processing complex NPs like the fact that c1> from Asher ( 1 987) are adopted. Those rules yield a DRS like the following:29
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To use this constraint fully, we would like to specify some constraints on the possible orderings for discourse referents in a DRS using order. One ab solute constraint from the work of May ( 1 985) is the following: when a sen tence t/t has a SUBJECT NP a containing an anaphoric pronoun f3 introducing a discourse referent x at level n in r, then order (x, y), where y is a discourse referent introduced by an NP in non SUBJECT position in 1/t. If we constrain the mapping from f-structures to DRSs so that this con straint on order is always satisfied , then part (a) of SCOPE rules out the weak crossover data concerning quantifiers and pronouns in (8):26
323
K, u
K ==
fact (K) z z had already climbed this mountain before z = u
K encouraged u to try again
( 1 0)
a. b. c. d. e. f. g.
In hisi wallet Billi found a dollar. ? In hisi wallet someonei found a dollar. 30 First hei lost hisi wallet. Then hisi car got stolen. Fredi was having a bad day. •?First hei lost hisi wallet. Then hisi car got stolen. Someonei was having a bad day. Everyone likes hill\. Fredi is very engaging. •?Everyone likes himi . Some cati is very engaging. Some cati is very engaging. Everyone likes himi .
In a contrast like that between ( l Oa) and ( l Ob), the problem is once again a matter of scope, but nothing in our mapping forces the discourse referent introduced by the pronoun to have wide scope. The precedence of the pronoun strongly suggests wide scope, however, and that, we claim, ac counts for the marginality of ( l Ob) as opposed to ( l Oa) . The possibilities for cataphora even among definites also is limited. This what the second part of the SCOPE constraint addresses . We should note is that cataphora with anaphoric pronouns (though not anaphorically used demonstratives like this) really is a form of accomodation of the interpreter
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The precedence of the indefinite with respect to the pronoun becomes more important, however, when both the discourse referent introduced by the in definite and the one introduced by the pronoun are accessible to each other or when antecedent indefinite and pronoun occur in different sentences. The second case is especially significant for our theory. Our mapping from F-structures to DRSs in the intersentential case requires that the discourse referent introduced into U K by a noun phrase from the prior sentence take wide scope over any discourse referent introduced into U K by a noun phrase from the second sentence, unless the latter is introduced by a defi nite. Cataphora with definites is acceptable in our theory because of the way we have formulated SCOPE. Part (a) of SCOPE, however, in conjunction with the other constraints of our theory , predicts (lOd) and (I Of) to be bad, unless one can have specific readings of the indefinite antecedents.
324 to an unusual discourse situation. The recipient considers the identification of a pronoun in a previous sentence with a discourse referent introduced by a definite in a subsequent sentence only if there are no antecedently available discourse referents. If there are discourse referents antecedently introduced to the occurrence of the pronoun or introduced in the same sentence, cataphora does not appear to be a possibility. Consider for instance the fol lowing possibilities: (1 1)
Sam was watching T V . H e had prepared the meat the night before so that the meal would be easy to make. Fred was now in the kitchen washing up.
3. DISJOINT REFERENCE IN BUILDRS
The notion of accessibility alone will not rule out certain ungrammatical dis courses. For instance, ACCESS together with DEF and SCOPE permit the discourse referents introduced by him and himself to either be identified with the discourse referent introduced by John or not . Nevertheless, in ( l 3a) the identificaiton cannot be made while in ( l 3b) it must be made. ( 1 2)
a. b.
John likes him . John likes himself.
Sentences as ( 1 2) require at least a distinction between reflexive and non reflexive pronouns, familiar from the literature on syntax, and some con straint making use of this feature. We have introduced a feature Refl : a dis course referent x in U K gets the feature + Refl if it is introduced into U K by a reflexive pronoun; otherwise x gets the feature - Refl . Many have argued that syntactic configuration imposes constraints governing the anaphoric behavior of reflexive and non-reflexive pronouns. To appeal to some familiar examples (taken from Lasnik & Freidin ( 1 98 1 ) ( l 3a-c) and Rein hart ( 1 983) ( 1 3 f-g):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( 1 1 ) is an attempt to have a neutral discourse. According to common-sense knowledge, it is clearly possible that Fred could have done the preparation . I n fact if the first sentence of ( 1 1 ) were dropped, the antecedent of he would be Fred and there would be no difficulty with this judgment. But no matter how you fill in the details, it is extremely difficult to get Fred to be antece dent of he once we add the first sentence of ( 1 1 ) . 3 t Such data points to limi tations on accomodation with cataphora. Part (b) of SCOPE predicts this fact .
325 ( 1 3)
a. I met a mani who he.i said that Mary had seen ei. b . The mani who he.i wants the woman to like ei. c. The mani who he.i thinks eii likes the woman. d. A man who hardly knows heri loves Maryi . e. •Hei loves a womani whoi hardly knows Billi . f. Near himi Dani saw a snake. g. •Near Dani hei saw a snake.
( 1 4)
Reagan voted for Reagan
According to a binding theory like that put forward in Chomsky's Lectures on Government and Binding, the two occurrences of Reagan cannot be as signed the same index without apparently violating some of the principles of the binding theory; yet it appears clear that the two occurrences could be coreferential. 33 An example due to Peter Sells shows another apparent difficulty with this simple view: ( 1 5)
Everyone likes John. Fred likes him. Mary likes him. Even John likes him.
In ( 1 5) John and he cannot be coindexed without violating the binding theory, yet again the obvious intent of the discourse is to make him and John coreferential. The moral we draw from these examples is that certain syntactic features determine constraints on which discourse referents may be identified with other discourse referents by means of the equations explicitly introduced by the anaphora resolution process . We take disjoint reference to be a con straint on discourse referents with certain features; the features have to do with whether the discourse referent in question was introduced by a reflexive or non-reflexive pronoun. Disjoint reference requires that a discourse
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Examples like these have led many to accept that something like Reinhart's c(onstituent)-command or Bach and Partee's ( 1 980) function-argument constraint determines a constraint of disjoint reference governs the distinc 2 tion between reflexive and non-reflexive pronouns. 3 There are certain difficulties with syntactic approaches that lead us to take a non-syntactic, semantic interpretation of disjoint reference con straints. Standard syntactic treatments, as Roberts ( 1 987) has argued, assign indices to NPs that do not have a clear interpretation. The intended interpre tation seems to be NPs with the same index are coreferential , while NPs with different indices are not coreferential . There are examples in the literature, however, that belie this simple view. Consider for instance ( 1 4), which is an example discussed in Evans ( 1 980):
326
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
referent with the appropriate feature not be identified with other discourse referents that are introduced by NPs in certain positions . This of course does not preclude that the discourse referents are mapped onto the same in dividual in a proper embedding of the DRS of which they are part. So our view differs from the simple view; just because two discourse referents can not be identified explicitly by means of an equation introduced by the anaphoric process, they may nevertheless turn out to be "coreferential . " But further, w e interpret disjoint reference constraints as requiring in ( 1 6) that the discourse referent z introduced by he not be identified with the dis course referent introduced by the second occurrence of the name John . That is, disjoint reference constraints must be intra-sentential . But as we see it, nothing in the disjoint reference constraints precludes that we identify z with the discourse referent i ntroduced by the first occurrence of John. We have chosen not to generate a separate C-structure with an indepen dent syntactic constraint on coindexing. It proves nevertheless relatively easy to adapt a C-command-like constraint to our mapping from f structures to DRSs; so we have stored information from the f-structure rele vant to disjoint reference in the discourse referent database. This strategy yields a suitable disjoint reference constraint on discourse referents, and it has the benefit that all our constraints apply at a single, representational level, eliminating redundancies. But further our strategy reflects a commit ment to only one interpretation of anaphora: anaphora involves a relation between discourse referents. We do not believe in a variety of different sorts of anaphoric binding relations - some determined by syntax, others by dis course effects. Rather, we believe that anaphoric binding relations always concern relations between discourse referents, although these relations may be determined by a number of constraints, employing syntactic, semantic, and pragmatic information. 3 4 Disjoint reference phenomena reveal an important facet of the represen tation of the content of a sentence - the semantic topic of the sentence. The subject - or in DR theoretic terms, the discourse referent introduced by the SUBJ f-structure - is, in one sense of aboutness , what the sentence is about. It is the primary topic. But there are other aspects to the representation of the content of a sentence that syntactic treatments o f disjoint reference have noticed. Every verb introduces a state or event discourse referent into a D RS; we think of this discourse referent as having certain "thematic roles slots" described by the lexical subcategorization of the verb. These slots are filled by other discourse referents introduced by NPs in the sentence; which discourse referent fills which slot is determined by the grammatical function of the NP introducing it as is supposed in Case Grammar. 3 5 These dis course individuals play a secondary role to the subject (they are secondary topics), although they play more important roles in the representation of the content than the discourse referents introduced by ADJUNCTS . 3 6 Very roughly, non-reflexive pronouns must not in the predication of a property
327
We now define a domain ofdisjoint reference for a reference marker (i)
(ii)
(iii)
XD.
Suppose that a discourse referent x0 is introduced by a DET or P RED in the SUBJ f-structure fn of a superordinate f-structure fn + 1 with Pred P n + 1 • Then Domain(�) = ( y : y is a discourse referent occuring as an argument in either Tr(fn), Tr(Pn + 1 ) or in the property locally p redicated of an argument of Tr(Pn + 1 ) ; or y is a discourse referent related in a predication to the event described by Tr(P n + 1 ) I Suppose that a discourse referent x0 is introduced by a DET or PRED in a non-SUBJ f-structure fn that is a subcategorized gram matical function in a superordinate f-structure fn + 1 with Pred P n + 1 • Then Domain(x0) = ( y : y is a discourse referent occurring as an argument in Tr(Pn + 1 ) or Tr(fn) I Suppose that a discourse referent x0 is introduced by a DET or PRED in f-structure fn that is not a subcategorized grammatical function in a superordinate f-structure fn + 1 • Then Domain(x0) ( y : y occurs as an argument in Tr(fn) 1 .
Using this definition of domain, we can now formulate a disjoint reference constraint. (DISREF) (a) A discourse referent x with the feature - Refl cannot be identified with a discourse referent y with feature - Refl in the domain of xY (b) A discourse referent x with feature + Refl must be identified
with a discourse referent y with - Refl in the domain of x .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
to a primary or secondary topic of the predication be used to ascribe a reflexive property, while reflexive pronouns must be so used. As with syntactic notions of disjoint reference, our "semantic" notion as signs NPs filling certain thermatic slots (like SUBJECT or other subcatego rized functions) more stringent disjoint reference constraints than others. The crucial feature accounting for the difference in judgments above in our use of semantic disjoint reference constraints involves the notions of local and proper predication. Suppose that a discourse referent x0 is introduced by a DET, or PRED (in the case of proper names or anaphoric pronoun), in the f-structure fn. Then the translation of fn under the construction al gorithm (a partial or predicative DRS) yields a property Tr(fn) locally predicated of "» · Now suppose that fn is a subcategorized argument of a PRED P n + 1 in a superordinate f-structure fn + 1 • Then the translation of P n + 1 yields a property Tr(P n + 1 ) properly predicated of "» .
328
( 1 6)
a. man(u1 hardly-knows(u p x1) x1 = m Mary(m) loves(ul ' m) b. loves(x 1 , u 1 ) woman(u 1) XI = [ ) Bill(b) hardly-knows(u p b)
Together the constraints A CCFSS, DEF, SCOPE and DISREF account, as far as we are aware, for the data concerning disjoint reference and cross over. But there is one other traditional domain of the syntactic theory of binding, viz . cases of " reconstruction" involving picture nouns, to which they ought to apply and with which they have some difficulty. But picture nouns be themselves already pose problems even without reconstruction. What for instance ought to be the relationship between the PP and the NP that picture i n ( 1 7 a) for instance? We suspect that there is a complex interac tion between verbs like see and picture nouns so that one might take the PP as an adjunct, then DISREF would permit ( 1 7a) . On the other hand if the PP is part of the object f-structure or some subcategorized grammatical function of see (as in see oneself in a picture, then DISREF would permit only ( 1 7b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For any discourse referent x and y , the constraint DISREF stipulates that if y e Domx then x and y must not be identified by BUILDRS' s anaphora resolution module. So if we examine, for example, the DRSs for (I 3d) and (1 3e) in ( 1 6a) and ( 1 6b) below, we see that the discourse referents intro duced by 'a man' and ' Bill' (u1 and b) are in each case accessible to the dis course referents introduced by the pronouns. But in ( 1 6b), b lies in the domain of x1 and since x1 has the feature Refl-, b may not be identified with x 1 • Similarly with the adjunct prepositional phrases in ( 1 4f) and ( 1 3g) or the relative adjunct clauses in ( 1 3a - c), the domain relations are such that though ACCFSS , DEF and SCOPE would permit either coindexing, DIS REF rules out ( 1 3a-c) and ( 1 3g) where the pronoun gives the discourse referent it introduces the feature Refl-.
329 ( 1 7)
a. b.
Fred saw that picture o f him. Fred saw that picture o f himself.
We suspect that there are, as suggested by Roberts ( 1 987), two construc tions, since these two sentences do differ in meaning . Other verbs don't per mit both forms; usually they license only reflexive anaphoric constructions like that in ( 1 7b). This phenomenon appears puzzling from our point o f view , since i t contrasts with the behavior of anaphoric pronouns i n other PPs like those in ( 1 3 f) and ( 1 3g). In view of these difficulties, however, any account concerning reconstruction phenomena seems premature. 38
4. 1 . An absolute discourse constraint? The last "absolute" constraint on anaphora resolution i n the current im plementation o f BUILDRS is one that governs permissible patterns o f anaphoric linkage. 39 I t may b e only one o f many such constraints, but we have found only this one so far. Perhaps the best way to understand this constraint is to look at a violation of it. Consider the following discourse. Maryi invited Susani over for dinner . Shei prepared sukiyaki. Shei arrived late. Shei# served heri apologetic guest sullenly.
( 1 8)
( 1 8) strikes many speakers as simply incomprehensible because of the shift ing subject. It appears that one can easily shift from Mary to Susan as sub ject in the sentences of a discourse as in the first three sentences of ( 1 8) . But one cannot switch back again even though world knowledge demands that 'she' in the fourth sentence have 'Mary' as an antecedent. 40 We shall sim ply take as a constraint on "discourse presentation" the following. Let of discourse referents i n [ACC] be the set of equivalence classes under ACC . Then =
(PRES) Suppose z1 and z2 are discourse referents introduced by anaphoric pronouns in a DRS K, and that z1 is identified with x and z2 with y in K, and x = y is not a condition in K. Now suppose z3 is in [ACCu) troduced by an anaphoric pronoun into K and [ACCzd [ACCz3 ] . Then z3 cannot be identified with x in K . =
=
(ACCESS), (FEAn. (DEF}, (SCOPE), (DIS REF) and (PRES) are all the absolute constraints on anaphora resolution in our current implementation. BUILDRS does not actually construct the set of all accessible discourse
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4 . PRAGMATIC AND DISCOURSE CONSTRAINTS ON ANAPHORA I N BUILDRS
330
4.2. Discourse constraints and salience The other constraints of a pragmatic o r discourse nature are more subtle than the absolute constraints we have just detailed; these discourse con straints impose preferences that may be overturned by other evidence. We distinguish two kinds of such constraints. The first pertains to the creation of a salience ranking on discourse referents, the second to world knowledge. An essential component of our analysis and implementation is that the abso lute, salience and world knowledge constraints operate as independent modules. Constraints pertaining to world knowledge are well recognized to be important factors in determining coreference, but we have little to offer 2 in this area that is not already well-known. 4 We shall assume the existence of a world knowledge component that "checks" the predictions of the salience filter for coherence, but we will not examine what its structure is. 43 Henceforth, we shall concentrate on salience. Salience constraints form yet another filter used to narrow down the set of potential anaphoric antecedents in BUILDRS. The phenomena that we try to accomodate with salience constraints are usually associated with the literature on local focus or topic. Roughly, the most salient discourse refer ent at a particular point in the discourse (or in the construction of the DRS from the discourse) is the entity that others have called the "entity in focus."
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
referents for x for each such x introduced by an anaphoric pronoun. In stead, it constructs a smaller set from the internal database of all discourse ' ' referents for the discourse - a set that we shall call ACC x · ACC x is that subset of ACC" of discourse referents that agree in number and gender with x and which also have passed the constraints (DEF), (SCOPE), (DIS 1 REF) and (PRES). 4 Every discourse referent outside of ACC 'x is excluded from further consideration as a possible anaphoric antecedent for x, so our program applies these constraints prior to applying the constraints that are not absolute. The order in which these constraints may be applied is not necessarily determined in the implementation. These constraints also in volve only feature checking and matching of the features in the discourse referent database. Since the database is easy to construct while the DRS it self is being constructed, these constraints are also quite efficient. ' If ACC x is a singleton ( z ) , the algorithm simply specifies that x = z. ' If ACC x is not a singleton , then BUILDRS invokes another, indepen dent module that computes a salience ranking on ACC 'x· This ranking enables BUI LDRS to prefer certain anaphoric links to others and to rule out some candidates for anaphora that the salience ranking assigns an ex tremely low rank to. We turn now to describing the salience ordering i n more detail .
33 1
(D l )
(0) Sami really goofs sometimes. ( l ) Yesterday was a beautiful day and hei was excited about trying out hisi new twin. (2) Hei wanted Fredk to join him on a practice flight . (3) Hei called himk at 6am. (4) Hek was furious at being awakened at that hour U ust to go for an airplane ride).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have chosen to elaborate our own notion of a salience ranking rather than use the more familiar notion of focus. 44 We do not think that the notion of the focus, the topic or the backward center of the sentences in the discourse prior to the one containing the pronoun is the right one for the 4 analysis of anaphora. 5 It is not clear to us that for every clause there is just one focus or perhaps any focus at all, although it is undeniable that some thing like a focus exists in many discourses . Unfortunately, tests for local focus do not seem to capture a clearly defined notion, and recent attempts to clarify the notion of focus tend to make predictions that go beyond any supporting data from speaker's intuitions. 46 One simple modification to the concept of focus, however, that seems to solve at least some of the difficulties is to abandon the idea of one local focus in favor of a degree of focus or a degree of ho w much more salient one discourse referent is than another among the set of syntactically and semantically acceptable, poten tial antecedents. We will use the factors associated with focus to determine a degree of focus or salience. Where several discourse referents have all pret ty much the same degree of salience, we claim that the salience filter does not choose among these candidates. This is important, because in many cases the salience ranking does not clearly distinguish between two possible antecedents, and BUILDRS would come to the wrong conclusion if it al ways had only one choice to make in those cases and that choice had to be based on rather delicate considerations of salience. Only in cases where a significant disparity in degree of salience exists, does the salience filter indi cate a preference for one antecedent over another, and only in cases of ex treme disparity does the salience filter install a very strong preference, which, if not obeyed because of constraints pertaining to world knowledge, leads to predictions of discourse infelicity. When salience and world knowl edge both agree on a preference for an anaphoric antecedent, the discourse exhibits coherence; when the salience filter instills a very strong preference for an antecedent that is ruled out because of world knowledge, the dis course is infelicitous or awkward. Grosz, Joshi and Weinstein ( 1 986) provide a convincing example showing that in some cases the salience of one discourse referent is so extreme that a failure to choose it as an antecedent for a discourse referent subsequently introduced by an anaphoric pronoun leads to infelicity.
332
(02)
Sami really goofs sometimes . Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Jilli to join him on a practice flight. He1 called heri at 6am. Shei was furious at being awakened at that hour.
Any salience ranking between the discourse referents introduced by Sam and Jill is irrelevant to our choice of antecedent for she in the fourth sen tence of (02). We conclude that salience rankings apply only to those possi ble antecedent discourse referents that have passed the absolute filters. I n this example the set ACC '1, where x i s the discourse referent introduced b y she i s the singleton consisting only of the discourse referent introduced by Jill. It is important to note that none of the facts that we have mentioned so far are absolutely decisive in determining anaphoric antecedents. Consider for instance the following variant. (03)
Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk u p at 6 am to invite himk out flying. Fredk was fast asleep. Hek was furious at being awakened at that hour.
In (03) factors that call attention to what others have called a "shift" in focus come into play and they in effect balance out the factors that in (0 1 ) were decisive in determining the salience o f Sam over Fred. Suppose that the occurrence of ' he' in the sixth sentence of (03) introduces a discourse refer ent z. Because of an aspectual shift and the reference to Fred again in the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
World knowledge dictates that he in (0 1 .4) be coindexed with ' Fred ' , but the salience ordering prefers Sam very strongly, because Sam is so much more salient than Fred. The result is a clash between salience and world knowledge filters and the coherence of the discourse suffers. The salience o f Sam over Fred is due to a number of factors, none of which by themselves would be sufficient to make the coreference of he and Fred infelicitous. The fact that there has been repeated anaphoric linkage to Sam and not to Fred and the fact that there is a parallelism between the structure of (0 1 . 2), (0 1 .3) and (0 1 .4) and that the subjects in (0 1 .2) and (0 1 . 3) are anaphoric pronouns coindexed with Sam lead the reader of (0 1 ) to expect with near certainty that h e i n subject position o f ( 0 1 .4) should be coindexed with Sam . It is also crucial for such examples that the absolute constraints by themselves don't dictate a clear choice for the anaphoric an tecedent of he in (0 1 .4). So much is obvious from considering the perfectly felicitous (02) t hough it is very close in form to (0 1 ) .
333
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
fifth sentence, Fred is in fact slightly more salient than Sam, but not so much so that were the discourse continued in a slightly different way, in felicity would result from coindexing the ' he' in the fifth sentence. 47 We draw two morals from these examples. (I) Salience is a matter of degree. The difference between (D l ) and (D3) seems to be one of degree of salience. The most salient discourse referent in ACC 'x may be so much more salient than the next most salient discourse referent in ACC 'x that if the most salient discourse referent cannot, because of world knowledge constraints , b e identified with the discourse referent introduced by a n anaphoric pronoun that has all the appropriate grammatical features relevant to such an identification, the resulting discourse is infelicitous or awkward. (2) Salience is the result of adding up the various preferences of the factors rele vant to determining salience in order to assign a salience ranking. Features like recency, parallelism, reiteration, and the like each add something to the overall salience ranking, but none alone is decisive. Our implementation of the salience ranking consists in a series of con straints. These constraints act as filters on ACC 'x• the set of syntactically and semantically acceptable antecedents for x. Each one of the constituent filters determines a particular salience ranking on ACC 'x that must be combined in some way to yield a total salience ranking. One might allow each constraint to have a "vote" on the candidate discourse referents, and the discourse referent with the most votes would win. But not all of these constraints have equal sway. The literature on focus has already made clear in general what sorts of constituent filters are required to determine the overall salience ranking: recency, grammatical function, parallelism, reiter ation, and " focus shifting" expressions and configurations like complex demonstratives and other definite expressions, clefs, fronting and aspectual shift. From the few empirical studies we know of on this topic, it appears that recency should be weighted relatively heavily and that discourse refer ents introduced in the same sentence should get the same weighting from this filter. 48 Two more observations from the literature on focus are that the next most influential factors in determining the total salience ranking are reiteration and parallelism and that these factors are of roughly equal im portance. Reiteration occurs when a discourse referent is repeatedly linked with discourse referents subsequently introduced in the discourse by ana phoric pronouns. Whenever anaphoric pronouns repeatedly refer back to some previously introduced discourse referent, this discourse referent tends to become the focus of the discourse. Parallelism concerns both syntactic and semantic constructions . 49 Parallelism also seems to come in degrees; a parallel structure may be strongly reinforced over several sentences as in (D l ) or there may be only a weak syntactic parallelism; also an otherwise weak parallel structure may be reinforced by words like too and also. 50 Finally, focus shifting mechanisms also tend to act cumulatively and in
334
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
degrees. They often give a relatively high salience to a discourse referent that previously was accorded a low salience. These observations have led us to develop a quantitative and additive model for salience. Since each interpretive constraint provides its own rank ing on ACC ' , we will require that these rankings be additive; if S1 and S2 are rankings on ACC 'x• then so is the ranking obtained by summing the particular assignments by S1 and S2 to each member of ACC" (we will write this ranking as S1 + S2) . Each filter adds its assignment to the members of ACC 'x to the sum of the weightings of the previously applied salience filters. The last filter in the sequence of ftlters is the sum of all the previous filters; we will also sum together the weights assigned to candidates in ACC 'x that have been already explicitly identified in the DRS. We shall call this final sum the total salience ranking, or the salience ranking sim pliciter when no confusion results. The particular numerical values our filter model uses in the implementation are of course totally arbitrary since what our model is looking for is differences in degrees of salience, not absolute values of salience. We must also choose cutoff values for the relevant differ ences in degree o f salience. H ere the choice of numerical values does make a difference. Good writing will minimize ambiguity and so in easily u nder standable texts, one discourse referent (the intended antecedent) will be much more salient. If we are generous and fix the bounds for infelicity quite high, then BUILDRS will usually return more than the intended interpreta tion; fixing the bound for infelicity low makes BUILDRS able to capture intended interpretations in good writing but perhaps misinterpret ambi guous and less good writing. The parameters relevant to the constraints mentioned so far are mostly recoverable either from the f-structures of the discourse ' s constituent sen tences or from the DRS. Let us investigate each o ne of these constituent filters in a bit more detail. First, the recency of a particular discourse refer ent y in ACC 'x in a DRS K for a discourse D refers to the point at which y was introduced into K relative to the point at which x was introduced. I n the current implementation, this filter assigns values t o discourse referents in ACC 'x as follows: (i) if y is the most recently introduced discourse referent in ACC ' " ' then salience(y) = 4; if y is introduced into K during the processing of the same sentence as the one in which the most recently introduced discourse referent in ACC 'x is introduced, then salience(y) = 3 ; if y is introduced during the processing of one sentence previous to the one in which the most recently introduced discourse referent in ACC 'x is introduced, then salience(y) = 2; i f y is introduced two sentences previous to the one in which the most recently introduced discourse referent i n ACC 'x is introduced, then salience(y) = l ; else salience(y) = 0. Given this assignment of values for the recency filter, we have been to some degree con strained on the numerical v alues that the reiteration filter assigns to dis-
335
=
=
The least important factor is the surface grammatical function of the NP introducing the discourse referent . Apparently if y in ACC 'x is in subject position, it tends to be more salient than one in object position. Our gram matical function filter assigns the value 2 to a discourse referent in ACC 'x introduced by an NP whose grammatical function is SUBJ, the value 1 to a discourse referent in ACC 'x introduced by an NP whose grammatical function is OBJ or OBJ2 and 0 otherwise. Finally with respect to salience shifting constructions, we presently only consider aspect shift and the use of proper names or definite descriptions. But since we encode aspectual character using event-type discourse referents and we have left these out o f o u r description of the D R S construction process here for simplicity, w e shall not discuss this in detail . The way our constraint works roughly is as fol lows . Consider a discourse with an aspectual shift from simple past actions or activities to statives in combination with the use of a proper name or definite description introducing a discourse referent x f ACC 'z• where z is a discourse referent introduced by an anaphoric pronoun which occurs in a clause after the aspect shift. I f y f ACC 'x is the most salient discourse salienceACC ' x(Y) + 3. In referent in ACC 'x• then in ACC 'z salience(z) =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
course referents in ACC 'x: if y f ACC 'x and if y has been already linked in K with other reference markers n times then salience(y) = n but the maxi mum value reiteration can assign is 4 (that is, repetition should be no more important than recency can be). The parellelism filter is quite complex and also quite unsatisfactory as it stands. Parallelism is a relatively local effect - it usually operates on two successive clauses, though it can occur throughout a more extended chunk of discourse. There are a variety of clues suggesting parellelism to consider . First, clue words like too and also may add to the salience of a particular 1 discourse referent in ACC 'x. 5 Suppose that such a clue word occurs. The filter then looks to see what is the surface position or gramm atical function of the NP introducing x in ACC 'x· It then looks for the NP in the previous clause with the same grammatical function. If that NP introduces y and y f ACC 'x• the filter assigns salience(y) 2 , and it adds 2 for the presence of the clue word . It also checks to see whether in the DRS there are condi tions C1 containing x as an argument and C 2 containing y as an argument SUCh that C1 "" C2 (y/x), where this means that C 2 COntains X in the same argument position where c l contains y and c2 is otherwise an alphabetic variant of C 1 • If so, the parellelism filter adds 2 to whatever other value it assigns to y by means of the other tests. Second, the parallelism filter may be triggered by a repeated pattern in previous clauses. If x was introduced in Ko by an NP with grammatical function F and in (at least) the last two previous clauses y and z i ntroduced in Ko by NPs with F and there is a u such that u = y and u = z are already conditions in Ko. then salience(u) 3; otherwise, salience(y) = o. s 2
3 36 general, however, if there is a salience shift indication, the program stores the final values as features on the set of discourse referents, and they are used in the next computation of salience needed when resolving a subse quent anaphoric pronoun. Salience shift also affects other factors; the pro gram will not consider repetitions of a discourse referent in the next calculation, if those repetitions occurred prior to the focus shift. Let us now return to (0 1 ) and (03). Suppose (0 1 ) yields a DRS contain ing discourse referent t1 for Sam t2 for Fred. Let's focus on the conditions and discourse referents contributed by (0 1 . 3) and (0 1 .4) . Consider first the processing of (0 1 . 3):
'
Our anaphora resolution module begins with z1 • ACC 1 = ( t 1 , t2 ) (collaps ing equivalence classes of d iscourse referents under = ) . Since the absolute constraints do not provide a unique solution, BUILDRS begins a salience ranking computation. First it looks to recency. Since Fred and Sam are both mentioned in 1 . 2 (Sam must be the anaphoric antecedent of the pronoun in 2 by the absolute constraints), recency (REC) weights them equally with value 4. But now the repetition filter (REP) assigns t1 the value 4 + 4 = 8 and t2 the value 4 + 0 = 4. Parallelism (PAR) also prefers t1 to t2; it assigns t 1 the value 8 + 3 = 1 1 , and t2 the value 4 + 0 = 4. Finally, the grammatical function filter (GF) assigns t1 the value 1 1 + 1 = 1 2 and t2 the value 4 + 2 = 6. There is a huge gap between the value assigned to t1 and t2, so the pro gram strongly prefers t1 as an antecedent and replaces z1 = [ I with z1 = t 1 • D ISREF forces ACC '1 = ( t2 } , so BUILDRS never uses the salience filter. We now pass to the contribution made by (0 1 .4):
called(z . , �) zt = t t z2 = t2 z3 was furious . . . ZJ = [ I
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
z1 called � zt = [ I � = [I
337
5 . OTHER DEFINITE NPS AND ANAPHORA RESOLUTION
We return now to examine briefly how other definite NPs besides anaphoric pronouns fare with respect to the filter model for anaphora resolution we have developed. As DR theory and the familiarity theory of definiteness re quire, discourse referents introduced by definites need to be linked up to the appropriate, accessible discourse referents . A discourse referent x in troduced by a definite description or proper name, however, appears to be able to be identified with a discourse referent that is not the most salient in ' ACC x far more felicitously than were x introduced by an anaphoric pro 5 noun . Consider the following variants of D 1 : 3 (04)
Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk at 6am . Fredk was furious at being awakened at that hour.
(05)
Sami really goofs sometimes. Yesterday was a beautiful day and hei was excited about trying out hisi new twin. Hei wanted Fredk to join him on a practice flight. Hei called himk at 6am. The potential pas sengerk was furious at being awakened at that hour.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
With respect to z3 , BUI LDRS again must resort to the salience filter . SalienceREc(t2) = 4 = SalienceREc(t2); SalienceREP(t 1 ) = 4 + 5 = 9 and SalienceREP(t1 ) = 4 + 1 = 5; SaliencePAR(t 1 ) = 9 + 3 = 12 and Sa liencePA R(t2) = 5 + 0 = 5. SalienceGF(t 1 ) = 12 + 1 = 1 3 and SalienceGF (t2) = 5 + 2 = 7 . The program again strongly prefers t 1 as an antecedent (almost twice as much), so that the program predicts infelicity when world knowledge forces us to choose t2. Turning to (03), we note that the discourse referent introduced by the last occurrence of ' he' in (03), call it z4, also has as its ACC ' set [ t" t2 ] . But here the computation for salience provides almost balanced results due to the presence o f a salience shifting expression. SalienceREc(t 1 ) = 4; Sa lienceREc(t1 ) = 2; SalienceREp(t 1 ) = 2 + 4 = 6 and SalienceREp(t2) = 4 + 1 = 5 ; SaliencePAR(t 1 ) 6 + 0 = 6 and SaliencePAR(�) = 5 + 2 = 7 . SalienceG�t 1 ) = 6 + 2 = 8 and SalienceGF(t2) = 5 + 2 = 7 . SalienceFS(t2) = 7 + 3 = 1 0, SalienceFS(t 1 ) = 8. Although t2 is now the preferred antece dent for z4 in (03), the discrepancy between t1 and t2 is not so great that infelicity occurs should world knowledge dictate that t1 rather than t2 be the antecedent. But if (03) is continued in such a way that t2 becomes repeated frequently, then it will become infelicitous to identify a discourse referent introduced by an anaphoric pronoun with t 1 .
338
6 . CONCLUSION
According to the filter model proposed here, the list of accessible discourse referents , which is itself determined by the logical structure of the discourse, has been paired down by various "grammatical feature" filters. These filters on the list of potential antecedents for an anaphoric pronoun or other definite NP are absolute. If they fail to produce a unique antecedent, BUILDRS resorts to ranking the candidates according to a salience. The salience ranking is the result of applying several filters each assigning a par ticular weighting to the members of ACC 'x• where x is the discourse refer ent introduced by the anaphoric pronoun or definite. BUILDRS allows the user to determine weightings and even the constituent filters determing the salience ranking. The model suggests that by postponing the most difficult tasks in the process of anaphora resolution, we may actually be able to avoid them in all except the worst possible cases. Of course, the list of feature mechanisms relevant to resolving pronominal anaphora and understanding definite descriptions discussed here should not be taken to be complete. We have not talked in detail of the constraints dependent on world knowledge or at all of those concerning global focussing or conversational planning dis cussed in Grosz & Sidner ( 1 985). But we must leave the question of where these fit in within the DR theoretic approach to discourse for another time.
ACKNOWLEDGEMENT We have benefitted from discussions with Hans Kamp, Lee Baker, Franz Guenthner, Werner
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Although also very close to (D l ) , (D4) and (D5) are perfectly felicitous. As with (D2), here again the relative salience of one discourse referent over another is irrelevant. In keeping with t he line of explanation that we ad vanced in the case of (D2), we hypothesize that definite descriptions and proper names introduce another absolute or deterministic filter on the set o f accessible discourse referents . For a definite description of the form ' 'the a" introducing a discourse referent x, this filter simply checks for each y i n ACC 'x the conditions applicable to y in a database that contains all those conditions explicitly introduced in the DRS or derived from them by a restricted inferencing component. s4 I f a(y) is a condition in the database, then y passes the filter; otherwise not. We will call the result of filtering ACC 'x with this content filter ACC*x· An analogous filter is defined for proper names. The salience ranking for potential antecedents for x, where x is introduced by a definite is defined on ACC*x· The presence of the extra filter on the set of potential antecedents appears to account for many cases where a definite description makes reference back to a discourse entity that is " no longer in focus . "
3 39 Frey, Andy Schwartz, Henk Zeevat and Ede Zimmerman. We thank the Center for Cognitive Science at the U niversity of Texas at Austin and to the Seminar fuer Natur-Sprachliche Systeme at the University of Tuebingen for research support.
Ctnter for Cognitive Science Dtpartment of Philosophy and Dtpartmtnt of Linguistics The University of Texas at A ustin
NOTES
I.
In the recent computational literature see Reichman ( 1 978), Sidner ( 1 979), ( 1 983) and
observations concerning focus and topic have been around for a much longer time. For a sur· vey and bibliography see Smith ( 1 985). 2. We take the familiarity theory of definiteness to be an integral part of the story DR theory has to teU about definiteness and indefiniteness . The OR-theoretic formalization of the
familiarity theory is developed in Heim ( 1 982). 3.
For an introduction to DR theory as we will be assuming it, see Kamp ( 1 986), Asher (1 986),
and Wada and Asher ( 1 986). The LFG component we have used is that detailed in Bresnan and Kaplan ( 1 982). The fragment treated by BLDRS discussed in Wada & Asher ( 1 986) contained indefinite singular and quantificational singular noun phrases, anaphoric pronouns, intrarui tive, transitive, ditransitive verbs, control and attitude verbs, relative clauses, possessives, and compound sentences using sentential conjunction; we have since expanded the implementation to handle definite descriptions and some prepositional phrases .
4.
Although there are extensions to the original fragment covered by DR theory that incor
porate discourse referents of other than individual type, we have not implemented any rules that employ them and so we shall ignore them here. For more details, see Kamp ( 1 98 1), (1 986), Wada & Asher ( 1 986). 5.
The mapping from F-structures to DRSs is discussed in detail in Wada & Asher ( 1 986).
It owes a good deal to the work in Frey ( 1 985). A good deal of work in the area of DRT im plementation is now published. Although we are aware of several efforts in this area besides our own (Klein & Johnson ( 1 986) , Sedogbo ( 1 986), Guenthner, Lehmann & Schonfeld ( 1 985)), it is the latter that is most relevant. Guenthner & Lehmann advocate a hierarchy of distinct filters on potential antecedents that is stricter and more sequentially bound than ours. They apply morphological, syntactic and semantic constraints using separate mechanisms, whereas we have integrated the anaphora resolution into one process using constraints from a variety of sources . Guenthner et a!. have a separate set of constraints defined on the syntactic struc tures of sentences in a discourse; we do not. We have also incorporated an explicit set of dis course constraints which they do not discuss in any detail. 6.
Some of the developments in LFG binding theory are reported in Sells ( 1 985).
7.
See for instance Roberts's ( 1 986) mating o f GB syntactic constraints o n anaphora with DR
theory's. 8.
Note that conversion here in these complex
cases
involves merging lists of discourse
referents as well as simple application. This description is only intended as a sketch, however;
the actual implementation of conversion is quite complicated . For a treatment of more com
plex NPs such as those containing relative clauses see Wada and Asher ( 1 986). 9.
For more details see Wada and Asher ( 1 986).
10.
For details see Van Eyck ( 1 985), Asher ( 1 986).
II.
The definition of accessibility here differs from that discussed and implemented in the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Groz, Joshi & Weinstein ( 1 983, 1985; Joshi & Weinstein 1986). Of course many of the linguistic
340 original BUILDRS of Wada & Asher ( 1 986). We have separated out the implicit processing constraint in that earlier definition and made it part of the separate constraint SCOPE defined below.
12.
One might also use other theories of discourse semantics like Seuren 's to furnish logical
constraints on anaphora. See Seuren ( 1 985). 1 3 . In the original BUILDRS ( 1 986) program, the only features in the list were the number and gender of NP that introduced the discourse referent and the feature ( + )Refl.
14. IS.
See Kamp ( 1 98 1 , 1 988).
For details on d,e see Asher ( 1 987). With some additional machinery developed i n that
paper, we can also explain the anaphoric links involved in Geach's Hob Nob sentence given below: Hob believes that a witch; has blighted his mare. Nob believes she1 has killed his cow. Apparently, Partee has called such a phenomenon the "telescoping effec t . " See Roberts
1 8.
Some examples (see Fodor and Sag ( 1 982)) appear to indicate that subordinated readings
are possible even without the presence of quantificational or modal elements (which may of course only emerge upon close reanalysis). These readings appear to rely, however, on some sort of notion of thematic continuity. We have no idea how to capture such cases without ap peal to detailed world knowledge; we hope to investigate such subordination cases using knowledge bases in future work.
1 9.
As i n subordinated readings. There are other complicated sorts of constructions that re
quire interpretation to fit the accessibility constraint. Consider for instance (Sg) (again due to Barbara Partee originally we think): (S)
g.
Either Fred does not own a car, or it is in the garage.
The accessibility constraint would dictate that if this sentence is translated in the obvious fashion into a DRS, the discourse referent introduced by the pronoun cannot be Identified with the discourse referent introduced by
a car.
But the intended meaning of this sentence is, we
think, very close to an exclusive disjunction with a hidden ellipsis. Fred does not own a car, or he does own a car and it is in the garage. I f this elliptical reading of disjunction is plausible, then the accessibility constraint predicts that the intended anaphoric link is admissable. Dis junctive sentences where the subject of the first disjunct is a universally quantified or definite NP seem particularly susceptible to this interpretation. We first heard of this solution from Hans Kamp.
20.
David Gadbois of the University of Texas at Austin is currently expanding and refining
21.
The latter condition formulates within DRT a criterion of functional dependency of one
the implementation of scope in BUI LDRS. NP on another. 22.
Assuming of course that a subordinated reading of the subsequent material is not availa
23 .
This was already implemented in Wada & Asher ( 1 986). This asymmetric behavior of in
ble. Our tests indicate that there is none, and most speakers seem to concur with the prediction. definites and definites also occurs in certain discourse contexts: (a)
(b)
If he, were to come home before we clean up this mess, I would be afraid. John, would get so angry that he might do anything. ll •
If he, were to come home before we clean up this mess, I would be afraid. A man; would get so angry that he might do anything.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
16.
( 1 987) for a discussion. 1 7 . See Roberts ( 1 987) for a treatment o f this and similar cases involving modalities.
341 B y appealing t o the notion of modal subordination in the construction of the D RSs for (a) and (b), we explain the discrepancy between (a) and (b) in an exactly analogous way to the explana tion for the discrepancy between (7a) and (7b). 24. It appears that indefinite descriptions with "enough content" can have referential or specific indefinite readings. Such "specific" indefinites would have just the same status vis a
vis accessibility as non-dependent definites. We believe that it is the specificity or the definite ness of the crossing coreferential NPs consider� tog�ther as a pair that makes Bach-Peters sen tences with indefinites like the one below acceptable even though it closely resembles the questionable examples in ( l Ob , d, f) : A man, w h o hardly knows herl loves a womani w h o scorns him,. The ungrammaticality or difficulties with ( l Ob, d ,
f) stems from speakers' difficulties in getting
clude from such examples and others that definiteness is not determined simply by the deter miner but by the total structure and content of the NP. With respect to (DEF), we will interpret
referentially interpreted indefinites like non-dependent definites. 2 S . We are endebted to Andy Schwartz for this formulation.
26. A similar constraint might take account of the data concerning questions, but our frag ment does not yet encompass them. David Gadbois of the U niversity of Texas is working on this aspect of the program too. 27.
This rule for scope is consonant with an observation made about an order dependence
of indefinite that was built into DR theory from the start: indefinites are as a rule (but see Fodor and Sag ( 1 982) for some exceptions) supposed to introduce new discourse individuals into the discourse. 28. It is also essential in analysing distributivity readings of plural definites and indefinites. 29. 30.
We will not full process the conditions in the DRS below.
The acceptability of (6d) depends on whether someone has wide or narrow scope over the
PP in his wall�t. Our theory currently treats the semantic structure of b as completely linear.
I f it is not linear D R theory will predict the sentence to be good as in (6b). In DR theory current ly temporal adverbs do contribute scope distinctions, and perhaps the same might be said of locative adverbs. If that is the
case,
scope rearrangement even of existentials might be impor
tant to get certain anaphoric facts to work out. 31.
Substituting the indefinite NP a man produces, as we would hope, the same effect.
32. There are, however, well-known di fficulties for those who adopt such a conclusion. Spe cifically, these di fficulties occur when claUMS occur in ADJUNCf position. Compare for in stance ( 1 3 h , i) with ( 1 3f, g) above: ( 1 3)
h.
Near the place he1 1ived, Dan; sa w a snake.
i.
Near the place Dan1 lived, he1 saw a snake.
Neither C-command nor function-argument constraints easily account for the difference. 33.
For a discussion of this example and of the difficulties of interpreting indices generally
in syntactic anaphoric binding theories,
34.
see
Roberts ( 1 987).
This distinguishes our approach from that of Reinhart ( 1 983) or Roberts ( 1 987). Reinhart distinguishes bound anaphora and other sorts of coreference. In bound anaphora the pronoun is treated as a bound variable. She claims that only bound anaphora obeys the principles of the binding theory. She also develops a pragmatically based account of disjoint reference along the following lines. Where a speaker uses a pronoun as a bound anaphor and (thus obeys the relevant principles o f binding theory), he may not use a pronoun that is not interpreted as a bound variable. Roberts also contemplates two sorts of binding relations, C-command
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a referential reading for the indefinite NPs in them with minimal descriptive content. We con
342 binding and discourse binding. The C-cornmand binding is completely determined at the syn.
tactic level and forces coreference where the same indices are assigned to two NPs. The access i bility relation of DR theory determines the constraint on discourse binding she explicitly mentions. Roberts also adopts a pragmatic construal of discourse reference. She assigns rankings to the various kinds of bindings and the pragmatic strategy is that the speaker must
always use the strongest binding potential of the sentence's grammatical structure. Roberts also has a pragmatic rule for interpretation, which is that the hearer must assume that if the 5peaker does not take advantage of the strongest binding potential, then unless there are rea sons to avoid that binding, he does not intend the expressions to corefer.
3S.
For a theory linking surface grammatical functions with thematic roles, see Dowty
( 1 987). 36.
We hypothesize also an indefinite number of open slots to be filled by discourse referents
introduced by NPs in ADJUNCT position; typically ADJUNCTs provide arguments on the
role , ' ' they function as a backgrounding role for the event being talked about rather than some role within the event. Nevertheless, this remark does not imply that sentences with fronted PPs
differ in truth conditions from sentences with sentential PPs In normal position. This is a
difference.
37.
Note that our disjoint reference constraint also allows the following discourse to be gram.
matical: Johllj thinks that everyone hates him,. Well, it's not true. Jim likes � . Mary likes hilllj and even John likes hilllj . All discourse referents introduced by the occu rrences o f him i n the second sentence are identi fied with each other and with the discourse referent introduced by
John in the frrst sentence.
It is an implicature that the John in the second sentence is the same as the John in the first sen tence, but nothing forces this identification in the DRS or, more to the point, the identification
him in the last clause with the dis John in the last clause. Perhaps some of the explanatory machinery devised by P. Sells ( 1 986) might be useful
of the discourse referent introduced by the occurrence of course referent introduced by the occ urrence of
38.
here but that is just a guess. We hope to do some more work on this topic.
39.
We are indebted to Carlota Smith for pointing out this constraint to us.
40.
This appears to be a rule of "focus shifting , " but unlike the other rules concerning focus
41 .
In a full theory that treats the problem of dUcourse segmentation and global focus,
and salience, it does appear to be absolute.
ACC '• will be subject to further constraints. For a discussi on of global focussing mecha
nisms, see Grosz ( 1 977), Grosz
would limit the size of ACC '• ·
42 .
& Sidner ( 1 98S),
and Guindon et a!. (1986). Such constraints
Since anaphoric pronouns carry themselves little in the way of lexical semantic informa
tion, one way of using world knowledge within the BUILDRS program comes to mind that
is similar to the content filter used for definite descriptions (see below, section S). It is a type
of lexical constraint defined relative to the "thematic roles" of the verb arguments in which
the pronoun and its potential antecedents occu r. Among such lexical constraints might be some that were absolute; others might resemble more closely the behavior of the salience constraints in BUILDRS.
43 .
Sidner ( 1 983) also advocates the use of world knowledge to check bindings suggested by
a focuslike mechanism.
44.
We are, however, endebted to the work of Sidner ( 1 979, 1 983) and Grosz, Joshi and
Weinstein ( 1 98 1 , 1 983, 1 986) on focus and centering. We have IUed many of the factors they
apply to determining focus to define a preference ordering for the anaphora resolution module by means of a salience ranking.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
event of the main clause. Note that when fronted sentential PPs appear to change "thematic
343 45. Many researches like Grosz, Joshi and Weinstein talk: about the backward center of the previous sentence in a discourse, but it seems to us that one needs to generalize this notion at least so that the center is the most salient discourse individual so far introduced in the dis course. Perhaps Grosz, Joshi and Weinstein's backward center for the previous sentence is the most salient individual in the discourse, but we are not completely sure on this point.
46.
Smith ( 1 985) contains a detailed study of such problems. Sidner ( 1 983) countances two
foci - a discourse focus and an actor focus, it should be noted. But
as
she always ranks one
of these foci above the other, her theory also embodies essentially the one focus view. 47. Consider as a replacement sentence for (04.5), He1 1uzd to go flying alone. 48 . 49. 50.
See Guindon e t al. ( 1986).
Sidner ( 1 979) uses the constraints of reiteration and parallelism. For instance compare Sidner's ( 1 979) example with 'too' to one without it. We fmd that
our intuitions on these examples are quite different:
(b)
The violet is commonly found near the green whittierlief1 •
The wild rose is found near i1j too.
The viotet1 is commonly found near the green whittierleaf1• The wild rose is found near it 111•
(b) strikes us as completely ambiguous even somewhat
clear preference for the coindexing that Sidner notes. 51.
infelicitous, whereas in (a) there is a
This filter should be generalized to look also for contrast as a way of making a discourse
referent salient, and perhaps other rhetorical relations should be looked for. But this is not an issue that we 52.
can
go into here.
We need to look two clauses back so tha� the parallelism filter will not be confused by
the "listing" phenomenon: "My friends Bi� and Harry1 are so crazy. He1 is going around the world in a dinghy, and he1 wants to hanglide in the Himalayas. " Note, however, that we do not need to look back more than two clauses for such a parallelism effect, since that will involve
a violation of PRES. Note also that given the current state of DRS structure, the parallelism
filter is not sufficient to capture parallelism of "argument" or discourse structure. We plan
to remedy this by augmenting the purely logical structure of a DRS with information pertaining to the global structure of the discourse. 53.
Data concerning this phenomenon is also well known; we have adapted an example of
Grosz et al. ( 1 986) again. 54.
We have not
as
yet implemented any such inferencing component. This should be taken
to be a placeholder for either some extant program or future research.
REFERENCES Asher, N. 1986: Belief in Discourse Representation Theory. Journal ofPhilosophical Logic 1 5 . Asher, N . 1 987: A Typology for Attitude Verbs and Their Anaphoric Properties. Linguistics
and Philosophy 10. Bach, E., & Partee, B. 1980: Anaphora and Semantic Structure. In: K.J. Kreiman & A . E. Ojeda (ed.s.), Papers from the Parasession on Pronouns and Anaphora, Chicago Linguistic Society, Chicago. Bresnan, J., & Kaplan, R. 1 982: Lexical-Functional Grammar : A Formal System for Gram
matical Representation. In: J. Bresnan (ed.), The Mental Representation of Grammatical
Relations, MIT Press. Evans, G. 1980: Pronouns. Linguistic Inquiry, 1 1 . Freidin, R., & La.milc, H . 1 98 1 : Disjoint Reference and Wh-Trace. Linguistic Inquiry 1 2 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(a)
344 Fodor, J . , & Sag, I . 1 982: Referential and Quantificational Indefinites . Linguistics and
Philosophy S .
Grosz, B . , Joshi, A . , & Weinstein, S . 1 983: Providing a Unified Account o f Definite Noun Phrases in Discourse. A CL Pf'OCeMings. Grosz, B . , Joshi, A . , & Weinstein, S . 1 986: Toward a Computational Theory of Discourse In terpretation. Manuscript. Grosz, B . , & Sidner, C. 1985: The Structure of Discourse Structure. SRI Technical Note 369. Guenthner, F., Lehmann, H . , & Schonfeld, W. 1 985: A Theory for the Representation of Knowledge. IBM J. Res. Develop. , January.
Guindon, R . , Sladky, P . , Brunner, H . , Conner, J. 1986: The Structure of User-Adviser Dia logues: Is there Method in their Madness? MCC report . Heim, I . 1982: The Semantics of Definite and Indefinite Noun Phr=. Ph.D. dissertation, University of Massachusetts . Structure: Centering. Proc. IJCAI. Karnp, H. 1 98 1 : A Theory of Truth and Semantic Representation. In: J. Groenendij k , Th . Janssen, & M. Stokhof (eds.), Formal Methods in the Study of Language, Mathematisch Centrum Tracts, Amsterdam. Karnp, H. 1 986: SID Without Time or Questions. Forthcoming CSLI report . Karnp, H . 1988: Conditionals in DR Theory. Manuscript.
Karttunen, L. 1976: Discourse Referents. In: J . D . McCawley (ed.), Syntax and Semantics, Academic Press, New York . Klein, E . , & Johnson, M. 1986: Discourse, Anaphora and Parsing. COLING Conference
Proceedings. May, R . 1985 : Logical Form: Its Structure and Derivation. MIT Press. Reichman , R . 1 978: Conversational Coherency. Cognitive Science 2 . Reinhart, T . 1983: A naphora and Semantic Interpretation. University of Chicago Press. Roberts, C . 1 986: Modal Subordination, Anaphora, and Distributivity. Ph.D. Thesis, University of Massachusetts. Sedogbo, C. 1 986: Extending the Expressive Capacity of the Semantic Component of the OPERA System. COLING Conference Proceedings. Sells, P . 1985: Lectures on Conttmporary Syntactic Theories. CSLI Lecture Notes Vol . 3 . Sells, P . 1986: On the Nature of Logophoricity. I n : A. Zaenen (ed.), Studies in Grammatical
Theory and Discourse Structure, Volume 2: Logophoricity and Bound A naphora. Seuren, P. 1985: Discourse Semantics. Blackwell. Sidner, C. 1979: Toward a Computational Theory of Definite Anaphora Comprehension in
English. MIT Technical Report AI-TR-53 7 . Sidner, C. 1 983: Focusing i n the Comprehension of Definite Anaphora. In: M . Brady, & R. Berwick (eds.), Computational Models of Discourse, MIT Press.
Smith, C. 1 985: Sentence Topic in Texts. Studies in the Linguistic Sciences, I S .
V a n Eyck, J . 1985: Aspects of Quantification in Natural Language. P h . D Thesis, University of Groningen. Wada, H . , & Asher, N. 1986: BUILDRS: An Implementation of DR Theory and LFG.
COLING Conference Proceedings.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Joshi, A . , & Weinstein, S . 1 9 8 1 : Control of Inference: Role o f Some Aspects o f Discourse
lolmUll of Semantics 6: 345-367
MOTION IMPERATIVES
ROLF MAYER
ABSTRACT
and linguistic aspects of path connection are discussed . The semantic analysis is then extended in the pragmatic direction: It is shown how semantic inferences may be filtered out via prag matic considerations. We suggest that a level of execution structure is needed to supplement the level of semantic representation. Motion imperatives are evaluated against maps, and aspects of executability are discu.ssed . I t is finally shown how the deontic function of motion imperatives
can
be fulfilled by texts in the indicative mood and that the criteria of adequacy
valid for motion imperatives then have to be met by motion indicatives.
I . INTRODUCTION 1
We are taking as our starting-point a fictitious map in the following form where movement along the edges is possible in any direction:
Fig. 1 : The M a p M •
The object to be moved is represented as a black circle located at one of the " places " . A crucial part of the ideas to be presented here has been im plemented in PROLOG. The system does the following: If the instructions given as an input are linguistically and logically free of errors (or can be ' 'in terpreted" by the object), the object complies with the instructions. After execution, further instructions may be given . The implementation simulates
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In this paper a restricted sample of motion imperatives is treated within the framework of dis
course representation theory. In order to pave the way for this treatment, the concept of path
346 the situation where the obj ect has complete " knowledge" of the map (which 2 means the object can " find" a path from any place to another place). The instructions make use of a small lexicon of German containing the following items (in some places we will, however, go beyond the fragment):
Consider an example (assume the object is at A): First input :
Fahre von A nach C . ("Go from A to C" . ) Fahre weiter nach F . ("Go farther t o F " .)
Execution : The object moves from A to C and then to F. Second Input
Fahre zuriick. ("Go back " .)
Execution: The object moves from F to A . In the following we are not s o much interested in matters of implementation but in the interaction of semantics and pragmatics at the level of discourse structure. Although our interest in this paper is mainly focused on imperatives, 3 much that will be said is also relevant with regard to motion texts in the in dicative mood. 4 The aspects discussed below include the following: a) conceptual machinery related to spatial locations, paths, and path connection b) coherence (how is the connection of paths linguistically ex pressed?) c) discourse representation structures for motion imperatives d) the logic of (motion) imperatives; semantically versus pragmati cally licensed inferences e) the evaluation of motion imperatives against maps; issues of ex ecutability.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(i) Verbs in the imperative mood: Reise ("Travel"), Fahre ("Go") (ii) Prepositions: von ("from"), iiber ("in the sense of " via"), nach ("to") (iii) Place names: A, B, C, D, E, F, G, H, I (iv) Adverbs: dann ("then"), dabei (in the sense of "when doing so"), weiter ("farther"), zuriick ("back"), (von) dort ("(from) there")
347 2. LOCATIONS AND PATHS
I assume that an object x located in space defines a minimal set of spatial points Loc(x,t) (determined by the object's boundaries) relative to a point of time t. We call this set its minimal location . The following principle holds: P: I f Loc(x,t) = I and Loc(x,t) = I ' , then I = I '
Def : A path p is a temporal Loc(x,t). position of
relative to an object x - a continuous function from interval T onto a set of minimal (spatial) locations We call p(t(s)) the starting position, p(t(e)) the ending p.
This definition corresponds to the specification of " path" given in Wunder lich and Herweg ( I 986) . Spatial locations as occurring in the above definition may be conceived of as the constituents of "absolute Newtonian space " . However, as such they are never perceived . Instead, commonsense experience (and modern science) relies on objects to define locations. In particular, language allows t he possibility of locating objects and events x by referring to frame loca tions I o (fixing ' 'search domains" in the sense of M iller and Johnson-Laird ( 1 976)) such that Loc(x,t) c Loc ( l o ,t) (where c denotes the subset relation); typical examples of frame locations are towns and countries. If we model paths by sequences of spatial frame locations f: < I 0 { 1 ) . . . I 0(n) > (where we assume that the spatial intersection of I 0(i) and I 0(i + I ) is empty, with i ranging from I to n - I ) , we arrive at a concept of path frame that will play a major role in section 6. The following definition explicitly relates paths and path frames :
Def : We call path p a realization of path frame f: < I 0 { 1 ) . . . I 0(n) iff the following holds: ( 1 ) p(t(s)) c Loc( l 0{1 ), t(s)); p(t(e)) c Loc(l 0 (n) , t(e)) (2) there are t(i) < t(i + I) (for i = I . . . n - I ) such that p(t(i)) c Loc(l 0(i), t(i)) and p(t(i + I )) c Loc( l 0{i + I ), t(i + I )). ( < denotes the relation o f temporal precedence) Before we define two notions of path connection, we will introduce the con cepts of a stationary path and a strictly stationary path as auxiliary notions (where the former is implied by the latter).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
P says that an object x (located in space) has exactly one minimal location relative to a point of time t . A path can b e defined in the following way :
348
Def p is a stationary path relative to 1 o iff p(t(s)) c Loc ( l o , t(s)) and p(t(e)) c Loc( l 0 , t(e)). For p to be a stationary path relative to 1 o , it will do if 1 o is the start location and the goal location of p.
Def. p is a strictly stationary path relative to 1 ° iff p(t) c Loc(l o ,t) for all t in the domain of p .
Def. : Path p (domain : T) is strongly connected with path p ' (domain : T ' ) iff (i) T meets T' (ii) p(t(e)) = p ' (t(s)), where the meet-relation is defined as follows : Def. : I f T and T ' are temporal (closed) intervals, T meets T ' iff the end point o f T and the starting point of T ' coincide. If path p is strongly connected with path p ' , p ::::: p ' represents the set theoretical union of p and p ' . For reasons of conceptual elegance we assume existence of the empty path Pe such that the following holds: p ::::: Pe = p for all p .
Def. : Path p i s weakly connected with path p ' (relative t o 1 ° ) iff (a) p is strongly connected with p ' or (b) (i) there is a path p" such that p" is stationary relative to 10 (ii) p is strongly connected with p " (iii) p " is strongly connected with p ' .
I f path p is weakly connected with path p ' relative to I o , p - p ' represents the set-theoretical union of p , p ' and p " (with p " the (possibly empty) link ing path in the sense of the above definition) . Note that strong path connec tion implies weak path connection . Keeping 1 o ftxed , we can say that weak path connection is an asymmetric relation: If p is weakly connected with p ' (relative to l 0), p ' is not weakly connected with p (relative to I 0).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A strictly stationary path p relative t o 1 ° i s e . g . established i f after moving to 1 ° a person stays there (before possibly continuing his/her movement). We are now ready to define the notions of weak and strong path connec tion (we presuppose that the paths to be connected have the same moving object).
349 I f paths p and p 1 are weakly (strongly) connected relative to 1 o , we could also say that the corresponding motion events are weakly (strongly) con nected relative to 1 o . With regard to our fragment, the parameter 1 o as occurring i n the defini tion of weak path connection is assumed to be filled up via goal adverbials.
3 . COHERENCE
(1)
Fahre noch A . Fahre von A nach F . ("Go t o A. G o from A t o F . " )
(2)
Fahre noch A . Fahre von dort nach F. ("Go to A. Go from there to F.")
(3)
Fahre nach A. Fahre donn nach F. ("Go to A. Go then to F . " )
(4)
Fahre nach A . Fahre weiter nach F. ("Go to A. Go farther to F . " )
(5)
Fahre nach A. Fahre nach F. ("Go to A. Go to F.")
(5 1 )
Fahre nach A. Fahre nach F. Beide Fahrten sind i nteressant. ("Go to A. Go to F. Both journeys are interesting . " )
(6)
(Object is assumed to be at A) Fahre von B nach F. ("Go from B to F. ")
( 1 ) - (4) represent "unmarked" instructions - in contrast to (5) and (6). I f we want an object to move to A and then t o F , (5) isn't the conventionalized
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
"Coherence" is taken to mean that sentences are properly linked with regard to each other. Conditions of coherence determine the conceptual and linguistic connection of paths; they tell us in particular when the construc tion of a complex path is linguistically licensed. Since in the context of the present paper we are primarily interested in "unmarked" sequences of mo tion imperatives, weak path connection counts among the conditions of co herence. With regard to our small lexicon, the weak connection of paths can be es tablished not only by the presence of properly chosen start adverbials (i .e. adverbials that "link up" with goal adverbials of the preceding sentence5 ) but also by text connectors such as donn and weiter. Here is a sample o f miniature texts:
350 way of expressing oneself. Neither is (6) if we want to tell the object to move from A to F via B. This does not mean, however, that no appropriate con texts can be found for (5) and (6) - (5 ' ) is a case in point.6 We will now further concentrate on the discourse particles dann and wei fer; later on we will also deal with the text connectors zuriick and dabei (in so far as these particles are related to our subject matter).
(7) and (8) serve to bring out a semantic difference between dann and weiter: Fabre von E nach A. Fahre dann von F nach G . ("Go from E t o A . G o then from F t o G . " )
(8)
?Fabre von E nach A. Fabre von F weiter nach G . ("Go from E t o A . G o from F farther t o G . ")
In both (7) and (8) the paths are not weakly connected. (7) certainly could not be used to tell somebody to get to G via A and F where equal value is attached to each path section ("bridging" in the sense of Clark (1977) seems hardly possible). However, as in the case of (5) and (6), a possible context can be constructed. Just think of a situation where only the path sections from E to A and from F to G are focused on. (8), however, doesn't seem to be licensed by a possible context . this would demonstrate that weiter (as a path connector) requires the weak connection of paths. In the context of the present paper there is another distinction that is worth noting. We take the following configuration as our starting-point : First input :
Fabre von A nach C . ("Go from A t o C . " )
Execution: possible path frame: ( A , B, C ) Second input: (a) Fahre nach F. ("Go to F.") (b) Fabre weiter nach F. ("Go farther to F . " ) (c) *Fabre dann nach F . ("Go then t o F . ") The contrast in acceptability between (b) and (c) can be accounted for in the following way: (b) presupposes that a path has been covered beforehand, but it does not presuppose that another path still has to be covered before travelling to F takes place.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(7)
351
Input:
Fabre zuriick nach I. ("Go back to I . " )
Use o f zuriick requires that the events i nvolved be conceptually connected, which suggests that a solution to the path connection problem has to be found within a theory of event structures. We will conclude this section by taking a short look at the text connector dabei (in its temporal sense). We take (9) - ( 1 1) as our starting-point: (9)
Fabre i.iber A nach C . ("Go via A t o C . ")
(9 )
Fabre i.iber A nach C. Das ist wichtig. ("Go via A to C. That is important" . )
( 1 0)
?Fabre nach C. Fabre i.iber ("Go to C. Go via A . ")
( 1 1)
Fabre nach C. Fabre dabei i.iber A. ("Go t o C. When doing s o , g o via A.")
1
(1 1
1
)
A.
Fabre nach C. Fahre dabei i.iber A. Das ist wichtig. ("Go to C. When doing so, go via A. That is important . " )
(9) and ( 1 1 ) are truth-conditionally equivalent but create different embedda bility conditions for subsequent discourse. With regard to a formal
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(c) however, presupposes that another action has t o b e executed before travelling to F takes place. The continuations (a) and (b), on the other hand, are both acceptable. However, the perspectives in (a) and (b) are different: I n (a) no conceptual link to the "old" path is explicitly established , in (b) such a connection is focused on by means of weiter; the hearer/reader is invited conceptually to j oin the two paths into a complex one. The example is related to what we call the path connection problem: When are paths conceptually related to each other and when can they be taken as subpaths of a more inclusive path? The path connection problem can be illustrated with regard to zuriick ("back ") (we will only consider the presuppositional use of zuriick here). Note that it would be wrong to assume that the movement of an object to a place 1 ° can always be described by means of zuriick if the object has ever been at 1 o before. Our miniature robotic system can be used to make this point clear. Suppose that our ' 'object ' ' has already been to all of the availa ble places and that its current position is A. We also assume that a user (not knowing anything about the object' s former "journeys" now starts playing with the system. Certainly the following initial input would be inap propriate:
352 representation, I assume that both (9) and ( I I ) explicitly induce one path marker. However, while (9) explicitly induces only one event-marker, ( 1 1 ) induces two . Note that event-markers are potential anchors for sentential anaphors. Das in (1 1 ' ) preferably refers to the event introduced by the se cond sentence; if the transitional adverbial in (9 ' ) is not accented, das refers to the whole complex event denoted by the initial sentence. 7 Dabei Oike von dort ) is a local anaphor in the sense that it finds its antece dent in the preceding clause, in contrast to presuppositional uses of zuriick, where the distance to the antecedent may be arbitrarily large.
As
has been observed by various authors (see for instance Seuren (1985)) and has also been illustrated in this paper, discourse particles are primarily related to the preceding context and not so much to the world . Discourse representation theory as inaugurated by Kamp ( 1 98 1 a, b) is well-suited to do justice to this insight by providing a level of discourse representation structures that is meant to mediate between language and the world. In dis course representation theory the meaning of a sentence can be conceived of as a function from discourse representation structures into discourse representation structures - which simply is to say that the preceding context determines how new verbal input has to be processed; the processing is to be algorithmic in nature and based on the syntactic structure of the input. In the following we show by means of a sample text how the results from the preceding sections could be incorporated into a DRT framework. 8 Our presentation is not algorithmic (note that we have not given an explicit syn tax of our fragment) but is hopefully explicit enough such that an algorithm could be constructed. The text to be processed into a DRS is the following: (1)
Fahre von A nach C . Fabre dabei tiber B . Fabre von C tiber F weiter nach E . Fabre dann von dort zuriick nach A . ("Go from A t o C . When doing so, go via B . Go from C via F farther to E. Go then back to A.")
We first present the DRS for ( 1 ) and then comment on its build-up:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4. DRS5 FOR MOTION IMPERATIYES
353 e, e · , e · · , e · ·
· ,
x, p, p · , p
'
· ,
i , j , k , 1 , m, n . . .
K : �------� Going ( e )
Agent ( e , x ) x-ADR Path ( e , x , p ) VON ( e , p , A, S T , i ) NACH ( e , p , C , GO , k ) i
Part-of ( e · , e ) i<j
)
Goin g ( e
•
'
Agent ( e
·
· ,
x)
p' ) VON ( e ' ' , p ' , C , ST , k ) Path ( e " , x ,
VIA ( e ' ' , p ' , F , TR, 1 ) NACH ( e ' ' , p , E , GO, m)
1<m
Conn ( p , p e
<*
·
e ' ·
Going ( e
•
•
,C '
)
Agent (e · · · , x ) Path ( e ' " , x , p "
) VON ( e ' " , p " , E , ST , m )
NACH ( e ' ' ' , p ' ' , A , GO, n ) m
·
p",
E)
<* e ' · ·
Fig. 2: DRS K '
Our representation is related to Davidson's ( 1 967) logical form of action sentences, Sondheimer's ( 1 978) treatment of path-oriented prepositions and Bauerle's ( 1 987) analysis of event-structures in the framework of DRT; it is in particular Bauerle who convincingly argues for a (non-compositional) framework that specifies clausal information with regard to thematic roles. Let me now give more specific comments on the DRS presented above: The markers e . . . e "' are discourse referents for action types (action types in turn are conceived of as a subclass of event types); event markers tell us which clauses are derived from the same clause. As should be evident from the notation so far introduced, p . . . p " are discourse referents for paths. x is the addressee of the imperative.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Path ( e · , x , p ) VIA ( e ' , p , B , TR, j )
354 "Path" is a relation between events, objects, and their paths. Path-oriented prepositions are taken as inducing predicates; the third ar gument position is filled by a frame location ,9 the fourth by a path ad verbial index distinguishing start adverbials, transitional adverbials and goal adverbials (where the information is calculated from information on the verb (subcategorization frames ! ) and the prepositions).
We assume the following implications: VON(e, p,
I, ST, i) - p(t(s)) c LocO, t(s)) I, GO, i) - p(t(e)) c LocO, t(e)) I, TR, i) - p 1 (a proper subpath of p)
NACH(e, p,
..., p(t(s)) c LocO, t(s)), and
goes through I,
..., (p(t(e))cLocO,t(e)) Numerical variables,i,j . . . (related by means of < ) have been added to re flect the spatio-temporal order of a path. The ordering corresponds to a numerical indexing where (in the general case, relative to a clause) the index ing starts with the start adverbial (if present) and carries over to transitional adverbials (where I assume that the linear order of transitional adverbials iconically mirrors the " real world" order, which is a sound default assump tion) and then to goal adverbials; goal adverbials of the same clause receive the same index; start adverbials get the same index as goal adverbials if they are "linked " . A procedure like this helps to model cross-sentential deduc tion with regard to path information. For the sake of notational simplicity we have not distinguished between entities that may serve as antecedents for anaphora and those that are in ferred or rather have the status of theoretical entities (the event markers and the numerical variables have a different status in this respect). "Part-or ' is taken as the reflexive, asymmetric and transitive relation known from mereology . "Conn(p, p 1 , l)" says that the paths p and p 1 are weakly connected rela tive to
I (=
the linguistically induced ending position of p) and that the co
herence conditions are met such that a complex path p - p 1 can be deduc tively obtained . e <
• e 1 is to express the information that e temporally completely pre ( weiter and dann function as triggers).
cedes or meets e 1
As far as the construction algorithm is concerned, the following remarks
concerning the processing of syntactic structure may suffice: I assume that the lexicon links verb arguments with thematic roles ("path" would be such a role; the lexicon is also supposed to say - relative to a certain verb whether the path is related to the agent, the theme, or both). I assume that NPs are processed first . Path-adverbial structures are trans-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
VIA(e, p,
355
(2)
+ Das ist wichtig. ("That is important.")
(1)
Text (1) is a simple (uninterrupted) sequence of motion imperatives ad dressed to the same addressee, which (together with the presence of dis course particles) contributes to the possibility of looking upon ( 1 ) as one complex imperative. ( 1 ) licenses a number of semantic inferences four of which are stated 11 below.
(3)
Fabre von A nach C. ("Go from A to C.")
(4)
Fabre von B nach C. ("Go from B to C")
(5)
Fabre von A nach E. ("Go from A to E.")
(6)
Fabre von E nach A. ("Go from E to A.")
I will demonstrate i n the following two sections (with regard t o (motion) im peratives) that semantically licensed inferences need not be pragmatically licensed.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
formed i n a way that i s evident through the sample clauses . Discourse parti cles like weiter and zuriick (the latter in co-occurrence with at least one other goal adverbial) trigger an embeddability check but are not spelt out in the DRS themselves. The latter is also true of the proadverbial von dort which takes up the "current" location - which, with regard to our fragment, al ways occurs in the preceding clause. Dabei (in its temporal sense) induces a Part-of-relation as is obvious from the above DRS; moreover, the indexing is extended. In the interest of simplicity of presentation, I have left out the introduc tion of special discourse referents for place names; neither have I formulat ed " novelty conditions" with regard to the introduction of discourse referents . The imperative operator ( I ) is to be interpreted in the sense that the ad- , dressee is to bring about a situation in which K ' is true; since a realization 1 period is not lexically specified, this depends on pragmatic parameters . 0 The operator has the whole "propositional" DRS K in its scope (a justifica tion follows in the next section). However, the operator must not block anaphoric accessibility of the events in its scope (and combinations thereof). This is demonstrated by (2) - the concatenation of (1) and a clause contain ing a sentential anaphor.
356 S . MOTION IM PERATIVES AND DEDUCTION
Our introduction of the imperative operator I in the last section might sug gest that the logic of imperatives could be simply reduced to the logic of their realization states (and thus to the logic of assertions). My claim is that this strategy is indeed viable (apart from some critical cases to be noted below) but that the concept of semantic inference with regard to (motion) impera tives is not enough and must be supplemented by pragmatic machinery. Let us start with some basic issues in the logic of imperatives; we begin with the following informal definition by Rescher (1 966 : 84): A command conclusion is validly inferred from a certain group of command premises if every possible world in which all the premises are terminated will also have to be such that the conclusion is ter minated.
This definition verifies the validity of the following argument: (1)
Fabre iiber A nach B Fabre dann nach C.
("Go via A to B . ") ("Go then to C.")
Fabre nach C.
("Go to C.")
Now look at the following pattern (2) and its nonsensical instantiation (2 ' ) (produced b y Rescher himself to show the invalidity of (2)): (2)
Do A ! Doing A entails doing B . Do Bl
(2 ' )
Recite Gunga Din! Reciting Gunga Din entails being able to remember 1 2 Gunga Din. Be able to remember Gunga Din !
The lack of parallelism between the logic of declaratives and the logic of im peratives can also be seen from ROSS' paradox in the following guise: (3 )
Fabre nach B.
("Go to B.")
Fabre nach B oder (Fabre) nach C.
("Go to B") ("or ("Go to C")
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Def. :
357 (3)
would be verified via
(3 1 )
(3 1 )
though
(3)
itself is not valid.
x geht nach B
("x goes to B.")
x geht nach B oder x geht nach C.
("x goes to B.") ("or") ("x goes to C.")
(4)
Fahre von A nach C. Fahre dabei tiber B . ("Go from A to C. When doing so, go via B . " )
Mechanically we may derive
(4 1 )
(4)
Fabre von A nach C. ("Go from A to C . ")
(4 11 )
Fahre iiber B. ("Go via B . ")
and
(4 11 ) :
The speaker of (4) naturally does not mean the addressee to perform (4 1 ) and then perform (4 1 1 ) . (4 1 1 ) pragmatically does not even make sense if ut tered relative to an empty context. What this simple example suggests is the necessity of a discourse-oriented approach where discourse representation structures for imperatives provide the input to calculate execution structures that represent a procedural codification of the information . Loosely speak ing, discourse representation structures for imperatives are "declarative" in the sense of the programming language PROLOG, execution structures are "procedural" in the sense of PASCAL. 1 4 In the specific context of the present paper one might suggest that (4) induces the path frame ( A, B, C ) a s its execution structure.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3) (in one reading), for instance, is not valid because its conclusion illegiti mately increases the options ftxed by the premise. This implies that a rule such as !A - !(A v B) must not be accepted. 1 3 (3) would turn out valid according to the logic of Chellas ( 1 97 1). This logic would also have ! (A & B) .... (!A & ! B) as a theorem and would thus also legitimize an inference pattern such as (1). This principle may however lead to inappropriate consequences with regard to natural language dis course if it is not only taken as a semantic scheme but also taken as a prag matic pattern fixing a sequence of individual actions. Just think of cases where a sequence of imperatives does not induce an isomorphic sequence of actions but where the information has to be "merged" to enable acting. Let me repeat an example:
358 Let me give a further example (going beyond our core fragment) showing the necessity of a discourse-oriented approach. (5)
Gehe geradeaus. Biege beim Krankenhaus nach links ab. ("Go straight ahead. Turn left at the hospital.")
-
(6)
Fabre nach B . ("Go to B.")
Uber B in (4) may just be intended by t he speaker as giving information to the addressee on how to get to C. In such a case one could say that getting to B functions as a subsidiary goal, with reaching C being the goal proper. These distinctions gain relevance if compliance (or partial compliance) with imperatives is at issue. I f going via B is taken as a subsidiary goal, we would not say that a person going to B but not to C has partially complied with (4); partial compliance is to mean that some positive value is attached to the realization of (4) (where the question of who will enjoy the fruits of realiza tion depends on the speech act: with regard to an order it is usually the per son issuing the command, in the case of an advice it is the addressee). Consider a further example: ( l ) (from the last section, repeated here as (7)) need not imply (3) and (4) (from the last section, repeated here as (8) and (9)) pragmatically (in the sense just defined) if the only proper goal states envisaged are E and A. -
(7)
Fabre von A nach C. Fabre dabei tiber B. Fabre von C tiber F weiter nach E. Fabre dann von dort zurtick nach A. ("Go from A to C. When doing so, go via B. Go from C via F farther to E. Go then from there back to A.")
(8)
Fabre von A nach C. ("Go from A to C.")
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
What (5) tells us is to go ahead until we reach the hospital and then turn left. The first clause provides us with a direction but does not give us enough in formation to decide when to change direction. Realizing the first clausal im perative in (5) without taking the information in the second clause into con sideration might well get us beyond the hospital ! ' 5 Let's now go back to (4). (4) semantically implies (6) but it need not imply (6) pragmatically in a sense to be explained below.
359 (9)
Fahre von B nach C. ("Go from B to C")
In the next section we will introduce additional pragmatic aspects into the discussion.
6. MAPS
Let us have a look at the following one-sentence input (with A being the "current" starting position of our object): Fahre von A nach B. ("Go from A to B . ")
Taking our map into consideration, various paths may be claimed to lead from A to B. To describe this situation better, we i ntroduce the concept of path frame extension in the following way: Def. :
Let f: ( I 0 ( 1 ) . . . 1 °(n) ) be a path frame and M a map (M is conceived of as a set of ordered pairs ( v(i), vU) ) ; where v(i) and v(j) are sup posed to be vertices linked by an edge. ( v(i), v(j)) E M licenses moving from v(i) to v(j)). f' : ( ( 1 ) . . . v(k) ) is a path frame extension with regard to f and M iff (i) the 1 °(i) from f are contained in f' (where their order in f' is the same as in f) (ii) ' v(l ) = 1 °(1) (iii) v(k) = 1 ° (n) (iv) ( v(i), v(i + 1 ) ) E M (for i = 1 . . . k - 1).
Some comments: Maps can represent different degrees of granularity and bring in a pragmatic flavour. According to what we know about the ficti tious map M• figuring in our paper, the following holds: (A, B ) , ( B , A) . . . E M • . We assume that tuples like (A, C ) or (A, E ) , however, aren't contained i n M• because any route from A to C (from A to E) passes another vertex. Each path frame extension is itself a path frame. It is a harmless conse quence of our definition that it does not exclude the possibility that a path frame f is a path frame extension with regard to f and some map. The above sample sentence yields the path frame (A, B ) . It has (among others) the following extensions with regard to our map:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Input
360
(A, B ) , ( A , D, E, B ) , (A, D, G, H, I, F, C, B ) (A, E , I, B ) , however, isn't among the path frame extensions o f (A, B ) . Given a path frame, which path frame extension i s to be chosen? The de cision seems to be determined by pragmatic parameters (the distance in volved, the number of places called on, the beauty of the countryside and maybe other factors). The parameters also depend on the speech-acts per formed: an order (e.g. in the army) often has to be followed without "detours" being allowed, a piece of advice may leave more options to the addressee. One of the key concepts in this respect is the type of control. Let us take ( 1 ) to illustrate the problem involved (where ( 1 ) is made up of the clauses S(1) and S(2)): Fabre von A nach B . Fabre von B weiter nach C. ("Go from A to B. Go from B farther to C"}
Supposing that S(l } induces path p and S(2) path p ' , the following holds according to our definition: p is weakly connected with p ' relative to B . I f ( 1 } i s t o b e realized by a n addressee, should (s}he b e allowed t o leave B and then return to B before going to C (which means the "linking" path p is stationary but not strictly stationary)? This indeed depends on the kind of control the speaker of (1} has or wants to have over the addressee. Suppose (1} is meant as a suggestion to the addressee (with regard to a jour ney}. Then to follow the suggestion would not imply that p 11 must be strictly stationary. However, if the addressee is a robot, one would perhaps like to install some control such that p is strictly stationary. Let us consider a further example where the pragmatic concept of control is involved: Suppose ( 1 } is meant as a suggestion to a traveller and (s}he in deed goes from A via B to C, and then moves on to F. Clearly the traveller has then followed th• suggestion. However, if ( 1 } is meant as an order to a robot and the robot makes the same tour as the traveller (going in the end to F) , we would in most cases regard its behaviour as inappropriate. Workers in Artificial Intelligence have recently spent a great deal of time developing systems of non-monotonic reasoning where minimal models and default values play a dominant role (see e.g. McCarthy (1 980, 1 984}, Reiter (1 980). One of the key questions underlying those studies is formulated by Hanks and McDermott (1987 :409} : 11
11
"Given a logical theory that admits more than one model, what are the preferred models of that theory (that is, what is the prefer ence criterion) ( . . . )? " I assume here that minimal paths are the default values in the calculation of path frame extensions ("minimality" may here refer to distance or the number of relevant places involved). 1 6
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(1)
361 Let me now add some linguistic flesh by looking at (2) - (5), which are to be interpreted against the background of our map M* (where we assume M• to be common knowledge of the person issuing the imperatives and the ad dressee): Fahre von B tiber H nach E. ("Go from B via H to E.")
(3)
Fahre von B nach H. Fahre von dort weiter nach E. ("Go from B to H . Go from there farther to E.")
(4)
?Fahre von B nach H. Fahre von dort zuriick nach E. ("Go from B to H . Go from there back to E.")
(5)
Fahre von B nach H . Fahre dann von H nach E . ("Go from B t o H . G o then from H t o E.")
be paraphrasized via (3), but not via (4). In order to comply with (3) properly, it seems that the path chosen to get from B to H should not contain E. (4) may be acceptable in a situation where it is part of the background assumption that the only path from B to H touches on E. How about (5)? If we take the paths induced by our map as the set of our "spatial possibili ties", we can say the following: The first sentence of (5) has ( B, E, H ) as a minimal path frame, the second sentence gets us ( H, E ) . Putting together these two path frames yields f = ( B , E, H, E ) , which seems to be a possible path frame for (5). However, it contains a loop and thus does not seem prag matically acceptable as an interpretation of (2). I take it that (2) and (5) are semantically, but not pragmatically equiva lent. A minimal semantic account of "x moving from A to C via B" would tell us that x is moving from A to B and then from B to C. Default reasoning, as a part of pragmatics which reflects things as they "typically" are, as sumes that B is spatially "between" A and C and that the patch covered is loop-free. I f we come to know that a person moved from A to C via B, we would not picture a "pathological" path leading from A to C, from C to B and then to C again. Linguistic rules are here closely intertwined with rules of "mental economy" . Two parameters play a role in the discussion of movement. One is the cur rent location Pos of the object whose movement is to be guided by impera tives. The other is the map M which tells the object about the potential routes there exist. These two parameters figure in the following definition of result-equivalence (this notion was suggested by M. Bottner (personal communication)). (2) can
Def :
Two (motion) imperatives I and I 1 are result-equivalent relative to ( Pos, M ) , iff I and I induce the same path frame extensions with regard to ( Pos, M ) . 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2)
362 Examples: a) (6) and (6 ' ) are result-equivalent relative to ( A, M• > , but not to ( B, M• > (where M• is again our fictitious map). (6)
Fahre von A nach C. ("Go from A to C . ")
(6 ' )
Fahre nach C. ("Go to C.")
(7)
Fahre nach C. ("Go to C.")
(7 ' )
Fahre tiber B nach C. ("Go via B to C.")
The following holds: If I and I ' are result-equivalent relative to ( Pos, M ) , then I and I ' are also result-equivalent relative to any ( Pos , M ' > where M ' C M. The notion of result-equivalence may become relevant in cases like the following: The speaker thinks the hearer has the map M . . (related to a cer tain situation). Suppose the speaker wants the hearer to move from A (where the hearer is) to C via B. He could either choose (7) or (7 ' ), which are result-equivalent relative to (A, M .. > . However, if the speaker is quite sure about what the map of the hearer ( = M .. ) is, he would for reasons of linguistic economy utter (7) and not (7 ' ). We can give the following interpretation to the concept of a map M : Def : M i s (relative t o a person x) the ability set o f x i f ( v(i), vU) > E M is to mean: If x is at the place v(i), s(he) is able to find her/his way to vU). It is not necessarily the case that M is identical to the transitive closure Cl(M) of M: ( v(i), vU) ) and ( vG), v(k) ) may be contained in M but ( v(i), v(k) ) needn't be. This possibility mirrors the general fact that people are often un able to change a situation s(i) into a situation s(k) without getting appropri ate information about what the individual "chunks" of the transformation are. This information may for instance consist in specifying that first s(i) has to be changed to sU), and then sU) to s(k) (see Moore ( 1 985), Morgenstern (1 986)). The above considerations account - in our particular case - for the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
b) Suppose M .. = ( ( A, B ) , ( B, C ) ) . (7) and (7 ' ) are then result equivalent relative to ( A , M . . ) :
363 potential usefulness of motion imperatives as guides of behaviour. Their paradigmatic function can be called deontic. However, the deontic function can also be fulfilled by sentences in the declarative mood which is evident from the build-up of many route descriptions (see, for example ( 1 0) below). In many cases the imperative verbal form is even left out (so that the ellip sis signals the deontic function). Typical examples are the authentic texts (8)-)9): Links tiber die Briicke, dann zunachst auf der StraBe weiter. Kurz nach der Kurve nach links in den Wald. ("Left over the bridge, then along the road. Just after the bend, left into the forest").
(9)
Auf dem Weg nach links weiter in Rkhtung Tal und bei der nach sten Abzweigung rechts. Dann nach links tiber die Briicke und rechts tiber eine zweite Briicke zur Aumtihle. ("Carry on, then turn left in the direction of the valley, then right at the next bend. Then left over the bridge and right over a second bridge to Aumtihle")
(10
Etwa drei Kilometer nach Beginn der Wanderung wenden wir uns bei den Hausern nach links in Richtung Aumtihle. 50 Meter vor dem Gasthaus geht es nach links tiber die erste, dann rechtshalten dend und gleich wieder links tiber die zweite Briicke. ("About three kilometres after the beginning of the walk we turn left by the houses in the direction of Aumtihle . 50 metres before the inn we go left over the first bridge, then immediately left again over the second bridge")
As
was pointed out above, instructions must be such that each step can be executed (which of course depends on the ability set of the addressee). Let me illustrate the linguistically relevant problems by finally looking at the is sue of NP-evaluation. Obviously, in order to realize instructions like (8)-(10) one must be able to identify the referents of the NPs involved. Look again at (8)- ( 1 0): Note that the uses presented here are not anaphoric ones. This induces the question of how the definite NPs are evalu ated. The intuitions are relatively clear. In all three cases "imaginary jour neys" are constructed where the paths have a temporal ordering associated with them. (8), for instance, tells us to follow the road until a bend x is reached and then tum left; x is supposed to be the first bend that is reached when following the road. To be a bit more formal: x (a bend which we may for the sake of simplicity conceive of as a frame location) has to be chosen such that (i) p(t) c Loc(x(t), t) holds - where p i.; the corresponding path and there is no t 1 < t such that (ii) p(t 1 ) c Loc(x 1 (t 1 ), t 1 ), and x 1 is a bend different from x.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(8)
364
University of Tubingen Seminar fur natiirlichsprachlicht Systtmt
NOTES I.
I thank Franz Guenthner, Rob van der Sandt and the referees of this paper for a number
of valuable suggestions. 2.
Winograd (1 972), Kuipers ( 1 978), McDermott and Davis ( 1 984), and others have devised
elaborate programs I do not wish to compete with. The set-up presented here rather serves as a useful environment to bring linguistic aspects into focus.
3.
Unless otherwise stated, in this paper the term imperative is paradigmatically used for
clauses or texts in the imperative mood that function as orders or commands. However, the i nterpretative range of imperatives is much wider and includes requests, threats, exhortations, permissions, concessions, warnings, advice and wishes (see Huntley ( 1 984 : 1 03), Donhauser ( 1 986: 1 64ff.)). - It is not among the aims of this paper to provide rules of speec h act deterrni-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
From a procedural point of view ( 1 0) presents additional difficulties: Let us assume (as is reasonable) that the reference of dem Gasthaus has to be established when (10), qua instruction, is used to find one's route. If (10) is to be procedurally optimal, it should be possible visually to identify the inn at a distance of 50 metres (in order to prevent the necessity of going far ther towards the inn and then going back to the bridge). This suggests that " visual fields" (qua "resource situations" in the sense of Asher and Bonevac ( 1 987)) relative to path locations p(t) are required in the descriptive apparatus. ( 1 0) also points to the possibility that a text may be descriptively adequate without being deontically appropriate (the example may become more convincing if we substitute 1 kilometre for 50 metres in (10)). I have nothing definite to say about the distribution of definites and in definites within "pragmatic" motion imperatives (where the latter term is to refer to texts in the imperative or declarative mood that are to guide one's movement). The possibility of both a second bridge and the second bridge in (9) and ( 1 0), respectively, suggests that perspectivization is involved (such that, for instance, in the second case a restricted resource situation is in duced).17 The use of the definite article then pragmatically signals to the addressee that a unique referent can be found to guide his/her motion. Let me end this paper on the following note. In robotics (as part of Artifi cial Intelligence) one is "interested in the automatic synthesis of robot mo tions, given high-level specifications of tasks and geometric models of the robot and obstacles" (B.R. Donald (1987 :295). The present article was not meant to promote robotics as defined above but to take the basic idea from the field and sort out a number of linguistic aspects. It is the hope of the author that the insights gained from the study of the small core fragment presented here are still useful when rich and "realistic" fragments in the area of " linguistic robotics" are thoroughly investigated.
365
REFERENCES
Asher, N. and Bonevac, D. 1 987: Determiners and Resource Situations. In: Linguistics and Philosophy 1 0, 567-596. Aqvist, L. 1 984: Deontic Logi c . In: D. Gab bay and F. Guenthner: Handbook o f Philosophical Logic. Vol. I I ; 605-7 1 4 . Bauerle, R . 1 987: Ereignisse und Reprlisentationen. L I LOG-Report 43. Barwise, J. and Perry, J. 1 983: Situations and Attitudes. Cambridge: MIT-Press. Chellas, B . F . 1 97 1 : Im peratives. In: Theoria, Vol. 37, 1 1 4- 1 28 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
nation. - On the general problem of relating sentence types and semantic-pragmatic functions see Meibauer ( 1 987) and Rosengren ( 1 988); imperatives are discussed (among others) by Don hauser ( 1 986, 1 987), Wunderlich ( 1 976: 1 50ff.) and W underlich ( 1 984)). 4. The present paper is in fact "complementary" to Mayer ( 1 989) where a comprehensive treatment of spatial coherence (relative to a fragment in the indicative mood) is given . 5 . " Linkage" with regard to our small fragment means that a goal location is taken up (lexi cally or proadverbially) by means of a start adverbial in the subsequent clause. For a more · detailed analysis see Mayer ( 1 989). 6. (5 ' ) is something like a conditional imperative in the sense of Donhauser ( 1 986: 1 7 J ff.) and may be paraphrased as " I f you go to A or to F, you will find those journeys interesting". 7. See B!!uerle ( 1 987 :4 l ff.) for an explicit t reatment of the sentential anaphor das. 8 . I conceive of discourse representation theory as a programm� rather than a full-fledged theory. If the reader prefers a different framework - for instance situation semantics (see Bar wise/Perry ( 1 983)) - a codification there might be easily possible. 9. See e.g. Herweg et al. ( 1 987) who rightly point out that the calculation of frame locations as induced by spatial prepositional phrases " P RAP NP" crucially depends on parameters characterizing the object denoted by the NP (which implies the important role of world knowledge). 1 0 . Such an interpretation of sentences or texts in the imperative mood clearly fails when they express wishes the realization of which is not within the power of the addressee (example: "Sleep well", see Donhauser ( 1 986: 164)). I I . Explicit inference rules are given in Mayer ( 1 989). 1 2 . R. van der Sandt has pointed out t o m e that the alleged entailment relation is in fact a relation of presupposition. 1 3 . For an exposition see Aqvist ( 1 984:634ff.), Hintikka ( 1 979), Karnp ( 1 979), and Lewis ( 1 979). 14. Procedural aspects of route finding have so far played a dominant role in the field o f Ar tificial Intelligence. See Habel ( 1 987) for an overview. - Linguistic aspects of route directions are e . g . dealt with by Klein ( 1 979) and Wunderlich ( 1 978). 1 5 . There are two major options as far as a semantic representation of (5) is concerned: (i) When processing S( l ) from (5) , one "looks ahead" and makes use of the locative expression beim Krankhaus to derive a goal location with respect to the path induced by S ( l ) . (ii) O n e keeps aspects of execution outside t h e semantic representation and does not build a goal location into the representation of S(l ). One might call this kind of representation "declarative" . At a second level the execution structure is calculated. This level of evaluation is " procedural' ' . 1 6 . Psychologists, such a s Johnson-Laird ( 1 983), speak o f " mental models" as prototypical ly representing situations by means of a finite number of tokens and relations between them. Herskovits ( 1 985) has particularly stressed the role of paradigmatic configurations and typical ity in the linguistic codification of spatial relations. 1 7 . This interpretation is i n accord with the account given i n Uibner ( 1 985: 304ff.) for cases like (i) Er brach sich das Bein. ("He broke his leg").
366 Clark, H . H . 1 977: Bridging. I n : P . N . Johnson-Laird and P . C . Wason (eels.): Thinking: Read ings in Cognitive Science. London: Cambridge University Press. Davidson, D. 1 967 : The logical form of action sentences. I n : N. Rescher (ed.): The Logic of Decision and Action . Pittsburgh: University of Pittsburgh Press, 8 1 -95 . Donald, B . R . 1 987: A Search Algorithm for Motion Planning with Six Degrees of Freedom.
In: Artificial Intelligence 3 1 , 295-353.
Donhauser, K. 1 986: Der lmperativ im Deutschen: Studien zur Syntax und Semantik des deut schen Modussystems. Hamburg: Buske. Donhauser, K. 1 987: Verbaler Modus oder Satztyp? Zur grammatischen Einordnung des deut schen lmperativs. I n : J. Meibauer (eel.): Satzmodus zwischen Grammatik und Pragmatik. Tubingen: Max Niemeyer Verlag, 57-74. Habel, Ch. 1 987: Prozedurale Aspekte der Wegplanung und Wegbeschreibung. L I LOG Report 1 7 . Hanks, S . and McDermott, D . 1 987: Nonmonotonic Logic a n d Temporal Projection. I n : Ar
9, 341 -378. Herweg, M . , Khenkar, M . , Pribbenow, S. and Rehkamper, K . 1 987: ElsaB-Wanderung fur Linguisten. Exemplarische Analyse und Reprasentation eines Satzes aus einer Reise beschreibung. LILOG Memo 8. Hintikka, J. 1 979: The Ross Paradox as Evidence for the Reality of Semantical Garnes. In: E. Saarinen (ed .) ( 1 979): Game-theoretical Semantics, 329-345 . .
Huntley, M . 1 984 : The Semantics of English Imperatives . In: Linguistics and Philosophy 7,
1 03- 1 33 . Johnson-Laird, P . N . 1 983: Mental Models. Towards a Cognitive Science of Language, I n fer ence and Consciousness. Cambridge: University Press. Kamp, H. 1 979: Semantics versus pragmatics . In: F. Guenthner and S . J . Schmidt (eels.): For mal Semantics and Pragmatics for Natural Languages. Dordrecht: D. Reidel, 255-287. Karnp, H. 1 98 1 (a): A theory of truth and semantic representation. In: J .A . Groenendij k , T . M .
Janssen, M . B . Stokhof (eels.): Formal Methods in the Study of Language, Bd . l , Amsterdam
Mathematisch Centrum, 227-322.
Karnp, H. 1 98 l (b): Evenements, representations discursives et reference temporelle. In: Lao gages 64, 39-64. Klein, W. 1 979: Wegauskiinfte. In: Zeitschrift fiir Literaturwissenschaft und Linguistik (9),
9-57. Kuipers, B . 1 978: Modelling spatial knowledge . In: Cognitive Science 2, 1 29- 1 53 . Lewis, D. 1 979: A Problem about Permission. I n : E. Saarinen (et alii) (ed .): Essays i n Honour of Jaakko Hintikka, 1 63 - 1 7 5 . Uibner, S . 1985: Definites. In: Journal o f Semantics 4, 279-326. Mayer, R. 1989 : Coherence and Motion. To appear in: Linguistics 27/3. McCarthy, J. 1 980: Circumscription - a form of non-monotonic reasoning. In: Artificial In telligence 1 3 , 27-39. McCarthy,
J.
1 984: Applications of Circumscription to Formalizing Common-Sense
Knowledge. I n : Artificial I ntelligence 28, 89- 1 1 6 .
McDermott, D . a n d Davis, E . 1 984: Planning routes through uncertain territory. I n : Arti ficial I ntelligence 22, 1 07- 1 56. Meibauer, J. 1987 : Satzrnodus zwischen Grarnmatik und Pragmatik. Tubingen: Max Niemeyer Verlag. Miller, G. A. and Johnson-Laird , P . N . 1 976: Language and Perception. Cambridge: Cam bridge University Press. Moore, R . 1 985: A Formal Theory of Knowledge and Action. I n : J . R . Hobbs and R . C . Moore
(ed.): Formal Theories of the Commonsense World. Norwood : Ablex Publishing Corpo ration. Morgenstern, L. 1986: A First Order Theory o f Planning, Knowledge and Action. In: Theoret ical Aspects of Reasoning about Knowledge. Proceedings . 99- 1 1 5 . Reiter, R . 1 980: A logic for default reasoning. In: Artificial Intelligence 1 3( 1 , 2), 8 1 - 1 32. Rescher, N. 1 966: The Logic of Commands. London: Routledge & Kegan Paul Ltd.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tificial I ntelligence 33, 379-412. Herskovits, A . 1 985: Semantics and pragmatics of locative expressions. In: Cognitive Science
367 Rosengren, I. 1 988: Die Beziehung zwischen Satztyp und lllok utionstyp aus einer modularen Sicht. In: Sprache und Pragmatik 6. Lund. Seuren, P . A . M . 1 985: Discourse Semantics. Oxford: Basil Blackwell. Sondheimer, N. 1 978: A semantic analysis of reference to spatial properties . In: Linguistics and Philosophy 2, 1 978, 235-280. Winograd, T. 1 972: Understanding Natural Language. New York/London: Academic Press. Wunderlich, D. 1976: Studien zur Sprechakttheorie. Frankfurt: Suhrkamp Verlag. Wunderlich, D. 1 978: Wie analysien man Gesprache? Beispiel Wegauskunfte. In: Linguistische Berichte 58, 4 1 -76. Wunderlich, D . 1984: Was sind Aufforderungsslitze? In: G . Stickel (ed . ) : Pragmatik in der Grarnmatik: Jahrbuch 1 983 des Instituts fiir deutsche Sprache. Dusseldorf: Schwann, 92- l l 8. Wunderlich, D . and Herweg, M. 1 986: Lokale und Direktionale. To appear in: Ch. Schwarze and D. Wunderlich (ed.): Handbuch der Semantik . Konigstein/Ts . : Athenlium.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Jounwl of s�mantics 6: 369-385
CONDITIONS FOR MUTUALITY
JOSEF PERNER and ALAN GARNHAM
ABSTRACT
provides a participant a in that situation with grounds G for assuming that a and b, the other participant, mutually know some proposition p indicated by S. Our criterion derives from ana lytic criteria proposed by Lewis ( 1 969) and Schiffer ( 1 972). We discuss how our criterion ap plies in a series of test examples, and compare it with Clark and Marshall's ( 1 9 8 1 ) trip/�
copres�n� hn�ristic. We argue that triple copresence is empirically incorrect . It is neither a necessary nor a sufficient condition for mutuality, and it fails on a wide variety of examples. We also consider Sperber and Wilson's ( 1 986) recent claim that the concept of mutual
knowledge should be replaced by those of mutual manifestness and mutual cognitive environ
ments, and argue that this move fails to solve the problem of mutuality. Finally we discuss how community membership produces muuiality. We argue that mutuality can only be established if certain rules of common sense reasoning can be assumed, and discuss the these rules must be 'mutually' known.
sense
in which
INTRODUCTION
Mutuality is
the characteristic feature of a cluster of concepts that includes mutual knowledge, common knowledge, mutual belief, mutual manifest ness, and shared understanding. These concepts play a crucial role in the analysis of such notions as trading and bargaining (Aumann 1 976; Milgrom 1 98 1 ) norm, social practice, rule, role, social group and organization (Bach and Harnish 1 979), definite reference (Clark and Marshall 1 98 1 ), meaning (Schiffer 1 972), convention (Lewis 1 %9) and distributed processing (Hal pern and Moses 1 984). Indeed, the analysis of almost any cognitive inter action requires one or more of these concepts and they, therefore, form an essential part of any cognitive theory. In this paper our prime concern is with mutuality, though , for concreteness, we will couch our discussion mainly in terms of (reasons for assuming) mutual knowledge. The question we will focus on is: when can a person engaged in an inter action with another justifiably assume that they have mutual knowledge? This question is important because many consequences follow from the es tablishment of mutuality. To give just three examples: mutual knowledge about social roles produces expectations about how other people will be-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We present a finite psychological decision procedure for determining whether a situation S
370 have (Bach and Harnish 1979); mutual knowledge about an object licenses certain types of definite reference to that object (Clark and Marshall 1 98 1 ); mutual knowledge that an agreement (e.g. to meet) has been made pro duces the expectation of coordinated action at a later date (Lewis 1 969). One standard analysis of mutuality is what Barwise (1985) calls the iter ated attitude approach 1 • On this view two agents, a and b, mutually know some proposition, p, if the following conditions are satisfied: that p that p that b knows that p that a knows that p that b knows that a knows that p that a knows that b knows that p and so on, ad infinitum
This analysis led Clark and Marshall to identify the mutual knowledge Mutual knowledge is common, yet it would seem that two people can never verify mutuality, because it would take an infinite amount of time to verify an infinity of conditions. Clark and Marshall proposed a so lution to this paradox - a heuristic finite decision procedure for determin ing when knowledge is mutual. While we agree with Clark and Marshall that a finite decision procedure for mutuality is required, we will show that the heuristic they describe is not a good one. As an alternative we propose a psychological decision criterion for mutuality based on the analytic criter ia put forward by Lewis (1 969) and Schiffer ( 1 972). Sperber and Wilson ( 1 986), have drawn a more radical conclusion from the mutual knowledge paradox. They argue that the concept of mutual knowledge has no place in a theory of communication - and, by implica tion, no place in cognitive theory - because no two people can ever be sure they have mutual knowledge. However, their arguments are invalid, be cause they rest on the unwarranted assumption that mutuality must be both defined and testedfor in terms of the infinite sequence of iterated attitudes. Nevertheless, Sperber and Wilson make two valid points about mutual kno wledge. First, there are many circumstances in which ascription of mutual knowledge would be open to doubt. More generally, assumptions about knowledge or, indeed, about other mental states and achievements, are often mistaken. Assumptions about what people ought to be able to per ceive or infer - assumptions about cognitive environments, as Sperber and Wilson call them - are safer. However, this point is a point about knowl edge, not about mutuality. Sperber and Wilson's second valid point is that mutual knowledge is not a precondition for communication, as has sometimes been supposed. Speak-
paradox.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a knows b knows a knows b knows a knows b knows
37 1
tion about who sent it establishes mutuality between sender and receiver.
Halpern and Moses also show that commonality of knowledge cannot be es tablished by processors sending acknowledgements back and forth, since an infinite number of acknowledgements would be needed, corresponding to the infinite series of checks that Clark and Marshall and Sperber and Wil son worry about. We part company, therefore, with both Clark and Marshall and Sperber and Wilson , by suggesting that mutuality should not be defined in term of iterated attitudes. Nor should the infinite series of iterated attitudes be regarded as the definitive test for mutuality. Rather, mutuality can be es tablished by a finite decision criterion, just as a person's knowledge of a fact, or whether that fact is manifest to them can be. The criterion we propose is intended to decide on mutuality, whether it be of knowledge or belief or manifestness. For 'historical' reasons, we for mulate the criterion for the case of mutual knowledge. However, it should not be criticised because it does not solve the very difficult problems of es tablishing when someone knows something. Its aim is to provide a method of deciding about mutuality. The adequacy of our criterion is primarily an empirical question - does it classify situations in the same way as people do? Its only rival is the triple copresence heuristic of Clark and Marshall ( 1 981 ), which often fails when our criterion produces the same result as our judgement. Although our criterion is principally an empirical one, intended to explain how people make judgements of mutuality, it also makes clear how mutuality relates to the infinite series of iterated attidudes. One can show that, for states such as having reasons to believe or manijestness, satisfaction of our criterion
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ers who make a definite reference to a church do not necessarily assume that they and their audience have mutual knowledge of the church, merely that the audience will be able to work out which building they are referring to (1 986:43 -4). The church need not be mutually known, only mutually manifest. However, the fact that mutual knowledge is not necessary for suc cessful communication does not mean that the concept of mutuality can simply be dismissed. Sperber and Wilson themselves are forced to distin guish between manifest ness and mutual manifestness. The concept of mani festness itself does not avoid the infinite regress2 . The fact that mutuality need not be established by an infinite series of checks is also shown by Halpern and Moses' ( 1 984) analysis of when knowl edge is common among the processors of a computer system with distribut ed processing. They show that a processor can assume that information it has sent out will become common knowledge if it knows when and where the information will arrive and if the information includes details of where it comes from. A single message with a known time of arrival and informa
372 logically entails the infinite iteration (see Perner and Garnham, ms. Appen dix B).
THE MUTUALITY CRITERION
Decision Rule for Mutuality:
Any situation S involving two participants a and b, which is perceived by one participant, a, as S[a] , provides grounds G (where G is a subset of S[a]) for a to assume that the proposition p is mutually known by a and b iff participant a has reason to believe that G satisfies the following four conditions: Cl. C2. C3. C4.
G G - aRG & bRG G - aRp & bRp Whether G satisfies conditions C2 and C3 is established by com mon sense reasoning.
If q is any proposition and x any rational person then xRq means that per son x has reason to believe that q. The symbol ' - ' stands for material im plication and
for conjunction. states the trivial fact that S can provide a with grounds for mutual knowledge only if it gives a reason to believe that G holds. Condi tion C2 is the central part of our criterion . It requires that G provides grounds for both participants a and b to know that G holds. In other words, G must be self-revealing or 'open' to both participants. Condition C3 states that G must also make the target proposition p known to a and b . Condition C4 requires that the judgement about whether G meets C2 and C3 must depend on application of common sense inference rules. To apply our criterion we assume that the source situation S can be regarded as a set of propositions. Figure I gives a bird's eye view of Schiffer's candle gazing situation - a prototypical example of a situation that warrants the assumption of mutual knowledge - which we character ize as follows: '&'
Condition CJ
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The work of Lewis ( 1 969:52-3) and Schiffer ( 1 972:34-5) suggests a deci sion procedure that allows each of two people (a and b) who are par ticipants in an interactive situation S(a,b) to decide whether that situation provides them with good grounds to assume that a target proposition p is mutually known. In stating this procedure, we make more explicit than did Lewis and Schiffer the role of common sense inferences.
373
S
=
'p 1 : p2: p3: p4: p5 :
Bob's eyes are within Ann's visual field. Ann's eyes are within Bob's visual field. The burning candle is within Ann's visual field. The burning candle is within Bob's visual field. There is no visual obstruction within the triangular area formed by Ann's eyes, Bob's eyes and the candle. '
Since Ann is fully informed about this situation we can equate S with her view of it (S[a]). The example has also been constructed so that no distinc tion need be made between S[a] and G. The actual situation S can therefore be equated with G. It is then relatively easy to show that situation S ( G) provides Ann with grounds to assume mutual knowledge of the fact that the candle is burning3 . The application of our criterion to Schiffer's candle gazing situation demonstrates how our decision criterion can be used as a precise algorithm in a concrete situation . It correctly shows that this situation provides grounds for mutual knowledge of the fact that the candle is burning. It also correctly rejects situations where mutual knowledge does not obtain. Clark and Marshall constructed �pveral versions of a scenario in which Ann and Bob share knowledge about some target proposition p (that tonight's movie at the Roxy is ' Monkey Business') and yet fail to have grounds for assuming mutual knowledge of p. Version 5 is the most complicated of these scenarios: =
" Version 5. On Wednesday morning Ann and Bob read the early edition of the newspaper and discuss the fact that it says that A Day at the Races is playing that night at the Roxy. Later, Bob sees the late edition, notices the correction of the movie to Monkey Business, and circles it with his red pen. Later, Ann picks up the newspaper, sees the correction, and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. /. Schiffer's candle gazing situation
374 recognizes Bob's red pen mark . Bob happens to see her notice the correc tion and his red pen mark. In the mirror Ann sees Bob watch all this, but realizes that Bob hasn't seen that she has noticed him . . . . " (Clark & Marshall, 1 98 1 : 1 4). There is no way of selecting a set of propositions that satisfy our criterion from this scenario, so it does not provide grounds for mutual knowledge4 •
CRITIQUE OF TRIPLE COPRESENCE
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Unlike our criterion, Clark and Marshall ' s triple copresence heuristic is dif ficult to apply to Version 5 of their scenario. Clark and Marshall ( 1 98 1 :32) point out that to have grounds for assuming mutual knowledge Ann needs 'evidence of triple copresence - of certain events in which Ann, Bob, and the target object are copresent, as when Ann, Bob, and the notice about Monkey Business were openly present together Wednesday morning. ' This criterion is difficult to apply because the expression 'openly present together' is too vague to serve as part of a fully explicit decision criterion. As the authors admit: "The trick is to say what counts as triple copresence - as being 'openly present together" ' (198 1 :32). Unfortunately this trick is not revealed in an explicit definition, but only exemplified by reference to Schiffer's candle gazing situation. There, Ann's evidence for things be ing 'openly present together' is 'evidence that she and Bob are looking at each other and the candle simultaneously.' ( 1 98 1 : 32-3). In applying this idea to Version 5 of Clark and Marshall's scenario one has to decide how the expression 'looking at each other' should be interpreted. If it requires eye contact then Version 5 fails the test for triple copresence. However, if it can be construed as 'Bob looks at Ann and Ann looks at Bob' then Ann would have grounds for assuming mutual knowledge in Version 5. For Ann has simultaneous evidence from the mirror that Bob is looking at her and the notice about Monkey Business and evidence that she is looking at Bob and the notice in the mirror. Of course, Bob does not have corresponding evidence of triple copres ence and therefore he has no grounds for assuming mutual knowledge. However, for Ann the triple copresence heuristic makes the wrong predic tion that she has grounds to assume mutual knowledge. The fact that Clark and Marshall's criterion leads to a different result for Ann and Bob indi cates that it is incorrect, at least under the current interpretation. Any satis factory criterion should give the same result for both participants except in cases of mistaken beliefs. It seems, therefore, that 'looking at each other' should be construed as 'having eye contact' . One problem with this interpretation is that there are obvious examples
375
,
'
The 'hole in the menu' candle gazing situation
of mutual knowledge based on seeing that do not depend on eye contacL Imagine yourself sitting side by side with a friend at night on a quiet bench in the park and suddenly a flash of lightning lights up the surroundings. In this situation you don't have to look each other in the eye to mutually know that there was a flash of lightning. In contrast to Clark and Marshall's triple copresence heuristic, our criterion is not affected by the absence of eye con tact. In fact it does not even demand copresence. A further problem is that even triple copresence with eye contact (the strongest interpretation of Clark and Marshall's formulation) does not pro vide sufficient grounds for mutual knowledge. To demonstrate this fact we assume that Ann and Bob are again seated at a table. Ann is holding the menu in front of her reading iL The waiter brings a lighted candle and puts it on the table. After the waiter has left, the candle goes ouL Bob sees it go out. He can also see Ann's eyes and the half of the menu folder facing him but not the other half. Ann can see Bob's eyes. She could not see that the candle had gone out if it were not for a little hole in the side of the menu folder that Bob cannot see. This situation is shown in Figure 2. In this situation Ann has evidence that she and Bob are looking at each other (eye contact), that Bob can see the extinguished candle, and that she herself can see the candle. So she has evidence for triple copresence (with eye contact) but no grounci:. for assuming that the candle's extinction is mutually known . The triple copresence heuristic has again led to the wrong decision. Our criterion, in contrast, correctly denies grounds for mutual knowledge in this situation. The reason why it fails our criterion can be easily seen. For Ann to know that the candle has gone out it is essential that she can see through the hole in the menu folder. This fact must therefore be part of G (to satisfy C3). C2 requires that Ann has reason to believe that G must reveal this fact to both participants. However there is nothing in the situation S that would
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. 2.
'
/1\Hmu
376
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
give her reason to believe that Bob thought that she could see the candle. On the contrary there is every reason for her to assume that Bob cannot see the hole in the menu. Therefore this situation fails condition C2 and Ann has no reason to asume mutual knowledge. The last few examples have shown that triple copresence is neither a suffi cient nor a necessary condition for assuming mutual knowledge. It might be moved, against our objections, that triple copresence is a heuristic test for grounds for mutual knowledge. It is not intended to provide a set of necessary and sufficient conditions . However, the wide range of straight forward examples that this ' heuristic' fails on renders it almost useless as part of a psychological decision criterion. The basic problem with physical copresence as a test for mutuality is that even if Ann can establish copresence she has not thereby established that Bob will have grounds to assume copresence. Our criterion shows that triple copresence only provides grounds for mutuality when copresence indicates itself to both parties. Much the same problem arises with linguistic copresence (Clark and Marshall, 1 98 1 :35 -42). Again, Clark and Marshall do not state precise conditions for linguistic copresence, but illustrate the idea by an example: "Imagine Ann saying to Bob I bought a candle yester day. By uttering a candle, she posits for Bob the existence of a particular candle. If Bob hears and understands her correctly, he will come to know about the candle's existence at the same time as she posits it. It is as if Ann places the candle on the stage in front of the two of them so that it is phys ically copresent. The two of them can be said to be in the linguistic copresence of the candle." (Clark & Marshall, 1 98 1 :39). From this quote it appears that linguistic copresence holds if the par ticipants and a spoken or written version of the target proposition are phys ically copresent, and the auxiliary assumptions of rationality, attention and simultaneity, plus some new ones specific to language, are satisfied. This in terpretation would capture the case in which Ann and Bob are present when a third person mentions the candle, but what about a telephone conversa tion in which Ann and Bob are not physically copresent? Is it sufficient that Ann and Bob hear the mention of the candle simultaneously? This criterion would permit mutual knowledge in a telephone conversation, but would wrongly suggest that a radio announcement by Ann heard by Bob in a different place would provide grounds for mutual knowledge. A possible modification of linguistic copresence to cover the telephone conversation but exclude the radio announcement might be a rather vague formulation that says: 'Ann and Bob have to hear each others' voices at about the same time'. But even this formulation, and probably any other attempt to cap ture the notion of linguistic copresence, will not be able to account for the mutual knowledge established by a reliable messenger (M) that Ann sends to Bob to tell him that tonight's movie at the Roxy has been changed to
377
THE PROBLEM OF UNCERTAI NTY
One could object that our criterion succeeds on the messenger example only because of the unreasonable assumption of perfect reliability. Real life is never perfectly predictable and Sperber and Wilson argue that even the slightest uncertainty poses an insurmountable problem for establishing mutual knowledge. They argue (1 986:20) that because uncertainty in estab lishing one step of the infinite iteration affects all later stages multiplicative ly, the probability of establishing mutual knowledge itself is vanishingly small. Sperber and Wilson's argument, however, only holds if mutual knowledge is defined as an infinite iteration of higher-order knowledge states, each of which has to be established separately, if mutuality is to be demonstrated. If, as we argue, mutuality can be established on the basis of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Monkey Business (p) and that she sent the message. Our criterion leads to a j udgement of mutual knowledge. Take Ann's point view. Condition C3 is met since S ( = S[a] = G) implies that both Ann and Bob know p. C2 is met , since Ann knows what she told M and since Ann has good reason to anticipate that M will give the message to Bob as instructed. C2 is also met for Bob since Bob can reconstruct from what M told him that Ann must have instructed M to tell him that p and to tell him that the message came from her. Also C4 is met since the judgement that C2 and C3 apply does not depend on any special expertise. In contrast, this messenger scenario would not provide grounds for mutual knowledge according to Clark and Marshall's triple copresence heuristic since Ann and Bob and the spoken form of p are never copresent. Ann and Bob do not even hear each others' voices at about the same time. Although Clark and Marshall clearly intend linguistic copresence to mean that Bob and Ann are physically copresent and only the mutually known fact or object is linguistically present, one might try to abandon the re quirement of physical copresence of the participants and allow that one of them need only be linguistically present. One could then argue in the mes senger example that at the time of M's delivery of the message Ann and p are linguistically and Bob is physically present. However, under this inter pretation, the copresence heuristic would give the wrong decision in simpler message situations, for example one in which Ann instructs M to tell Bob that p, and M later tells Bob that p. Since Ann is physically and Bob and p are linguistically present when Ann gives her instruction the copresence heuristic would suggest mutual knowledge. Intuition and condition C2 would clearly reject this suggestion, since Bob has no grounds for knowing who had sent the message (C2 fails) and hence he would not have grounds to assume that Ann knew p.
378
COMMUNITY MEMBERSHIP
Besides physical and linguistic copresence the third major way of establish ing mutual knowledge, according to Clark and Marshall, depends on com munity membership. "Even when Ann is not acquainted with Bob, she can assume there are generic and particular things the two of them mutually know. The basic idea is that there are things everyone in a community knows and assumes that everyone else in that community knows too . " (Clark and Marshall, 1 98 1 :35). There are two distinct problems in establishing mutual knowledge via community membership. The first is: how can strangers es tablish mutual knowledge about which community they belong to? The sec ond is to decide what knowledge can be assumed to be mutual between members of a community once they are mutually known to one another. In many cases the first p roblem is solved quite straightforwardly. Ann says to Bob "Hello, I am from Newick. " and Bob replies "So am I " . This exchange establishes mutual knowledge of 'Ann and Bob are both from Newick', as our decision criterion shows. The second problem is different. The important part of its solution is to show why Ann and Bob, after establishing that they both come from Ne wick, have grounds to believe that they mutually know 'Newick is in Sussex' . A positive decision about mutual knowledge of this fact can be made by our criterion if we allow as a rule of common sense inference: 'if somebody is from Newick then that person knows Newick is in Sussex' . This rule will then be admissible by condition C4 for establishing C2 and C3. The critical question is : under what conditions such a rule should so admitted? It is tempting to argue that all that is needed is that everybody in Newick knows that they live in Sussex. Surely it follows from this fact, as C3 requires, that if Ann and Bob are from Newick, then both know that Newick is in Sussex.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a finite series of tests, any compounding of uncertainties will be strictly limited . Using our criterion, Ann's uncertainty about whether mutual knowledge is established by her message will only be marginally greater than her uncertainty about whether Bob knows the target proposition p. Her confidence that Bob knows p will depend on how certain she is that the messenger will inform Bob of p. Her confidence that she and Bob have mutual knowledge of p depends, in addition, on the probability that her messenger will not forget to tell Bob that Ann wanted him to know that the message came from her and the probability that Bob will apply the mutu ality criterion. If these two probabilities are high Ann has good grounds to assume that mutal knowledge will have been achieved, despite the uncer tainty.
379
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Clark and Marshall ( 1 98 1 :37) argued in exactly this way: "It is instruc tive to spell out the two main assumptions required here for mutual knowledge of proposition p ('Newick is in Sussex'). First, Ann must believe that she and Bob mutually know they belong to a particular community. Let us call this assumption community membership. And second, Ann must believe that everyone in that community knows that particular propo sition p. Let us call this assumption universality of kno wledge. " Before we give the correct characterization of how Ann establishes C3 in the previous example, we will show, in a further example, that universality of knowledge cannot be the correct criterion. The vicar of Newick visited every inhabitant in private and told each one of them the following: ' ' I have told everybody in Newick that Mr Mainwaring-Knight is possessed by Sa tan, but you are the only one who I have told that I told everybody. " According t o Clark and Marshall's criterion o f universality of knowledge within the community of Newick it follows that if Ann and Bob meet and identify each other as being from Newick they then mutually know that Mr Mainwaring-Knight is possessed by Satan. But this conclu sion is obviously unwarranted. In fact, Ann has firm evidence that Bob would not know that she knew about Mr Mainwaring-Knight's possession, since the supposedly trustworthy vicar has assured her of that fact. This unwarranted conclusion can be avoided by our criterion if we re quire that the rules used for establishing conditions C2 and C3 must be, not just universally but, mutually known. We tried to capture this requirement informally by stipulating that conditions C2 and C3 must be established by rules of common sense reasoning, which are mutually assumed. Indeed, we have had to use a contrived example to illustrate the difference between rules that are merely universally known and those that can be used to estab lish mutual knowledge. With this requirement the 'lying vicar' example fails our criterion. Con dition C3 requires that situation S (which contains: ' 'Ann and Bob are both from Newick") implies that Ann and Bob have reason to believe that Mr Mainwaring-Knight is possessed by Satan. C4 requires that this implication be established by a mutually known rule 'if somebody is from Newick then that person knows that Mr Mainwaring-Knight is possessed by Satan'. The way the vicar informed his flock ensures that this rule is universally known (everybody in Newick knows it) but not mutually known (Ann, for instance, thinks that she is the only one who knows that everybody knows). The rule, therefore, fails to satisfy condition C4 in our present formulation, which re quires mutuality. Had the vicar told his flock a different story: "I have told everybody that Mr Mainwaring-Knight is possessed by Satan and I have assured them that I will tell everybody else", then Mr Mainwaring-Knight's possession would be mutually known within the community of Newick. That this is so can be
380 established by our criterion. The situation S is the sum of personal encoun ters with the vicar. Under the assumption that the vicar is reliable and will do what he says, everybody in Newick has reason to believe S (C2 satisfied). And by commonly assumed rules about communication it can be estab lished that everybody who was part of S would have reason to believe that Mr Mainwaring-Knight is possessed by Satan (C3 satisfied). Let us call the body of knowledge that is shared by a community in this way communal knowledge.
COMMON SENSE REASONING AND COMMUNAL KNOWLEDGE
We have seen that, contrary to Clark and Marshall's claim, it is not suffi cient that the rules used to determine whether G satisfies conditions C2 and C3 are universally known within a community. They must be, in some sense, mutually known. However, two people do not mutually share the common assumptions of a community until they mutually know that they are mem bers of that community. This fact causes no problem when community membership is established on the basis of some other knowledge, such as knowledge of language. If Ann and Bob tell each other that they are Scot tish, they mutally know the folklore of Scots, and can use that knowledge to establish other pieces of mutual knowledge. However, mutual knowl edge of community membership can, in some cases, be established on the basis of knowledge about the common beliefs of that community. Ann and Bob may mutually establish that they are sighted using their knowledge of how slighted people behave, or that they are freemasons by certain charac teristic, but subtle, bodily movements. Therefore, just as it is too weak a condition to require that the rules used to establish C2 and C3 are univer sally known, it is too strong a condition to require that they are mutually known, in the usual sense. We, therefore, distinguish between mutual knowledge (between a and b
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
One can see from the last example of the truthful vicar how a body of communal knowledge can be built up if a group of people is authoritatively instructed, as children are in school. In school everybody is told (roughly) the same facts. Since this instruction is carried out in the open, nobody can have any suspicion that there is any secret about it, unlike the case of the lying vicar. So educated people have reason to assume that they mutually know the facts typically taught at school. For instance, ' Newick is in Sus sex ' is taught at the infant school in Newick and can therefore be assumed to be mutually known by all members of that community older than 4 years. So, to return to our unfinished example, Ann can reason from the fact that Bob is from Newick to the fact that he knows 'Newick is in Sussex' using only rules that are mutually known to people from Newick5 •
38 1
S
=
'p1 : p2: p3 : p4:
Ann is sighted. Bob is sighted. Ann's eyes are within Bob's visual field . Bob's eyes are within Ann's visual field . '
Can Ann assume mutual knowledge of p l and p2? Trivially, aRp l and aRp4. Ann also has reason to believe p2 and p3 on the basis of the common knowledge of what sighted people look like and of what they can see. In par ticular, her visual abilities enable her to establish that Bob is sighted before that fact becomes mutually established. Since he is sighted she can assume he shares common assumptions about seeing, since these assumptions are communal among sighted people. This attribution allows her to establish, straightforwardly, that S - bRS. It follows that S provides grounds for mutual knowledge of all of pl, p2, p3 and p4, and hence of pl and p2 in particular.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that p) , and communal kno wledge - knowledge that all members of a community 'mutually' entertain. Mutually occurs in quotes here, since the mutuality of communal knowledge is, for two arbitrarily selected members of a community, not actual but conditional on their recognising each other as members of the community. Community members know that other community members share communal knowledge. But they do not neces sarily know who the other community members are. When two people meet they may not know which community memberships they have in common. None of their communal knowledge is mutual until they have established (mutually) which communities they belong to . It is often claimed (e.g. Sperber and Wilson, 1 986: 1 9) that mutual knowledge must be known to be mutual. But in one sense two members of a community who have met but not established that they are members of that community do not mutually know the relevant communal knowledge. The rules used to establish C2 and C3 can be either mutually known or communally known, and as communal knowledge these rules can be at tributed to someone identified as a member of a community, before that membership is mutually established. In some cases, therefore, communal knowledge of these rules can be used to establish mutual knowledge of community membership and hence mutual knowledge, in the strong sense, of the rules of reasoning used in the community. This method of establish ing mutual knowledge works only when the communal knowledge of a community provides a way of identifying members of that community. The following example illustrates how knowledge of the common sense psychology of seeing - knowledge that is communal but not yet mutual can be used to establish mutual k nowledge of sightedness. Ann and Bob have met for the first time, and have not yet spoken:
382
C4:
Whether G satisfies C2 and C3 must be established by rules that are tacitly assumed (or that have been previously established) EITHER to be mutually known between participants a and b OR to be communal knowledge of a community of which G establishes that a and b are members.
What is important in this formulation of C4 is that the mutuality of the rules for testing the other conditions must not itself be tested. It must either be tacitly assumed or be explicitly established by the criterion on a previous occasion . If mutual knowledge of these rules had to be tested when the criterion was applied, an infinite regress would result. Another important feature of this formulation of C4 is that it allows in ference rules which are only communally known. Why is it safe to attribute to someone communal knowledge before mutual knowledge of community membership has been established? The reason lies in the nature of common sense psychology and communal k nowledge. Two people either are mem bers of a community, and (potentially) mutually know the fol klore of the community, or they are not. There is no question of these psychological rules having an intermediate status, such as being known to everyone, but only being known by a select few to be known to everyone. Communal knowledge can never be like the knowledge imparted by the lying vicar. Therefore, if Ann identifies Bob as a member of a certain community, and she knows that she is a member of that community, she already knows that the common sense psychology of that community can become mutual . She will not usually come to any mistaken conclusions if she assumes Bob will use that psychology in his reasoning. If G includes information about com-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Mutual knowledge of sightedness and of the common sense psychology of seeing can be established by this 'bootstrapping' method because vision can be used to establish whether another person is sighted. However, sighted people can only mutually establish that they are sighted in this way if con ditions like p3 and p4 hold. For each of two people to know that the other is sighted is not enough. The fact that each is sighted must be part of the situation or inferable from it. Other types of community membership have the same self-recognising property as seeing. For example, by becoming a freemason one learns the secret signs that allow one to recognise other Free masons. Freemasons can establish, between themselves, that they are Free masons by using these signs before they mutually know, in the usual sense, that they are Freemasons. To do so they use knowledge about Freemasons that is communal and, hence potentially mutual. However, even a Freema son will not necessarily recognise another freemason if neither makes the appropriate signs. We are now in a position to reformulate condition C4:
383 munity membership, Ann can use her knowledge of Bob's knowledge of common sense psychology to establish mutual knowledge of community membership.
CONCLUSION
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have presented a finite decision criterion that a person, a, can use to de cide whether a and b have mutual knowledge of a proposition p in situation S that relates a and b (or whether their reasons to believe p are mutual or whether p is mutually manifest). The criterion can only be used if (a has rea son to believe that) a stock of mutual or communal knowledge is already available to a and b. We have also shown how this stock of knowledge can be increased by application of our criterion. A criterion that uses mutual knowledge of one fact to establish mutual knowledge of another is not cir cular in the way that a definition of mutual knowledge in terms of mutu ality would be. However, it follows that our criterion cannot be used to es tablish all the mutual knowledge that a and b share. Some of that knowledge must be established in other ways. Nevertheless, providing that some mutual knowledge can be established in a different way, and provid ing that people's knowledge is organised in a suitable manner, our criterion can be used to build a large stock of mutual knowledge from very modest beginnings. But what precisely are those beginnings? The simplest way in which mutual knowledge can be 'established' without using our criterion is for it to be assumed. Two people might assume any bit of knowledge to be mutu al. Such assumptions are risky, in the sense that they might be incorrect. But because the assumption that a piece of knowledge is mutual typically has many consequences, such assumptions can be revised in the light of sub sequent evidence. Indeed, the most general assumptions, for example of the mutuality of the minimum rationality needed to apply our decision criterion or of (near) universal perceptual and cognitive abilities, are both the safest and the ones having the most widespread and immediate conse quences. Mistaken assumptions about the mutuality of specific pieces of knowledge, on the other hand can be harder to correct. In such cases mutu ality would be better established by our criterion . Nevertheless, the fre quent misunderstandings in human interactions are probably attributable, at least in part, to unfounded assumptions of mutuality. Developmentally, the idea of initially assuming wholesale mutuality from which one gradually retreats is reflected in Piaget's ( 1 923 / 1 926; 1948 / 1 956) doctrine of 'egocentrism', according to which young children operate as if facts known to them are 'open' to everybody. Recent evidence (Wimmer, Hogrefe and Perner, 1988) however suggests that this idea is in-
384
Laboratory of Experimental Psychology University of Sussex Brighton BNJ 9QG England
ACKNOWLEDGEMENT
The authors express their gratitude to Steve Isard and Richard Power for valuable suggestions and critical comments. This paper is a shortened version of a more extensive unpublished manuscript, which is available from the authors.
NOTES I.
The term 'attitude' is used in the technical sense of 'propositional attitude ' , which simply
means that the verb know has a proposition as one of its arguments. This use is compatible with the idea that knowledge is a mental state but it is also compatible with Ryle's ( 1 949) sug gestion that knowing is an achtevtmtnt. 2.
Sperber and Wilson claim that "the situations which establish a mutual cognitive envi
ronment are essentially those that have been treated as establishing mutual knowledge" ( 1 986: 45) and they refer in a footnote to Lewis and to Clark and Marshall. Clark and Marshall ex plicitly propose a finite, though heuristic, test for mutuality. And as we shall show, Lewis's conditions can be regarded as providing a finite decision criterion for mutuality. By accepting
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
correct. Even children below the age of 4 years judge whether someone else knows a fact independently of whether they themselves know it. The opposite extreme to a wholesale assumption of mutuality would be to assume the m utality of only minimal rationality. In any particular case this assumption might have to be given up on the basis of good evidence, such as the gross behaviour of an insane person or the failure of someone to respond normally in a straightforward interaction. More generally, any assumption of mutual knowledge may be abandoned if a notices that b reacts inappropriately to behaviour based on that assumption. However, such an overly cautious approach to mutuality is implausible. We typically impute much more mutual knowledge to strangers than just the assumption of minimal rationality. We do not first test out whether they can hear but tacitly assume mutuality of hearing. Only when their reactions to our noises violate this assumption do we retract from it. Children, probably, make even stronger a priori assumptions about mutuality. For example, they may well assume that everybody understands their language until they have con tact with a foreigner. There is, therefore, an important empirical question about which abilities and pieces of knowledge children tacitly assume to be mutual and which ones they attempt to establish using our criterion. How ever, we must leave this question for another occasion .
385 Lewis's criterion, Sperber and Wilson undermine their own arguments against mutual knowl edge. If there is a finite decision criterion, there is no infinite series of checks. 3 . Perner and Garnham (ms.) give the full details. To draw a conclusion about mental achievements or states, rather than about cognitive environments (manifestness I reason to be lieve) it is necessary to import some additional assumptions about what Ann and Bob are pay
ing attention to , but these assumptions are about (ordinary) knowledge or beliefs, not about mutuality. 4.
A step-by-step analysis is given in Perner and Gamham (ms.).
5.
It is worth noting that Lewis, in his demonstration of how common knowledge follows
from his three conditions assumes "mutual ascription of some common inductive standards and background information, rationality, mutual ascription of rationality, and so on." ( 1 969: 56-7), though earlier he talked o f merely "share[d) . . . inductive standards and background information" ( 1 969:53), an idea that is shown to be inadequate by our lying vicar example. H owever Lewis did not make this assumption explicit in a fourth condition corresponding to
tled for the insufficient criterion of universality instead of mutuality.
REFERENCES Aumann, R . J . 1 970: Agreeing to disagree. A nnals of Statistics 4: 1 236- 1 239. Bach, K. and R . M . Harnish 1 979: Linguistic communication and Speech Acts. MIT Press. Cambridge, MA. Barwise, J. 1 98 5 : Modeling shared understanding. Unpublished manuscript. Department of Philosophy and CSLI , Stanford University. Clark, H . H . and C . R . Marshall l 98 1 : Definite reference and mutual knowledge. In: A . K . Joshi,
B . Webber, and I . Sag (eds . ) : Elem�nts of Discours� Und�rstanding. Cambridge University
Press. Cambridge.
Halpern, J. Y . and Y.O. Moses 1 984: Knowledge and common k nowledge in a distributed envi ronment. Proceedings of th� Third A CM Conference on Principles of Distributffl Comput
mg, pp. 50-6 1 . Hughes, G . E . and M . J . Cresswell l 968: A n Introduction to Modal Logtc, Methuen. London. Hintikka, J. 1 969: Knowlfflg� and Beli�f. Cornell University Press. Ithaca , N Y . Lewis, O . K . 1 969: Convention: A Philosophical Study. Harvard University Press. Cam bridge, MA. Milgrom, P. 1 98 1 : An axiomatic characterization of common knowledge. Econometrica 49: 2 1 9-222. Moore, R.C. 1 980: R�asoning about knowlfflg� and action. SRI International, Technical Note 1 9 1 . Perner, J . and A . Garnham (ms . ) : Conditions for mutuality: Extended version. Unpublished manuscript: Laboratory o f Experimental Psychology, University of Sussex. Piaget, J. 1 926: The Language and Thought of th� Child. Routledge and Kegan Paul. Lon don. (Originally published, 1 923). Piaget, J. and B . Inhelder 1 956: The Child's Conception of Space. Routledge and Kegan Paul. London. (Originally published, 1948). Ryle, G. 1 949: Th� Concept of Mind. Hutchinson. London . Schi ffer, S. 1 972: M�ning. Oxford University Press. Oxford. Sperber, D. and D. Wilson 1 986: R�l�vana: Communication and Cognition. Blackwell. Oxford. Wimmer, H . , G .-J . Hogrefe and J. Perner 1 988: Children's understanding of informational access as source of knowledge. Child Development 59: 386-396 .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
our C4. It is probably for this reason that Clark and Marshall ( 1 98 1 :37), who relied on Lewis's conditions in their treatment of mutual knowledge established by community membership, set