JOURNAL OF SEMANTICS
AN INTERNATIONAL joURNAL FOR THE INTERDISCIPLINARY STUDY OF THE
SEMANTICS OF NATURAL LANGUAGE
MANAGI N G EDI TOR: PETER BoscH (IBM Scientific Centre, Heidelberg and University of Osnabriick) ASS OCIATE EDITORS: NICHOLAS AsHER (University of Texas, Austin) RoB VAN DER SANDT (University of Nijmegen) REVIEW EDITOR: ANKE LODELING (University of Tiibingen ASSI STANT EDITOR: ANKE LODELING (University of Tiibingen
l
EDI T ORIAL BOARD : MANFRED BIERWISCH ( MPG and Humboldt
University Berlin)
BRANIMIR BoG1JIU\EV {Apple Computer Inc) MARIO B RILLO (University of Toulouse)
o
KEITH BROWN (University of Essex)
GENNARO CHIERCHIA (University of M ilan)
fv
CoPESTAKE (Stanford University) OsTEN DAHL (University of Stockholm) PAUL DEKKER (University of Amsterdam) CLAIRE GARDENT (University of Saarbriicken) BART GEuRTS (University of Osnabrock) MICHAEL HERWEG (IBM Scientific Centre, Heidelberg) LAURENCE R. HoRN (Yale University) JoACHIM JACOBS (University of Wuppertal) PHILIP N. JoHNSON-LAIRD (Princeton University) MEGUMI KAMEYAMA (SRI I nternational Stanford)
HANs KAMP (University of Stuttgart)
SEBASTIAN LiiBNER (University of Diisseldorfj SIR JoHN LYoNs (University of Cambridge) JAMES D. McCAWLEY (University of Chicago) MARC MoENS (University of Edinburgh) fRANCIS J. PELLETIER (University of Alberta) MANFRED PINKAL (University of Saarbriicken) ANTONY SANFORD (University of Glasgow) ARINIM VON STECHOW (University of Tiibingen) M ARK STEEDMAN (University of Pennsylvania) ANATOLI STRIGIN ( Max Planck Gesellschaft, Berlin) HENRIETTE DE S ART (University of Utrecht) BoNNIE WEBBER (University of Pennsylvania) HENK ZEEVAT (University of Amsterdam) THOMAS E. ZIMMERMANN (University of Stuttgart)
w
EDITORIAL ADDRESS: Journal of Semantics, c/o Dr P. Bosch, IBM Germany Scientific Centre, Vangerowstr. I8, D-691I5 Heidelberg, Germany. Phone: (49-622I-) 59-425I/ 4437· Telefax: (49-622I-) 59-3200. Email:
[email protected]
©
Oxford University Press
All rights reserved; no put of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without either the prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London WIP 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, Massachusetts OI92J, USA.
f SIB) is published quarterly in February, May, August and November by Oxford University Press, Oxford, UK. Annual subscription is US$140 per year. Journal oJStmantics is distributed by MAIL America, 2323 Randolph Avenue, Avenel, New Jersey 07001, USA. Periodical postage paid at Rahway, New Jersey, USA and at additional entry points.
Journal o Semantics (ISSN oi67
US POSTMASTER: send address corrections to Journal of &mantics, c/o MAIL America, 2323 Randolph Avenue, Avenel, New Jersey 07001, USA.
f
For subscription in ormation pkase su inside back cover.
JOURNAL OF SEMANT-ICS Volunie 15 Number
2
CONTENTS
Special issue on Underspecification and Interpretation Guest Editors: Reinhard Blutner and Rob van der Sandt (continued from Volume 15 Number r)
.
REINHARD BLUTNER
Lexical Pragmatics
ANATOLI STRIGIN
Lexical Rules
as
I IS
Hypotheses Generators
Please visit the journal's world wide web site at http://www.oup.eo.uk/semant
Subscriptions: The Journal of Semantics is published quarterly.
Institutional: UK and Europe £78.oo; USA and Rest of World US$140.00. (Single issues: UK and Europe £23.00; USA and Rest of World US$4o.oo.)
Personal:* UK and Europe £38.oo; USA and Rest of World US$71.00. (Single issue: UK and Europe £11.oo; USA and Rest of World US$21.00.)
*Personal rates apply only when copies are sent to a private address and payment is made by personal cheque/credit card. Prices include postage by surface mail or, for subscribers in the USA and Canada by Airfreight or in Japan, Australia, New Zealand and India by Air Speeded Post. Airmail rates are available on request. Back Issues. The current plus two back volumes are available from the Oxford University Press, Great Clarendon Street, Oxford OX2 6DP. Previous volumes can be obtained from Dawsons Back Issues, Cannon House, Park Farm Road, Folkestone, Kent CT19 sEE, tel +44 (o)1303 850101, fax +44 (o)1303 850440. Volumes 1-6 are available from Swets and Zeitlinger, PO Box 830, 2160 SZ Lisse, The Netherlands. Payment is required with all orders and subscriptions are accepted and entered by the volume. Payment may be made by cheque or Eurocheque (made payable to Oxford University Press), National Girobank (account 500 1056), Credit cards (Access, Visa, American Express, Diners Club), or UNESCO coupons. Please send orders and requests for .sample copies to the Journals Subscriptions Department, Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK. Telex 8 3 7330 OXPRES, tel +44 (o)1865 267907, fax +44 (o)1865 26748 5.
tel/fax:
+44 (0) 1235 201904
e-ma1l:
[email protected] . co . uk
Introduce
a
co lle ague
to
frien
journal of Semantics-
Please send the person named below a free copy of PLEASE PRINT DETAILS
Naiile
: . . ···. .
.............................
Address
...................
City/County.....
. .
Postcode..............
· · ·
'
. ·
·
·
.
Scope of this Journal The JOURNAL OF SEMANTICS publishes articles, notes, discussions, and book reviews in the area of natural language semantics. It is explicitly interdisciplinary, in that it aims at an integration of philosophical, psychological, and linguistic semanticS as well as semantic work done in artificial intelligence and anthropology. Contributions must be of good qualiry (to be judged by at least rwo referees) and should relate to questions of comprehension and interpretation of sentences or texts in natural language. The editors welcome not only papers that cross traditional discipline boundaries, but also more specialized contributions, provided they are accessible to and interesting for a wider readership. Empirical relevance and formal correctness are paramount among the criteria of acceptance for publication.
Information for Authors: Papers for publication should be submitted in 3 copies to the managing editor. They should be ryped on A4 (or similar format), one-sided,
double-spaced, and with a wide margin and must be accompanied by an approx. 200 word summary. Notes and bibliographical references must appear at the end of the
typescript. All bibliographical references in the text by author's surname and year of publicatioiL Diagrams must be submitted camera-ready. The submission should be accompanied by a Postscript ftle of the paper on diskette. Final versions after acceptance of a paper for publication must be accompanied by a file in source code. All papers submitted are subject to anonymous refereeing and are considered for publication only on the assumption that the paper has neither as a whole or in part already been published elsewhere nor has elsewhere been submitted or accepted for publicatioiL Unless special arrangements have been made, copyright rests with Oxford University Press. Authors receive 20 offprints of their published articles and ro offprints of their published reviews, free of charge. Larger numbers can be supplied at cost price by advance arrangement.
Copyright: It is a condition of publication in the Journal that authors assign copyright to Oxford Universiry Press. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning copyright, authors may use their own material in other publications provided that the Journal is acknowledged as the original place of publication, and Oxford University Press is notified in writing and in advance.
Advertising: Advertisements are welcome and rates will be quoted on request. Enquiries should be addressed to Helen Pearson, Oxford Journals Advertising, PO Box 347, Abingdon SO, OX14 sXJ<., UK. Tel/fax: +44 (o)1235 201904. Email: oxfordad
[email protected].
Journal ofSnnantics
© Oxford
15: I15-162
University Press 1998
Lexical Pragmatics R E I N HA RD B LUT N E R
Humboldt University, Berlin
Abstract Lexical Pragmatics is a research field that tries to give a systematic and explanatory account of pragmatic phenomena that are connected with the semantic underspecification of lexical items. Cases in point are the pragmatics of adjectives, systematic polysemy, the
ng
generalized conversational implicature and pragmatic anomaly. The fruitfulness of the basic account is established by its application to a variety of recalcitrant phenomena among which its precise treatment of Atlas & Levinson's Q- and !-principles and the formalization
of the balance between informativeness and efficiency in natural language processing {Horn's division of pragmatic labour) deserve particular mention. The basic mechanism is subsequently extended by an abductive reasoning system which is guided by subjective probability. The extended mechanism turned out to be capable of giving a principled account of lexical blocking, the pragmatics of adjectives, and systematic polysemy.
1
INTRODUCTIO N
The aim of linguistic pragmatics is to provide an explicit account of utterance-interpretation. Being more specific, such an account has to clarify how disambiguation is achieved, how anaphoric and cataphoric relation ships are resolved, how deictic expressions are used, how presuppositions are projected, what role accommodation phenomena play, how conversational implicatures are worked out, how sentence fragments and ungrammatical utterances are interpreted, how contextual and encyclopedic knowledge is brought to bear, and so on (c£ for example, Sperber & Wilson 1986). In many cases the interest in pragmatics generally arises through concern with the problems of semantics. For example, it has been the evident divergence between the formal devices /\, V, :J, (Vx), (3x), (1x) (in its standard two-valued interpretation) and its natural language counterparts that has built the starting point of Grice's logic and conversation (Grice 1968). And it has been the products of Camap's and Montague's model theoretic ""•
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
distribution of lexical and productive causatives, blocking phenomena, the interpretation of compounds, and many phenomena presently discussed within the framework of Cognitive Semantics. The approach combines a constrained-based semantics with a general mechanism of conversational implicature. The basic pragmatic mechanism rests on conditions of the common ground and allows to give a precise .explication of notions as updati
I I 6 Lexical Pragmatics
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
semantics leaving basic theoretical issues regarding commonplace linguistic phenomena largely unresolved (conditionals, generic plural, aspect, etc.) that have led Thomason to his thoughts about the fundamentals of pragmatics (Thomason 1990). It is tempting to characterize Lexical Pragmatics as that area of pragmatics that arises as.reaction to specific problems of Lexical Semantics. Here Lexical Semantics has to be understood in its 'classical' sense as a truth-functional, static semantics of lexical items. From a Grician per spective, two different ideas of how to overcome the divergences between the classical theory and the natural language demands come into mind. The first one uses conventional implicatures as an enlargement of the classical information entries. (One standard example is to describe the difference between and vs. but by means of this idea). The second idea uses conversational implicatures as a method to overcome the divergences between (formal) meaning and natural language interpretation. Whereas I believe that modern semantic theories (which usually are characterized as dynamic, epistemic, nonmonotonic) make the conception of conventional implicature superfluous as an addendum to the semantic component, I do not think so of conversational implicature. In fact, in this paper I will argue that the proper use of conversational implicature will resolve some of the problems of lexical interpretation that remain unsolved otherwise. The conceptual core of Lexical Pragmatics demands a straight formu lation of conversational implicature. Paired with the idea of (radical} semantic underspecification in the lexicon and an appropriate repre sentation of contextual and encyclopedic knowledge, this conception avoids both unmotivated lexical ambiguities a� the need for expansive reinterpretation and coercion mechanisms. Furthermore, I hope to illustrate how an appropriate formulation of the mechanism of conversational implicatures explains the restrictions on interpretations that can be observed with regard to the traditional areas of polysemy, metonomy, and adjectival modification. The main goals of this paper will be to develop a sensitive feeling of what kind of problems can be approached within Lexical Pragmatics, to explain some characteristics that a proper theory of Lexical Pragmatics has to take account of, and to demonstrate how (a fragment of } the theory works as a restrictive, explanatory account. The discussion proceeds as follows. In the next section, I give some examples illustrating the kind of problems Lexical Pragmatics has to deal with. In section 3 a straightforward formulation of conversational implicature is provided that rests on pragmatic conditions of updating the common ground. Furthermore, some consequences of the basic mechanism are discussed demonstrating the integrating and unifying character of the basic mechanism. Finally, four theses are considered that
Reinhard Blutner
1 17
are designed to characterize Lexical Pragmatics as a proper theoretical program. In section 4 the basic mechanism is extended by including an abductive reasoning system guided by subjective probability, and it is shown how this specific model solves some of the problems stated before.
2 THE RANGE O F LEXICAL PRAGMATICS: SOME EXAMPLES
2. I
The pragmatics of adjectives
In a very simplistic view the meanings of adjectives like red, interesting, or straight are taken as one-place predicates. This assumption about the meaning of adjectives leads to straightforward consequences when it comes to exploit the compositional character of our linguistic system. In rough approximation that is sufficient for the present purpose, the principle of compositionality states that 'a lexical item must make approximately the same semantic contribution to each expression in which it occurs' ( Fodor & Pylyshyn 1988). Examples like this cow is brown suggest that in this case the one-place predicate standing for the meanings of the adjective brown directly applies to the meaning of the subject term yielding the truth value of the whole sentence. In adjective-noun combinations, on the other hand, it is the intersection operation that forms the meaning of the compound expression. As a consequence of this view we get a nice explanation of certain entailments. Fodor & Pylyshyn put this point as follows (p: 42): Consider predicates like . . . is a brown cow. This expression bears a straightforward semantical relation to the predicates . . is a cow and . . . is brown; viz., that the first predicate is true of a thing if and only if both of the others are. That is, is a brown cow severally entails . . . is brown and is a cow and is entailed by their conjunction. Moreover-and this is .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The examples selected in this chapter are not intended to give a complete survey of the aims and problems of Lexical Pragmatics. Admitting that the selection is rather accidental, I hope that the two classes of examples chosen given a reliable body of evidence at least about some core problems. Since it is not always straightforward to distinguish problems about the semantics of lexical units from problems of Lexical Pragmatics, some solicitude has been spent in order to diagnose the latter kind of problems. Needless to say, that to a certain degree certain theoretical predestinations are unavoidable to perform this diagnosis. I believe, however, that these predestinations aren't superficial and don't lead to perverse problems.
II
8
Lexical Pragmatics
important-this semantical pattern is not peculiar to the cases cited. On the contrary, it holds for a very large range of predicates (see.. . soldier, . . . is a child prodigy, and so forth).
is a red square,
.
.
.
is a funny old German
In order for a cow to be brown most of its body's surface should be brown, though not its udders, eyes, or internal organs. A brown crystal, on the other hand, needs to be brown both inside and outside. A brown book is brown if its cover, but not necessarily its inner pages, are mostly brown, while a newspaper is brown only if all its pages are brown. For a potato to be brown it needs to be brown only outside ... Furthermore, in order for a cow or a bird to be brown the brown color should be the animal's natural color, since it is regarded as being 'really' brown even if it is painted with all over. A table, on the other hand, is brown even if it is only painted brown and its 'natural' color underneath the paint is, say, yellow. But while a table or a bird are not brown if covered with brown sugar, a cookie is. In short, what is to be brown is different for different types of objects. To be sure, brown objects do have something in common: a salient part that is wholly brownish. But this hardly suffices for an object to count as brown. A significant component of the applicability condition of the predicate 'brown' varies from one linguistic context to
another (Lahav I 993: p. 76).
Some authors, for example Keenan {1974), Partee (1984), Lahav (1989, 1993) conclude from facts of this kind that the simplistic view mentioned above must be abolished. As suggested by Montague (1970), Keenan (1974), Kamp (1975), and others, there is a simple solution that addresses such facts in a descriptive way and obeys the principle of compositionality. This solution considers adjectives essentially to be adnominal functors. Such functors, for example, tum the properties expressed by apple into those expressed by red apple. Of course, such functors have to be deftned disjunctively in the manner illustrated in (1) :
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The systematicity oflinguistic competence is another phenomenon that can be accounted for by means of the simple ideas developed so far. According to Fodor & Pylyshyn (pp. 41-2), the systematicity of linguistic competence consists in the fact that the ability to understand and produce some expressions is intrinsically connected to the speaker's ability to produce and understand other expressions that are semantically related. When a speaker understands the expressions brown cow and black horse, he under stands the expressions brown horse and black cow as well. Again, it is the use of the intersection operation that explains the phenomenon. Unfortunately, the view that a large range of adjectives behaves intersectively has been shown to be questionable. For example, Quine (1960) notes the contrast between red apple (red on the outside) and pink grapefruit {pink on the inside), and between the different colours denoted by red in red apple and red hair. In a similar vein, Lahav (1989, 1993) argues that an adjective such as brown doesn't make a simple and ftxed contribution to any composite expression in which it appears.
Reinhard Blutner
1 19
( 1) RED(X} means roughly the property (a) of having a red inner volume if X denotes fruits only the inside of which is edible (b) of having a red surface if X denotes fruits with edible outside (c) of having functional part that is red if X denotes tools
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let us call this view the Junctional view. It should be stressed that the functional view describes the facts mentioned above only by enumeration. Consequently, it doesn't account for any kind of systematicity concerning our competence to deal with adjective-noun combinations in an interesting way. Another (notorious) problem of this view has to do with the treatment of predicatively used adjectives. In that case the adjectives must at least implicitly be supplemented by a noun. Various artificial assumptions are necessary which make such a theory inappropriate (c£ Bierwisch 1989 for more discussion of this point). Before I come to a more systematic discussion of these and some further problems, let me introduce a third view about treating the meanings of adjectives which I call the free variable view. In a certain sense, this view can be seen as preserving the advantages of both the simplistic as well as the functional view, but as overcoming their shortcomings. The free variable view has been developed in considerable details in case of gradable adjectives (see, for example, Bierwisch 1989, and the references given therein). It is well known that the applicability conditions of restricting adjectives that denote gradable properties, such as tall, high, long, short, quick, intelligent, vary depending upon the type of object to which they apply. What is high for a chair is not high for a tower and what is clever for a young child is not clever for an adult. Oversimplifying, I can state the free variable view as follows. Similar to the first view, the meanings of adjectives are taken as one -place predicates. But now we assume that t�ese predicates are complex expressions that contain a free variable. Using an extensional language allowing A-abstraction, we can represent the adjective long (in its contrastive interpretation), for example, as AX LONG(x, X), denoting the class of objects that are long with regard to a comparison class, which is· indicated by the free variable· X At least on the representational level the predicative and the attributive use of adjectives can be treated as in the first view: The train is long translates to (after A-conversation) LONG( t, X) and long train translates to Ax [LONG(x, X} 1\ T(x}]. In these formulas t is a term denoting a specific train and T refers to the predicate of being a train. In the following, let us envisage the free variable account from the point of view of a (representational) semantics where meanings are identified with patterns of some sort ('semantical representations'). As stressed by Partee ( 1984: 284 ), 'the compositional principle can still be formulated for such
. 1 20 Lexical Pragmatics
·
( 2) (a) a fast car (b) a fast typist
[one that moves quickly] [a person that performs the act of typing quickly]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
systems, but it then becomes a much more syntactic notation, basically a constraint on the translation rules that map syntactic representations onto semantic ones.' Free variables now simply have the status of place holders for more elaborated subpatterns and expressions containing free variables should be explained as representational schemes. Free variables stand not only as place· holders for a comparison class X as just indicated. The view can be generalized to include other types of free variables as well, for example a type of variables connected with the specification of the dimension of evaluation in cases of adjectives as good and bad or a type of variables connected with the determination of the object-dependent spatial dimensions in cases of spatial adjectives like wide and deep. In what follows, a variety of other kinds of variables will be considered leading to rather complex types of lexical underspecification. Of course, it is not sufficient to postulate underspecified lexical representations and to indicate what the sets of semantically possible specifications of the variables are. In order to grasp natural language interpretation it is also required to provide a restrictive account explaining how the free variables are instantiated in the appropriate way. Obviously, such a mechanism has to take into consideration various aspects of world and discourse knowledge. We are presented here with a kind of selection task: how to select from a set of possibilities an appropriate one where (weak) restrictions are given in form of world and discourse knowledge. The main idea of Lexical Pragmatics now suggests that it is the mechanism of conversational implicatures explicated and formalized in an appropriate way that fills the gap of selecting the 'right' specification from the set of semantically possible ones in the cases under discussion. In principle, this mechanism is based on all kinds of world and discourse knowledge. However, a proper formulation of the mechanisms should make it possible to extract which aspects of general world knowledge and discourse knowledge are really relevant in specifying a certain variable that results from a specific type of lexical underspecification. Let us call now three general problems which are intended to give some clear impression of what kind of questions should be approached within Lexical Pragmatics and what are the challenges for Lexical Pragmatics as a proper theoretic program. The first problem can be stated as the problem of(pragmatic) compositionality. The problem can be explained at best by way of an example. In (2) I have adopted an example discussed by Pustejovsky & Bogurajev (1993) showing the context dependence of the adjectivefast, where the interpretation of the predicate varies depending on the noun being modified.
Reinhard Blutner 121
(c) a fast book (d) a fast driver
[one that can be read in a short time] [one who drives quickly]
(3) (a) car. (b) fast: (c) fast car: (d) unification
--+
Ax [CAR(x) 1\ TELIC(x, s) 1\ MOVE(s) 1\ ...] Ax [TELIC(x, s') 1\ FAST(s')] Ax [CAR(x) 1\ TELIC(x, s) 1\ MOVE(s) 1\ TELIC(x, s') 1\ FAST(s') .. .] Ax [CAR(x) 1\ TELIC(x, s) 1\ MOVE(s) 1\ FAST(s) 1\ ...]
It is a straightforward how the analysis can be extended to the other cases given in (2). It seems to me that this kind of analysis works well in cases like (2). The problems of the account become visible, however, when we consider other types of adjectives, for example colour and taste adjectives.Suppose that we
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Examples of this kind suggest that the adjective modifies a specific conceptual component connected with the noun, namely its purpose or function. With regard to this component the adjective seems to make a unitary contribution; it qualifies this component (the act of moving, typing, reading or driving) in a specific and unitary way.In the general case the principle of pragmatic compositionality says that it is possible to decompose the lexical items in conceptual components and that these components determine the conceptual interpretation of the whole expression. In some cases it seems that the conceptual components of a lexical item are associated semantically with it, i.e. those components determine the meaning of the lexical item. In other cases the association of the conceptual components is via general world knowledge and this information is detachable from the word meaning. The former case is exemplified by the information concerning the purpose or function of artefact terms. The latter case may be illustrated by the information determining the designation of the spatial properties of spatial object terms (e.g. Lang 1989). As an exercise that demonstrates a typical case of the notion of compositionality just under discussion let us calculate the 'conceptual interpretation' of the expression fast car. In (3a) the semantic analysis of the noun car is sketched in some relevant aspects.The analysis states that the concept related with cars is characterized (beside other things) by a telic role (purpose or function) that qualifies a situation type s associated with cars as a moving process.The semantic analysis of the adjective fast is given in (b) expressing that this adjective affects the telic role only.In (c) the expression given in (a) and (b) are combined by the intersection operation, and in (d) the resulting interpretation (a car that moves quickly) is obtained by unifying the free variables.
1 22 Lexical Pragmatics
(4) (a) red:
Ax [SAL-PARTcoLOua(x, y') 1\ RED( y')] (b) sweet: AX [S�-PARTTAsTE(x, y') 1\ SWEET( y')] ( ) apple: AX [APPLE(x) 1\ SAL-PARTTAsT�(x, y,) 1\ PULP(y,) 1\ SAL-PARTcoLOua(x, y2) 1\ PEEL( y2) 1\ ... ] c
Although such an 'analysis' gives the right input for satisfying pragmatic compositionality, it doesn't describe the data more systematically than the functional view sketched before. Obviously, both views describe the data only by enumerating them.What can be learned from this observation is simple that compositionally always can be retained, but at the expense of systematicity. As a matter of fact, in the former case the notion of semantic compositionality is saved by enumerating the adjectives' applicability conditions for different objects. In the second case, on the other hand, the notion of pragmatic compositionality is saved by enumerating the salient parts of objects with regard to any aspect (colour, taste ...).In this case, the cumbersome has to do with the relational notion of salience which obviously cannot be analysed in a way which is systematic and compositional at the same time. Another issue which conflicts with a systematic compositional treatment concerns the fact that the colour vocabulary gets reduced when applied to things that come in a limited range of colors (e.g. red/white wine). A curious observation is that the same colour sometimes gets expressed in different ways, depending on what it contrasts with. For example, in Japanese, aka-zatoo 'brown sugar' Qit.'red sugar') comes in the same range of colours as shira-miso, lit.'white bean paste'.' Next, let us consider a problem which I will call the problem of pragmatic anomaly. In (s) the examples (a-d) seem to make correct (but
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
want to describe that a red apple is one whose surface is red (but not necessarily its inside), and a red grapefruit is one having a red inner volume (but not necessarily a red surface).According to the account just sketched we can try to describe this by assuming an application condition for red saying that a salient part of the object is wholly reddish: AX (SAL PART(x, y') 1\ RED(y')].Furthermore, we have to postulate that the salient part of an apple is its surface and the salient part of a grapefruit is its inner volume. However, what counts as salient part with regard to colour is not necessarily salient with regard to other aspects.What counts as the salient part of an apple with regard to taste, for example, seems to be the inner volume and not the peel. Thus, according to the view under discussion we have to change the semantic entry for red to something like that shown in (4a), whereas the entry for sweet would look like (4b).Furthermore, in order to make the procedure pragmatically compositional, we would have to postulate a conceptual analysis for apple as sketched out in (4c).
Reinhard Blutner 123
not necessarily true) statements about a conceivable ·state of affairs. The examples (e-h), on the other hand, are somehow defective, but they are defective for different reasons: (g) and (h) represent true category violations, while (e) and (f) represent the phenomenon of pragmatic anomaly:
(s) (a) The tractor is red
The tractor is defective The tractor is loud The tractor is gassed up ?The tractor is pumped up ?The tractor is sweet *The tractor is . pregnant *The tractor is bald-headed
The importance of this distinction has been noted, for example by Keil ( I 979). Category mistakes can be explained on grounds of an ontological category violation as described by Sommers (1959, 1963) and Keil (1979). Pragmatic anomaly, on the other hand, has only indirectly to do with the so-called ontological level describing the basic categories of existence in terms of which we conceptualize our everyday world.That sweet is not an appropriate attribute of tractors can't be explained on grounds of an ontological category violation. A tractor can be sweet, by the way. Taste one: it might surprise you. The problem of pragmatic anomaly has to do (i) with the distinction between category mistakes and pragmatic anomaly and (ii) with the formal treatment of the latter kind of deviance. Both the phenomena of category violations and pragmatic anomaly are intimately based upon the conceptual information associated with lexical items. As a consequence, both kinds of phenomena should find a proper place within Lexical Pragmatics. Since almost nothing is known about the proper treatment of pragmatic anomaly, this phenomenon seems to be an especially exciting challenge for Lexical Pragmatics. A third problem that Lexical Pragmatics has to deal with concerns the phenomenon called lexical blocking. This phenomenon has been demon strated by a number of examples where the appropriate use of a given expression formed by a relatively productive process is restricted by the existence of a more 'lexicalized' alternative to this expression. One case in point is due to Householder (1971). The adjective pale can be combined with a great many colour words: pale green, pale blue, pale yellow. However, the combination pale red is limited in a way that the other combinations are not. For some speakers pale red is simply anomalous, for others it picks up whatever part of the pale domain of red pink has not pre-empted. This
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(b) (c) (d) (e) (f) (g) ·(h)
I 24 Lexical Pragmatics
2.2
Systematic polysemy
Systematic polysemy refers to the phenomenon that one lexical unit may be
associated with a whole range of senses which are related to each other in a systematic way. The phenomenon has traditionally been thought intract able, and in fact it is intractable when considered as a problem of Lexical Semantics in the traditional sense. In the following I want to suggest that a considerable part of the phenomenon can be accounted for by using the conception of Lexical Pragmatics: pairing (radical) semantic underspecifica tion in the lexicon with an pragmatic mechanism of contextual enrichment. Unfortunately, the term systematic polysemy indicates a whole family of empirically different subphenomena for which no unified terminology is available. Expressions such as open and closed polysemy (Deane 1988), conceptual specification and conceptual shift (Bierwisch 1983), sense modulation and sense change (Cruse 1986), constructional polysemy and sense extension (Copestake & Briscoe 1995) may be convenient to indicate a rough outline of the classification. Early work on polysemy has concentrated itself on the analysis of the various senses of singular lexical units, relating these senses via certain semantic rules. From the structure of these rules certain conclusions about
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
suggests that the collocation of pale is fully or partially blocked by the lexical alternative pink. The phenomenon of lexical blocking has been approached theoretically in different ways. In the context of language acquisition, for example, Clark (1990, and the references therein) has formulated a principle of contrast that, in its most succient form, goes as follows: 'Every two forms contrast in meaning' (Clark 1990: 417). In the context of word formation Kiparsky (1982) has formulated a general condition which he calls Avoid Synonymy: The output of a lexical rule may not be synonymous with an existing lexical item.' While there is something right about these principles, they are still too strong, as it has been repeatedly argued. For example, Hom (1984) has observed that words like fridge, icebox, and refrigerator can coexist within a single idiolect despite their referential equivalence; furthermore he notices the doublets synonymy and synonymity which seem to be perfect synonyms.To handle these and other cases, starting with McCawley (1978), another line of research has been formulated which rests on a reformulation of Grice's theory of conversational implicature (see, for example, Atlas & Levinson 1981; Hom 1984; Levinson 1987; Matsumoto 1995). Lexical Pragmatics accounts for the phenomenon of lexical blocking in a closely related way.
Reinhard Blutner
125
The lexicologist must show the relationships between all the senses of a lexical item (a task which conventional dictionaries have done well) and also the relationships of related senses of different lexical items (a task which recently linguists have begun doing), but few studies attempt to do both of these tasks (Lehrer 1978: 95).
Following this methodological insight, Lehrer has investigated cooking words (Lehrer 1968), temperature words (Lehrer 1970), and sensory words for taste, smell, feel (Lehrer 1978), and she has suggested the following hypothesis: If there is a set of words that have semantic relationship in a semantic field (where such relationships are described in terms of synonymy, autonymy, hyponymy, etc., and if one or more items pattern in another semantic field, then the other items in the first field are available for extension to the second semantic field. Perceived similarity is not necessary (Lehrer 1978: 96).
a necessary condition on semantic transfer-the transfer of words with a meaning in one domain to another-this principle is certainly interesting. However, as Lehrer suggested herself, not each potential transfer of
As
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the structure of the (mental) lexicon may be derived. As an example for this kind of research I refer to seminal work of Caramazza & Grober (1977). These authors investigated twenty-six senses of the word line. Clustering and scaling analyses revealed five major groupings of these senses for which clear descriptions can be given. Distinguishing a core meaning level· from the level of conceptually salient senses, Caramazza & Grober have proposed EXTENSION as the underlying abstract core meaning of line and they have presumed 'instruction rules' of the form REALIZE X AS Y (where X and Y correspond to semantic representations) in order to produce the abstract meanings of the five clusters. Thereafter, the application of subsequent instructions produce the senses realized in specific contexts. As an example of how the instruction rules might work, Caramazza & Grober (1977) consider the derivation of the sense of line in draw a line under the title of the book. The 'linguistic dictionary' correlates the sound part of line with its core meaning EXTENSION. Using the instruction REALIZE EXTENSION AS UNIDIMENSIONAL EXTENSION, an output corresponding to a certain cluster of concrete senses would be realized. Applying a further instruction, say REALIZE UNIDIMENSIONAL EXTENSION AS VISUAL PERCEPTIBLE, would help to isolate the intended surface sense. Though investigations of that kind may be extremely instructive and interesting from a psycholinguistic point of view, they leave an important point out of consideration: the isolated investigation of singular lexical units closes one's eyes to certain regularities and restricting conditions that may arise alone from investigating the semantic relations between different but in some aspects similar lexical units.
126 Lexical Pragmatics
·
(6) (a) I ate pork/?pig (b) I like beef/?cow (c) The table is made of wood/?tree
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
meaning that satisfies the principle is realized and becomes permanent in the usage of the speech community. As an example let us consider the English touch word sharp which transfers to taste, smell, and sound. However, not each touch word related to sharp in the touch domain may transfer to the other domains. Blunt, for example, may transfer neither to taste nor to smell. Consequently, the quoted hypothesis cannot be the whole story. There must be some kind(s) of extra conditions which determine which kind of potential transfers of meaning remain in the language and which disappear. Cast in Caramazza's & Grober's (1977) framework of instruction rules, it is the question of restrictions on rules that becomes important. Why is the application of an instruction rule possible in the case of sharp but impossible in the case of blunt? And what is the theoretic status of these restrictions? Are they reflecting only idiosyncratic properties of the lexicon? Or are they more systematic and perhaps related to properties of the conceptual domains under discussion? The complex of questions raised in this connection I call the restriction problem of polysemy. Lexical Pragmatics suggests that part of the restriction problem can be treated in terms of pragmatic restrictions. In the previous subsection I discussed two further problems in order to demonstrate some challenges for Lexical Pragmatics: the fallacy of {pragmatic) compositionality and the problem of handling lexical blocking. Both problems have interesting consequences within the domain of systematic polysemy as well. Let's consider first lexical blocking. Take the well-known phenomenon of 'conceptual grinding', whereby ordinary count nouns get a mass noun reading denoting the stuff the individual objects are made of, as in There is fish on the table or There is dog all over the street. There are several factors that determine whether 'grinding' may apply, and, more specific, what kind of 'grinding' (meat grinding, fur grinding, universe grinding . . . ) may apply. Some of these factors have to do with the conceptual system, while others are language-dependent (c£ Nunberg & Zaenen 1992; Copestake & Briscoe 1 995 ). Lexical blocking is a language-dependent factor. It refers to the fact that the existence of a specialized item can block a general/regular process that would lead to the formation of an otherwise expected interpretation equivalent with it. For example, in English the specialized mass terms pork, beef, wood usually block the grinding mechanism in connection with the count nouns pig, cow, tree. This explains the contrasts given in (6).
Reinhard Blutner
127
It is important to note that blocking is not absolute, but may be cancelled under special contextual conditions. Nunberg & Zaenen (1992) consider the_ following example:
(7) Hindus are forbidden to eat cow/?bee£2
{8)
The Coercion View (a) Every lexical unit determines a primary conceptual variant which . can be grasped as its (literal) meaning. (b) The combinatorial system of language determines how the lexical units are combined into larger units (phrases, sentences). (c) There is a system of type and sortal restrictions which determines whether the resulting structures are well formed. (d) There is a generative device (called type/sort coercion) that tries to overcome type or sortal conflicts that may arise by strict application . of the combinatorial system of language. The coercion device is triggered (only) by type or sort violations.
A widely known system that follows these assumptions has been presented
by Pustejovsky {1989, 1991, 1993, 1995). Copestake & Briscoe {1995), Fodor & Lapore {to appear) and others point out various problems ··with Pustejovsky's theory.3 Taken together, these problems may suggest that it is more promising to favour an alternative view. This view is more radically founded on underspecified representations and makes use of a straightforWard mechanism of contextual enrichment.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
They argue that 'what makes beef odd here is that the interdiction concerns the status of the animal as a whole, and not simply its meat. That is, Hindus are forbidden to eat beef only because it is cow-stuff' (Nunberg & Zaenen 1992: 391). Examples of this kind strongly suggest that the blocking phenomenon is pragmatic in nature and may be explicable on the basis of Gricean principles. In section J. I I will present an explicit account of conversational implicature that aims to explain the blocking phenomenon. Many researchers agree in seeing compositionalitjr as a principle satisfied at the semantic level of representation but violated at the level of utterance interpretation. Respecting in principle the non-compositionality of utterance interpretation, most of these researchers seem to consider it virtuous and advantageous to deviate from compositionality in a minimalist way. A typical approach following this path of virtue is the so-called coercion view. Generally, it can be characterized by four assumptions:
128 Lexical Pragmatics.
(9)
(a) (b) (c) (d)
The mechanism of contextual enrichment carries the main burden in explaining restrictions on interpretation. Because of its inferential character, this mechanism is structured non-compositionally. Inferential processing must be controlled by cost factors. Such cost factors may reflect non representational means as salience and relevance.4 fu we shall see in section 4, the idea of (radical) underspecification and contextual enrichment nicely fits in the picture of monotonic processing (Alshawi & Crouch 1992; van Deemter & Peters 1996). Moreover, it is this feature of processing which crucially is involved in explaining pragmatic anomaly. ·
3
·
LEX ICAL PRA G MA T ICS AND T HE THEORY OF CONVERSATIONAL I MPLICATURE
Based on the material presented in the last section, the task of this section is mainly to evolve and propose a set of guidelines for realizing Lexical Pragmatics as a proper theoretic settings. fu we shall see, these guidelines aim at a straightforward formulation of the notion of conversational implicature as a necessary prerequisite to develop Lexical Pragmatics. Before we are ready to speculate on what a proper treatment of conversational implicatures might be, let me first make some remarks concerning present accounts of conversational implicatures. For Griceans, conversational implicatures are those non-truth-functional aspects of utterance interpretation which are conveyed by virtue of the assumption that the speaker and the hearer are obeying the cooperative principle of conversation, and, more specifically, various conversational maxims: maxims of quantity, quality, relation, and manner. While the notion of conversational implicature doesn't seem hard to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The Radical Underspecification View Every lexical unit determines an underspecified representation (i.e. a representation that may contain, for example, place holders and restrictions for individual and relational concepts). The combinatorial system of language determines how lexical units are combined into larger units (phrases, sentences). There is a system of type and sortal restrictions which determines whether structures of a certain degree of (under)specification are well formed. There is a mechanism of contextual enrichment (pragmatic strengthening based on contextual and encyclopedic knowledge). This inferential mechanism is controlled by cost factors and doesn't need triggering by type or sort violations.
Reinhard Blumer
grasp intuitively, it has proven difficult to define precisely. The generality of the cooperative principle and the conversational maXims makes it difficult to specify just which maxims are involved in particular impli catures. Essential concepts mentioned in the maxims are left undefined (what is relevance, adequate evidence, etc.). However, before we can start to 'flesh out' something like the maxims, Grice's view of implicature raises even more basic questions. Are there just the maxims Grice mentioned, or might others be needed (as he suggested himself )? Or could the number of maXims be reduced? Sperber & Wilson (1986) are an extreme case in suggesting one only, the maxim of relevance. And what is the rationale behind the cooperative principle and the maxims? Are they norms which speakers and hearers must know in order to communicate adequately (as Grice and most followers suggest)? Or are they generalizations about certain forms of inferential behavior which speakers and hearers need no more to know to communicate than they need to know the principles of digestion to digest {Sperber & Wilson's position, a position which is also adopted in the present account). An important step in reducing and explicating the Gricean framework has been made by Atlas & Levinson (1981) and Horn (1984). Taking Quantity as a starting point they distinguish between two principles, the Q-principle and the !-principle (termed R-principle by Horn 1984).5 Simple but informal formulations of these principles are as follows:
(10) Q-principle: Say as much as you can (given I) (Horn 1984: 1 3).
/-principle:
Make your contribution as. informative (strong) as possible (Matsumoto 1995: 23). Do not provide a statement that is informationally weaker than your knowledge of the world allows, unless providing a stronger statement would contravene the !-principle (Levinson 1987: 401). Say no more than you must (given Q) (Horn 1984: 13). Say as little as necessary, i.e. produce the minimal linguistic information sufficient to achieve your com municational ends (bearing the Q-principle in mind) (Levinson 1987: 402) Read as much into an utterance as is consistent with what you know about the world. (Levinson 1983: 146-7)
Obviously, the Q-principle corresponds to the first part of Grice's quantity maxim (make your contributions as informative as required), while it can be argued that the countervailing !-principle collects the second part of the quantity maxim (do not make your contribution more informative than is required), the maxim of relation and possibly all the manner maxims. As
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
·
129
1 30
Lexical Pragmatics
Hom ( I 984) seeks to demonstrate, the two principles can be seen as representing two competing forces, one force of unification minimizing the Speaker's effort (!-principle), and one force of diversification minimizing the Auditor's effort (Q-principle). Conversational implicatures which are derivable essentially by appeal to the Q-principle are called Q-based implicatures. A standard example is given in ( I I ). As a general characteristic, 'these implicata limit what is said by shrinking the range of possible states of affairs associated with what is said to a smaller range of those states of affairs associated with what is communicated. What is communicated is MORE DEFINITE than what is said' (Atlas & Levinson I98I: 3 5 )· -+
Not all of the boys are at the party
Conversational implicatures which are derivable essentially by appeal to the !-principle are called !-based implicatures. These implicatures can be generally characterized as 'enriching what is said by reshaping the range of the possible states of affairs associated with what is communicated. What is communicated is MORE PRECISE than what is said.' (Atlas & Levinson I98I: 36). A standard examples is given in ( 1 2).
( 1 2) John said 'Hello' to the secretary and then he smiled -+
J.I
John said 'Hello' to the female secretary and then he smiled
Towards the proper treatment of conversational implicature
I believe that the proper treatment of conversational implicatures crucially depends on the proper formulation of the Q- and the !-principle. As I will demonstrate subsequently, such a formulation also has to account for the interplay of these two principles and their interaction with the two quality maxtms. Let me start with giving a more explicit formulation of the Q- and ! principle. In order to do that, we need a distinction between those aspects of an utterance that can be described in purely linguistic (conventional) terms and a description of the intended content of the utterance.6 Let us abbreviate the linguistic part of an utterance by a where a includes phonological, syntactic, and semantic information-indicated by phon(a), syn(a), and sem(a), respectively. For the sake of explicitness, we identify the intended content of an utterance (better: a possible interpretative hypotheses about the content of the utterance) with a (partial) state description and ·denote it with m? Furthermore, we have the general
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
( I I ) Some of the boys are at the party
Reinhard Blutner I 3 I
I have suggested a provisional account of a kind of nonconventional itnplicature, namely a
conversational implicature; what is implicated is what is required that one assume a speaker to think in order to preserve the assumption that he is observing the Cooperative Principle (and perhaps some conversational maxims as well), if not at the level of what is said, at least at the level of what is implicated.
The wording of the original formulation of the maxims seems to suggest that some concern primarily what is said-that is, they concern sem(a) (e.g. the maxims under 'Manner')-while others concern primarily what is meant (e.g. 'be relevant'). So it is certainly an appropriate picture to explicate the maxims as constraining [sem(a) , m)-pairs. In order to capture notions as linguistic complexity and informativeness, let us assume a global cost function c(a, m). This function combines the complexity compl(a) of the linguistic aspects with a cost-function c(sem(a) , m) expressing the cost to correlate the linguistic meaning sem(a) with the (partial) state description m. As a first approximation, for the latter cost we will assume them as inversely related to the (subjective) probability to associate some m with a given sem(a). That means, the more probable the realization of a certain (partial) state description m-given the range of possible models provided by sem( a )-the less surprising this m should be and the less it should cost to assume. For the sake of concreteness let us follow information theory in seeking for a numerical measure of the surprise associated with a given information state. Within information . theory8 surprise is expressed as negative logarithm of probability. Thus, information theory leads us to the following assumption: c(sem(a) , m) - log2 pr(mlsem(a)), where =
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
idea ·that the semantic description of the utterance, sem(a), is a kind of underspecified representation determining a whole range of possible specifications or refinements one of which is the intended content mintended· Let us express this idea by assuming a general constraint C defining the set of the possible pairs [sem( a) , m]. In the simplest case sem(a) is a first-order formula, m a state description in the sense of Carnap (1947) and C is realized as Carnap's holds in: [sem(a) , m) E C iff sem(a) holds in m (usually written as m f= sem(a)). Obviously, the set C of pairs is defined context-independently in this case. However, it also is possible and more appropriate for natural language applications to use Cs that are dependent on general world and discourse knowledge (see section 4). To reduce our notational apparatus we simply write [a, m) E C instead of [sem(a) , m] E C. Furthermore, we use the abbreviation C(a) for the set {m: [a, m) E C}. Let us assume now that the conversational maxims (or their explicanda) are conceptualized as further constraints on [sem(a) , m) -pairs. This idea is not so surprising and can already be traced back to Grice ( 1989: 86):
1 3 2 Lexical Pragmatics
pr( mlsem( a)) is the conditional probability that an instance of the proposition sem( a) is an instance of the (partial) state description m in a given space of eventualities or possible worlds. Using a simple factorial analysis for the total cost function c (a, m), we get the following ansatz: (13) c(a, m) = compl(a) · c(sem(a) , m), where compl(a) is a positive real number and c(sem(a) , m) = - log2 pr(mlsem(a)) Now let me suggest the following first approximation to the Q- and the I -principle, respectively:
In this formulation, the Q- and the !-principle constrain the set C of possible [sem( a) , m]-pairs in two different ways. The !-principle constrains the set by selecting the minimal surprising state descriptions with respect to a given semantic content sem(a)9 and the Q-principle constrains the set by blocking those state descriptions which can be grasped more economically by an alternative linguistic input a'. I should also add that I have tried to formulate the Q- and the !-principle from the perspective of language comprehension. Due to the very symmetric formulation of the principles, switching to the production perspective may be realized simply by switching Q and I. Before we come to a closer inspection of the formulation (14) we have to introduce the notion of common ground in our theoretic framework and we have to investigate the effects of the maxims of quality. According to conventional wisdom, a common ground cg is an information state that contains all the propositions that are shared by several participants (for example, S and H). In more formal terms this means that an information state cg (non-empty set of possible worlds) counts as common ground iff for each proposition ¢ it holds: cg f= ¢ {::} cg F Ks( ¢ ) A KH( ¢ ).'0 Let us write cg[a] for the common ground that results from cg by updating it with a. Intuitively, the notion cg[a] aims to realize the strengthening of cg by adding an appropriate state description m of sem (a). There are two conditions that are suggested by this intuition: (i) each appropriate state description m is consistent with the common ground cg[a]; and (ii) the informational content of the disjunction of the possible state descriptions m is contained in the common ground cg[a). I guess the first condition may be seen as related to the first maxim of quality (Do not say what you believe is fo lse). And under certain conditions the second
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(14) (a) [a, m] satisfies the Q-principle iff there is no [a', m] E C such that c (a', m) < c(a, m) (b) (a, m] satisfies the !-principle iff there is no (a, m'] E C such that c (a, m') < c ( a, m).
Reinhard Blutner 1 3 3
{I S) (a) Quality I: for each m E p(a) : m is consistent with cg[a) (b) Quality 2: p(a) is a non-empty set and Vp(a) holds in cg[a) " Let us call an update pragmatically licensed iff it satisfies the conditions {IS)(a, b). Now we call an utterance a pragmatically anomalous iff there is no pragmatically licensed update for it. Furthermore, a proposition ¢ is called a conversational implicature of a iff cg[a] f= ¢ for each pragmatically licensed update. If this relationship holds for each common ground cg we may speak of generalized implicatures. Restricting the corresponding notions to specific classes of common grounds, we may define implicatures of the particularized variety. Let us now consider some simple examples to see how the proposed mechanism is working. First consider Moore's paradox exemplified by the contrast between {I6a) and {I6h). (I6) (a) The cat is on the mat, but John doesn't know it. (b) ?The cat is on the mat, but I don't know it. The absurdity of {I6b) falls out straight away as a case of pragmatic anomaly. The explanation immediately results from the formulation of the quality maxims in (I s) and the conditions on common grounds. To see the crucial point, we have to show first that whenever a has a pragmatically licensed update cg[a], then the proposition Ks(sem(a)) must be logically consistent. This assertion follows from the fact that the proposition V p( a) logically entails the proposition sem(a) and the fact that V p( a) is contained in the common ground cg[a). The latter directly results from the condition (Ish). As a consequence sem(a) must be contained in cg[a]. From our condition on common grounds it follows that Ks(sem(a)) must also be contained in cg[a] and therefore must be consistent. Hintikka (I962) calls a proposition ¢ epistemically indefonsible just in case
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
condition may be seen as related to the second maxim of quality (Do not say what you lack evidence for). If that is right, then crucial consequences of the maxims of quality and their very special status within the overall theory can be formulated in terms of conditions on updating the common ground. Let us use the abbreviation 8JQ {a) for the set of possible state descriptions that are constraint both truth-conditionally (by means of C) and by means of the Q-principle, i.e. pq (a) =def {m: [a, m] E C and [a, m] satisfies Q principle}. Analogously we have the definition tJI(a) =def {m: [a, m] E C and [a, m] satisfies !-principle}. We simply write p(a) referring to the intersection of both conditions: p(a) =def 8Jq(a) n PI (a). Using this notation we can state the two conditions related to the quality maxims as follows:
134
Lexical Pragmatics
=
(17) (a) a : (/S 1 or Sz f , p V q )) C(a) {m1 , m2 , mJ, where m1 = (p, q), m2 (p, •q), mJ = ( •p, q)} (b) a': (/S1 and S2/, p 1\ q ) C(a) {mr } (c) tJI (a) = {m1 , m2 , mJ, since c(a, m 1 ) = c(a, mz ) c (a, m3 ) tJq (a) = {m2 , mJ, since c(a', m 1 ) < c(a, m1 ) tJ(a) = tJq (a) n tJI(a) = {m2 , mJ (d) if cg[a] is a pragmatically licensed update, then12 cg[a] f= Psp, Ps •p, Psq, Ps•q, . . . (Quality I: clausal implicatures) cg[a] f= Ks• (p 1\ q) (Quality 2: scalar implicature) =
=
=
=
The derivation crucially rests on the assumption that the logically stronger expression s[ and s2 realizes the state description m[ with higher probability than the logically weaker expression s[ or s2 and therefore can block this state description for the interpretation of s[ or s2. It is worth noting that the present approach to 'scalar implicatures' has some advantages over the traditional approach based on Horn-scales (c£ Gazdar 1979). In an exercise in .his logic book McCawley (1993: 324) points out that the derivation of the exclusive interpretation by means of Horn-scales breaks down as soon as we consider disjunctions having more than two arguments. Consider the connectives AND and OR where both are construed as n-place operators, AND yielding truth when all n arguments are true and OR yielding truth when at least one argument is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the proposition Ks¢ 1s mconsistent (with regard to his epistemic logic system). Using this notion, we can summarize our argumentation as follows: There can be no pragmatically licensed update for a in case the proposition sem(a) is episternically indefensible. Utterances with epistemically indefensible sem(a) come out as pragmatically anomalous utterances accord ing to the definition given above. It is a simple exercise to show that expressions of the form > 1\ •Ks ¢ are epistemically indefensible (i.e. Ks ( 4> 1\ •Ks ¢) is inconsistent). Consequently, the absurdity of (I 6b) comes out as a case of pragmatic anomaly. Other cases of epistemically indefensible statements are discussed by Karttunen (I972) and Gazdar (I979) and demonstrate the importance of this subcase of pragmatic anomalies. Next, consider a simple example showing the generation of scalar and clausal implicatures. We consider the expression a = S1 or S2 and the competing expression a' s[ and s2 and we assume that both expressions are of the same linguistic complexity: compl(a) = compl(a'). The derivation of the clausal and scalar implicatures of a is schematized in (17)·
Reinhard Blutner I 3 S
true. Clearly, as in the binary case we get for any number of arguments (AND, OR) as a Hom-scale which predicts that {I8a) implies {I8b). (I 8) (a) OR(S., S2,
, Sn) (b) NOT AND(SI, Sz, . . . , Sn)
·
•
•
•
(I9) With the salmon you can have fries, rice or a baked potato. It is easy to check that the current account yields the right result. As an example consider the case of three disjuncts a = OR(S1 , S2, SJ The derivation of the exclusive interpretation runs as above, but now based on the folloWing alternatives Q� = AND (S I ' Sz, s3 ). a: = AND(S I ' sz), a� = AND(S1 , S3 ), a� = AND(S2 , S3). Again the central point is that the stronger expressions realizes. the relevant state descriptions with higher probability than the weaker expressions thereby blocking them for the interpretation of OR(S I , S2, S3). It should be noted that we did not include the single disjuncts among the alternatives. This is motivated by the independent requirement (which any theory of Q-based implicatures has to make, but which is notoriously difficult to formalize) that the alternatives must contrast in view of an element which is qualitatively similar in a relevant sense. This is a general phenomenon. In spite of the entailment relation licensed by existential generalization a proper name as John' does not form a contrast class with a quantifier like 'some'. 'All' being a quantifier itself does.' 3 The next class of examples deals with the phenomenon of (partial) lexical . blocking. Aronoff (I 976) has shown that the existence of a simple lexical item can block the formation of an otherwise expected affrxally derived form synonymous with it. In particular, the existence of a simple abstract nominal underlying a given -ous adjective blocks its nominalization with -ity: (2o) (a) curious-curiosity tenacious-tenacity (b) furious-*furiosity-fury fallacious-*fallacity-fallacy While Aronoff's formulation of blocking has been limited to derivational processes, Kiparsky (I982) notes that blocking may also extend to inflectional
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Unfortunately this prediction is too weak. The conjunction of (I 8a) and (I8b) yields an formula which is true if any number of disjuncts smaller than n is true. This is correct for n = 2, but wrong for more arguments since a general account of the exclusive interpretation would have to predict the interpretation according to which it is true in case one (and only one) disjunct is. The utterance of (I9) certain does not invite you to take either one or two of the items mentioned.
I 3 6 Lexical Pragmatics
(21) (a) Black Bart killed the sheriff (b) Black Bart caused the sheriff to die
Let me now demonstrate how the theory developed so far accounts for total blocking. Consider two expressions a and a' that are semantically equivalent, i.e. C(a) C (a'). In case that the expression a is less complex linguistically than the expression a', i.e. compl(a) < compl(a'), it results (from (13)) that c(a m) < c(a' m) for each m E C(a). Consequently, we obtain PQ(a') 0. The latter implicates p(a') 0 and we can conclude that there is no pragmatically licensed update for a'. In other words, the existence of a linguistically simpler (less marked) expression a equivalent to a' has totally blocked the more complex one. When the expressions a and a' are of comparable linguistic complexity, i.e. compl(a) compl(a'), the result is c(a m) c(a' m) for each m E C (a). From this we get pQ(a) PQ(a') C(a) and the expressions a and a' may coexist selecting the same state descriptions. The latter result predicts that synonymous expressions are possible when their linguistic complexities are the same. This prediction contra dicts Kiparsky's (1982) principle Avoid Synonymy. However, as mentioned by Hom (1984), Kiparsky's principle seems to be too strong: pairs of =
=
,
,
=
=
,
=
=
=
,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
processes and he suggests a reformulation of Aronoff's blocking as a subcase of the Elsewhere Condition (special rules block general rules in their shared domain). However, Kiparsky cites examples of partial blocking in order to show that this i11formation is too strong. According to Kiparsky, partial blocking corresponds to the phenomenon that the special Qess productive) afHx occurs in some restricted meaning and the general (more productive) . afHx picks up the remaining meaning (consider examples like refrigerant refrigerator, informant-informer, contestant-contester). To handle these and other cases Kiparsky (1982) formulates his general condition Avoid Synonymy cited· above. Working independent of the Aronoff-Kiparsky line, McCawley (1978) collects a number of further examples demonstrating the phenomenon of partial blocking outside the domain of derivational and inflectional processes. For example, he observes that the distribution of productive causatives (in English, Japanese, German, and other languages) is restricted by the existence of a corresponding lexical causative. Whereas lexical causatives (e.g. (21a)) tend to be restricted in their distribution to the stereotypic causative situation (direct, unmediated causation through physical action), productive (periphrastic) causatives tend to pick up more marked situations of mediated, indirect causation� For example, (21b) could have been used appropriately when Black Bart caused the sheriff's gun to backfire by stuffing it with cotton.
Reinhard Blutner
137
(22) (a) [a , m) satisfies the Q-principle iff there is no [o: ' , m) E C ' . satisfying the !-principle such that c( o: , m) < c (a , m) (b) [o: , m] satisfies the !-principle iff there is no [o: , m') E C satisfying the Q-principle such that c (o: , m') < c ( o: , m). Let me now demonstrate how this explication of the Q- and !-principle explains Horn's 'division of pragmatic labor'. Let me keep the previous notations but give them a slightly changed content by referring to the principles in (22) instead of those in (14). Consider again two expressions a
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
expressions like icebox-refrigerator, synonymy-synonymity, persuade to not dissuade from can coexist within a single idiolect despite their referential equivalence. '4 Furthermore, Horn argues that in these cases the inherent complexities (as demonstrated by psycholinguistic evidence) are approxi mately of the same size. Though it is not completely clear which factors influence the inherent complexity of a linguistic expression, Horn's counterexamples suggest that our prediction is of the right kind. The theory developed so far has been shown to predict that the more complex one of two semantically equivalent expressions must be blocked in all its interpretations. However, this prediction seems too strong and would conflict with examples like (21). There are several possibilities of avoiding this conclusion. First, we could stipulate that o: and o: ' would not be semantically equivalent in such cases. Essentially we had to stipulate that the less complex expression applies semantically to the stereotypic causative situation only, whereas the more complex expression is not restricted semantically in a related way but gets its restriction (to the indirect causative situation) by way of the Q-principle. Another way out of the dilemma of total blocking would accept the semantic equivalence of expressions like (21)(a, b) but would claim that the principles (14)(a, b) have default character only, with a preference to the Q-principle in the case of conflicts. I think that both 'solutions' are unsatisfactory for conceptual reasons and would not really explain the general tendency that 'unmarked forms tend to be used for unmarked situations and marked forms for marked situations' (Horn 1984: 26)-a tendency that Horn (1984: 22) calls 'the division ofpragmatic labor'. I think a better solution to this problem and a real explanation of 'the division of pragmatic labor' has to start with a reformulation of the I- and Q-principle. The informal formulation of these principles as documented in (10) stresses a kind of partial circularity: in expressing the Q-principle reference to the !-principle has been made and vice versa. I think we have to live with this kind of partial circularity, but at the same time we must give a precise formulation for it in order to see its consequences. The following is an attempt in this direction:
138 Lexical Pragmatics
(23)
'0-principle' wins I slept on a boat yesterday ---t The boat was not mine I slept in a car yesterday ---t The car was not mine
'!-principle' wins I lost a book yesterday ---t The book was mine I broke a finger yesterday ---t The finger was mine
In the examples on the left-hand side the Q-principle (earlier formulation) seems to win, but in the examples on the right-hand side the facts suggest the !-principle (earlier formulation) as the winner. But there is a crucial difference between both kinds of examples that may resolve the matter. While it is plausible to assume that I have only one boat or one car it is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
and a' that are semantically equivalent, i.e. C(a.) = C(a.' ) and let us assume furthermore that C(a.) and C(a.'), respectively, contain exactly two elements Illdir and mindir of different complexity, say c( sem( a), mdir ) = c(sem(a' ) , Illdir) < c(sem(a) , mindir ) = c(sem(a') , mindir )· If the expres sion a is less complex linguistically than the expression a', i.e. compl(a) < compl(a.'), we can calculate the set PQ (a) if it assumed that there is no expression a" that expresses the content of a less costly than a itsel£ The application of (22a) simply yields PQ(a) = {Illdir, mindir} · With this result at hand we can apply (22b) and get p1 (a) = {Illdir} (since c(a, Illdir ) < c(a, mindir )). Consequently, we obtain p(a) = {Illdir}; i.e. the unmarked form selects the unmarked situation. Now consider the marked expression a'. In this case the application of (22a) yields PQ (a') = {mindir}· This result contrasts with the outcome of total blocking by using the earlier formulation of the Q-principle (r4a). The difference, of course, is due to the fact that a pair [a', mi] can be blocked only by a less complex pair [a, mi] if the latter satisfies the !-principle; thus, only [a', Illdir] is blocked but not [a', mindir]· Furthermore, it is a simple exercise to show that PI (a') = { Illdin mindir}. Consequently, we obtain p( a') = {mindir}; i.e. the marked form selects the marked situation. It is important to see that this explanation of the 'division of pragmatic labor' doesn't rest on specific lexical stipulations or stipulations with regard to the costs, but is a general consequence of our formulation of the Q- and !-:-principle as presented in (2r ). According to an earlier formulation (e.g. Atlas & Levinson 1981; Horn 1984), the Q- and 1- (R-)based principles often directly collide, and a general preference for the Q-principle has been stipulated. The present reformulation of the Q:- and I -principle avoids this stipulation and predicts that in the 'conflicting cases' the Q-principle yields a more restricting output than the !-principle. In the literature cases of Q- versus I -clashes have been discussed that seem to contradict this general pattern. Consider the material presented in (23).
Reinhard Blutner
I
39
3 .2
·
Four theses of Lexical Pragmatics
Lexical Pragmatics is a research field that tries to give a systematic and explanatory account of pragmatic phenomena that are intimately connected with the semantic underspecification of lexical items. . The approach combines a compositional semantics with a general mechanism of con versational implicature. Starting off from a underspecified semantic representation, a mechanism of information enrichment (abduction) is invoked to yield the appropriate specification with regard to the common ground. In section 2 the range of Lexical Pragmatics has been characterized by several examples, and some general, typical problems have been discussed. The present subsection tries to sharpen the rather impressionistic picture that has evolved and attempts to illuminate it from a methodological perspective. Thesis
1:
Lexical Pragmatics is non-comp ositional
In section 2 I argued against the principle of pragmatic compositionality. This principle says that it is possible to decompose the lexical items of an compound expression into conceptual components which combined together determine the conceptual interpretation of the whole expression.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
implausible to assume that I ha�e only one book or one finger. Only in the former case can the more precise genitive form (I slept on my book; I slept in my car) block the corresponding interpretation. In treating the examples on the right-hand side, we have to take into account also those interpretations where more than one book or one finger is involved; in this case there is no alternative expression that may block the ego-centred interpretations and they may be selected by means of the !-principle. Summarizing, the present account of conversational implicature tries to give a real unification of the two competing 'forces' expressed by the Q-principle and the !-principle, respectively. This approach contrasts with most recent theoretical accounts (e.g. Hirschberg 1991; Matsumoto 1995) that have focused on single classes of conversational implicatures only. In contrast, the present account tries to address the integration of different kinds of conversational implicatures. The main problems addressed in this paper are problems of Lexical Pragmatics. Before I come to a detailed treatment of some typical examples, I want to discuss four theses that are designed to characterize Lexical Pragmatics from a methodological point of view.
140
Lexical Pragmatics
Thesis 2: Lexical Pragmatics crucially involves non-representational means
Beside the question 'Is Lexical Pragmatics compositional?' we have the related question 'Is Lexical Pragmatics combinatorial?' The basic intuition underlying the combinatorial approach is that a cognitive activity is a process of manipulating representations, typically a sequential procedure consisting of discrete steps in accordance with definite criteria. The combinatorial approach contrasts wjth the connectionist approach (e.g. Rumelhart, McClelland, & the PDP Research Group 1986), viewing the cognitive system as a network of units collected to each other through links of various strengths. The cognitive activity in these systems consists of a parallel spread of activation instead of the combinatorial sequential procedure. Perhaps there are arguments against the view that lexical pragmatic is combinatorial (c£ the discussion in Lahav 1993 with regard to the pragmatics of adjectives). However, I want to be cautious with regard to this issue, which is difficult to decide, and I will ask a different but related question instead, namely the question of whether Lexical Pragmatics involves non-representational means of manipulating representations. The
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I have tried to make it clear that neither the influence of salience nor the phenomenon of lexical blocking can be approached in this way and that the relevant data should not be described only by enumeration but in a more systematic way. The problem behind it has to do with the relational nature of salience, the existence of blocking effects; i.e. the influence of items that don't occur in the expression under discussion but nevertheless are important to determine its interpretation, and the whole idea of inferential reasoning. The main doctrine of Lexical Pragmatics aims at the combination of a compositional semantics with a general mechanism of conversational implicature. It is the second part of this doctrine that accounts for the non-compositional character of Lexical Pragmatics. Almost everything in the formulation of conversational implicature has this non-compositional character: both the formulation of the Q-principle as well as the !-principle are 'holistic' in addressing a whole range of alternative expressions; the conceptions of informativeness, surprise (measured in terms of conditional probability), and linguistic complexity are non-combinatorial and cannot be reduced to the corresponding properties of the parts of an expression; the mechanism of information enrichment C as based on abductive inference (see section 4) is non-compositional.
Reinhard Blutner 141
answer to this question, I claim, is clearly affirmative. Notions such as salience, cue validity, diagnostic value, informativeness, surprise, relevance, frequency of use, and so on are candidates for such non-representational means. It would be a fallacy to assume-according to a superficial reflection of the concept of mental representation-that each quantity that is involved in determining our mental behaviour must be mentally represented in order to become effective. Parameters like salience and cue validity need not be represented mentally in order to exist and to determine our cognitive activity. Instead, such parameters involve the non-representational dimension of our computational system.' With regard to the theoretical framework of Lexical Pragmatics, the whole conception of costJunction is interspersed with non-representational means. A problem for ordinary symbolism: we need a way of manipulating non-representational elements. 5
Econo�y principles
Economy principles are crucially involved in determining how non representational parameters · control the selection and suppression of representations. With Zipf (1949) as a forerunner we have to acknowledge two basic and competing forces, one force of unification, or Speaker's economy (!-principle), and the antithetical force of diversification, or Auditor's economy (Q-principle). The two opposing economies are in extreme conflict, and we have reformulated this conflict in a way that makes it possible to obtain definite outcomes with regard to the selection of interpretative hypotheses. However, I feel that there must be an inde pendent justification of this kind of Economy principles, perhaps one derived from the general economics involved in defining connectionist network behaviour (e.g. Rumelhart, McClelland, & the PDP Research Group 1986). Thesis 4: Lexical Pragmatics has to explain when conversational implicatures are cancellable and when not
Does cancellability constitute a necessary condition for conversational implicatures? Grice himself notes that cancellability doesn't hold for all kinds of conversational implicatures and mentions implicatures based on the Quality maxim as an exceptional case. Our discussion of Moore's paradox has demonstrated this case in the context of the present theory. Another type of conversational implicature constituting a counterexample
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Thesis 3 : Lexical Pragmatics crucially involves
142 Lexical Pragmatics
against the claim that cancellability is necessary for conversational implicatures has been pointed out by Sadock (1978) in his seminal paper attacking the usefulness of the cancellation test for conversational · implicature. Grice states explicitly that generalized conversational implicatures, those that have little to do With context, are cancelable. But is it not possible that some conversational implicatures are so little dependent on context that cancellation of them will result in something approaching invariable infelicity? In a paper in preparation, I argue that sentences of the form almost P only conversationally entail not P, contrary to the claim made by Karttunen and Peters (1979). The implicature is straightforwardly calculable and highly nondetach able but, unfortunately for my thesis, just about uncancelable. The sentence Gertrude not only almost swam the English Channel, in fact she swam it is, I admit, pretty strange (Sadock
Langendoen's (1978) analysis of reciprocals give rise to another kind of examples, suggesting that cancellability is not necessary for conversational implicatures. Langendoen assumes that the reciprocal makes a uniform semantic contribution on every occasion of its use. This semantic con tribution, he assumes, must be reflected by truth conditions included in every.instance of use of the reciprocals. His analysis, then, rules out all but the weakest meaning he discusses, Weak Reciprocity, as the correct meaning·· of the reciprocal. However, as discussed in Dalrymple et al. (1994), Langendoen wouldn't deny that expressions like (24) appear to express Strong Reciprocity. (24) Willow School's fifth-graders know each other this position needs to explain why such examples appear to mean something stronger than Weak Reciprocity. Doing so will presumably involve appeal to pragmatic strengthening of the proposition that according to them is the sentence's literal meaning. One way such strengthening might occur is through conversational implicature (Dalrymple et al. 1994: 76). An advocate of
observed by Dalrymple et al., the added strength of (24) over Weak Reciprocity does not seem to be cancellable, as evidenced by the infelicity of the following cancelling expression:
As
(25) *Willow School's fifth-graders know each other, but the oldest one doesn't know the youngest. Hirschberg (1991), Dalrymple et al. (1994), and others insist on taking cancellability as a hallmark of conversational implicature. They would reject an analysis of these examples in terms of conversational impli cature. It seems, however, that this claim is not so much based on their treatment of conversational implicature but is rather a consequence of the old dictum semantics is strong and pragmatics is weak. The present account of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 978: 293)·
Reinhard Blutner
143
conversational implicature suggests that the borderline between semantics · and pragmatics (conversational implicature) cannot be drawn by the condition of cancellability. The discussion of blocking has shown that pragmatic anomaly isn't necessarily connected with inconsistency. In a similar vein, non-cancellable implicatures can arise in spite of the consistency of the corresponding cancelling expressions. To take a typical example from the pragmatics of adjectives, (26a) would suggest (26b) as an conversational implicature.
The corresponding cancelling expression (26c), however, can be shown to be pragmatically anomalous for very general classes of common grounds (in spite of its semantic consistency). The theoretic treatment of such examples (see section 4) strongly suggests that a proper treatment of conversational implicatures may explain when an implicature is cancellable and when not. In this sense, the dictum semantics is strong �eading to non-cancellable inferences in each case) and pragmatics is weak Uustifying cancellable inferences only) must be abandoned. Instead, it is an important task for Lexical Pragmatics to explain when conversational implicatures are cancellable and when not. '6
4 UNDERSPECI FICAT I O N AND ABDUCTIO N
In the former section, we have introduced the general constraint C . defining the range of possible refinements of an underspecified semantic representation. However, we have only considered a rather provisional explication of this constraint. In this section we consider a realization of C that seems refined enough to analyse some specific phenomena of Lexical Pragmatics. The main idea is to consider C(a) as the set of abductive variants that can be generated from sem( a) by means of a specific common ground that includes crucial aspects of world and discourse knowledge� It is shown how the incorporation of the abductive component in the general pragmatic framework solves some of the problems in connection . with the pragmatics of adjectives and the phenomenon of systematic polysemy.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(26) (a) This apple is sweet (b) Its pulp is sweet (c) ?This apple is sweet, but its pulp is not (perhaps, its peel is)
144
Lexical
Pragmatics
Cost-based abduction: an extension of the basic mechanism
4. 1
•
•
•
•
•
•
-
.
•
•
•
.
-
.
•
•
•
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For the sake of explicitness, let us consider sem(a) as a conjunction of positive literals and let us propose weighted abduction (Stickel 1989; Hobbs et al. 1993) as a general method to specify sem(a) by exploiting Hom clause knowledge bases. The use of weighted abduction allows us to pair the abduced variants mi with its proof costs. The earlier measure of the global costs c(a, mi) (formula (1 3)) should then be replaced by an explicit account of those proof costs. For the present purpose, we adopt Stickel's (1989) PROLOG-like inference system for generating abductive specifications and his mechanism for computing proof costs in a slightly simplified way. It is taken for granted that every literal in the initialn formula is annotated with (non-negative) assumption costs ci: q�1 , q� . The knowledge base is assumed to provide , p� q, where the literals Pj in the formulas of the form p�1 , antecedent are annotated with weights Wj · There are four inference rules that constitute abductive proofs and determine the assignment of concrete proof costs (for details, see Stickel 1989): Resolution with a fact: If a current goal clause contains a literal that is unifiable with a fact in the knowledge base, then this literal is marked as proved. (The retention of a proved literal allows its use in future factoring). ' Resolution with a rule: Let the current goal clause be . . q c '. . and let p� q in the knowledge base. If q and q are there be an axiom p�1 , I u, then the goal clause . . . p� · W u, . . , p� wnO", unifiable with most general unifier ' ' q 0" can be derived (where q 0" is marked as proved). 'Obviously, we assume that the new assumption costs can be calculated by multiplying the ' corresponding weight factors with the assumption cost c of the literal q in the old goal clause. Making an assumption: Any unproved literal in a goal clause can be marked as assumed. Factoring: If a literal q occurs repeatedly in a proof, each time with different costs, the occurrences of q are unified and the lowest cost is taken. An abductive proof is complete when all literals are either proved or assumed. The cost measure for an abductive proof is the sum of all costs of the axioms involved in the proof plus the costs for the assumption ofliterals that are not proved. For the following, ·we will assume that all axiom costs are zero. Furthermore, we aim to bring our system as close as possible to a Bayesian network. As Chamiak & Shimony (1990) have shown, this can be achieved when costs are interpreted as negative logarithms of certain conditional probabilities and when, besides other simplifying assumptions, no factoring occurs in the abductive proo£ In the following, we will adopt
Reinhard Blutner 145
(27)
cg:
C2 ---+ s o s D l 1\ A . ---+ C Dl 1\ Bo . s ---+ A --+ D
The diagram (28) shows the abductive ·inference graph when (27) is taken as the common ground and sem ( a ) = S is taken as the starting clause. [Assuming C: 2 · $10 $2o) =
[Assuming A: o. s $2o $10) ·
=
[Assuming B: o.s $10 $s) ·
=
The resulting set of abductive variants is presented in (29a) and the costs associated with these variants are given in (29b). (29) (a) (b) (c) - (d)
C(s) c(S, A) p(S) cg[S)
{A, B , c } = $10 , c(S, B) {B} cg U {B}
=
=
=
=
. $s, c(S, C)
=
$2o
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the probabilistic interpretation of costs, but we will not refrain from using factoring. Factoring some literals obtained by backward chaining can be proved to be a very useful operation in natural language interpretation (c£ Stickel 1989). It is now possible to incorporate the abductive component in the general pragmatic framework viewing natural language interpretation as inferences to pragmatically licensed updates. For simplicity's sake let me illustrate the incorporation of abduction by way of an elementary example. This gives me the opportunity to discuss some crucial differences between the present approach and the Hobbs-Stickel account where natural language inter pretation is viewed as abductive inference to the best explanation. In order to simplify matters, I will exclude effects of blocking via the Q-principle. That means, I will assume that there are no expression alternatives a ' that may block any interpretation of a. Let us assume a knowledge base as presented in (27) and let us accept that all axiom costs are zero.
146 Lexical Pragmatics
�
(3o)
cg':
C2 ____,
s o s D l 1\ A . s o D l 1\ B . ---> D
____, C
____,
A
B --->
In this case we have the same abductive inference graph as shown before in (28), and we get the same abductive variants and the same costs associated with them. But now the cost-minimal variant B is inconsistent with cg'. From this fact it follows that there is no pragmatically licensed update for S with regard to cg'. In other words, S becomes pragmatically anomalous with regard to cg'. (3 I )
Now look at the Hobbs-Stickel account. It gives A as the minimal explanation (c£ the diagram (3 1)). This leads to the postulation of cg' U {A} as update. Consequently, there is an important difference between the Hobbs-Stickel account and the present one. On the Hobbs-Stickel view
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Since we have assumed that 1 here are no blocking alternatives, the condition (22a) becomes v2cuous and the set p(S) is the set of cost minimal variants, given in (29c). Since the expression B is consistent with cg, a pragmatically licensed update exists (satisfying the Quality conditions (1 sa, b)). It is given in (29d). The Hobbs-Stickel account is looking for minimal explanations, that means it selects the cost-minimal variants from the set of the consistent abductive variants. This contrasts with the former view which first selects the cost-minimal variants from the set of all abductive variants and then checks them with regard to consistency. However, in the present case this mak<;!s no difference, since the minimal variant B is consistent with cg, and consequently it is at the same time the minimal explanation of S. The updating of cg by the minimal explanation gives the same result as already presented in (29d). Now consider the common ground cg' given in (3o), which is cg (i.e. by •B). augmented by the clause B
Reinh.ird Blutner 147
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
there is an update in each case when the starting clause sem( a) is con.sistent with cg. The present account, on the other hand, yields a much more restricted notion of update. There is a pragmatically licensed update only when one of the cost-minimal abduced variants is consistent with cg. If all cost-minimal variants _are inconsistent with cg they can be seen as 'blocking' any interpretation of the starting clause. As shown in tl?.e next section, this device is appropriate to capture cases of pragmatic anomalies in natural language interpretation. From a computational point of view, the present approach looks well if it is assumed that the abductive machine generates the abductive variants in the order of its (estimated) costs. In this case, we have to assume simply that the abductive system stops if it has completed its first abductive proo£ The result is then given to the consistency checker. If the result is consistent, the system has found an interpretation. If not, the system may indicate that it doesn't understand-the only interpretation it can find is a faulty one. Perhaps, there is a mechanism of accommodating the knowledge base that restores interpretability after all, but even then there is no possibility to access other variants than the cost-minimal ones. The overall architecture of the Hobbs-Stickel account sets out to access non-minimal variants when the minimal ones do not provide explanations. This feature makes processing less efficient, and it makes it difficult to discriminate between 'good' and 'bad' interpretations. In contrast, the present view of interpretation connects an efficient processing architecture with the possibility of providing an explanation of pragmatic anomalies. Straightforwardly, this way of realizing efficient processing conforms with the realization of the monotonicity property of language processing (e.g. Alshawi & Crouch 1 992; c£ section 2.2), and it is this idea that helps us to explain pragmatic anomaly. There is yet another important feature distinguishing the present account from the Hobbs-Stickel approach, the possibility of having non cancellable implicatures. Let us call a conversational implicature ¢ of an utterance a in cg contextually cancellable iff there is a strengthening cg' of cg such that a is interpretable in cg' but
1 48
Lexical Pragmatics
_4.2
Abduction and the pragmatics of adjectives
One part of speech is especially suited to demonstrating the phenomenon of semantic underspecification: the adjective. In section 2.1 we considered the free variable view as an especially promising account of treating the meaning of adjectives. It had been stressed that the specification of free variables is necessary for a full interpretation of an utterance. I will now demonstrate how the current theory yields an appropriate mechanism of (contextual) specification by applying it to the kind of examples discussed by Quine ( 1 960) and Lahav ( 1 993 ) (c£ section 2. 1 ). (32) (a) (b) (c) (d) (e)
The apple is red Its peel is red Its pulp is red APPLE(d) 1\ PART(d. x) 1\ COLOUR(x, u) 1\ u red APPLE(d) 1\ PART(d, x) 1\ PEEL(x) 1\ COLOUR(x, u) 1\ u =
=
. red
My claim is that (ph) but not (pc) can be construed as a conversational implicature of (pa). Input of the analysis is the underspecified semantic representation given in (pd). One of the abductive specifications of this semantic input specifies x as the peel part of the apple (see (pe)). For the calculation of the corresponding costs we start with assumption costs as given in the first line of (3 3). Note that we take the assumption cost for the 'slots'. PART(d, x) and COLOUR(x, u) as negligible with regard to the costs of the more 'specific' elements of the representation.'7 This contrasts with corresponding stipulations by Hobbs et al. ( 1993) but it agrees with the general picture that specificity is the primary determinator of the assumption costs. Furthermore, we refer to axioms of the form q p�1 1\ p�\ where the weights w" W2 are monotonic functions of �
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
proposition B. For example, if we strengthen cg by adding •B (as in (3o), S will be pragmatically anomalous in the new context. This shows that on the current account cancellability is not a necessary feature of conversational implicature;· some implicatures may be non-cancellable. The Hobbs-Stickel approach, on the other hand, is in agreement with the standard view (resting on the highly defeasible notion of minimal explanation). The usefulness of cancellation as a test for conversational implicature has been challenged in connection with the thesis 4 of section 3 .2. The present account of conversational implicature suggests that a proper treatment of conversational implicature may explain when an implicature is cancellable and when not. The next subsection provides an analysis of the pragmatics of adjectives that gives further evidence for this view.
Reinhard Blutner 149
certain conditional probabilities: W1 ex prob ( qlpi) (c£ Hobbs et al. 1993). If the Pi are necessary conditions for q, then we have W1 + W2 I, and the weights wi can be interpreted to estimate the saliences of the feature complexes Pi with regard to p. t\ COLOUR{x, u)so t\ u = reds1 (3 3) APPLE{d)Sl t\ PART{d, x)$0 =
i I
APPLE{d)
+-
II II
PART{d, x)"1 t\ PEEL(x)"('-"Yl t\ etc '-o
r
PEEL{x)
+
COLOUR{x, u)/3 t\ etc' -/3
total costs: $ 2 - cry - a,B( I - 1)
::::::
$ 2 - a,B
The diagram (3 3) shows that part of the abductive inference graph that is relevant for abducing the red peel-interpretation {pe) starting with (pd). The axiom in the second line of (3 3) can be seen as decomposing the concept of an apple into a peel part (salience a) and a residue, where the peel part is taken as a kind of slot-ftller structure; a may be interpreted as the salience of the part-relation for apples (a « I ) . In a similar vein, ,B may be interpreted as the salience of the colour slot for the peels of apples. Given the assumption that the colour of the peel is more diagnostic for classifying apples than the colour of other apple parts, for example, the colour of the pulp, the red peel-specification is arguably the cost-minimal specification. To make this point explicit, let us consider the calculation performed in (3 3). It crucially rests on the factoring operation which unifies the part- and colour-slots of the predicate complex of the utterance with the corresponding slots that emerge while conceptually decomposing the subject term of the utterance. The red peel-specification comes out as the cost-minimal specification if its total costs are smaller than the costs of any other specification. This corresponds to the condition a,B > a',B', where a' and ,B' are the parameters for any other apple part (e.g. for the pulp). Suppose that, as is rather plausible, this condition is satisfied, then the !-principle selects the red peel-interpretation and blocks the red pulp interpretation. Consequently, we get (ph) as an conversational implicature, but not (pc). Note that the non-existence of the implicature (32c) doesn't forbid a discourse as (34) but rather licenses it. (34) This apple is red. But not only its peel is red. Its pulp also is red. In the case of(3sa) analogous considerations give (35b) but not (35c) as a conversational implicature.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Red Peel-Variant
1 so Lexical Pragmatics
.
=
TRACTOR( d)
+-
PART(d, xt7 t\ TYRES(xt< •- -r) A etc •- <>
r
TYRES(x)
+-
P-STATE(x, u)11 A et c • -P
Pumped up Tyres-Variant total costs: $ 2 (l:"'f - a,B( 1 'Y) � $ 2 - a,B -
-
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3 5) (a) The apple is sweet (b) Its pulp is sweet (c) Its peel is sweet It should be added that the present account evaluates utterances as (36) as pragmatically anomalous (assuming the former axioms and weights} (36} ?This apple is red, but its peel is not (perhaps, its pulp is) This qualifies implicatures like (32b) and (3sb) as non-cancellable (under normal circumstances-neglecting the possibility of genetic engineering). Finally consider the contrast between (37a) and (37b}: (37) (a) ?The tractor is pumped up (b) The tyres of the tractor are pumped up (c) ?The coachwork of the tractor is pumped up (d) TRACTOR(d) A PART(d, x) A PRESSURE(x, u) A u = pumped up (e) TRACTOR(d) A PART(d, x) A TYRES(x) A P-STATE(x, u) A u = pumped up (f) TRACTOR(d) A PART(d, x) A MOTOR(x) A P-STATE(x, u) A u = pumped up The present account predicts (3 7a) as pragmatically anomalous. This prediction results from the fact that those parts of tractors that may be pumped on (the tyres) are only marginally diagnostic for identifying tractors and therefore the corresponding interpretation (37b) can be blocked by specifications that refer to more salient parts, for example as shown in (3 7c). However, the latter specifications suffer from sort conflicts and therefore violate the condition ( 1 s ) To make the argument explicit, let us start with (37d) as underspecified representation of (37a), and let us compare the cost for calculating the two enrichments (37e) and (37f) (related to (37b) and (37c)). The diagram (38) presents the corresponding abductive inference graph that is relevant for abducing the pumped up tyres-interpretation (37e). P-STATE(x, u)*0 t\ u up* 1 A (38) TRACTOR(d)*1 A PART(d, x)*0 TI III I
Reinhard Blutner
I5I
Note that this graph has practically the same structure as that given in (3 3 ). In the present case the factoring operation unifies the part and pressure-state slots arising from the predicate complex of the utterance_ (37a) with those that emerge while conceptually decomposing the subject term the tractor. Next let us consider an abductive inference that corresponds to an enrichment referring to more salient parts as the tyres, say the motor of the tractor as it is given in (39). (39)
TRACTOR(d)$1 1\ PART(d, x)$0
.,_
-1
II
II
TRACTOR(d) ..- PART(d, x)"
,
1\
P-STATE(x, u) so 1\
u
=
upS1
-r 1\ MOTOR(x)" ' ( •--r) 1\ etc• -o '
total costs: $ 2 - a '"(
In this case only the part slot of the initial representation (first line of (39)) can be unified with a corresponding slot arising from the conceptual decomposition of the subject term. The pressure-slate slot, on the other hand, cannot be used in factoring because the composition of concept of a tractor's motor doesn't involve a pressure-state slot in the intended sense. The cost calculations for the rwo enrichments (37e) and (37f) are as given in (38) and (39). It is obvious that the pumped up motor-variant wins over the pumped up tyres-variant when the condition a '"' > a/3 holds, i.e. a I a > f3I"'· Here a may be interpreted as the salience of the tyre parts of the tractor, a ' as the salience of the motor part of the tractor, f3 as the salience of the p(ressure)-state slot for the tyres of the tractor, and "' as the salience of the part slot for tractors. Let us suppose that the condition a I a > {3 h is satisfied, which is quite plausible if we assume that the saliences for the different slots are approximately the same. But the saliences of the various ftllers of a slot may vary considerably; for example, it appears that the salience of the motor as part of the tractor is much higher than the salience of the tyres. Then the !-principle selects the pumped up motor variant and blocks the pumped up tyres-interpretation. Intuitively, the winning variant (pumped up motor) suffers from sort conflicts. I refrain from expressing this formally by a corresponding axiom. The existence of this sort of conflict leads to a violation of the Quality 1 condition. Consequently, under plausible context conditions, there is no pragmatically licensed update for an utterance of (3 7a), and it comes out as pragmatically anomalous.Let us consider an utterance like (40) 1
1
(4o) The bicycle is pumped up
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
?Pumped up Motor-Variant
1 52 Lexical Pragmatics
This utterance does not have the highly marked status of (37a). The present account explains this by making the plausible assumption that the tyres of bicycles are one of the most salient parts of bicycles. Consequently, in this case the pumped up tyres-interpretation comes out as a cost minimal one, and it doesn't suffer from sort conflict. Needless to say, the present considerations regarding the amounts of the parameters have to be supported by careful empirical studies. However, as a first step considerations of this kind may be valuable. They may demon strate at least which kinds of influence are conceivable, and this again may be tested empirically.
Abduction and systematic polysemy
In this section I will demonstrate how the ideas put forward in section 4.1 may provide a mechanism for generating the range of the conceptually salient senses of institute-type words-a mechanism that solves the restric tion problem of polysemy. Adopting the radical underspecification view (section 2.2), I will show how the extended mechanism of conversational implicature is capable of giving a principled account. The general idea that leads us to underspecified representations in the case of institute-type words is as follows. Suppose there are certain entities which can be understood as conceptual frames or schemata and can be classified according to the variety of institute-types (government, school, parliament, etc.). Suppose further that these entities can be considered under different perspectives. These perspectives are assumed to provide more concrete realizations of the rather abstract concept of a certain institute-type e, perhaps realized as building, process, or institution property. However, the particular perspective adopted and, consequently, the concrete realization of the intended institute-type remains semantically open. In a first approximation, the semantic representation of institute-type nominals may look like (41a, b). (41) (a) Ax 3e[SCHOOL(e) 1\ REALIZE(e. x)] (b) Ax 3e(GOVERNMENT(e) 1\ REALIZE(e. x)]
Note that the specification of x as building, process, or institution proper has not been specified in the lexicon. That means that the variety of different interpretations has not been treated by stipulating semantic ambiguities. Note furthermore that the different restrictions on interpreta tive variants, for example for school and government, are no longer treated semantically. As a consequence, the restriction problem of polysemy has to be analysed pragmatically.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4· 3
Reinhard Blutner
I53
In the previous subsection we used an axiom of the form q t- p� 1 1\ p�2 to abduce, for instance, the existence of peel parts of an assumed apple from the existence of the apple. In a similar vein, we now use axioms of this form in order to abstract, for instance, the existence of a building and/or an institution realization from the existence of an entity of type school or government. Weighted abduction rules that provide the corresponding decompositions are presented m (42a, b) for the case of school: {42) (a) SCHOOL{e) t- REALIZE{e. xfrr 1\ BUILD{x)a( I--y) 1\ etd-a ' ' ' (b) SCHOOL{e) t- REALIZE{e. x)a -y 1\ INSTIT{x)a ( I --y) 1\ etd-a ·
·
(43) (a) The school has a flat roo£ (b) The school building has a flat roo£ Moreover, I will demonstrate why the utterance of (44a) appears as a pragmatic anomaly (under normal circumstances) and, consequently, why the interpretation of (44b) is suppressed as a conversational implicature of (44a). {44) (a) ?The government has a flat roo£ (b) The government building has a flat roo£ To simplify it, the underspecified semantic representation of the sentences {43a) and (44a) are as indicated in (45a, b).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Analogously to the case discussed before, the parameters may be interpreted as follows: a as the .salience of the building realization of a school, a ' as the salience of the institution realization, and 1 as the salience of the realization slot { 1 < I ) . It is important to note that the assumptions about the weights in (42) are correct only when the condition on the left side of t- is conceptually necessary for the conditions on the right side. In the case of {42a) that means that every institution of type school must be realized in a (single) building. This certainly is highly plausible in the case of school. In other cases, for instance for government, the corresponding supposition is plausible only with a certain (very small) probability. In these cases we have to introduce an extra factor 8 ( 8 < I ) into the exponential of the corresponding rule (as we will do subsequently in a simple example). Let me now demonstrate by an example (i) how the abductive machinery can be used to generate the possible interpretations as conversational implicatures, and (ii) how the mechanism excludes the impossible inter pretations as cases of pragmatic anomaly. More concretely, I will illustrate how the content of (43b) may be construed as a conversational implicature of (43a).
1 s 4 Lexical Pragmatics
(45) (a) 3e[SCHOOL(e) 1\ REALIZE(e, x) 1\ BUILDING{x) 1\ HAS_A_FLAT_ROOF(x)) (b) 3e[GOV(e) 1\ REALIZE(e, x) 1\ BUILDING(x) 1\ HAS_A_FLAT_ROOF(x))
s' (46) SCHOOL{e)$1 1\ REALIZE{e, x)$0 1\ BUILD{x)$0 1\ i II II I II II SCHOOL{ e) REALIZE{e, x)"'7 1\ BUILD{x)"' ( '-7) 1\ etc•-<> .
.
.
+-
(Consistent} Building-Variant total costs: $ 2
-
a
By using the alternative rule (42b) for decomposing the subject term, the inference graph (47) results. sr (47) SCHOOL(e) $1 1\ REALIZE(e, x)$0 1\ BUILD(x) $0 1\ .
i I
SCHOOL{ e)
II
+-
II
,
REALIZE(e, x)"' 7
1\
INSTIT{x)"'
'<
.
l , _..,
.
1\ etc •-<> '
(Inconsistent} Institution-Variant total costs: $ 2
-
a ''Y
In this case, an inconsistent institution reading is generated. Since factoring doesn't apply as before, we get only a reduced rate of saving (by a '-y). From the two considered variants, the !-principle selects the building variant when the condition a:/ a ' > 'Y is satisfied, and the (inconsistent} institution variant is suppressed in this case. It is plausible to assume that a and a ' are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In both cases, the first two conjuncts result from the lexical inputs of the institute-type nominals, and the remaining ones correspondingly to the predicate complex. The expression BUILDING(x) is due to the assumed sort restriction provided by the predicate complex, and it is singled out as an important representational element in the present analysis. The diagram (46) shows the part of the abductive inference graph that is relevant for abducing the building-interpretation (43b) starting with (43a) (in its pre-analysed form (4sa)). Note that there is no real abduction in this very crude and simplifying analysis. The graph shows a 'conceptual decomposition' of the subject term and a factoring operation that unifies the occurrence of BUILD(x) resulting from this decomposition with its occurrence resulting from the predicate complex. This effects a saving in assumption costs (by an amount of a).
Reinhard Blumer 1 S S
of comparable amount, since the building and institution reading of school can be seen as realizing concepts of both the basic level of buildings and that of institutions. Consequently, the condition a/a' > 1 (with 1 « I ) may be assumed to hold and the !-principle selects the building variant. We can conclude that (43b) comes out as a conversational implicature of (43a). Now I want to argue that (44a) comes out as pragmatically anomalous. As before, we have to contrast two abductive inference graphs. They are shown in (48) and (49). 8• (48) GOV( e )$1 1\ REALIZE( e, x)$0 1\ BUILD(x)so 1\ .
i
+---
.
.
II
II
REALIZE(e, x/01
1\
II
BUILD(x)6 o ( •-'1')
etc0( •-o)
1\
( Consistent ) Building-Variant total costs: $ 2 a -
(49) GOV( e ) s 1 1\ REALIZE( e, x)$0 t
I
GOV(e)
II
+---
II
,
1\
BUILD(x) so
s,
1\ '
>
REALIZE(e, xt 1 1\ INSTIT(xt ( •- '1'
1\
etc •-o '
{ Inconsistent) Institution-Variant total costs: $ 2 - a'-y The parameter 8 ( 8 « I ) corresponds to the probability of assuming that a government is realized in a single building. Now, the condition 8a/a' > 1 becomes relevant and the !-principle selects the (consistent) building variant when the condition is satisfied. In the other case, when the converse condition 8a/ a' < 1 is satisfied, the !-principle selects the (inconsistent) institution variant. Of course, the latter possibility is actually realized. This follows from the assumption that it is very implausible to assume that a governnient is realized in a single building ( 8 « I ) . Furthermore, government buildings are certainly not basic-level buildings. Consequently a « a'. Both factors make it highly plausible to assume that 8a/ a' < 'Y· Therefore, the inconsistent institution variant wins over the consistent building variant. The inconsistency of the selected variant leads to a violation of the Quality I condition. Consequently, under plausible context conditions (43a) comes out as pragmatically anomalous.'8 Again, it should be stressed that these considerations regarding the amounts of parameters are rather provisional and should be supported by careful empirical studies. Nevertheless, the present view sheds some light
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I
GOV(e)
II
I 56
Lexical Pragmatics
on the way how the restriction problem of polysemy may be solved by considering the probabilistic nature of conceptual knowledge. s
CONCLUSI O N
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
One aim of this paper was to collect some general problems that have a prima facie claim on the attention of linguists interested in Lexical Semantics. These problems had to do with the utterance of words within concrete conceptual and contextual settings and went beyond the aspects of meaning typically investigated by a contrastive analysis of lexemes within the Katz-Fodor tradition. Three groups of problems were considered: (i) pragmatic compositionality, (ii) blocking, and (iii) pragmatic anomaly. The problems came to the fore in connection with the pragmatics of adjectives and the phenomenon of systematic polysemy. The same points can be made with regard to word formation in general (e.g. Aronoff 1 976; Bauer 1983 ) and the interpretation of compounds in particular (e.g. Meyer 1993; Wu 1990). Moreover, the investigation of kinds of polysemy other than those found with institute-type words may be helpful in order to see the uniquity of these problems (cf., for instance, Lakoff's (1987) study on English prepositions and Sweetser's ( 1990) investigation of English perception verbs). Furthermore, Fabricius-Hansen's (1993) research on how the interpretation of noun-noun compounds is affected by a genitive attribute may raise the same problems in a more complex area. The second aim of this paper was to sketch a new approach called Lexical Pragmatics that deals with these problems in an explanatory way and tries to give a systematic account of the phenomena under discussion. The paradigm is based on two simple principles: (i) an adequate representation of lexical items has to be given in a semantically underspecified format, and (ii) their actual interpretation has to be derived by a pragmatic strengthen ing mechanism. The basic pragmatic mechanism rests on conditions of updating the common ground and allows to give a precise explication of notions such as generalized conversational implicature and pragmatic anomaly. The fruitfulness of the basic account was initially established by its application to a variety of recalcitrant phenomena, among which its precise treatment of Atlas & Levinson's Q- and !-principles and the formalization of the balance between informativeness and efficiency in natural language processing (Hom's division of pragmatic labour) deserve particular mention. The basic mechanism was subsequently extended by an abductive reasoning system that is guided by subjective probability. The extended mechanism turned out to be capable of giving a principled account of lexical blocking, the pragmatics of adjectives, and certain types of systematic polysemy.
·
Reinhard Blutner I 57
Acknowledgements I am grateful to Manfred Bierwisch, Paul David Doherty, Bart GeurtS, Gerhard Jager, Thomas Jiingling, Annette Lefimollmann, Chris Pinon, and Rob van der Sandt for useful comments on earlier versions of this paper. Special thanks go to an anonymous referee of the JS. I do not, however, intend to imply by this that they endorse my approach. In particular, Thomas and Manfred don't believe a word of it. REINHARD
BLUTNER
Humboldt University, Berlin Jagerstrasse 1 o-1 1 10117 Berlin Germany e-mail: blutner@ger:man.hu-berlin.de
Received: I 8. I 0.97 final version received: 05.04.98
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I find it important to apply the ideas to other possibly more complex and more realistic examples than those considered here. Moreover, methods are needed that allow one to measure the values of the probabilistic parameters that control and organize conceptual knowledge. Seen from a moderately distant viewpoint, the standard accounts of Lexical Semantics may appear as an incoherent research field which is at odds with itsel£ As an endeavour that has to access Grammar, Semantics, and aspects of utterance interpretation at the same time it multiplies the diversion of . these disciplines. Overstretched by the task of theory formation, it either combines theoretical rigour with descriptive poverty, or, more predominantly, it leads to linguistic anecdotalism, collecting pretty and curious observations without theoretic control. I sense that it is the inadequacy or the lack of a genuine pragmatic component that has led to this situation. In so far as Lexical Pragmatics tries to take pragmatics seriously-especially the conception of conversational implicature-and in so far as it is explicit about this component, it may substantiate a division of labour between grammatical and pragmatic aspects of the lexicon. This may broaden the way for overcoming the unfortunate situation just mentioned. Perhaps most details concerning the main ideas of the present account in concrete terms may prove false in the future. This may concern, first at all, the Economy principles and their interaction. In order really to justify the details of these principles we need more empirical evidence and studies. But it is also crucial to discover the reasons that explain why the principles are just as they are. This brings us to a reductionist programme as is currently pursued in the domain of Integrative Connectionism (e.g. Smolensky 1995). A first attempt at achieving a full reduction of Speaker's economy (I-principle) and Hearer's economy (Q-principle) to connectionist principles is currently under way.
I 58 LeXical Pragmatics
NOTES
·
Jackendoff (I983) for similar distinc tions. The important point of this dis tinction correlates with Grice's proposal in his William James Lectures to make a distinction within the 'total significa tion' of a linguistic utterance between what a communicator has said (in a certain favoured, and _maybe to some degree artificial, sense of said), and what a communicator has meant beyond it (what she has implicated, indicated, suggested). 7 Mental model and conceptual representation are more psychologically coloured terms; information state is the favoured term used in formal semantics. 8 See, for example, McEliece (I977). 9 In section 4 a more refmed cost-function is developed which sometimes allows the selection of state descriptions that are not minimal surprising. However, these state descriptions can be characterized as the 'better interpretations' because they are more unifying and, in some sense, more relevant than less surprising ones. This formulation brings us closer to the idea of Atlas & Levinson (I 98 I) where the I-principle is intended as inference to 'the best interpretation' (with 'best interpretation' informally understood as interpretation which prefers corefer ential readings of entities, making use of stereotypical relations between . referents or events. However, it should be added that the way in which Atlas & Levinson (I98 I) try to formalize their Principle of Informativeness seems rather misleading. IO Here K is the epistemic operator indexed to H and S, respectively. The epistemic logic I assume is Hintikka's (I962). As discussed by Zeevat (I 997), this condition on common grounds is only a necessary one. Developing a more refined defmi tion ofcommon ground, Zeevat formulates also an update operation for common grounds. His · conception, however,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I I have to thank an anonymous referee of the JS for this example. 2 An anonymous referee notes that for him beef sounds equally good as cow in (7). Consequently, the relevant dif ference in acceptability between (6b) and (7) is in cow (which sounds fine) more than in beef This fact is sufficient to illustrate the phenomenon of deblock . ing, which is the relevant one in the present context. 3 These problems concern (i) the restric tiveness of the coercion mechanism, (ii) the apparent inflation of shifting operations, (iii) the stipulation of an additional checking mechanism that diminishes the use of monotonic processing, and (iv) problems with the analysis of co-predication in case of logical polysemy. For details and further criticism, see Copestake & Briscoe (1995), Fodor & Lapore (to appear), and Blutner {to appear). 4 In general, cost factors relate to the (estimated) cost of accessing the dif ferent interpretations. In section 4. 1 an explicit account is provided which brings our system as close as possible to a Bayesian network and takes costs as negative logarithms of certain condi tional probabilities. 5 Sperber & Wilson's {I986) extreme position of reducing the maxims to just one-the maxim of relevant-isn't relevant in the present context. As argued by Levinson (I989), Sperber & Wilson try the 'impossibility of reduc ing countervailing principles to one mega-principle'. They concentrate on the phenomena of classic particularized Relevance implicatures illustrated by Grice, and they fail to account for the whole range of generalized conversa tional implicatures-the implicatures that are most important for lexical pragmatics. 6 See, for example, Bierwisch (I983) and
Reinhard Blumer I 5 9
�ifJ
P =def
ifJ.
..., � ifJ
P.( =def ..., � ..., ¢)
all a knows, it is possible that p. I 3 The presented arguments showing the
advantages of the present approach over the traditional approach based on
Hom-scales are due to Rob van der Sandt. I thank him for allowing me to include his considerations in this article. I 4 An anonymous referee notes that at least for the pair persuade to not-dissuade from the equivalence is spurious, since dissuade presupposes that the person previously intended the complement. The referee's example: If John was undecided about whether to vote for Clinton, Mary could persuade John not to vote for Clinton, but she couldn't dissuade him from voting for him. I 5 Only under very special conditions is it possible to construct representational pendants to the non�representational elements, for example when we develop intuitions about our cognitive system or about our mental activity. In this sense, the representation of salience, relevance, and so on is possible. Usually, these representations are comparative in char acter and are not quantitatively scaled. I6· With regard to the first part of the dictum, seeing non-cancellability as a necessary condition of entailment (or seeing cancellability as a sufficient con� dition of conversational implicature), I . agree with Hirschberg ( I 99 I ) and assume that it is right, at least if it is possible to discriminate cancellation from suspension (the calling into question of an asserted proposition), from con textual disambiguation, and from cer tain forms of speaking loosely (for careful discussion, c£ Hirschberg I99 I : 28 f£). I7 As assumption cost of the latter units I have stipulated $ 1 . You may see this stipulation as fixing the $-unit. I 8 An anonymous referee has suggested more minimal contrasts in order to bring out the relevant difference:
?The government has a flat roof vs. The Ministry ofjustice has a flat roof ?The university has a flat roof vs. The college of engineering has a flat roof
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ignores the effects of conversational implicatures which also influence the common ground. The present account seeks to gra5p these effects in a first and rather sketchy manner. Furthermore, it should be noticed that I use the notion of information state sometimes as refer ring to a set of possible worlds and sometimes as a representational struc ture (state description or disjunction of state descriptions). It should be added that in the present context we read K as know for sure (this doesn't presuppose the complement proposition). However, there is a prob lem with this formulation that at least should be mentioned. In the present formulation, the common -ground cannot include propositions that some or all of the participants know to be false. However, there are kinds of conversations where this formulation is unsatisfying. For instance, consider a Christmas-time conversation where the proposition that there is a Santa Claus may be common ground; even if some or all of the participants know that there isn't any Santa Claus. I I Identifying state descriptions with sets of possible worlds and r(a) with a family of sets of possible worlds, we can write this condition in the following way: U p(a) :J cg[a]. I 2 Beside the epistemic operator K we -,K...,_ Hintikka _need its dual reads as a knows that It is important to note that Hintikka is . using the verb know in a technical sense without the usual factive presup can be read position. In this vein, as what a knows is not that p, and can be read as for
I 6o Lexical Pragmatics
REFE RENCES Alshawi,
H
&
Crouch,
R
Dalrymple, M, Kanazawa, M., Mchombo, & Peters, S. ( I994), 'What do reciprocals mean?', in M. Harvey & L. Santelmann (eds), Proceedings of the Fourth Semantics
( I992),
'Monotonic semantic interpretation', in Proceedings of ACL, Delaware, 32-9. Aronoff. M ( I976), Word Formation in Genera
tive Grammar, MIT Press, Cambridge, MA. & Levinson, S. ( I 98 I), 'It-clefts,
and Linguistic Theory Conference: SALT
W, Cornell University, Ithaca.
Atlas, J.
& Peters, S. (I996), Seman tic Ambiguity and Underspeciflcation, CSLI
informativeness and logical form', in P. Cole (ed.), Radical Pragmatics, Academic Press, New York, I -61. Bauer, L. (I983), English Word-Formation,
van Deemter, K.
Publications, Stanford, CA. Fabricius-Hansen, C. ( I 993),
Cambridge University Press, Cambridge. Bierwisch, M (I983), 'Semantische und konzeptuelle Reprasentation lexika lischer Einheiten', in W. Motsch & R Ruzicka (eds), Untersuchungen zur Semantik, Akademie Verlag, Berlin,
'Connectionism and cognitive archi tecture: a critical analysis', Cognition,
Georgetown University Roundtable on Language and Linguistics, Georgetown University lexicon', in C. Rameh (ed.),
Press, Washington, DC. Carnap, R (I947), Meaning University of Chicago
Copestake,
A.
&
Briscoe,
T.
( I 995),
'Semi-productive polysemy and sense extension', jou rnal ofSemantics, 12, I 5-67. Cruse,
D. A. {I 986),
Lexicon'.
·
Gazdar, G. ( I979), Pragmatics, Academic Press, New York. Grice, P. (I968), Logic and Conversation, text of Grice's William James lectures at Harvard University, published in Grice {I989)· Grice, P. ( I989), Studies in the Way of Words, Harvard University Press, Cambridge,
and Necessity, Press, Chicago &
London. Charniak, E. & Shimony, E. S. (I 990), 'Probabilistic semantics for cost based abduction', Technical Report CS-90-02, Computer Science Department, Brown University. Clark, E. V. (I99o), 'On the pragmatics of contrast', journal of Child Language, 17, 4 I7-3 I.
28, 3-7 1 . Fodor, J . A . & Lapore, E . (to appear), 'The emptiness of the lexicon: critical reflec tions on J. Pustejovsky's The Generative
Lexical Semantics,
Cambridge University Press, Cambridge. Deane, P. D. ( I 9 8 8), 'Polysemy and cognition', Lingua, 75, 325-36 1 .
MA.
Hintikka, J. ( I 962), Knowledge and Belief An
Introduction to the Logic ofthe Two Notions, Cornell University Press, Ithaca & New York. Hirschberg, J. {I99 I),
A Theory of Scalar
Garland Publishing, Inc., New York & London. Hobbs, J. R, Stickel, M E., Appelt, D. E.,
Implicature, & Martin,
abduction', 69-I42.
Horn, L. R
P. (I 993), 'Interpretation as Artificial Intelligence, 63,
( I 984).
'Toward a new taxo
nomy for pragmatic inference: Q based and R-based implicatures', in D.
Schiffrin (ed.), Meaning, Form, and Use in Context, Georgetown University Press, Washington, DC, 1 1-42.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Beitriige zur Geschichte der deutschen Sprache und Literatur, I I 5 , I93-243. Fodor, J. A. & Pylyshyn, Z W. ( I 988),
6 1-99· Bierwisch, M. (I989), 'The semantics of gradation', in M. Bierwisch & E. Lang (eds), Dimensional Adjectives, Springer Verlag, Berlin etc., 7 I-26 1. Blutner, R (to appear), 'Lexical semantics and pragmatics', in Linguistische Berichte. Caramazza, A. & Grober, E. (I 977), 'Poly semy and the structure of the subjective
'Nominal
phrasen mit Kompositum als Kern',
Reinhard Blutner
I6I
Householder, F. W. ( 1 97 I ), Linguistic Specu Lehrer, A { I970), 'Static and dynamic ele ments in semantics: hot, warm, cool, cold', lations, Cambridge University Press, Papers in Linguistics, 3, 49-74· London & New York. Jackendoff, R ( I983 ), Semantics and Lehrer, A ( I978), Structures of the lexicon Cognition, MIT Press, Cambridge, MA. and transfer of meaning, Lingua, 45, 95- I 2 3 . Kamp, H. {I 975), 'Two theories about adjec tives', in E. L. Keenan (ed.), Formal Levinson, S. ( I 9 8 3 ), Pragmatics, Cambridge Semanticsfor Natural Language, Cambridge University Press, Cambridge. Levinson, S. { I987), 'Pragmatics and the University Press, Cambridge, I23-55· grammar of anaphora', journal of Karttunen, L. ( I972), 'Possibly and Must', in Linguistics, 2 3 , 379-434. J. P. Kimball (ed), Syntax and Semantics 1 , Levinson, S. ( I 989), 'Relevance', journal of Seminar Press, New York, I-2o. Linguistics, 25, 455-72. Karttunen, L. & Peters, S. ( I 979), 'Con ventional implicature', in C.-K. Oh & McCawley, J. D. { I 978), 'Conversational implicature and the lexicon', in P. Cole D. A Dinneen (eds), Syntax and Semantics (ed.), Syntax and Semantic$ 9: Pragmatics, 1 1 : Presupposition, Academic Press, New Academic Press, New York, 245-59. York, I-56. Keenan, E. L. { I974), 'The functional McCawley, J. D. { I993), Everything that Linguists have Always Wanted to Know principle: generalizing the notion of about Logic but were Ashamed to Ask, 2nd Subject of', in Papers from the Tenth edn, University ofChicago Press, Chicago. Regional Meeting of the Chicago Linguistic McEiiece, R { I977), Theory of Information Society, Chicago, IL, 298-3 IO. and Coding, Addison-Wesley, Reading, . Keil, F. C. { I 979). Semantics and Conceptual MA. Development, Harvard University Press, Cambridge, MA. Matsumoto, Y. {I995), 'The conversational condition on Horn scales', Linguistics and Kiparsky, P. (I982), 'Word-formation and Philosophy, I 8, 2 I -60. the lexicon', in F. Ingeman (ed.), Pro ceedings of the 1982 Mid-America Linguistic Meyer, R ( I993), Compound Comprehension in Isolation and in Context, Max Niemeyer Conference. Verlag, Tiibingen. Lahav, R ( I989), 'Against compositionality: the case of adjectives', Philosophical Montague, R ( 1 970), 'Universal Grammar', Theoria, 36, 3 73-98. Studies, 5 5, I I I-29. Lahav, R ( I993 ), 'The combinatorial Nunberg, G. ( I979), 'The non-uniqueness of semantic solutions: Polysemy', connectionist debate and the pragmatics Linguistics and Philosophy, 3, I43-84. of adjectives', Pragmatics and Cognition, I , 7 1 -88. Nunberg; G. (I995). 'Transfers of meaning', Lakoff, G. ( I987), Women, Fire, and journal of Semantics, 12, I09- I 32. Dangerous Things: What Categories Reveal Nunberg, G. & Zaenen, A ( 1 992), 'Sys About the Mind, University of Chicago tematic polysemy . in lexicology and Press, Chicago. . lexicography', in K. Varantola, H. Tommola, T. Salmi-Tolonen, J. Schopp Lang, E. ( I 989), 'The semantics of dimen (eds), Euralex II, Tampere, Finland. sional designation of spatial objects', in M. Bierwisch & E. Lang (eds), Dimen Partee, B. ( 1984), 'Compositionality', in F. sional Adjectives, Springer-Verlag, Berlin, Landman & F. Veltman (eds), Varieties 7 1 -261. of Formal Semantics, Foris, Dordrecht, Langendoen, D. T. { I978), 'The logic of 28I-J I I . reciprocity', Linguistic Inquiry, 9, I 77-97· Pustejovsky, J. (I989), 'Type coercion and selection', paper presented at WCCFL Lehrer, A ( I968), 'Semantic cuisine',journal VIII, April I 989, Vancouver, BC. of Linguistics, 5, 3 9- 5 6. ·
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
162
Lexical Pragmatics
Pustejovsky, J. (1991 ), 'The generative lexi con', Computational Linguistics, 17, 4, 409-41.
Pustejovsky, J. ( 1993), 'Type coercion and lexical selection', in J. Pustejovsky (ed.), Semantics · and the Lexicon, Kluwer, Dordrecht, 73-96. Pustejovsky, J. ( 1995 ), The Generative Lexicon, MIT Press, Cambridge, MA. Pustejovsky, J. & Boguraev, B. ( 1993): 'Lexical knowledge representation and narural language processing', Artificial .
Intelligence, 63,
1 93-223.
Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foun dations, MIT Press/Bradford Books,
Cambridge, MA. Sadock, J. M. ( 1.978}, 'On testing for conversational implicature', in P. Cole (ed.), Syntax and Semantics, Vol. g: Pragmatics, Academic Press, New York, 281-97·
Smolensky, P. ( 1995), 'Constituent strucrure and explanation in an iritegrated connectionist/symbolic cognitive archi tecture', in C. Macdonald & G. Macdonald (eds), Connectionism: Debates on Psychological Explanation, Vol. 2, Basil Blackwell, Oxford, 22 1-90.
Philosophical Review, 72,
327-63.
Sperber, D. & Wilson, D. ( 1986), Relevance, Blackwell, Oxford. Stickel, M. E. (1989), 'Rational and methods for abductive reasoning in natural language interpretation', in R Studer (ed.), Natural Language and Logic, Springer-Verlag, Berlin, 2 3 3-52. Sweetser, E. E. ( 1990), From Etymology to Pragmatics, Cambridge University Press, Cambridge. Thomason, R H. (1990), 'Accommodation, meaning, and implicature: interdiscip linary foundations for pragmatics', in P. R Cohen, J. Morgan, & M E. Pollack (eds), Intentions in Communication, MIT Press, Cambridge, MA. Wu, D. ( 1 990), 'Probabilistic unification based integration of syntactic and semantic preferences for nominal com pounds', Proceedings of the 13th Interna ·
tional Conference on Linguistics (COLING 413-18.
Computational
g o),
Helsinki,
Zeevat, H. ( 1997), 'The common ground as a dialogue parameter', in A. Benz & G. Jager (eds), MunDial'97: Proceedings
of the Munich Workshop on Formal Semantics and Pragmatics of Dialogue, Munich.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Quine, W. V. 0. ( 1960), Word and Object, MIT Press, Cambridge, MA. Rumelhart, D. E., McClelland, J. L. and the PDP Research Group ( 1 986), Parallel
Sommers, F. ( 1959), 'The ordinary language tree', Mind, 68, 16o-85. Sommers, F. ( 1963), 'Types and ontology',
© Oxford University Press 1 998
Lexical Rules As Hypotheses Generators A N A T OL I ST R I G I N
Humboldt University, Berlin
Abstract Developments in computational linguistics lead to the conception of sense extension rules in the lexicon as a theory of regular polysemy. Lexical rules are defined only on such semantic information as is in the lexicon with the desired effect of restricting the amount
1 AN INFORMAL DES CRIPT I O N O F S E NSE E XTE N S I O N It suffices intuitively to characterize sense extension as one kind of regularity in the interpretation of polysemous words. Rules are usually invoked to describe regularities. The discussion of sense extension rules within a formal framework began somewhere in the 196os. McCawley ( 1968), discussing the semantics of lexical items in the lexicon, suggested that probably all languages have implicational relationships among their lexical items, whereby the existence of one lexical item implies the existence of another lexical item, which then need not be listed in the lexicon.
His example of such an implicational relation is the use of words for temperature ranges (warm, cool) to also represent the temperature sensation produced by wearing an article of clothing. To quote McCawley again . . . the English sentence (I) This coat is warm. is ambiguous between the meaning that the coat has a relatively high temperature and the
meaning that it makes the wearer feel warm . . I propose then that English has two lexical items wa rm, of which only one appears in the lexicon, the other being predictable on the .
basis of a principle that for each lexical item which is an adjective denoting a temperature range there is a lexical item identical to it save for the fact that it is restricted to articles of clothing and means 'producing the sensation corresponding to the temperature range denoted by the original adjective'.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of semantic information in the lexical representation of ambiguous items. The paper presents some examples which indicate difficulties for this approach, argues for prag matically based rules which use conceptual information, and proposes a programmatic partial formalization of this approach in the framework of abductive interpretation.
1 64 Lexical Rules as Hypotheses Generators
Note that although a rule is a mapping between lexical entries in the lexicon, nothing is said about the nature of the semantic information in the entries. At a second glance the generalization is not quite correct, and McCawley himself notes this in the postscript to the paper reprinted in McCawley (1973). In (2) the restriction to clothing is violated.
(2) The fire is warm.
On the other hand, in the case of articles of clothing other bodily sensations can seemingly be predicated of them, e.g. in (3 ). The domain of the rule could probably be broader than originally suggested, but to find this out a more subtle analysis is required, which was never conducted. Later Green { 197 4) in her critique of the implicational rule approach noted that the domain of the rule suggested by McCawley should probably be more restricted at the same time because the words hot and cold are not used to refer to articles of clothing producing these sensations, i.e. we do not usually say This jacket is cold. The question how to delimit the domain of a rule is evidently not trivial. As an additional touch one might note with Green that it is not clear why only cool, hot, and warm have extensions to colours, as in hot colours, warm colours, and only warm, cool, and cold may refer to personality characteristics. . To quote Green: we say that someone has a warm personality, and that he is warm to the people, but not that he has a hot personality, or that he is hot to the people.
Confronted with these irregularities, Green thinks that the implicational rule approach to the lexicon is too strong, but that {to quote Green again): it would be possible to think of rules that imply the possibility of specific kinds of 'derived' uses. (implicarional possibility rules) as deftning the notion of 'related lexical entry'. These would bind together as one lexical item lexical entries which have semantic relationships related by these rules. This way, the non-existence of a usage would not have to be seen as
an exception to the rule, which has to be learned in addition to the rule. Rather, it may be seen simply as the existence of a gap in the lexicon; if a word or usage should be added to the lexicon to ftll that gap, it would be seen as an addition to the lexicon, not as a change in
a rule of grammar.
Her statement can be . interpreted as indicating a different conception of rules. Here the rules are not mappings between lexical entries, but are used on demand to fix all possible semantic interpretations of a lexical item, where evidently an item is something different from McCawley's item, for . which entry is used.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3) · The sweater is itchy.
Anatoli Strigin
165
The rules as conceived by Green would probably leave it to the speaker to decide on some regular basis whether there should be two related lexical entries in a lexical item, i.e. whether a rule should be applied. The question arises whether the basis is sufficiently determined by the lexicon data or needs some knowledge from outside the lexicon. In the latter case the rule can be said to use conceptual knowledge. If pragmatics is conceived as an interface between purely linguistic and general conceptual knowledge, the rules suggested by Green may be taken to belong to pragmatics. · Though the example of McCawley would not probably be considered to be the case of a sense extension rule now, the idea caught on. The line of thought leading to the pragmatic understanding of the kind of regularities under the above interpretation of Green's quotation is also elaborated in . Nunberg (1979, 1995). The line of thought based on the suggestion of McCawley is most prominently expounded in Copestake & Briscoe (1995), Briscoe, Copestake, & Lascarides (1995). To isolate what I consider to be the essential difference between them consider the basic problem to be solved in the conceptual analysis of the phenomenon. To provide a model of sense extension it is necessary to find sufficiently general rule domain definitions and to account for the existence of exceptions in the domains. To define a domain the source class of objects should be defined as well as the basic transfer relation between that class and the class resulting from the rule application. It sometimes seems that exceptions are an artefact of an imprecise domain description. The two conceptions differ in their predictions as to whether the domain description can be made precise. The pragmatic understanding would imply that this is a matter of conceptual knowledge, dependent on many totally non linguistic factors and complex conceptual processes, hence difficult. The lexicalist semantic understanding would imply that lexical entries contain only a limited amount of regimented semantic information; hence the domains and transfer relations can be more easily described in terms of the scheme of a lexical entry. This position seems to be more attractive, provided we know how to make the abstract relations of the scheme more specific. But we do not expect a high degree of precision. In his pragmatic approach Nunberg assumes that a speaker of some particular language like English has at her disposal a set of basic sense shifting transfer functions reflecting principles of the organisation of conceptual knowledge (rather like the rules of Green). They can be combined in different ways to give rule-like sense extensions. Nunberg (1995) and Nunberg & Zaenen (1992) are cautious regarding their relation to the lexicon. There it is maintained that some of the .processes could be viewed as lexical. In such cases of lexical transfer the conceptual relation defining the transfer function is explicitly coded in the relevant lexical ·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
166 Lexical Rules as
Hypotheses Generators
entries, so the transfer is licensed on the information in the lexical entry. This accounts for the language-specific character of· sense extension. In other cases conceptual information is needed that is not represented in the lexicon. The only hope to set some limit on the amount of such information is to claim that it should be in some sense relevant for the speaker. Nunberg (1995) uses the notions of salience and noteworthiness. The transfer function should be salient, and the property contributed by the
(4)
a.
b.
We had rabbit for dinner yesterday She wears rabbit
It could seem that the sense extensions which let words denoting animals denote their edible substance or their hide cover indeed only the animals the indicated parts of which are regularly used in the indicated way in English-speaking communities. The restrictions to the community and to regular use o the parts/products in the community seems necessary because names of animals which do not answer the descriptions do not allow the sense extension to proceed quite as easily, c£ an example of Copestake & Briscoe (1995) in (5).
(5)
Badger hams are a delicacy in China while mole is eaten in many parts of Africa
Under the pragmatic account we might base the transfer on our factual knowledge about the use of fur or meat of specific animals, then generalize it if this use is sufficiently important, to obtain a sense extension rule which takes an animal to some related stuff. The ease of transfer in the core set of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
new predicate should be noteworthy in the context. Briscoe et al. (1995) assume that the rules· are present in the lexicon and have the status of defaults. That means they should be applied whenever their application is allowed. Predicted, but unattested cases have to do either with the involved default being overridden or with its graded quality. The rule domains are more easily specifiable because lexical entries contain only a few types of semantic relations which define their qualia structure (Pustejovsky 1995), i.e. some essential properties of the corresponding objects. The relations can be made conceptually more precise, but this additional information is not accessible to the rules. A number of examples should now help to bring the difference between the two conceptions into focus. One lexical predicate transfer in English can be described by 'if an animal has hide such that it is common to process it to be used by people in the English linguistic community, then the word denoting this animal can be used to denote the processed hide'. Another regularity is to use an animal's name for the meat of this animal, cf. (4).
Anatoli Strigin 167
·
cases could then be accounted for if the necessary relations were coded in the lexical entries for words like rabbit. In cases like (5) we would need more general inferences via the conceptual knowledge. For moles we could then assume that they are used for food. This description is compatible with Nunberg's treatment. Treating sense extensions as lexical rules, another explanation of (5) should be sought, because factual knowledge is not available in the lexicon. Copestake & Briscoe (1995) formulate the rule in . terms of an abstract qualia relation origin, and its domain description refers to animals. The rule generates a lexical entry for the comestible substance for any word denoting an animal. The actual attested use of the rule by English speakers to derive the meat sense of mole determines the
Briscoe 1995; Briscoe & Copestake 1996). Using rules to derive senses with low frequency leads to the deterioration of their acceptability compared to regular cases. The sense-extension from animals to their meat is barred from its usual application to the name of an animal by the existence of a word which is reserved to denote the edible parts of this animal specifically. Thus pork is the usual name of pig meat, and not pig. This part of the phenomenon is known as blocking. An important characteristic of blocking is deblocking. Deblocking happens when the name with the blocked reading is never theless sometimes used in this reading instead of the specialized word to denote the entity in question. The examples are of mixed quality, but could make the point, c£ (6, 7), both from Copestake & Briscoe (1995), the latter coming from Terry Pratchett's Guards, Guards where the use of the blocked version characterizes one character of the novel, called Throat, because of the pejorative associations with the use of the word pig.
(6) (7)
??
Sam ate pig (instead of pork) 'Hot sausages, two for a dollar, made of genuine pig, why not buy one for the lady?' 'Don't you mean pork sir?' said Carrot warily, eyeing the glistening tubes. 'Manner of speaking, manner of speaking,' said Throat quickly. 'Certainly your actual pig products. Ge�uine pig.'
A pragmatic explanation would appeal to conversational principles. ' Briscoe et al. (1995) develop a theory of default interaction to account for blocking and intend to treat deblocking as blocking by a lexical exception, in principle. This account does not easily generalize to the sense extension
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
acceptability degree judgement for (5). Since mole is not used in the corresponding way in the English community, the applications of the rule are thus very rare. This statistics of the rule is registered as the relative frequency of the derived sense in the lexical entry for mole (Copestake &
168 Lexical Rules as Hypotheses Generators rules, and is supplanted by one based on a statistical explication of Gricean maxims in later papers. This one leaves deblocking to pragmatics.
(8)
korovu a. My eli We were eating the ·cow serebr'anyx tarelok gov'adinu iZ b. My eli We were eating the veal from silver plates ?? korovu iz serebr'anyx tarelok My eli c. plates · We were eating the cow from silver
It seems that in Russian we have two different sense extension rules for one domain. The conventionalized sense-extension is not blocked for korova. It is not clear how to distinguish the result of the two rules in the lexicon. In contrast to this, the English-like 'sense extension to meat' rule is freely applicable to names of fish, where the corresponding morphological derivatives are very rare, available mostly for expensive big fish like salmon. Thus it is perfectly OK to have (9).
(9)
My eli sudaka (iz serebr'anyx tarelok) We ate pike (from silver plates)
For names of bigger fowl the derivatives often exist, but the difference in the meaning between the morphologically derived forms and the forms in the extended sense is barely perceptible. The derivation does not apply to smaller hunted birds like snipe, or to the word ptica (fowl), and sense extension rules function here like in the case of
fish. On the assumption
that derivation is a lexical rule the interaction of the two rules is difficult to state. The necessary domain definitions seem also to be very difficult in the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The discussion of sense extension in Nunberg (1979), in Ruhl (1989), and in Copestake & Briscoe (1995) is based mostly on English examples. Some comparison with other languages c�uld throw additional light on the sense extension rules and, I believe, would indicate some difficulties for the lexicon conception of the rules. In Russian the meat of a mammal denoted by a noun is usually referred to by a morphologically regularly related mass term derived from the noun via the sufftx -ina. The sufftx is of very general application, but sometimes the derivation is blocked, e.g. korova (cow) in this sense is blocked in favour of the word gov'ad-ina. Although the word contains the sufftx, the stem is not the name of any animal. Now, if you wanted to convey the idea that the edible parts of a mammal were consumed as a whole, and not in portions, you could use the sense extension device like in English. And it is the only way to use the animal names for mammals in this extended sense, since the uses which call the holistic consumption into question have a very strange ring to them, e.g. (8).
Anatoli Strigin 169 lexicon. There does not seem to be a problem of principle for the pragmatic account, since the applicability' of the transfer function can be derived on any cultural basis whatsoever. At a first approximation, the dichotomy could run along the lines of how much edible substances is obtained from the animal or whether or not it is eaten as a whole as a rule, leaving ptica
(fowl) out for obvious reasons. The generalization can provide estimates for a nonce word or an unknown name of an animal. The probabilistic solution in the lexicon is completely non-predictive in these cases.2 Apresjan ( I 973) is a compendium of sense extension rules in Russian. It . includes, among other regularities, the following two: the sense extension
(I o)
abrikos vs. abrikos apricot b. jablon'a vs. jabloko applletree vs. apple (I I) gorcica, xren mustard, horseradish a.
The first makes an exception for apples ( x ob) and does not easily apply to exotic fruit in Russian. Thus, it is very strange to say There stood a banana/ mango/coconut/orange at the corner to refer to the corresponding plants. The second applies only to plants which can reach the kitchen table grouped up, so that the form identifying the plant is not preserved. A third regularity, not listed in Apresjan's work, but very common, is to refer to spots made by the juice of some berries with the name of the berry, e.g. in ( 1 2).
( 1 2)
U teb'a klubnika
na shtanax ne otstiralas'
At you strawberry on trousers not washed-away-reflexive The strawberry juice stain on your trousers did not wash away
It is unclear whether three sense extension rules in the sense of lexical mapping are adequate here. If the first one is regarded as such, either there is no origin specification in the lexical entries of exotic fruit, because there is no word blocking the application of the rule similar to (Iob), or the probabilistic solution must be adopted. The absence is unnatural, the probabilistic solution simply registers the exceptions. But the upshot of this is that people who do not know the names for the plants will still use the sense extension innovatively. This does not seem to happen, they rather
use things like banana tree/bush. On the pragmatic approach one explanation could simply state that the domain of the rule is limited because its results for exotic fruit are culturally not noteworthy.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
rule from fruit to plants bearing these fruit, 3 in case the fruit is used for food (xo), and from plants to a kind of food product made from them, as in {I 1).
170 Lexical Rules
·
as
Hypotheses Generators
The second rule would either require a very detailed specification both in lexical entries and in the rule domain of what is usually done to the plant in the domain to exclude cases of calling an apple pie an apple, or must be assimilated to some more general rule with consecutive specification of the result. Carrots cut in pieces are still called carrot, and it could be maintained that it is actually a case of a very general sense extension rule called grinding which allows the shift to the substance obtained from some object. That
specify a context: mustard seeds can be ground, but not processed to be used, i.e. mixed with vinegar, etc. If we license the rule in the general form for the lexical entry mustard, we still have to qualify its result. This is a patently pragmatic option, so should we first license such general rules in the lexicon to produce excessively general lexical entries and then move them to pragmatics for qualifications? The third rule, if considered as originating in the lexicon, presupposes such a considerable amount of world knowledge in the lexicon as to make its principled structuring by qualia relations very implausible. However, the conception of lexical rules as defaults in the lexicon has the merit of a rigorous formalization in Copestake & Briscoe (1 995). The
pragmatic alternative along the lines of Nunberg has some generality which can be made more precise. The origin of the phenomenon, it might be claimed, is in the way we name things. Some names are just reserved labels, some names use relations between things, invoking the concept of one thing and shifting to a related concept, and sense extension is a way to produce this kind of shift. Before we name things, we probably decide whether the thing is worth being named at all, since the device of description is always available. This noteworthiness is a prerequisite of sense extension, c£ Nunberg (1 995). But if noteworthiness or nameworthiness (nameworthiness is probably a more suggestive term) is computed at all, its criteria are .far from being clear. The nameworthiness concerns the relation underlying the regularity, and the result. It is usually motivated by the high relevance both of the processes conceptualized by the relation and of the results of these processes to some human sphere of life, and are context dependent. The nameworthy relations
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ground mustard is used in the kitchen only prepared in a specific way could be claimed to be world knowledge. An obvious difficulty with this treatment is the granularity of the rules and their precision.4 We could claim that we have a lexical rule which relates syntactic features and introduces this general relation, and our world knowledge tells us how it is specified in the context, e.g. how the object is ground and what form grounding turns the thing into depending on the context? The only requirement for making the rules more precise is that of compatibility with the derived syntactic properties. The difficulty is in requirement to
Anatoli Strigin
171
are used by transfer functions to give an interpretation of a lexical item in a context The central problem of the pragmatic account now is a description of transfer functions. This is also the aim of the paper. It attempts to give such a . description in a general framework which views semantic interpretation as hypothetical reasonmg, lexical interpretation being a part of this activity.
·
2 LEXICAL I NTERPRETATION AS ABD U CTIO N
( I 3)
if B is observed, and A ::::} B is given abduce A, a possible reason for B
All possible reasons are called hypotheses or abducibles. The use of abduction to model interpretation of texts has already been reported by Charnjak & McDermott (I98 5). Hobbs, Stickel, Appelt, & Martin (I993) took it a step further. To find an: interpretation of a sentence they explain its logical representation by abducing the best possible explanation for its components, where best is having .the least cost. In case of the lexical interpretation they represent ambiguity by providing all possible interpretations and disambiguating the result by searching for the hypothesis which provides the cheapest explanation. Consider their example in (I4). The knowledge of the interpretation of the word bank in English is taken to consist of the postulated predicate bank (X) , which is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If lexical interpretation is hypothetical activity, lexical rules as transfer functions could be based on the reasoning mechanisms underlying this activity, too. Lexical rules determine the hypotheses space of this activity. Since a hypothesis may be adopted, but need not, such rules will be rather like the 'implicational possibility rules' of Green. Hypothetical reasoning can be considered a case of abduction, following Peirce, e.g. Peirce (I 992). In its simplest form abduction is a mode of reasoning on the basis of a rule (which establishes a connection between a case and a result) and a possible result of the rule, to the case (possible reason for the result). The notion 'reason' is deliberately very broad and intuitively vague here, since it must be based on the formal properties of the underlying logic. The technical terms used in the literature are explanation for the reason in question and evidence or observation for the observed result of the rule. If the rule is stated as an implication, abduction is just following modus ponens backwards.
172
Lexical Rules as Hypotheses Generators
implied by two concepts corresponding to the notions of river bank, bankriv�(X), and of financial institution, bankfirumu(X). ( 1 4)
{ bankn�(X)
bank(X) bankfinanu(X) => bank(X) =>
}
Many non-monotonic inferences are abductive by nature, which is to say they provide plausible explanations for some states of affairs . . . The problem, of course, is that not just any explanation will do; it must, in some sense, be a 'best' explanation . . . But if there is a best theory, there must be poor ones; so diagnostic reasoning really consists of two problems: (a) What is the space of possible theories that account for the given evidence? (b) What are the best theories in this space? ·
Poole systems will be used to define the space of all interpretations of a lexical item. Under the approach of Hobbs et al. ( I 993) the· hypothesis space is not limited, and the problem of choice is not separated from the problem of characterizing the hypothesis space. Poole systems (Makinson 1994 introduced the term) or abductive frameworks are a formalization of hypothetical reasoning as theory formation based on first order language L. Let f be a set of sentences in L which are considered to be facts in the sense that we accept them as true and inviolate for the span of the abductive task at hand, and ¢ be the sentence expressing another kind of fact, an observation which we want to explain. Finding an explanation is our abductive task. We also have a set of hypotheses II at our disposal, which we can use to explain ¢, as long as they do not lead to contradiction. For technical reasons (Poole 1987) we need a set P of ground instances of the formulas of II which would together with r imply ¢. An abductive task assumes a very simple version of theory building: if a set of ground instances P is shown not to contradict r, we simply compute the consequences of f and P together, which is a theory in the formal meaning of the word. If ¢ is in the theory, we may say that P explains ¢>. The sentences in P are called abducible sentences or abducibles.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The antecedents may be t:tken as possible hypotheses, and if some discourse referent d is plugged in for the variable, we can explain the observation bank(d ) in two ways by abducing to one of the readings following (1 3) and (14). If something implying bank,;v�(d) occurs in the text, bankriv�(d). can be explained in its· turn, and since this amounts to using the hypothesis which has already been put to use, this explanation is cheaper than assuming a new hypothesis. So the reading bank,;ver(X) is sort of primed in the context. I will start with this example as the basic idea underlying lexical interpretation, but elaborate it in terms of abductive systems as defined by David Poole (Poole 1988). Talking about AI treatments of diagnosis, Raymond Reiter (Reiter 1987) notes that:
Anatoli Strigin
173
·
(15) r
=
{
rained-last-night ::::} grass-is-wet sprinkl�r-was-on ::::} grass-is-wet grass-ts-wet ::::} shoes-are-wet
}
The hypothesis set II may contain all the antecedents of the implications. It is often preferable to have only the basic hypotheses, in the sense that they are not further explainable from other hypotheses, i.e. II {rained-last-night, sprinkler-was-on}. We can choose either P1 =
=
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To represent generalizations with potential exceptions we allow the hypotheses in II to be open first-order formulas. Free variables can be used as place holders for constants. Given all substitution instances. of a hypothesis, those substitution instances which are explicitly contradicted may not be used, the others may. The rules in II are thus sentence generators. We may now introduce the necessary terminology. An abductive framework is a pair (r, II) of sets of possibly open formulae. Let P be a set of ground instances of formulas from II. De£ I : A scenario ofan abductiveframework (r, II) is a set P ofground instances of elements of II such thatT U P is consistent. De£ 2: If ¢ is a sentence, an explanation of ¢ from (r, II) is a scenario P of (r, II) which together with r implies ¢, i.e. a set P ofground instances of II is an explanation of ¢ iff (i) P u r 1= ¢ (ii) p u r is consistent It is possible to use provability, f-, instead of modelling relation, f=, due to the equivalence of the notions for the first order languages.5 A theory is explicated as an extension of the abductive framework (r, II) defined below. De£ 3 : An extension of (r, II) is the set of logical consequences of the union �f r and some maximal with respect to set inclusion scenario P of (r, II). Another name for an extension is maxiconsistent set (Makinson 1994). There is a connection between explainability and theory building, expressed in the following theorem proved in Poole (1988). Theorem: there is an explanation of ¢ from (r, II ) iff ¢ is in some extension of (r, II). The theorem says that ¢ can be explained iff it follows from a consistent theory based on some maximal set of hypotheses. Consider a simple example of an abductive task. Suppose we observe shoes-are-wet. The facts at our disposal are the implications of the data base in ( 1 s).
174 Lexical Rules
as
Hypotheses Generators
=
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
{rained-last-night} or P2 = {sprinkler-was-on} as the explanation of shoes-are-wet. The maximal scenario is their union. The extension is built on this scenario. The maximal scenario is also an explanation, but it is too presumptive. We would like to assume not all compatible cases at once, but one at a time, i.e. we should use explanations which are minimal in terms of set inclusion. We want to use knowledge in the form of hypotheses only if there is evidence for them. We do not want constantly to hypothesize that it rained last night, or that we must pay for cigarettes, but do this only if our shoes are wet or if there is a cigarette shortage and the vending machine refuses to , budge. Since neither ¢ nor II are syntactically marked as observation or hypotheses, respectively, they simply are earmarked so, if they are viewed as terms in the relation of explainability. The point is they cannot be simultaneously treated as something else in the same task. And explainability is only one family of relations involved in the non-monotonic inferential activity. Another family of relations is prediction by default, i.e. prediction either ofwhat is a convention, or what is an accepted tendency. Extensions (or an explainability relation) can be used to model default reasoning, too. By default reasoning a kind of hypothetical reasoning is usually meant where the hypotheses are used not to explain observed things, but to predict what may be. In modelling prediction by explainability we must define criteria of what is predicted. There are different possibilities. Usually what is predicted by default is defined as something which is in every extension of an abductive framework, i.e. what is explained by every theory. Default rules have the property which is called conditioning in Poole (1991), i.e. they are used whenever their preconditions are met. Hypotheses are used when there is evidence for them. Formally there is no difference between the two. Moreover, whatever is a hypothesis in one abductive task might be a default in another. This conceptual difference can be taken care of by keeping defaults and hypotheses separate, so the set of defaults will be denoted by �. but defaults can be used in explanation alongside with hypotheses. The modified definition is given below. De£ 4: A scenario of abductive framework (r, �' II} is the union of a set P of ground instances of elements of II and a set D ofground instances of � such that r U P U D is consistent. De£ s : If ¢ is a sentence, an explanation of ¢ from (r, �' II ) is a scenario A = P U D of (r, �' II) which together with r implies ¢, i.e. A P U D is an explanation of ¢ iff (i) P u D u r f= ¢ (ii) P U D U r is consistent It is essential to understand that explanation is not prediction by default. This point is often overlooked.6
Anatoli Strigin
175
Poole systems may use constraints on inference which serve as a kind of inference control mechanism. The next definition gives the form of Poole systems with constraints. De£ 6: A scenario of an abductive framework with constraints (r, II, �, C) is the union of a set P ofground instances of elements of II and a set D ofground instances of b. such that r U P U D is consistent. If
Fp U D. U Fn f= g. and Fp U D2 U Fn � g2 and FP U D2 U Fn � g. ,
then solution (D1 ,g. ) is more general than solution (D2 ,g2 ). The starting point of this section was the example (14) of Hobbs et al. (1993) which is repeated here as an abductive framework.
II = {bankrivtr(X) , bankfinana(X) } (16)
r=
{
banknvtr(X)
=>
bankfinana(X)
bank(X)
=>
bank(X)
}
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
fo
1 76 Lexical Rules
as
Hypotheses Generators
.
( 17)
LEphform
=
(r
.
{ cons;(X)
=?
.ifOrm (X) } , IT
=
{cons; (X) })
The simplest assumption to be made regarding the scheme is that the entities cons; which correspond to bankriver(X) or bankfinanu(X) are names of concepts. The general entity .ifOrm(X) which corresponds to bank(X) will be called the semantic form associated with a phonological form.
3 C O N CEPTS AND C ONTEXTS It is surely impossible to do justice to the notion concept here; therefore only features which are relevant to the concerns of the paper will be introduced. It is widely accepted that at some level of abstraction a large part of conceptual knowledge can be conveniently described (as being) in proposi tional format. Leveque (1986) has provided a concise formulation of this assumption: 'For the structures to represent knowledge, it must be possible t� interpret them propositionally, that is, as expressions in a language with a truth theory. We should be able to point to one of them and say what the world would have to be like for it to be true.' So in this respect concepts are just theories. . . Abductive frameworks are a reasonably good means of modelling concepts. A substantial amount of work done by psychologists of which Barsalou (1992) is a sort of a synopsis uses frame-based formalisms as a formal implementation of the notion. But frames can be modelled by .
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This framework models the case when there are no possibilities to further explain any of the two hypotheses, so they are basic. In this case the system of Hobbs et al. (1993) must assign assumption costs to the hypotheses because each one must be assumed. The two explanations in the Poole system should be subjected to some choice criterion, if there is one, or treated as equally plausible, if there is none. If a Poole sys�em is embedded in a context where one of its hypotheses becomes explainable itself, this hypothesis would no longer be basic, and would not surface as a potential explanation. The system of Hobbs et al. (1993) would still consider it as a possible choice. The main difference between the systems of Poole and Hobbs is the locality of a Poole system, which is due to the fact that its hypotheses ar.:e explicitly listed. We need not consider all possible contexts from the start. And the problem of the ·choice criteria between the basic minimal hypotheses can be stated separately. . The general scheme of a lexical entry of an ambiguous lexical item which can be abstracted from the example from Hobbs et al. (1993) is given by the abductive framework scheme in (17)
Anatoli Strigin 1 77
'
l1
=
{
getedible(X, Y)
}
intofur(X, Y) rabbit(X)&getedible(X, Y) ::::}
consistsof(X, Y)&edible(Y)&stuff( Y)
rabbit(X)&intofur(X, Y) ::::}
f=
partof(X, Y)&rabbiifur(X)&-.edible(X)
rabbit(X)&consistsof(X, Y)
::::}
rabbit(X)&partof(X, Y)
aspof(X, Y)
::::}
rabbit( ¥) ::::} •consistsof(X, Y) rabbit( ¥)
::::}
•partof(X, Y)
rabbit(X)
::::}
animal(X)
aspof(X, Y)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
abductive frameworks, somewhat modifying the suggestions in Hayes (1980), Reiter ( 1987) and Russel & Norwig (1995) as to how to interpret frames in logic. Concepts on this account contain descriptive information that people represent cognitively for a category, including definitional information, prototypical information, functionally important information, and probably other types of information as well. For ease of reference to hypotheses they will be written involving a naming convention, i.e. where p(X) ::::} q(X) is a hypothesis, it will be replaced by a new hypothesis consisting of a predicate with exactly those variables free which were free in the original hypothesis, say newh(X) and . the set of facts will be extended by the formula newh(X) ::::} (p(X) ::::} q(X)) (or newh(X)&p(X) ::::} q(X)). Consider e.g. the concept rabbit in ( 1 8). There are twq possible hypotheses here: firstly, rabbits consist of some soft matter which is edible, with an indication of which part of the rabbit this is, presumably. This hypothesis is rendered by getedible(X, Y). Secondly, rabbits have fur as their part.7 The predicate aspof(X, Y) marks those properties which are important aspects . of the concepts, structuring the theory. The remaining formulas state that rabbits are neither edible substance · not parts of rabbits. Other common knowledge is listed in (19), some linguistic knowledge in (2o).
178 Lexical Rules
( 19) r
as
Hypotheses Generators
consistsof(X, Y)
:::::}
partof(X, Y)
stuff(Y)
stuff(X) :::::} mass(X)
=
animal(X)
=
{
:::::}
object(X) :::::}
object(X)
·-.
( mass(X))
mass(X) :::::} -. (count(X)) object(X)
:::::}
}
count(X)
Since aspects provide structure to the concepts, nameworthy things about concepts will be aspects, formally ·speaking. It still remains to characterize . contexts, since we want to use sense extension rules only in the context of lexical interpretation. A context is an environment in which something happens, informally speaking. The context in which the interpretation of some phono logical form is computed is the context in which an abductive task of a particular kind is solved. So we heed a definition of context for an abductive task which can fulfil its role. Contexts are thus resource specifications. Following McCarthy (1993), contexts could be formalized as entities which may be used to index pieces of knowledge as specifying some resources in form of micro-theories. Since interpretation is hypothetical reasoning, this specification should make some. distinctions. One and the same piece of knowledge can be used for different purposes in different tasks, i.e. to know a context in which this piece of knowledge is rised we must know how it is used in this context, among other things, i.e. as a hypothesis, a default, a constraint, a fact, or an observation. Given that an abductive framework ftxes the role of the knowledge pieces, the use of an abductive framework amounts to a partial specialization of the context parameter. Only contexts that are consistent with this role assignment can be considered for use with the . framework. To be able to talk about contexts we can index the resources of an abductive framework by a context parameter which should remain the same for the task at hand. If the context is changed, a new context may contain additional resources, and the change of the context may add the resources of the new context. If the facts of the new context do not contradict those of the old context, the resulting context is consistent, though some inferences may
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2o) r
:::::}
stujJ(Y)
Anatoli Strigin I 79
become blocked. Another kind of context change, in which the resources of a resulting context are reclassified with regard to the resources of the old one for a new task, will not be used here. Given that a context brings an abductive framework with it, we need the notion of a composition of two consistent abductive frameworks. There is not much work going on in this direction at present, so we will only sketch the _necessary operation below. The contexts will be introduced by a special predicate ct(X). To pursue this line of thought our hypothetical rules will be specified by a context of lexical interpretation which is determined by the concept which is the primary hypothesis.
THE RULES
Cases of polysemy depend on the existence of several explanations, some of them direct, in terms of a concept, some only related to the concept. The structure of sense extension rules explored in the paper is due to the following proposal: the extended sense interpretation comes about by hypothesizing that a concept related to a given explanation by some specified relation can be taken as a new hypothesis. This meta-hypothesis is reflected by the introduction of a new predicate shift. To analyse the structure of a rule we modify the structure of a lexical entry (17) to (2 1).
II = ={
{cons;(X) }
(2 1 )
r
cont;(X)
::}
shift(X, Y)
sform(X) ::}
}
sform(X)
If nothing else is specified in the context of interpretation, cont;(X) is a hypothesis schema yielding explanation instances as scenarios. Any such scenario will be a minimal explanation. The predicate shift(X, Y) will allow a reference shift for a semantic form extending the hypothesis space by using the structure of conceptual knowledge. If there is an aspect of a concept that is nameworthy, this aspect can provide an interpretation of the semantic form which is normally interpreted by the concept. Note that shift(X, Y) needs an explanation from elsewhere, since there is no corresponding hypothesis in the lexical entry. shift is explained via a core lexical rule, (22). The core rule is a default, so it
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
4
1 So Lexical Rules
as
Hypotheses Generators
has to be applied whenever possible, but it is not sufficient by itself to let sform be deduced.
( 22 ) � = { (ct( C)&aspof(X, Y, C) )
=*
shift( Y, X) }
(23 ) � =
{
animaltofur(X, Y) .
animaltomeat(X, Y)
}
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It uses a contextualized version of aspof(X, Y), and places no constraints on . p�ssible explanation via aspof(X, Y, C). The condition ct( c) is intended to provide an appropriate binding of the context in which the rule is applied to the theory containing the resources of the concept which is the primary hypothesis in the lexical entry. To reflect the dependence of the explanation on the context we index the abductive framework itself by a context in the manner of McCarthy (1993) and use the three-place predicate aspof(X, Y, C) instead of aspof(X, Y) which reads like the relation holding of Y and X is an aspect of the concept that names X in context C. Technically we could either dispense with aspof(X, Y) and use aspof(X, Y, C), which will be done in the sequel, or (again following McCarthy 1993) decide that in the relevant context the two predicates are equivalent, and use aspoj(X, Y, C) only on leaving the context. In each case the deductions will be restricted to the resources of the relevant concept. Any rule can be generated from . the core rule by adding additional constraints to the abductive framework. There are a number of possibilities to generate a rule from the core rule. If one and the same aspect relation is used in a number of concepts for which it is nameworthy, it can be taken itself and used to constrain the core rule. Then the domain of the rule is defined by the · relation itself, regardless of whether there is a lexical entry or not . for the category which forms the domain, or even whether their is a separate concept for it at all. Thus, it is possible to consider 'things that are processed for the kitchen by grinding'. The second way is to add constraints on the concepts which can be gathered in the domain, e.g. specify them as animals. It can be combined with a more general statement of the transfer relation. An the third way is to create a special category of relation which generalizes over a number of cases of sufficiently similar relations, and use this new category as a constraint. The case of animal meat and fur could be an illustration of the first or the second way, (23 ). The device of naming the defaults is used to make the reference to defaults in constraints easier.
Anatoli Strigin
I8I
·
ct(C)&animaltomeat(X, Y, C)&aspof(X, Y, C) shifi(X, Y)
=>
f=
ct(C)&animaltofur(X, Y, C)&aspof(X, Y, C) =>
shifi(X, Y)
ct( C)&animaltomeat(X, Y, C) count(X)&animal(X)
=>
&-.(count( Y) )&edible( Y)&consistsof(X, Y) ct(C)&animaltofur(X, Y, C) count(X)&animal(X)
=>
&•(count(Y) )&-.(edible(Y) )&partof(X, Y) To see how the rule works assume that the relevant part of the lexical entry for rabbit are as in (25). II = { rabbit(X) } (25)
r
=
{
rabbit(X)
=>
shifi(X, Y)
ifrabbit(X)
=>
}
ifrabbit(Y)
The context is fixed to the lexical interpretation of rabbit, i.e. ct(rabbit) holds. The semantic form ifrabbit(X) of rabbit can be explained via rabbit(X), via animaltofur(Z, X, rabbit), or via animaltomeat(W, X, rabbit). The predicates consistsof(X, Y) and partoJ(X, Y) provide the necessary entailments via aspof(X, Y, rabbit), and constrain the rules simultaneously. Consider now the examples from section 1. The case of mole can be explained, if consistsof is not nameworthy for moles, presumably because mole stuff is not considered to be edible. It can be assumed to be so. This · exceptional hypothesis decreases the acceptability judgements. The case of ( I I ) can be seen as small-scale generalization. The use of abduction allows generalization on the basis of a small number of sufficiently similar cases. A new predicate is introduced which is abduc tively explained by the cases in question: There is no need for straight forward semantic criteria for this relation which are sometimes required under the lexical rule approach, c£ Briscoe & Copestake (I996). This is especially useful in the case of (12): the generalization exploits a property of easily squashable juicy plants which is nameworthy because it is very salient
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
C=
1 82 Lexical Rules
as
Hyp otheses Generators
in a small, but very important set of contexts. And lastly, exotic plants are not nameworthy origin properties of exotic fruit, presumably because they are exotic. In the case of animal-to-meat sense extension in Russian, the domain of the morphological rule derivation can be easily restricted to animals which are sold as meat, in portions. The interaction between the two rules, the morphologically marked derivation rule and the sense extension, is the subject of the next section.
BLOCKING AND DEBLOCKING
Presumably these resources must be assessed from the viewpoint on the interlocutors, i.e. a) from the point of view of the listener: could the speaker intend to name something which does not have a specific name with required properties with the help of this device? and b) from the point of view of the speaker: can the listener plausibly find the hypothesis? These are pragmatic constraints, since their justification lies in the fact that there is a primary interpretation of a word, and the laws of successful communication are known to the interlocutors. On the one hand, the listener's constraint would give one part of an account of blocking and deblocking. In (6), the use violates the assumption that there is no appropriate specific name, so the reference shift seems unmotivated. In (7) this shift is justified, since an additional characteristic can be hypothesized, distinguishing the two words. The mechanism is in each case based on something like Gricean Maxims yielding discourse implicatures (Copestake & Briscoe 1 995), probably because more specific explanations are more informative. For this explication to go through the explanation via a name should be relatively more specific according to the specificity of explanations criterion. The definition of specificity (7) will be shown below to be applicable in this case. On the other hand there should be a general mechanism to compute the plausibility of some interpretation . relative to others which reflects the effort. Computing plausibility of an interpretation can use different degrees of assumption, e.g. how much inference using world knowledge this computation involves. The extreme case is probably the interpretation of unknown words. If e.g. the context of interpretation strongly suggests the meat-interpretation of an unknown word, then the constraints of the rule (23) can be used as a default characterisation of the aspect relation associated with the hypothesis animaltomeat. Thus, if somebody does not know what a badger is, s/he still can infer that it is an animal, and the edible part of its
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
s
Anatoli Strigin I 8 3
(26 )
( 27 )
IT
=
{pigmeat(X) }
f
=
{ pigmeat(X)
IT
=
{ pig(X) }
r
=
{
pig(X)
=>
shift(X, Y)
==?
sfpork(X)}
sfpig(X) ==?
}
sfpig(Y)
Suppose our relevant conceptual knowledge is represented in (28), i.e. pigmeat is the meat a pig consists of, and it is a nameworthy aspect of pig, that it consists of meat (meat abbreviates edible stuff here).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
stuff is meant, though it is impossible to say which part it is; delicacy, ham, and eaten bias the explanation towards the edible parts. The bias could be accounted for using something like the coherence measure of Ng & . Mooney (1990). To return to the account of t:xplanation interaction in terms of Gricean maxims. One strategy of the listener is to assume that the speaker is as informative, as necessary and possible. In other words, s/he is trying to be as specific, as· possible in the sense of De£ 6 of the specificity of solutions. The solutions in case of the name vs. extended sense possibilities are solutions to· the choice of words. Given that this is known to the speaker, s/he would normally act this way. Acting contrary to it should be justifiable, i.e. . explainable in its own right. In (6), the use violates the assumption that there is no specific name with appropriate properties, and there is no evident justification. In (7) this use is justified, since an additional characteristics of the object referred to can be computed on the basis of the contextual information: pig is intended to refer to generally not edible parts of pig, which are in the sausages. So the use of pig has a somewhat different explanation, than in the normal case. This use is justified from the point of view of Throat. The listener is invited to drop the constraint on edibility or to expand this notion: an invitation not understood by the customers of Throat, as a rule-to their disadvantage. This account employs the definition of specificity. To show that De£ 6 is applicable in this case, we must compare two solutions to the choice problem. If the solution with the name turns out to be more specific, the definition (7) is applicable. But since the problem of word choice is not quite the problem of explanation choice, we should compare the resources of explanation in general, i.e. explanation schemata and not their instances. Let {26) be the lexical entry for pork, and (27) the lexical entry for pig.
184
Lexical Rules
(28)
=
ll
(29) r
=
as
Hypotheses Generators
{ usemeat(X, Y),pigtomeat(X, Y) }
Since the abductive task is that of choosing words, the context is ftxed, e.g. to the constant choice by ct(choice). The rule (23) is also available in ct(choice) as (3o). .6.
=
r :..._
C
=
{ animaltomeat(X, Y, choice) }
{
{
animaltomeat(X, Y, choice)&aspof(X, Y, choice)&ct(choice) =>
shifi(X, Y)
.
'ct(choice)&animaltomeat(X, Y, choice) =>
count(X)&animal(X)
}
·
·
&-. (count( Y) )&edible( Y)&consistsof (X, Y) .
}
We have to show that the solution (J I) is less general than the solution (32). ( J I ) ( pigmeat(X) , ifpork(u)) ( J2) ( { pigtomeat(X, Y) , pig(X) , animaltomeat(X, Y, choice) } , ifpig( u)) To check this assume that the discourse referent u is provided in the choice context choice. Consider the case where the choice context contains only the grammatical information ·that u is a discourse referent dis coursereferent( u), and we know nothing about its identity. In this case both the solution (3 I) and (32) are applicable. But if we modify this contingent fact and assume that we already know that pig( u ) (3 I) is no longer applicable, since pigs are not pork. Solution (32) is still applicable, because if X is instantiated to u, the hypothesis pig( u) suffices to explain ifpig( u). Thus, the account of blocking/deblocking sketched above can use De£ 6 with open formulas. ,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
pigmeat(Y)&usemeat(X, Y) => consistsof(X, Y)&pig(X)&meat(Y) pig(X)&pigtomeat(X, Y) => consistsof(X, Y)&meat( Y) pig(X)&consistsof(X, Y)&meat(Y) => aspof(X, Y, choice) pigmeat( Y)&consistsof(X, Y)&pig(X) => aspof(X, Y, choice)
Anatoli Strigin I 8 5
·
.
. 6 A COMPARIS O N O F S OME RULE PROPERTIES UNDER THE TWO APPROAC HES A short comparison of the two approaches is now in order. Though the paper is largely programmatic, and merely sketches a formalization of the notion of a transfer function of Nunberg (1979) in the context of an abductive theory of natural language interpretation, there are two points clearly relevant to the proposal which should be discussed, however briefly,
·
·
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Consider now the example of this sense extension in Russian. A number of people suggested that morphologically marked semantics-changing derivation rules do not really differ from sense extension rules. They can thus be compared for specificity. The restriction of the sense extension rule to the holistic meaning is easily explainable if assumptions made to satisfy Gricean maxims can be conventionalized. Since the derivation rule is restricted to portion-wise sold animals, solutions using it would have been more specific than those using sense extension. The sense extension rule must aquire an additional condition, to be used nevertheless. The contrary to the restriction is the simplest conventionalizable addition, in a sense. For small animals the additional restriction is often not distinguish able from the holistic version; hence there is no violation of the maxims if both rules are used. This account of rule deblocking refers to pragmatic inferences which may be ad hoc, for single words, or conventionalized for. a class of words. But it is not applicable in a straightforward way to syntactic rules because of . that. Specificity ranking can be defined on different kinds of rules. Inasmuch as such ranking is used in syntax, the similarity is very interesting and suggestive, and may be a manifestation of some deeper information processing property of human intelligence, but the solutions to the problems of interpretation should not be automatically transferred to syntax, contrary to what I think is the position of Briscoe et al. (1995). In particular, the morphological phenomenon when different tense-formation rules {e.g. dreamed/dreamt) coexist is not deblocking in the sense explicated here, since deblocking involves pragmatic inferences extending some attributes of � concept, as the discussion of (7) indicated, and is not available in the -syntax. Another feature of the account is that it presupposes contextual variation of concepts. Principles of such conceptual modification in a context are postulated by psychologists, but are not very dear (Barsalou 1 992 reports on such effects).
186
Lexical Rules
as
Hypotheses Generators
i.e. contextual and language-particular licensing of the transfer. I will not be able to provide a theoretical contribution to these problems and have to restrict myself to some remarks. Since any aspect of a concept is in principle available as a source of a
·
your trousers, this might be related to the stain, though other explanations are possible. Suppose we consider the salience of the stain hypothesis to be proportionate to a probability estimate of washing the trousers as a result of a stain on them. The new conditional probability of the hypothesis might be taken to reflect its chances of entering into an explanation of the word strawberry (its interpretation). But if its salience is only proportionate to the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
hypotheses space extension for rule generation, but not all potential extensions are observed in every language, and if they are then not in all contexts, such extensions must be licensed by some contextually relevant and language-relevant factors. As far as language-particular preferences for sense . extensions are concerned, the position of Nunberg seeins to be adequate: cultural salience can lead to a sense extension rule. Unfortunately, no experiment in rule generation is possible here, since sense extension rules are learned when learning a language, and not created anew. Once the rule is highly conventionalized, it can be placed in the lexicon in the sense of being more readily accessible. The objection of Lascarides & Copestake (1998) that a non-trivial interface is required between the sort of formalism necessary to implement open-end inference of the kind proposed here and the syntactic representation is based on the assumption that some reasoning takes place in the lexicon, and it is better to make it easier. Program matically, Poole systems and the context theory can achieve exactly this, limiting the depth of abduction. Contextual dependence of a rule is another important matter. The use of context is limited under the lexicon rule approach to overriding the default information. Arguably, there is a better use, for which the abductive pragmatic approach is better suited. Consider the case of juice strains in (12). It could be assumed that all the aspects of a concept have logically equal chances either to be an interpretation for the sform or not to be, taken in isolation, but acquire different preferences depending on the interpreta tion context. The rating of a hypothesis in this context should depend on its salience there. If some interpretation hypotheses provide minimal explana tions of an observation in the interpretation context, they should become more salient. Thus, in the context of washing the juice stain interpretation of strawberry should be very salient. Then the corresponding rule will have a high probability of being chosen. The question is how to measure salience to achieve the kind of reasoning described. Intuitively, the interpretation becomes more likely because it explains part of the context. If you wash
Anatoli Strigin 187 probability, it is actually a likelihood function (Edwards
1992). Thus, there
are two roads to compute the saliences of the hypotheses in the context: either to treat them as probabilities directly, or to take their negative logarithms Qog-likelihood) and treat them as cost assignments. Probabilistic abduction proposed in Poole (1993) is a way to compute the conditional probabilities of hypotheses in a given context. Cost-based abduction with probabilistic cost semantics proposed by Charniak &
entries of affixes. Blocking and deblocking can be modelled under both approaches, but the interpretation of deblocked items can be naturally handled under the pragmatic rules approach, whereas Briscoe & Copestake (1996), where a frequency-of-occurrence based account of blocking is proposed, reserve the problem of extra implicatures of the use of the blocked form · and the generation of this form itself to the interface of pragmatics. Briscoe & Copestake ( 1 996) also propose the use of statistic data to grade
the rules relative to each lexical entry in the domain and to use the statistic information to guide the rule application. While this is of great potential interest for computational linguistics, this approach does not cover contextual dependence of the rules discussed above. Another recent attempt to integrate probabilities, pragmatics, and a lexicon which is screened off from pragmatics is Copestake & Lascarides {1997). But the approach propose there still offers no possibility to assign different probabilities to senses in different contexts.8 The dependence on frequency of observed readings can be reflected under the cost-assignment extension of abduction indicated above, too, so that rule probability for each lexical entry can, in principle, be registered on the aspof predicate of the corresponding aspect of the concept. However, this discussion shows that an abductive treatment along the lines of this _ paper is still more like a research programme, the .work of Hobbs and his asssociates notwithstanding. · ANATOLI STRIGIN Humboldt University, Berlin jiigerstrasse 1 0- 1 1 1 0 1 1 7 Berlin Germany e-mail:
[email protected]
Received: 28.08.97 Final version received: 1 8.06.98
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Shimony (1994) could implement the log-likelihood-based version, although a flexible reassignment of costs in a context is needed. Lexicon rules can easily accommodate all kinds of syntactic and morphological effects accompanying sense extension. It is then possible to claim basic similarity of sense extensions and derivation. This is not a problem for the pragmatic rules, either, since rules can occur in lexicaL
r88 Lexical Rules as Hypotheses Generators N O TES r
Nunberg & Zaenen
(1992)
appeal to
Gricean maxims to justify their opinion that a specific description is to be pre
ferred to a vague one 'where no ulterior
·
motives intrude'. They do not specify how these motives are figured out and
2
Lascarides
(1997)
&
suggest that inter
preting a compound should be con
trolled by the associated probabilities.
notion, it is fairly clear in prototypical
the lexicon as interpretation schema.
Though cultural salience is a vague
etc. are going to be treated in one way or another depending on how well they fit
the generalization.
The example is based on the hypothesis that this is the direction of the rule application.
Some
evidence
hypothesis are the forms like
in English or German.
Apfelbaum
This
is
also
for
the
lemon-tree
(apple-tree) in the
direction
adopted in Copestake & Briscoe
Apresjan assumes the reverse.
(1995);
4 The term is used by Nunberg & Zaenen
to describe specifications of a
(1992)
relations interpreting a compound in
These
are
ordered
by
a
specificity
hierarchy. If no specific interpretation is possible, the most general relation is considered and treated as an anaphor to
be ·resolved pragmatically from the context. The choice between several
possible compatible readings is guided
by the principle that words are assigned
the most probable sense that produces a
well-defined discourse update. The dis
tinctive feature of the proposal is the
method
of
computing
probabilities.
The probability of a sense of a word depends on the frequency of this sense
in the corpus used to compute the frequency distributions. The interpreta
tion schemata defining the senses of the
general relation in a context depending
compounds are assigned weights which
Note that the hypotheses in TI are not
tial compound which does not occur in
on the world knowledge.
.
statements about the real world; but
reflect their productivity. For a poten
the corpus the probabilities of the senses
assumptions about what can be a possible
will be proportional to the productivity
A reviewer noted that the default logic
the corpus, the prqbabilities of its senses
description of the world.
in Poole systems is insufficient. Indeed the relation of sceptical default conse
rankings. If a compound is observed in
are computed via their frequency in the corpus, with a residual probability dis
quence in Poole systems with constraints
tributed between those senses, which
default logic of Reiter, c£ Reiter
to
is equivalent to the prerequisite-free
Dix
However,
(1992).
the
(r98o),
computed
relation is here not that of sceptical
default consequence, but of explanation.
The objection does not apply. Pros and
cons of abductive explanation vs. default logic
should
be
application case.
7 See Chaffin
the part
(1992)
of relation.
evaluated
in
each
for the polysemy of
are not observed, again in proportion the
productivity
rankings.
Thus,
there is no way to take. into account
the influences of the context on the
probability of a reading. Hence if two
readings are compatible, it is invariably
the more frequent that will be chosen.
The closest we can get to reflecting the contextual influence is to index prob abilities by the
compute them.
type of corpus used to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
allows for predictions about new nouns,
i.e.· names of unknown birds, new loans,
6
·
domain of compounds Copestake
They register a restricted number of
cases. If these cases serve as a basis of a
5
pretation for the notoriously difficult
taken into account by the listener.
generalization, the pragmatic approach
3
8 In an attempt to provide an inter
Anatoli Strigin
189
RE FERE N C E S Apresj an, J.
Linguistics
(1973), 'Regular polysemy', 5-32. W. (1992), 'Frames, concepts
Indiana University Press, Bloomington, IN. Hayes, P. J. ( 1980), 'The logic of frames', in D. Metzing (ed.), Frame . Conceptions and Text Understanding, Walter de Gruyter, Berlin. Hobbs; J., Stickel, M., Appelt, D., & Martin, P. (1993), 'Interpretation as abduction',
Regularity,
I.P,,
Barsalou, L. and conceptual fields', in A Lehrer & E. F. Kittay (eds), Frames, Fields, and Contrasts, Lawrence Erlbaum, Hillsdale,
NJ, 2 1-74-
Briscoe, T. & Copestake, A (1996), 'Con trolling the application of lexical rules',
Artificial Intelligence, 63, 69-142.
.
·
Santa Cruz, CA Briscoe, T., Copestake, A, & Lascarides, A (1995), 'Blocking', in P. Saint-Dizier & E. Viegas (eds), Computational Lexical Semantics, Cambridge University Press, Cambridge. Chaffin, R. ( 1992), 'The concept of a semantic relation', in A. Lehrer & E. F. Kittay (eds), Frames, Fields, and Con . trasts, Lawrence Erlbaum, Hillsdale, NJ,
. Computer Science, I, 225-87. McCarthy, J. (1993), 'Notes on formalizing context', Proceedings of . the Thirteenth International joint Concference on Artificial Intelligence. McCawley, J. D. (1968), 'The role of semantics in a grammar, in E. Bach &
R. T. Harms (eds),
Theory, Holt, Reinhart, & Winston, New York, 124-69. McCawley, J. D. (1973), Grammar and Meaning, Taishukan Publishing Company, Tokyo. Makinson, D. (1994), 'General patterns in nonmonotonic reasoning', in D. M. Gabbay, C. Hogger, J. Robinson, & D. Nute (eds), Handbook ofLogic in Artificial
25 3-88.
Charniak, E. & Shimony, S. E. ( 1994), 'Cost based abduction and map explanation',
Artificial Intelligence, 66, 345-74.
Charnjak, E. & McDermott, D.
to
Artificial
(198 5), Intelligence,
.
Addison-Wesley, Reading, MA Copestake, A & Briscoe, T. (1995), 'Semi-productive polysemy and sense extension', Journal of Semantics, 12,
Intelligence and Logic Programming, Vol.
1 5-67.
Copestake, A & Lascarides, A (1997), 'Integrating symbolic and statistical representations: the lexicon pragmatics . interface', Proceedings of the Association for Computational Linguistics 1997, Madria. Dix, J. ( 1992), 'Default theories of Poole type and a method for constructing cumulative versions of default logic', Proceedings of 10th ECAI, Wiley & Sons, New York, 289-93. Edwards, A W. F. (1992), Likelihood, John Hopkins University Press, Baltimore, MD.
Green� G. M
(1974), Semantics and Syntactic
Universals in Linguistic
·
J,
Clarendon Press, Oxford. Ng, H. T. & Mooney, R. J. (1990), 'On the role of coherence in abductive explanation', Proceedings of the Conference
of the American Association of Artificial Intelligence, 3 3 7-42. Nunberg, G. {1979), 'The non-uniqueness of
semantic
solutions:
polysemy',
Linguistics and Philosophy, 3, 2, 1 43-84. Nunberg, G. (1995), 'Transfers of meaning', journal of Semantics, 12, 109-32· Nunberg. G. & Zaenen, A (1992), 'Systema tic polysemy in lexicology and lexi cography', in K. Hannu Tommola, T. Salrni-Tolonen, & J. Schopp (eds),
.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lascarides, A & Copestake, A (1998), 'Pragmatics and word meaning', MS. Leveque, H. J. (1986), 'Knowledge represen tation and reasoning', Annual Review of
Proceedings ofthe ACL SIGLEX Workshop on Breadth and Depth ofSemantic Lexicons,
Introduction
·
190 Lexical Rules as Hypotheses Generators EURALEX 1992
on belief: conditioning, specificity and the lottery paradox in default reasoning', Artificial Intelligence, 49, 281-307. Poole, D. (I993), 'Probabilistic hom abduction .and bayesian networks', Artificial Intelligence, 64. 8 I-I 29. Pustejovsky, J. (I995), The Generative Lexicon, MIT Press, Cambridge, MA. Reiter, R (I98o), 'A logic for default reasoning', Artificial Intelligence, I 3, 8I-I 32. Reiter, R (1987), 'Nonmonotonic reason ing', Annual Review of Computer Science, 2, I47-86. Ruhl, C. (I989), On Monosemy: A Study in Linguistic Semantics, State University of New York Press, Albany, NY. Russel, S. & Norwig, P. (I995), Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs, NJ.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Proceedings, Tampere, Finland, 387-96. Peirce, C. S. (1992), Reasoning and the Logic of Things, Harvard University Press, Cambridge, MA, edited by Kenneth Laine Ketner. Poole, D. (1985), 'On the comparison of theories: preferring the most . specific explanation', Proceedings of the Ninth International joint Conference on Artificial Intelligence, Los Angeles, CA, 144-7. Poole, D. (I987), 'Variables in hypotheses', Proceedings of the Tenth International joint Conference on Artificial Intelligence, Milan, Italy, 905-8. Poole, D. (I988), 'A logical framework for default reasoning', . Artificial Intelligence, 36, 27-47· Poole, D. (I99I), 'The effect of knowledge