Text Representation
is a forum for interdisciplinary research on the nature and organization of the cognitive systems and processes involved in speaking and understanding natural language (including sign language), and their relationship to other domains of human cognition, including general conceptual or knowledge systems and processes (the language and thought issue), and other perceptual or behavioral systems such as vision and nonverbal behavior (e.g. gesture). 'Cognition' should be taken broadly, not only including the domain of rationality, but also dimensions such as emotion and the unconscious. The series is open to any type of approach to the above questions (methodologically and theoretically) and to research from any discipline, including (but not restricted to) different branches of psychology, artificial intelligence and computer science, cognitive anthropology, linguistics, philosophy and neuroscience. It takes a special interest in research crossing the boundaries of these disciplines. HUMAN COGNITIVE PROCESSING
Editors Marcelo Dascal, Tel Aviv University Raymond W. Gibbs, University of California at Santa Cruz Jan Nuyts, University ofAntwerp
Editorial address Jan Nuyts, University of Antwerp, Dept. of Linguistics (GER), Universiteitsplein 1, B 2610 Wilrijk, Belgium. E-mail:
[email protected] Editorial Advisory Board Melissa Bowerman, Nijmegen; Wallace Chafe, Santa Barbara, CA; Philip R. Cohen, Portltmd, OR; Antonio Damasio, Iowa City, IA; Morton Ann Gernsbacher, Madison, WI; David McNeill, Chicago, IL; Eric Pederson, Eugene, OR; Fran~ois Recanati, Paris; Sally Rice, Edmonton, Alberta; Benny Shanon, Jerusalem; Lokendra Shastri, Berkeley. CA; Dan Slobin, Berkeley, CA; Paul Thagard, Waterloo, Ontario
Volume8 Text Representation: Linguistic and psycholinguistic aspects Edited by Ted Sanders, Joost schilperoord and Wilbert Spooren
Text Representation Linguistic and psycholinguistic aspects Edited by Ted Sanders University of Utrecht
Joost Schilperoord University ofTilburg
Wilbert Spooren Free University, Amsterdam
John Benjamins Publishing Company Amsterdam I Philadelphia
,(;)i;j~
\ _ __/
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences - Permanence of Paper for Printed Library Materials, ANSI z39.48-1984.
Library of Congress Cataloging-in-Publication Data Text Representation : Linguistic and psycholinguistic aspects I edited by Ted Sanders, Joost Schilperoord, Wilbert Spooren. p. em. (Human Cognitive Processing, ISSN 1387-6724; v. 8) Based on papers presented at a conference held July 1997, Utrecht University. Includes bibliographical references and indeL l. Discourse analysis--Psychological aspects--Congresses. I. Sanders, Ted. II. Schilperoord, Joost. III. Spooren, Wibert. IV. Series. P302.8.T49 2001 401 ~41--dc21 ISBN 90 272 2360 2 (Eur.) /1 58811 077 X {US) {Hb; alk. paper)
2001035506
C 2001- John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benja.mins Publishing Co.· P.O. Box 36224 · 1020 ME Amsterdam· The Netherlands John Benja.mins North America · P.O. Box 27519 · Philadelphia PA 19118-0519 · usA
Table of Contents
Preface 1.
vu
Text representation as an interface between language and its users Ted Sanders & Wilbert Spooren 1. ACCESSIBIUTY IN TEXT AND Accessibility theory: an overview 29
SECTION
2.
TEXT PROCESSING
1
27
Mira Ariel 3·
4
The influence of text cues on the allocation of attention during reading 89 Michelle L Gaddy, Paul van den Broek & Yung-Chi Sung Lexical access in text production: On the role of salience in metaphor resonance 111 Rachel Giora & Noga Balaban 2. RELATIONAL COHERENCE IN TEXT AND TEXT PROCESSING Semantic and Pragmatic relations and their intended effects 12.7
SECTION
5.
Alistair Knott 6.
7.
On the production of causal-contrastive although sentences in context 153 Leo G. M. Noordman Beyond elaboration: The interaction of relations and focus in coherent text 181
Alistair Knott, ]on Oberlander, Michael O'Donnell, Chris Mellish 8. Unstressed en I and as a marker of joint relevance 197 Henk Pander Maat 9.
Argumentation, explanation and causality: an exploration of current linguistic approaches to textual relations 2.31
Francisca Snoeck Henkemans
12.5
V1
Text Representation
SECTION
3.
fROM TEXT REPRESENTATION TO KNOWLEDGE
247 Constructing inferences and relations during text comprehension
REPRESENTATION
10.
249
Arthur C. Graesser, Peter Wiemer-Hastings & Katja Wiemer-Hastings IL
Thinking about bodies of knowledge: Tests of a model for predicting thoughts 273 Bruce K. Britton, Peter Schaeffer, Michael Bryan, Stacy Silverman &
Robert Sorrells SECTION
4.
SEGMENTATION IN TEXT AND TEXT REPRESENTATION
307
12. Conceptual and linguistic processes in text production; interactive or autonomous? 309
Joost Schilperoord 13. Subordination and discourse segmentation revisited, or: Why matrix clauses may be more dependent than complements 337
Arie Verhagen Subject index 359
Preface
The chapters of this book volume are all based on papers presented at the
International workshop on text representation: Linguistic and psycholinguistic aspects, held at Utrecht University in July 1997. We are indebted to the Centre for Language and Communication (CLC) and the Faculty of Arts of Utrecht University, the Center for Language Studies (CLS) and the Discourse Studies Group of Tilburg University, the Netherlands Organization for Scientific Research NWO and the Royal Academy of Sciences KNAW for their financial support of this workshop and to the Utrecht Institute of Linguistics UIL OTS (which now includes CLC) for logistic and financial support in the preparation of the manuscript. We are very grateful to Gerben Mulder, Annelous van Rijn and Martien Kroon for their invaluable help in editing this manuscript. We would also like to thank the many colleagues who acted as reviewers of the chapters: Yves Bestgen Diane Blakemore Lucille Chanquoy Liesbeth Degand Jack DuBois Charles Fletcher Bruce Fraser David Galbraith Morton Ann Gemsbacher Paul van den Hoven EdHovy
Jan van Kuppevelt Fons Maes Bonny Meyer Joanna Moore Gisela Redeker Manfred Stede Mark Torrance Carel Van Wijk Luuk Van Waes Jos van Berkum
Finally, we want to thank the editors of the Benjamins series Human Cognitive Processing, especially Jan N uyts, for publishing this book volume in their series. Ted Sanders, Joost Schilperoord, Wilbert Spooren 's-Hertogenbosch I Utrecht, December 2000
CHAPTER
1
Text representation as an interface between language and its users Ted Sanders Wilbert Spooren University of Utrecht/Free University, Amsterdam
1.
From meaning to processes, from sentence to discourse
The theme of this volume is text representation, or more specifically the linguistic and psycholinguistic aspects thereof. In our view, a text representation is a cognitive entity: a mental construct that plays a crucial role in both text production and text understanding. In text production it is the basis for lexical retrieval and for producing and combining the discourse units. In text understanding it is the result of the decoding of the linguistic information in a discourse. This book characterizes a field of study in which the two disciplines, linguistics and psycholinguistics, are growing together. Traditionally, the linguists' task was to connect linguistic forms with meanings. This was usually done within a simplex model of communication, in which isolated sentences were connected to interpretations. Gradually the insight grew that such an enterprise cannot be undertaken without concern for the language user that produces and interprets an utterance. Consequently, the cognitive representations and actions of producers and interpreters entered the linguistic model. This development was stimulated by the rise of cognitive psychology in the early 1960s, at the expense of behaviorism. Cognitivists regarded the cognitive processes proper as their object to be explained, rather than the product of such processes, that is, the behavior in which these processes resulted. And since the end of the 1970s the view on linguistic communication has been extended in such a way that not only sentences, but also extended discourse is the object of study. We can describe the present situation in the following schematic view on communication through language, although we realize that it is an oversimpli-
~
Ted Sanders and Wilbert Spooren
~ 0
Q,o 0
• Figure 1. Communication through written text: From cognitive representation to text, to cognitive representation (as seen by Laura van Beek, 9 years old)
fied model, that in fact reflects the 'conduite metaphor of communication' (Reddy 1979). In this view, there is a producer who has a cognitive representation of what she intends to communicate; this is formulated in a linguistic code, called the text, and this text is decoded by the interpreter who can be said to understand a text once he has made a coherent representation ofit. This view fits theories that describe the link between the structure of a text as a linguistic object, its cognitive representation and the processes of text production and understanding. This schematic view also shows very clearly that research on the representation of language and text should by its very nature be an interdisciplinary enterprise. As such, one might expect a strong connection between linguistic analyses and psycho linguistic research. However, we believe it is fair to say that researchers have not always appreciated this interdisciplinary aspect, and have often worked in isolation, thereby maintaining the traditional borders between the two disciplines: linguists describe language structures, psycholinguists study mental representations and processes. In recent years, this situation has improved significantly (see, among others, Gernsbacher & Giv6n 1995). Topics like interclausal and inter-sentence relations (for instance, the contributions to Fayol & Costermans 1997), information distribution (Chafe 1994) and the structure of complete texts (Mann & Thompson 1992), have received serious attention in descriptive studies and, to a lesser extent, in studies of text processing. Furthermore,
Text representation as an interface between language and its users
linguists pay more attention to the cognitive aspects of language use (compare the emergence of cognitive linguistics), and psycholinguists give more serious consideration to the linguistic complexity of the object under study. This volume is intended to contribute to the information exchange among researchers from these disciplines. Before giving an overview of the content of the different contributions to this volume, we first highlight themes we believe to be central to the cognitive and linguistic study of text representation. We start with three general tendencies in research on text representation. Then we introduce two linguistic characteristics that constitute a text, which will be the object of study in this volume.
2.
Three major themes in research on text representation
2.1
Multiple representations
So far, we have been using the term 'the text representation'. This term is in fact an oversimplification. Both in linguistics and in psychology text representations are taken to be composite. In psycholinguistics we find this idea very explicitly present in the work ofKintsch and associates (Kintsch 1998, but also in Kintsch & van Dijk 1978 and in Van Dijk & Kintsch 1983; for an overview see also Singer 1990). Their research focuses on the receptive side of communication and states that readers make multiple representations of the sentences of a text: a surface code (a short-lived representation of the exact linguistic material in the sentences), a text base (containing the propositions expressed by the sentences and their interrelations), and a situation model (in which the linguistic material is integrated with the background knowledge of the reader). In linguistics, the concept of multiple representations has also been developed. In formal semantics in the work ofKamp (see Kamp 1981; Heim 1982; Kamp & Reyle 1993) in cognitive linguistics most explicitly in the work by Fauconnier ( 1985, 1994) on mental spaces. In general, the idea is that linguistic expressions are instructions to update the current mental representation, that is based on previous discourse, background knowledge and inferencing. Thus, expressions are considered to have a procedural meaning. Or, as Fauconnier ( 1994, p. xviii) has put it: Language does not itself do the cognitive building - it "just" gives us minimal, but sufficient, clues for finding the domains and principles appropriate for build-
3
4
Ted Sanders and Wilbert Spooren
ing in a given situation. Once these clues are combined with already existing configurations, available cognitive principles, and background framing, the appropriate construction can take place, and the result far exceeds any overt explicit information.
2.2
Underspecification of mental representations
The quote from Fauconnier brings us to the second important theme of recent research on text representations, the underspecification of mental representations. Contrary to what is maintained in the standard coding theory of meaning, it is fairly generally accepted in many branches of linguistics and psychology that what an utterance means cannot in any easy, transparent and compositional way be connected to the meaning of the individual elements in the utterance and their interrelation. An utterance explicitly codes only part of the meaning of the utterance into explicit linguistic material, the rest having to be provided by inferencing. In linguistics this point has been formulated forcefully by Sperber and Wilson ( 1992, Chapter 1) and Fauconnier ( 1994, Introduction), but it is in fact prominent in most descriptive accounts of coherence (see Bublitz, Lenk & Ventola 1999, for a recent overview). It is even one of the major tenets, in a much more radical 1 form, in, e.g., conversation analysis (cf. Pomerantz & Fehr 1997). The same linguistic items will be interpreted differently in different situations and contexts, and hence any adequate theory of meaning will have to allow for rich inferential mechanisms. The Gricean program (e.g., Grice 1975) is one such attempt to provide the necessary inferential power: Under the assumption of cooperativeness, participants in a conversation will generate implicatures in order to extend the literal meaning of an utterance and be able to make a coherent representation of the discourse. This inferential mechanism has been explored systematically by Levinson (1991 ), for instance to account for the binding properties of anaphors. In the field of text linguistics, Spooren ( 1997) uses a similar mechanism to account for so-called underspecified uses of connectives (cases where coherence relations are not (completely) matched by the meaning of the connectives that occur in the text). In psychology there are numerous findings demonstrating the underspecification of linguistic material. For instance, Graesser, Singer, and Trabasso ( 1994) have suggested that in narrative texts causal connections function as a default, thus allowing for coherence relations to remain unmarked. Noordman and Vonk (1998) and Sanders and Noordman (2000),
Text representation as an interface between language and its users
suggest that such a view is promising for expository texts as well. In the latter study it is found that causal relations lead to faster processing of the connected information, but still lead to a more integrated representation. Taking the role of non-linguistic factors, such as reader's characteristics, into account, Noordman and Vonk (Noordman, Vonk & Kempff 1992; Noordman & Vonk 1992) have shown that depth of processing of causal relations is dependent on the reader's goals and knowledge. 2..3
Dynamic representations
A third recurrent theme is that text representations are constructed dynamically: The effect of a language element on a representation is dependent on the current state of that representation, which is updated incrementally. 2 Most of the current formal semantic systems (Situation Semantics, File Semantics, Discourse Representation Theory, etc.) specifically incorporate this aspect. In psycholinguistic text production and reception systems, incrementation and dynamicity are omnipresent (see Andriessen, De Smedt & Zock 1996; Garnham 1996; Sanders & Van Wijk 1996; Schilperoord 1996). The vocabulary used to capture this aspect of text representation is usually that of connectionism. Although the status of connectionist models as theories of natural language production and interpretation is under dispute (cf. Levelt 1989, p. 20), they have the promise to quite easily capture the flexibility needed to model the temporal course of discourse comprehension and production. The insight that the cognitive processes of text production and interpretation can be modeled as dynamic processes in which activation fluctuates, and that this process is influenced, or even to a large extent determined by the linguistic characteristics of the text, raises one of the most important challenges at the intersection of linguistic and psycholinguistic studies ofdiscourse. Fortunately, there is also an ongoing use of sophisticated empirical methods which enable researchers to ask and find answers to quite precise questions. For instance, a dynamic view on the process of discourse comprehension leads to the expectation that while a reader proceeds through the text, the activation of concepts, facts and events as parts of a discourse representation fluctuates constantly. So, hypotheses considering activation patterns can be tested with on-line methods like reading time registration, naming tasks or eye movement registration (see HaberIandt 1994, for an overview of such methods). Eventually, the fluctuating activation patterns settle into a relatively stable memory representation of the text. Several discourse comprehension models are based on these insights and
S
6
Ted Sanders and Wilbert Spooren
empirical findings, such as the Structure Building Framework (Gernsbacher 1990) the Landscape model of Reading (Van den Broek, Young, Tzeng & Linderholm 1998) and the Construction-Integration model (Kintsch 1998). Questions of how exactly this activation fluctuates, and how the activation is influenced by the linguistic characteristics of the text are currently major research questions, that are partly addressed in contributions to this volume, among others by Gaddy, Van den Broek & Sung in Chapter 3. It is important to note that there is a similar tendency in research on production, even though there is in general less attention for production studies, as Levelt ( 1989) notices in his handbook Speaking, because of the bias in psycholinguistics towards perception research, at the cost of production research. Along the same lines, Kintsch, presenting an overview of discourse psychological work ( 1994, p. 728) remarks that "many psychological studies have concerned themselves with this problem in the past few years, although overwhelmingly with the comprehension rather than the production side." Where empirical findings such as longer or shorter reading times of segments are taken to indicate the level of activation of a concept being processed during text understanding, the on-line registration of pause times opens a promising route to gain further insight in the on-line processes of text production. Schilperoord ( 1996) has used the method of analyzing the location and duration of pauses during written discourse production in an attempt to open up the •black box' of a discourse producer's cognitive representation. He found that text producers tend to pause longer before segments located high in a structural hierarchy of the text under production, than before segments located low in such a hierarchy. If we assume that differences in pause time reflect differences in cognitive effort needed to retrieve information from Long Term Memory, then it can be hypothesized that the hierarchical structure of discourse is a crucial factor in determining the on-line level of accessibility of information (Schilperoord & Sanders 1997, 1999). This line of work, in which a cognitively inspired text-analysis (Sanders & Van Wijk 1996) is combined with on-line psycholinguistic research methods, is an example of how the combination of linguistic and psycholinguistic methods contributes to the development of integrated theories of language structure and language processes. Similar tendencies can be found in research in Cognitive Linguistics, a research paradigm not specifically aimed at the discourse level, on issues like the polysemy of prepositions (Sandra & Rice 1995), epistemic modality (J. Sanders & Spooren 1996) or the study of metaphor and figurative speech (see among many others, Gibbs 1994, 1996 ). In this volume, Chapter 4 by Giora
Text representation as an interface between language and its users
and Balaban, is another example of such a study on methaphor.
3· Two constituting principles of text: referential and relational coherence Now that we have discussed general properties and tendencies in research on text representation, an obvious question emerges: What are the exact characteristics of the text, the object of representation? We discuss the ones we believe to be most important, i.e. those that determine a set of sentences to be a text rather than a loose set of sentences. 3.1 What makes a text a text?
A constituting characteristic of texts is that they show connectedness. The question of how to characterize this connectedness is generally considered crucial in discourse studies. A dominant stance is that coherence explains best for this connectedness, where coherence is a characteristic of the representation rather than of the text itself. In other words, coherence is considered a mental phenomenon; it is not an inherent property of a text under consideration. Language users establish coherence by relating the different information units in the text. The notion of coherence has a prominent place in both (text- )linguistic and psycholinguistic theories of text and discourse. Although this is not a particularly new view of coherence- it is a dominant thesis in most recent work (see, among many others, Van Dijk & Kintsch 1983; Garnham & Oakhill 1992; Hobbs 1990; Noordman & Vonk 1997; Sanders, Spooren & Noordman 1992)- it is a crucial starting point for theories that aim at describing the link between the structure of a text as a linguistic object, its cognitive representations and the processes oftext production and understanding. And in our view it is this type of theory, located at the intersection of linguistics and psycholinguistics that could lead to significant progress in the field of discourse studies. Generally speaking, there are two respects in which texts can cohere: 1. Referential coherence: units are connected by repeated reference to the same object; 2. Relational coherence: text segments are connected by establishing coherence relations like CAUSE-CONSEQUENCE between them.
7
8
Ted Sanders and Wilbert Spooren
In this volume the role of both types of coherence in text processing is discussed. Now that we have identified the types of representation and the constituting principle of text, we can state a central theme of this volume in different terms: A major issue is the relationship between the linguistic surface code (what Giv6n 1995, calls 'grammar as a processing instructor') and the meaning representations. Both coherence phenomena under investigation - referential and relational coherence - have dear linguistic indicators that can be taken as processing instructions, which will typically affect the surface representation. For referential coherence these are anaphoric devices such as pronouns, and for relational coherence these are connectives and (other) lexical markers of relations. 3.2.
Referential coherence
The relevant linguistic indicators for referential coherence are pronouns and other devices for anaphoric reference. Ever since the seminal work of linguists like Chafe (1976) and Prince (1981), both functional and cognitive linguists have argued that the grammar of referential coherence can be shown to play an important role in the mental operations of connecting incoming information to the existing mental representations. For instance, referent NPs are identified as either those that will be important and topical, or as those that will be unimportant and non-topical. Hence, topical referents are persistent in the mental representation of subsequent discourse, whereas the non-topical ones are non-persistent. Recently, more and more empirical data from corpus studies become available which underpin this cognitive interpretation of referential phenomena, following a route guided by functional linguists such as DuBois (1980). In a distributional study, Giv6n ( 1995), for instance, shows that in English, the indefinite article a( n) is typically used to introduce non-topical referents, whereas topical referents are introduced by this. In addition, there is a clear interaction between grammatical subjecthood and the indefinite article this: most this-marked NPs also appear as grammatical subjects in a sentence, while a majority of a(n)-marked NPs occurred as non-subjects. Across languages there appears to be a topic persistence of referents; in active-transitive clauses the topic persistence of subject NPs is systematically larger than that of object NPs. In several publications Ariel (1988, 1990) has argued that regularities in grammatical coding should indeed be understood to guide processing. She has studied the distribution of anaphoric devices and she has suggested that zero
Text representation as an interface between language and its users
anaphora and unstressed pronouns co-occur with high accessibility of referents, whereas stressed pronouns and full lexical nouns signal low accessibility. This co-occurrence can easily be understood in terms of cognitive processes of activation: high accessibility markers signal the default choice of continued activation of the current topical referent Low accessibility anaphoric devices such as full NPs or indefinite articles signal the terminated activation of the current topical referent, and the activation of another topic. Ariel ( 1988, 1990) has argued that binding conditions on the distribution and interpretation of pronominal and anaphoric expressions actually are the 'grammaticalized version' of cognitive processes of attention and accessibility of concepts that are referred to linguistically. Speakers encode the degree of accessibility of mental structures in several ways. Each referential expression is a kind of retrieval device for the listener. In fact, Ariel proposes to define referring expressions in terms of a processing procedure: zero-anaphors and pronominal expressions encode highly accessible concepts, where lexical anaphors refer to less accessible referents (see also Chafe 1994, for a processing view on linguistic structures of'given' and 'new'). In experimental research on text processing, quite some work has been done which can be taken to demonstrate the 'psychological reality' of linguistic indicators of referential coherence. On-line studies of pronominal reference have resulted in the formulation of cognitive parsing principles for anaphoric reference (cf. Garrod & Sanford 1994; Sanford & Garrod 1994). For instance, it is easier to resolve a pronoun with only one possible referent, and it is easier to resolve pronouns with proximal referents than distant ones. As for the time course, eye fixation studies have repeatedly shown that anaphoric expressions are resolved immediately (e.g., Carpenter & Just 1977; Ehrlich & Rayner 1983 ). Consider an example like (1). ( 1) a. The guard mocked one of the prisoners in the machine shop. b. He had been at the prison for only one week. When readers came upon ambiguous pronouns such as he in (lb), they frequently looked back in the text. More than 50% of these regressive fixations were to one of the two nouns in the text preceding the pronoun, suggesting that readers indeed attempted to resolve the pronoun immediately. As for the meaning representation, it has been shown that readers have difficulty to understand the text correctly when the antecedent and referent are too far apart and reference takes the form of a pronoun.
9
10
Ted Sanders and Wilbert Spooren
On a more global text level, research on the exact working of accessibility markers as processing instructions is rare, but a good example is Vonk, Hustinx and Simons (1992). They show the relevance of discourse context for the interpretation of referential expressions. Sometimes anaphors are more specific than would be necessary for their identificational function (for instance, full NPs are used rather than pronominal expressions). The authors convincingly argue that this phenomenon can be explained in terms of the thematic development of discourse: if a character is referred to by a proper name after a run of pronominal references, then the name itself serves to indicate that a shift in topic is occurring. Readers process the referential expressions differently, as becomes apparent from reading times. Whereas anaphoric reference modulates the availability of previously mentioned concepts, cataphoric devices change the availability of concepts for the text that follows. Gemsbacher ( 1990) and her colleagues have demonstrated the reader's sensitivity for this type of linguistic indicators of reference. They contrasted cataphoric reference byway of the indefinite a(n) versus definite this to refer to a newly introduced referent in a story. So the new referent egg was introduced either as 'an egg' or as 'this egg'. It was hypothesized that the cataphor this would signal that a concept is likely to be mentioned again in the following story and that therefore the this-cataphor results in a higher activation. Subjects listened to texts and where then asked to continue the text after the critical concept. They appeared to refer sooner and more often to a concept introduced by this than by an. These and other results show that concepts that were marked as a potential discourse topic by this are more strongly activated, more resistant to being suppressed in activation, as well as more effective in suppressing the activation of other concepts (Gemsbacher 1990; Gernsbacher & Shroyer 1989). It is this type of findings that provides the psycholinguistic underpinning for the idea of'grammar as a processing instructor'. 3.2
Relational coherence
So far, we have discussed examples of the way in which linguistic signals of referential coherence affect text processing. We now move to signals of relational coherence. In many approaches to discourse connectedness, coherence relations are taken to account for the coherence in readers' cognitive text representation (cf. Hobbs 1979; Mann & Thompson 1986; Sanders et al 1992; Sanders, Spooren & Noordman 1993 ). Coherence relations are meaning relations which connect two text segments (i.e. minimally clauses). Examples are relations like
Text representation as an interface between language and its users
CAUSE-CONSEQUENCE, LisT and PROBLEM-SOLUTION. These relations are conceptual and they can, but need not, be made explicit by linguistic markers. Below, we will first focus on the second aspect of relational coherence, that on the level of the surface code: linguistic markers such as connectives and signaling phrases. After that we will move to the level of the meaning representation: the nature of the relations themselves. Ever since Ducrot (1980) and Lang (1984), there have been linguistic accounts of connectives as operating instructions. The basic idea is that a connective has the function of relating the content of the connected segments in a specific type of relationship. Anscombre and Ducrot ( 1977), for instance, analyze but as setting up an argumentative scale (for instance, the (un)desirability ofJohn), with one segment tending towards the negative side of the scale and the other towards the positive side: (2) John is rich, but dumb. In Chapter 9, Snoeck Henkemans explores the relationship between this type of argumentative approaches to connectives and the text-linguistic ones that were considered earlier above. In his influential work on Mental Spaces, Fauconnier (1985, 1994) treats connectives as one of the so-called space-builders, linguistic expressions that typically establish new mental spaces. Mental spaces are mental constructs set up to interpret utterances, "structured, incremental sets [ ... ] and relations holding between them [ ... ], such that new elements can be added to them and new relations established between their elements, (Fauconnier 1994, p. 16). Other examples of space builders are prepositional phrases (In 1929, From her point of view), adverbs (really, probably). A connective acting as a space-builder is an if-then conditional, as in 'Ifl were a millionaire, my VW would be a Rolls• or 'If he had listened to his mother, this criminal would be a Saint". Such expressions if p then q set up a new mental space H in which a p and q hold. So, if p is the space builder and in this new space my VW from the initial space is identified with the Rolls in the new space (for the detailed analyses see Fauconnier 1994, Chapters 3 and 4 and Sweetser 1996). Fauconnier argues that the solution to some of the problems of traditional semantics, such as opacity, presupposition and the like, falls out naturally from these mechanisms. In the same vein, Spooren (1989) has argued that but-coordinations typically function to contrast conflicting information coming from different perspectives, and that this may even affect the truth-conditional level For instance, (3a) is possible, but (3b) is a contradiction (Spooren 1989, p. 69).
n
12
Ted Sanders and Wilbert Spooren
(3)
Cassius Clay was shy, but Muhammed Ali wasn't. b. Muhammed Ali was shy, but Muhammed Ali wasn't
a.
This type of dynamic approach to connectives 'as processing instructors' is becoming more and more important, not in the least because of the rise of Cognitive Linguistics as a new branch on the linguistic tree. Is there any psycholinguistic work showing the relevance of ideas like this? Indeed, in various on-line processing studies the function of linguistic markers is examined. These studies have primarily aimed at the investigation of the processing role of the signals per se, rather than on more sophisticated ideas like the exact working of 'space building'. The experimental work typically includes the comparison of reading times of identical textual fragments with different linguistic signals preceding them. Recent studies on the role of connectives and signalling phrases show that these linguistic signals affect the construction of the text representation (cf. Deaton & Gernsbacher in press; Millis & Just 1994; Noordman & Vonk 1997; Sanders & Noordman 2000). Millis and Just (1994), for instance, investigated the influence of connectives like because immediately after reading a sentence. When participants had read two clauses that were either linked or not linked by a connective, they judged whether a probe word had been mentioned in one of the clauses. The recognition time to probes from the first clause was consistently faster when the clauses were linked by a connective. The presence of the connective also led to faster and more accurate responses to comprehension questions. These results suggest that the connective does influence the representation immediately after reading. Deaton and Gernsbacher (in press) combined on-line and off-line measures to investigate readers' use of because. They found that two causally related clauses connected by because were read more rapidly than when they were presented without the conjunction. When the clauses were conjoined by because, the second clauses were also recalled more frequently in a prompted recall test. Generally speaking, studies on the influence of linguistic markers on text representation show a rather inconsistent pattern. Sometimes linguistic markers give rise to better structure in free recall (Meyer, Brandt & Bluth 1980), to faster and more accurate reactions on a probe task, to faster and more accurate responses to comprehension questions (Degand, Lefevre & Bestgen 1999; Millis & Just 1994), and to better recall in a prompted recall task (Deaton & Gemsbacher, in press), but they do not lead to more information recalled
Text representation as an interface between language and its users
(Britton, Glynn, Meyer & Penland 1982; Meyer 1975). At the same time, online data suggest that the presence of linguistic markers facilitates processing (Britton et al. 1982; Deaton & Gemsbacher, in press). Thus far, we have discussed the role of connectives and signaling phrases in discourse processing. A preliminary conclusion might be that they can be treated as linguistic markers which instruct readers in how to connect the new discourse segment with the previous one (Britton 1994). In the absence of such instructions readers have to determine for themselves what coherence relation connects the incoming segment to the previous discourse. Such an inference process requires additional cognitive energy and results in longer processing times. If this idea has any validity, it implies that the coherence relations themselves would have a major influence on discourse processing as well. One might expect that the type of relation that connects two discourse segments, be it causal, additive, contrastive etc., affects the discourse representation. Here we move into another area where the combination of text linguistic and discourse psychological insights has lead to significant progress: the categorization of coherence relations. In the last decade, a significant part of research on coherence relations has focused on the question how the many different sets of relations should be organized (Hovy 1990; Knott & Dale 1994; Pander Maat 1998; Redeker 1990; Sanders 1997a). Sanders et al. ( 1992, 1993) have started from the properties common to all relations, in order to define the 'relations among the relations', relying on the intuition that some coherence relations are more alike than others. For instance, the relations in ( 4), (5) and (6), all express (a certain type of) causality, whereas the ones in (7) and (8) do not. Furthermore, a negative relation is expressed in (7), as opposed to all other examples, and (8) expresses an enumeration or addition. (4) The buzzard was looking for prey. The bird was soaring in the air for hours. (5) The bird has been soaring in the air for hours now. It must be a buzzard. (6) The buzzard has been soaring in the air for hours now. Let's finally go home! (7) The buzzard was soaring in the air for hours. Yesterday we did not see it all day. (8) The buzzard was soaring in the air for hours. There was a peregrine falcon in the area, too. A dominant distinction in existing classification proposals is that between socalled content, ideational, external or semantic relations on the one hand, and
13
14
Ted Sanders and Wilbert Spooren
presentational, internal and pragmatic relations, on the other hand. In the first type of relations, segments are related because of their propositional content, i.e. the locutionary meaning of the segments. They describe events that cohere in the world. The relation in (9) can be interpreted as semantic because it connects two events in the world; our knowledge allows us to relate the segments as coherent in the world. A relation like (9) could be paraphrased as "the cause in the first segment (Sl) leads to the fact reported in the second segment (52)" (Sanders 1997a). (9) The neighbours suddenly left for Paris last friday. So they are not at home. (10) The lights in their living room are out. So the neighbours are not at home. In (10) however, the two discourse segments are related because we understand the second part as a conclusion from evidence in the first, and not because there is a causal relation between two states of affairs in the world: It is not because the lights are out that the neighbours are not at home. The causal relation (10) could be paraphrased as "the description in 51 gives rise to the conclusion or claim formulated in the 52." Hence, in the second type of relation the discourse segments are related because of the illocutionary meaning of one or both of the segments. The coherence relation concerns the speech act status of the segments. If this distinction is applied to the set of examples above, the causal relation (4) is semantic, whereas (5) and (6) are pragmatic. This systematic difference between types of relations is noted by many students of discourse coherence. Still, there is quite a lot of discussion about the exact definition of a distinction like this (see e.g., Bateman & Rondhuis 1997; Degand 1996; Hovy 1990; Knott & Dale 1994; Knott 1996; Knott & Sanders 1998; Martin 1992; Moore & Pollack 1992; Oversteegen 1997; Pander Maat 1998; Sanders 1997a; Sanders & Spooren 1999 and several contributions to Couper-Kuhlen and Kortmann (2000)). At the same time, several researchers have come up with highly similar distinctions, and there seems to be basic agreement on the characteristics of the prototypical relations (Sanders 1997a). Moreover, very similar distinctions have been shown useful in describing the differential meaning of conjunctions (Sweetser 1990). Also, the saliency of categorizations like these has been shown in experiments in which, among other tasks, language users were asked to judge the similarity of relations. Still, the discussion on this issue is clearly continued in this volume, especially in Section 2 (Chapters 6 to 9).
Text representation as an interface between language and its users
One of the emerging tendencies in recent linguistic research on the classification of coherence relations is the relevance of the notions perspective and subjectification. In several influential publications, Ducrot has stressed the diaphonic nature of discourse. Even in monologual texts traces can be found of other 'voices', information that is not presented as fact-like, but from a particular point-of-view, either the current speaker's (subjectified information, in the terminology of}. Sanders & Spooren 1997) or another cognizer's (perspectivized information). Langacker has contributed much to the study ofthe linguistic effects of notions like subjectification (e.g., Langacker 1990) and this notion also seems valid for the study of coherence relations and connectives (Pander Maat & Sanders 1999; Verhagen 1995, and some other contributions to Stein & Wright 1995). Although perspective remains an elusive notion for linguistics and psycholinguistics alike, Fauconnier's mental space framework seems adequate in capturing this intriguing aspect of language (J. Sanders 1994; J. Sanders & Redeker 1996, Verhagen 2000). If categorizations of coherence relations and connectives indeed have cognitive significance, they should show relevant in areas like language development, both diachronic (grammaticalization, cf. Sweetser 1990; Traugott 1988; Traugott & Heine 1991) and synchronic (language acquisition), and discourse processing. In all three areas, substantial studies are under way (Evers-Vermeul 2000; Spooren & Sanders in prep.), and there already exists suggestive evidence. Research on first language acquisition shows that the order in which children acquire connectives shows increasing complexity, which can be accounted for in terms of the relational categories mentioned above: ADDITIVES before CAUSALS, POSITIVES before NEGATIVES (see Spooren, Sanders & Visser 1994; Spooren 1997). And in text processing, there is work on the role of (different types of) coherence relations in the construction of a meaning representation. However, at first sight, results of experimental studies on that issue provide a less dear picture than the one for the role of linguistic markers, especially when it concerns expository rather than narrative texts. Perhaps this situation is largely due to the fact that it is difficult to design reading experiments in which coherence relations or text structures are varied in a succesful and independent way, while at the same time avoiding the use of ill-formed texts (what Graesser, Millis, and Zwaan 1997, have called "textoids"). Nevertheless, the idea that coherence relations affect text processing does get support from results of processing studies. Several studies suggest a processing difference between CAUSAL and non-CAUSAL relations. For instance, causally related events in short narratives are recalled better (Black & Bern
15
16
Ted Sanders and Wilbert Spooren
1981; Trabasso & Van den Broek 1985; Trabasso & Sperry 1985). Keenan, Baillet and Brown ( 1984) and Myers, Shinjo and Duffy ( 1987) demonstrated that the effect of causal connectedness on memory for sentences is greatest for moderate levels of causality. Also, causally related sentences are read faster (Haberlandt & Bingham 1978), and the reading time decreases when the causality increases (Keenan et al. 1984, Myers et al. 1987). Fewer studies exist for expository text. One example is a study by Meyer and Freedle (1984). They claimed that differences exist in the amount of organization of different types of text structure. The better organizing types are CoMPARISON, CAUSATION and PROBLEM-SOLUTION, whereas COLLECTION is a weaker organizing type. These structure types are rather similar to the often distinguished types of coherence relations. In a free recall experiment, Meyer and Freedle expected readers to reproduce more information from better organized types than from less organized ones. The results show that recall of the CAUSATION and COMPARISON passages was indeed superior to the recall of the coLLECTION passage. 3 However, there are some problems with the Meyer and Freedle study, see Horowitz (1987) and Sanders & Noordman (2000) for further details. Sanders and Noordman (2000) embedded a similar text segment in two different contexts. In one case it was a Solution to a Problem, in the second case the same segment was part of an addition. It was found that PROBLEM-SOLUTION relations lead to faster processing, better verification and superior recall. The authors conclude that the processing of a text segment depends on the relation it has with preceding segments. Perhaps the most interesting finding in this experiment is the contrast in the effect of the two independent variables: linguistic markers (implicit or explicit) and type relations (PROBLEM-SOLUTION or LIST). Explicit marking of the relations resulted in faster processing, but did not affect recall. However, verification data concerning the representation immediately after reading, show an effect of the linguistic marker. This finding is quite similar to the effect Millis and Just (1994) found for the influence of because. Hence, it can be concluded that the relational marker has an effect during on-line processing, but that its influence decreases over time. This contrasts with the effect of the coherence relation, which is also manifest in the recall. This contrast is similar to another frequently observed finding in language comprehension: initially a reader or listener constructs the surface representation of a sentence, but after a short time interval only the meaning or gist of the message is retained. Sachs ( 1967) found this effect for the form and meaning of
Text representation as an interface between language and its users
sentences which participants had to identify as identical or different. Results like these strongly support the idea that coherence relations are an indissoluble part of the cognitive representation itself, whereas linguistic markers like connectives and signaUing phrases are merely expressions of these relations, which guide the reader in selecting the right coherence relation. This conclusion is highly compatible to a view on coherence in which linguistic markers, as part of the surface code, 'guide' the reader towards a coherent text representation (cf. Gemsbacher & Giv6n 1995; Graesser et al. 1997; Noordman & Vonk 1998). So far, we have presented an overview of text-linguistic and psycholinguistic work on referential and relational coherence. This overview might suggest discourse processing to depend entirely on text characteristics such as the linguistic markers of referential and relational coherence. That is not the case. It is rather plausible that the role of coherence and its linguistic markers interacts with 'reader factors' like interestingness of materials (Spooren, Mulder & Hoeken 1998), domain knowledge (Birkmire 1985; McNamara & Kintsch 1996), topic complexity (Spyridakis & Standal 1987) reader's goals (Noordman et al. 1992) and verbal ability (Meyer, Young & Bartlett 1989). We are convinced that the interaction of such reader's characteristics with textstructural properties are prime issues for further research on text processing (see also Kintsch 1998). However, the focus of this chapter, and in fact, of this entire volume, is on gaining further insight into the role of coherence and text structure itsel£ In many discourse psychological studies on the interaction of textual and reader's factors, the coherence of passages is varied by manipulating many different textual aspects of coherence at the same time, such as adding elaborative information, identification of anaphoric references, and even supplying background information. As a result, this type of research has often conflated coherence per se and various other textual aspects that potential influence coherence. As we have argued above, we think it is crucial for the further progress of the field, that we get a better grip on the linguistic factors that determine the cognitive discourse representation. This can only be achieved by a further cross-fertilization of the fields of text linguistics and discourse psychology. A good iUustration of this point concerns the role of relational markers of text structure. It has been argued above that signaling phrases and connectives make existing relations explicit. This also implies that the use of the markers is bound to restrictions: not every connective can express every relation. In recent text-linguistic work we are beginning to understand what
17
18
Ted Sanders and Wilbert Spooren
these restrictions are and how they interplay with the meaning expressed by the connected segments {cf. Knott & Dale 1994; Pander Maat & Sanders 2000 and several contributions to Risselada & Spooren 1998). It is this type of insights that underlines the importance of further cooperation of text linguists and psycholinguists working on discourse (Sanders 1997b).
4-
The coherence of this volume
What can readers expect of this volume? The collected chapters typically present a cross-disciplinary account of text representation, by both linguists and psycholinguists. This implies that linguistic analyses of textual characteristics ultimatdy aim at accounting for the cognitive interpretation they can receive. At the same time, psycholinguistic studies focus on the relevance of text characteristics for theories of text processing, where text processing concerns both production and interpretation. An important benefit of this combination of text linguistics and psycholinguistics, and of production and understanding is that we will encounter various methodologies, which are complementary: linguistic analysis, text analysis, corpus linguistics, computational linguistics, argumentation analysis, and the experimental psycholinguistic study of text processing. A final focus of this book is the comparison and further testing of linguistic and processing theories of text representation. The following 12 chapters are divided in four sections. Section 1 deals with referential coherence in text and text representation, and especially with accessibility: how can the notion of varying accessibility explain for different referential forms, and what is the evidence for such a dynamic account of the cognitive representations language users have? In Section 2 focus shifts from referential to relational coherence in text and text representation, when the classification of coherence rdations and connectives is discussed in a closely connected cluster of chapters, combining various theoretical approaches (from Relevance Theory to Argumentation Theory and cognitive accounts of coherence relations) and different empirical methods (from text-analysis to reading experiments). Section 3 focuses on the cognitive representations of discourse and its relation to knowledge representations: how are they related? How large is the role of linguistic factors? Finally, Section 4 discusses an issue typically neglected in the previous sections: when coherence is said to exist, it exists between something, for
Text representation as an interface between language and its users
instance, discourse units or text segments. But how are these segments defined? And when we distinguish between different linguistic levels of representation (word, clause, sentence, paragraph), do we know that these levels have any psychological validity? Together, the chapters in these four sections present an overview of a growing field of interest, at the intersection of linguistics and psychology, the study of a phenomenon that is crucial to our behavior because it is the mostly used vehicle of communication: that of text and its cognitive representation.
Notes More radical in that radical conversation analysists deny the possibility of attributing any meaning to an utterance without regard of its context.
1.
Stricdy speaking there is no logical connection between dynamic systems and incremental systems, in that dynamic systems can be constructed that are not incremental and incremental systems that are not dynamic. Yet in every serious language interpretation system that we know of the two go together.
1..
3· However, there are some problems with the Meyer and Freedle study, see Horowitz
( 1987) and Sanders and Noordman (2000) for further details.
References Andriessen, J., De Smedt, K., 8c Zock, M. ( 1996). Discourse planning: Empirical research and computer models. InT. Dijkstra, 8c K. De Smedt (Eds.), Computational linguistics. AI and connectionist models ofhuman language processing(pp. 247-278). London etc.: Taylor 8c Francis. Anscombre, J., & Ducrot, 0. ( 19n). Deux MAIS en Fran~ais? Lingua, 43, 23-40. Ariel, M. (1988). Referring and accessibility. Journal ofLinguistics, 24,65-87. Ariel, M. {1990). Accessing noun-phrase antecedents. London: Roudedge. Bateman, }.A., 8c Rondhuis, K. J. (1997). Coherence relations: towards a general specification. Discourse Processes. 24, 3-49. Birkmire, D. P. {1985). Text processing: The influence of text structure, background knowledge and purpose. Reading Research Quarterly, 20, 314-326. Black. J. B., & Bern, H. {1981). Causal coherence and memory for events in narratives. Journal of Verbal Learning and Verbal Behavior, 20, 267-275. Britton, B. K., Glynn, S.M., Meyer, B.}. F., & Penland, M. J. (1982). Effects of text structure on use of cognitive capacity during reading. Journal of Educational Psychology, 74, 51-61. Britton, B. K. ( 1994). Understanding expository text: Building mental structures to induce insights. In M.A. Gemsbacher {Ed.), Handbook of psycholinguistics (pp.
19
~o
Ted Sanders and Wilbert Spooren
640-674). San Diego etc.: Academic Press. Bublitz, W., l..enk, U., & Ventola, E. (Eds.). (1999). Coherence in spoken and written
discourse. Amsterdam etc.: John Benjamins. Chafe, W. L. ( 1976). Givenness, contrastiveness, definiteness, subjects, topics and point of view. In C. N. U (Ed.), Subject and topic (pp. 25-55). New York: Academic Press. Chafe, W. L. (1994). Discourse, consciousness, and time. The flow and displacement of conscious experience in speaking and writing. Chicago: Chicago University Press. Couper-Kuhlen, E. & Kortmann, B. (Eds.). (2000). Cause, Condition, Concession, Contrast. Cognitive and discourse perspectives. Berlin: Mouton de Gruyter. Deaton, J. A., & Gemsbacher, M.A. (in press). Causal conjunctions: Cue mapping in sentence comprehension. Journal ofMemory and lAnguage. Degand, L. ( 1996 ). A situation-based approach to causation in Dutch with some implications for text generation. Dissertation, Universit~ Catholique de Louvain, Belgium. Degand, L., Lefevre, N., & Bestgen, Y. ( 1999). The impact of connectives and anaphoric expressions on expository discourse comprehension. Document Design, 1, 39-51. DuBois, J. W. ( 1980). Beyond definiteness: The trace of identity in discourse. In W. L. Chafe (Ed.), The Pear Stories: Cognitive, cultural and linguistic aspects of narrative production (pp. 203-274). Norwood, NJ: Ablex. Ducrot, 0. {1980). Essai d'application: MAIS - les allusions a l'monciation d~locutifs, perfonnatifs, discours indirect In H. Parret (Ed.), Le langage en context: etudes philosophiques et linguistiques de pragmatique (pp. 487-575). Amsterdam: Benjamins. Ehrlich, K., & Rayner, K. (1983 ). Pronoun assignment and semantic integration during reading: Eye movements and immediacy of processing. Journal of Verbal Learning and Verbal Behavior, 22,75-87. Evers-Vermeul, J. {2000). De complexiteit van connectief-verwerving [The complexity of connective acquisition]. Nederlandse Taalkunde. Fayol,M., &Costennans, J. (Eds.) (1997). Processinginterclausal relationships in production and comprehension of text Hillsdale, NJ: Erlbaum. Fauconnier, G. (1994). Mental spaces: Aspects of mtJJning construction in natural language (2nd ed.). Cambridge, MA: Bradford. Gamham, A., & Oakhill, J, (Eds.). ( 1992). Discourse representation and text processing. A special issue of Language and Cognitive processes. Hove, UK: Lawrence Erlbaum Associates. Gamham, A ( 1996 ). Discourse comprehension models. In T. Dijkstra, & K. De Smedt (Eds.), Computational Linguistics. AI and connectionist models of human language processing(pp. 221-246). London etc.: Taylor & Francis. Garrod, S. C., & Sanford, A. J. ( 1994 ). Resolving sentences in a discourse context: How discourse representation affects language understanding. In M. A. Gernsbacher (Ed.), Handbook ofpsycholinguistics (pp. 675-698). San Diego etc.: Academic Press. Gernsbacher, M.A., & Giv6n, T. (Eds.) (1995). Coherence in spontaneous text. Amsterdam etc.: John Benjamins. Gemsbacher, M.A., & Shroyer, S. (1989). The cataphoric use of the indefinite this in spoken narratives. Memory 6- Cognition, 17, 53fr540.
Text representation as an interface between language and its users
Gemsbacher, M. A. ( 1990). Language comprehension as structure-building. Hillsdale, NJ: Erlbaum. Gibbs, R. W. ( 1994). Figurative thought and figurative language. In M. A. Gernsbacher (Ed.), Handbook ofpsycholinguistics (pp. 411--446). San Diego etc.: Academic Press. Gibbs, R. W. (1996). What's cognitive about cognitive linguistics? In E. Casad (Ed.), Cognitive Linguistics in the Redwoods (pp. 27-54). Berlin: Mouton de Gruyter. Giv6n, T. ( 1995 ). Coherence in text vs. coherence in mind. In M.A. Gemsbacher, & T. Giv6n (Eds.), Coherence in sponttmeous text (pp. 59-115). Amsterdam etc.: John Benjamins. Typological Studies in Language, 31. Graesser, A. C., Millis, K. K., & Zwaan, R. (1997). Discourse comprehension. Annual ReviewofPsychology, 48, 163-189. Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrative text comprehension. PsychologiaU Review, 101,371-395. Grice, H. P. ( 1975). Logic and conversation. In P. Cole, & P. L. Morgan (Eds.), Syntax and semantics 3: Speech acts (pp. 41-58). New York: Academic Press. Haberlandt, K., & Bingham, G. (1978). Verbs contnbute to the coherence of brief narratives: Reading related and unrelated sentence triples. Journal of Verbal Learning and Verbal Behavior, 17,419-425. Haberlandt, K. ( 1994). Methods in reading research. In M. A. Gemsbacher (Ed.), Handbook ofpsycholinguistics (pp. 1-31 ). San Diego etc.: Academic Press. Heim, I. (1982). The semantics of definite and indefinite noun phrases. Unpublished Thesis, Univ. ofMassachussetts. Hobbs, J. R. ( 1990). Literature and cognition. Menlo Park, CA: CSLI. Hobbs, J. R. (1979). Coherence and coreference. Cognitive Science, 3, 67-90. Horowitz, R. ( 1987). Rhetorical structure in discourse processing. In R. Horowitz, & S. J. Samuels (Eds.), Comprehending oral and written language (pp. 117-160). San Diego, CA: Academic Press. Hovy, E. H. (1990). Parsimonious and profligate approaches to the question of discourse structure relations. Proceedings of the 5th International Workshop on Natural Language Generation. Just, M. & Carpenter, P. (Eds.), Cognitive processes in comprehension. Hillsdale NJ: Erlbaum. Kamp, H., & Reyle, U. (1993). From discourse to logic. Introduction to model-theoretic
semantics ofnatural language, formal logic and discourse representation theory. Kluwer: Dordrecht. Kamp, H. ( 1981 ). A theory of truth and semantic representation. In J. Groenendijk, T. Janssen, & M. Stokhof (Eds.), Truth, interpretation, and information (pp. 1-41 ). Dordrecht: Foris. Keenan, J. M., Baillet, S.D., & Brown, P. (1984). The effects of causal cohesion on comprehension and memory. Journal of Verbal Learning and Verbal Behavior, 23, 115-126. Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychologiatl Review, 85, 363-394. Kintsch, W. (1994). The psychology of discourse processing. In M.A. Gemsbacher
21
n
Ted Sanders and Wilbert Spooren
(Ed.), Handbookofpsycholinguistics(pp. 721-739). SanDiegoetc.:Academic Press. Kintsch, W. ( 1998 ). Comprehension. A paradigm for cognition. Cambridge: Cambridge University Press. Knott, A. ( 1996). A d~Jta-driven methodology for motivating a set of coherence relations. Doctoral dissertation. University of Edinburgh, Edinburgh, Scotland. Knott, A., & Dale, R. ( 1994). Using linguistic phenomena to motivate a set of coherence relations. Discourse Processes, 18, 35-62. Knott, A., & Sanders, T. (1998). The classification of coherence relations and their linguistic markers: An exploration of two languages. Journal ofPragmatics, 30, 135175.
Lang, E. ( 1984). The semantics ofcoordination. Amsterdam: Benjamins. Langacker, R. W. ( 1990). Subjectification. Cognitive Linguistics, 1, 5-38. Levett, W. J. M. ( 1989). Speaking. From intention to articulation. Cambridge (Ma) etc.: Bradford/MIT Press. Levinson, S. C. ( 1991 ). Pragmatic reduction of the binding condition revisited. Journal ofLinguistics, 27, 107-161. Mann, W. C., & Thompson, S. A. (Eds.). (1992). Discourse description. Diverse analyses ofa fund-raising text. Amsterdam: Benjamins. Mann, W. C., & Thompson, S. A. ( 1986 ). Relational propositions in discourse. Discourse Processes, 9, 57-90. Martin, J. R. ( 1992). English text. System and structure. Philadelphia: John Benjamins. McNamara, D. S., & Kintsch, W. (1996). Learning from texts: Effects of prior knowledge and text coherence. Discourse Processes, 22, 247-288. Meyer, B. J. F. (1975). The organization ofprose and its tffects on memory. Amsterdam: North-Holland. Meyer, B. J. f., & Freedle. R. 0. (1984). Effects of discourse type on recall Ameriam EduaJtional Research Jouma~ 21, 121-143. Meyer, B. J. F., Brandt, D. M., & Bluth, G. J. (1980). Use oftop-level structure in text: Key for reading comprehension of ninth-grade students. Reading Research Quarterly, 16, 72-103. Meyer, B. J. F., Young, C. J., & Bartlett, B. J. (1989). Memory improved: Enhanced reading comprehension and memory across the life span through strategic text structure. Hillsdale, NJ: Erlbaum. Millis, K. K., & Just, M.A. ( 1994). The influence of connectives on sentence compr~ hension. Journal ofMemory and Language, 33, 128-147. Moore, J.D., & Pollack, M. E. (1992). A problem for RST: The need for multi-level discourse analysis. Computational Linguistics, 18, 537-544. Myers, J. L., Shinjo, M., & Duffy, S. (1987). Degree of causal relatedness and memory. Journal ofMemory and Language, 26, 453-465. Noordman, L. G. M., & Vonk, W. (1997). The different functions of a conjunction in constructing a representation of the discourse. In M. Fayol, & J. Costermans (Eds. ), Processing interclausal relationships in production and comprehension oftext (pp. 7593 ). Hillsdale, NJ: Erlbaum. Noordman, L. G. M., & Vonk, W. ( 1998). Memory-based processing in understanding
Text representation as an interface between language and its users
causal information. Discourse Processes. 26, 191-212. Noordman, L. G. M., & Vonk, W. (1992). Readers' knowledge and the control of inferences in reading. Language and Cognitive Processes, 7, 373-391. Noordman, L. G. M., Vonk, W., & Kempff, H. J. (1992). Causal inferences during the reading of expository texts. Journal ofMemory and Language, 31, 573-590. Oversteegen, L E. ( 1997). On the pragmatic nature of causal and contrastive connectives. Discourse Processes. 24, 51-85. Pander Maat, H. ( 1998). The classification of negative coherence relations and connectives. Journal ofPragmatics, 30, 177-204. Pander Maat, H., & Sanders, T. (2000). Domains of use and subjectivity. On the distribution ofthree Dutch causual connnectives. In Couper-KuhlenandKortmann (Eds.), Cause, Condition Concession, Contrast. Cognitive and Discourse perspectives,
57-83. Berlin: Mouton de Gruyter. Pomerantz, A., & Fehr, B. J. (1997). Conversation analysis: An approach to the study of social action as sense making practices. InT. A. Van Dijk (Ed.), Discourse as social interaction (pp. 64-91 ). Thousand Oaks, CA: Sage. Prince, E. (1981). Toward a taxonomy of Given-New information. In P. Cole (Ed.), Radical Pragmatics. New York: Academic Press. Reddy, M. (1979). The conduit metaphor. In A. Ortony (Ed.), Metaphor and thought. Cambridge: Cambridge University Press. Redeker, G. ( 1990). Ideational and pragmatic markers of discourse structure. Journal of Pragmatics, 14,305-319. Risselada, R., & Spooren, W. (Eds.). (1998). Discourse Markers. Special issue of Journal of Pragmatics, 30. Sachs, J.D. (1967). Recognition memory for syntactic and semantic aspects of connected discourse. Perception and Psychophysics. 2, 437-442. Sanders, J. M. (1994). Perspective in narrative discourse. Dissertation KU Brabant, Tilburg. Sanders, J. M. & Redeker, G. (1996). Perspective and the representation of speech and thought in narrative discourse. In G. Fauconnier & E. Sweetser (Eds.), Spaces. worlds and grammars (pp. 290-317). Chicago: University of Chicago Press. Sanders, J., & Spooren, W. ( 1996). Subjectivity and certainty in epistemic modality: A study of Dutch epistemic modifiers. Cognitive Linguistics. 7, 241-264. Sanders. J., & Spooren, W. ( 1997). Perspective, subjectivity and modality from a cognitive linguistic point of view. In W. A. Liebert, G. Redeker, & L. Waugh (Eds.), Discourse and perspective in cognitive linguistics. (pp. 85-112). Amsterdam: Benjamins. Sanders, T. ( 1997a). Semantic and pragmatic sources of coherence. On the categorization of coherence relations in context. Discourse Processes, 24. 119-147. Sanders, T. ( 1997b). Psycholinguistics and the discourse level: Challenges for cognitive linguistics. Cognitive Linguistics, 8, 243-265. Sanders, T. f. M., & Noordman, L G. M. (2000). The role of coherence relations and their linguistic markers in text processing. Discourse Processes, 29, 1, 37-60. Sanders, T., & Spooren W. ( 1999). Communicative intentions and coherence relations.
23
24
Ted Sanders and Wilbert Spooren
In W. Bublitz, U. Lenk, & E. Ventola (Eds.), Coherence in text and discourse (pp. 235-250). Amsterdam etc.: John Benjamins. Sanders, T. J. M., Spooren, W. P.M., & Noordman, L. G. M. (1992). Toward a taxonomy of coherence relations. Discourse Processes, 15, 1-35. Sanders, T. }. M., Spoorcn, W. P.M., & Noordman, L. G. M. (1993). Coherence relations in a cognitive theory of discourse representation. Cognitive Linguistics, 4, 93133. Sanders, T ., & van Wijk, C. ( 1996). PISA- A procedure for analyzing the structure of explanatory texts. Text, 16,91-132. Sandra, D., & Rice, S. (1995). Network analyses of prepositional meaning: Mirroring whose mind- the linguist's or the language user's? Cognitive Linguistics, 6, 89130. Sanford, A.}., & Garrod, S.C. (1994). Selective processes in text understanding. In M.A. Gemsbacher (Ed.), Handbook ofpsycholinguistics (pp. 699-720}. San Diego etc.: Academic Press. Schilperoord, }. (1996). It's about time. Temporal aspects of cognitive processes in text production. Amsterdam: Rodopi. Schilperoord, }., & Sanders, T. ( 1997). Pauses, cognitive rhythms and discourse structure. An empirical study of discourse production. In A. Liebert, G. Redeker, & L. Waugh (Eds.}, Disourse and Perspective in Cognitive Linguistics (pp. 247-249). Amsterdam: Bcnjamins. Schilperoord, }., & Sanders, T. (1999). How hierarchical structure affects retrieval processes; implications of pause and text analysis. In D. Galbraith, & M. Torrance (Eds.), Conceptual processes in writing (pp. 13-30). Amsterdam: Amsterdam University Press. Singer, M. (1990). Psychology of language: An introduction to sentence and discourse processes. Hillsdale, NJ: Erlbaum. Sperber, D., & Wilson, D. (1992}. Relevance: Communication and cognition (2nd ed.). Oxford: Blackwell. Spooren, W. P.M. S. ( 1989). Some aspects of the form and interpretation ofglobal contrastive coherence relations. Ph.D. Dissertation Nijmegen University. Spooren, W. (1997). The processing of underspecified coherence relations. Discourse Processes, 24, 149-168. Spooren, W., Mulder, M., & Hoeken, H. (1998). The role of interest and text structure in professional reading. ]oumal ofResearch in Reading, 21, 109-120. Spooren, W., & Sanders, T. (in prep.). What does children's discourse tell us about the nature ofcoherence relations? Paper VU Amsterdam I UiL OTS UtrechL Spooren, W., Sanders, T., & Visser, J. (1994). Taxonomic van cohcrcnce-rclaties: Evidentie uit taalverwervingsonderzoek? (Taxonomy of coherence relations: Evidence from language acquisition research?) GrammafiTI, 3, 33-54. Spyridakis, f. H., & Standal, T. C. ( 1987). Signals in expository prose. Reading Research Quarterly, 22, 3, 285-298. Stein, D., & Wright, S. (Eds.). (1995). Subjectivity and subjectivisation. Cambridge, Cambridge University Press.
Text representation as an interface between language and its users
Sweetser, E. E. (1990). From etymology to pragmatics. Cambridge: Cambridge University Press. Sweetser, E. ( 1996). Mental spaces and the grammar of conditional constructions. In G. Fauconnier, & E. Sweetser (Eds.), Spaces, worlds and grammar (pp. 318-333). Chicago: The University of Chicago Press. Trabasso, T., & Sperry, L L ( 1985). Causal relatedness and importance of story events.
Journal ofMemory and Language, 24, 595-611. Trabasso, T., & Van den Broelt, P. (1985). Causal thinking and the representation of narrative events. Journal ofMemory and Language, 24, 612-630. Traugott, E. C. ( 1988 ). Pragmatic strengthening and grammaticalization. Proceedings of
the Berkeley Linguistics Society, 14,406-416. Traugott, E. C., & Heine, B. (Eds.). (1991). Approaches to grammaticalization. Amsterdam: Benjamins. Van den Broek, P., Young. M., Tzeng, Y., & Linderholm, T. (1998). The landscape model of reading: Inferences and on-line construction ofa memory representation. In H. Van Oostendorp, & S. R. Goldman (Eds.}, The construction of mental representations during reading (pp. 71-98). Mahwah, NJ: Erlbaum. Van Dijk, T. A., &Kintsch, W. (1983). Strategiesofdiscoursecomprehension. New York: Academic Press. Verhagen, A. (1995). Subjectification, syntax and communication. In D. Stein, & S. Wright (Eds.), Subjectivity and subjectivisation. Cambridge, Cambridge University Press. Verhagen, A. (2000). Concession implies causality, though in some other space. In Couper-Kuhlen & Kortmann (Eds.), Cause, Condition, Concession, Contrast. Cognitive and Discourse perspectives. 361-380. Berlin: Mouton de Gruyter. Vonk, W., Hustinx, L. G. M. M., & Simons, W. H. G. (1992). The use of referential expressions in structuring discourse. Language and Cognitive Processes, 7, 301-333.
2.5
SECTION
1
Accessibility in text and text processing
Both in modeling discourse structure and in processing studies, accessing referents and the linguistic coding of accessibility are central issues. The three chapters in this section show that these phenomena need to be studied from a linguistic and a psycholinguistic angle. In Chapter 2, Ariel discusses the linguistic means of reference to discourse entities. Her central claim, based, among others, on corpus analysis, is that language users do not arbitrarily switch between different referential forms, such as pronouns and full NPs, but that they show a systematic pattern. As in her earlier work, Ariel argues that the form of referential expressions can be explained by means of accessibility theory: the less accessible a referent is, the more elaborate the referential marker used by the language user. She gives an overview of her own Accessibility Theory, re-explaining aspects that have sometimes been misunderstood, and elaborating it with new findings. Then, her theory is compared to other accounts of reference. She explicitly addresses the issue of the cognitive motivation behind the theory, and discusses the relationship with psycholinguistic work on anaphoric reference. In Chapter 3, Gaddy, Van den Broek and Sung use a typically psychological framework to model allocation of attention in what they call the Landscape Model of Reading. The model addresses the issue how various text characteristics (linguistic, discourse-structural) guide the reader's attention during reading and how they affect the mental representations readers construct of the discourse. The referential forms studied by Ariel (see Chapter 2) are one of these textual devices determining the workings of the model. The authors claim that theirs is an adequate model of the on-line reading process. The chapter once again underlines the importance of the notion of 'activation' as an explanatory concept in understanding the reading process and its result: a coherent mental representation of the information expressed in the text. Activation is also a key concept in Chapter 4, by Giora and Balaban. This chapter deals with accessing literal and non-literal (or metaphorical) lexical meaning in text production, such as the boys' fight in the schoolyard (literal) vs. the union's fight against the government (non-literal). On the basis of experi-
2.8
Section 1: Accessibility in text and text processing
mental research, the authors defend a modified version of the view that lexical processes are autonomous, namely that salient meanings of a word (i.e., those that are coded into a language) are always activated, regardless of whether the activated meaning is contextually appropriate. Empirical evidence for the socalled 'graded saliency hypothesis' comes from a rating experiment in which participants were asked to indicate how they had understood a particular word. It is shown that even if the surrounding discourse strongly evoked a 'figurative' meaning, participants activated the coded meaning of the target word as well. The chapter relates to Gaddy et al. 's chapter in its dynamic conception of linguistic meaning. Furthermore it relates to Schilperoord's claim (Chapter 12) that in text production the planning takes place in a modular way.
CHAPTER
2
Accessibility theory: an overview Mira Ariel Tel Aviv University
Accessibility theory (Ariel 1985a, 1990 and onwards) describes how human language, specifically, the referential system, is responsive to facts about human memory, where memory nodes are not equally activated at any given time. Some are highly activated, others are only mildly activated, and in between, the range of activation is infinite in principle. Most memory nodes are of course not at all activated. Yet, speakers may wish to refer to Given (i.e., familiar) pieces of information, regardless of their current degree of activation for the addressees. Accessibility theory offers a procedural analysis of referring expressions, as marking varying degrees of mental accessibility. The basic idea is that referring expressions instruct the addressee to retrieve a certain piece of Given information from his memory by indicating to him how accessible this piece of information is to him at the current stage of the discourse.• To be sure, most referring expressions simultaneously also contain some conceptual content which contributes to the retrieval process. For example, she simultaneously means 'highly accessible', and 'female and singular', and the friend implies that the entity is 'of a relatively low degree of accessibility' because it is a definite description, and also that it is a 'close acquaintance', etc. But some linguistic entities (e.g., zeroes) are purely procedural, namely lacking any content, only marking a specific interpretative procedure. 2 Yet others do carry a conceptual meaning, but are undistinguishable from other expressions in terms of the concept they convey (e.g., it and this/ that). These, I have argued, are undistinguishable with respect to the description they provide for the intended referent (an inanimate object). They can only be distinguished from each other in terms of the processing instruction they mark: personal pronouns mark a higher degree of accessibility than demonstrative pronouns. Here is an example where the speaker repairs an it to a that, not because that better describes or identifies the entity referred to, and not because that is accompanied by some deictic gesture. Rather, it codes too high a degree of
30
Mira Ariel
accessibility for the word awakened: 3 ( 1) Melissa:
Ron: Frank:
Well, I'll say awakene~, cause that1's what I have written down . . . . (Sniff) ... Just watch, He'll put a note by it;.. note by that1• . . . I really like that word1 Melissa (Household).
Although speakers mark as Given information units packaged as NPs, as whole propositions, as VPs, and as verbs (see Ariel1985b, 1988b), and although all of these Given pieces of information may vary in degree of accessibility, I have concentrated on the intricate retrieval process involved in referential acts performed by NPs (Ariel1985a, 1988a, 1990 and onwards)." The structure of this paper is as foUows. Section 1 presents the basic claims and findings of accessibility theory, emphasizing aspects of the theory which have sometimes been misunderstood ( 1.2, 1.3). Section 2 sums up recent research which corroborates and further develops accessibility theory. In Section 3, I argue that while accessibility theory is cognitively motivated, accessibility marking constitutes a linguistic proper phenomenon. Section 4 compares accessibility theory with other theories of reference, and Section 5 lists open questions for linguistic and psycholinguistic research of anaphora.
1.
Accessibility theory: Basic claims and findings
1.1
Introducing accessibility theory
Accessibility theory argues that context retrievals of pieces of information from memory are guided by signaling to the addressee the degree of accessibility with which the mental representation to be retrieved is held This assumption entails that speakers do not guide addressees' retrievals by referring them to the correct "geographic" source which serves as the basis for assuming that the information is Given. In other words, languages do not provide us with conventional codes specialized for ( 1) information retrievable from our general encyclopedic knowledge (e.g., there existed an entity by the name of Simone de Beauvoir), for (2) information extractable from the immediately available physical context (e.g., there exists a table between us), or for (3) information
Accessibility theory
previously mentioned in the discourse (e.g., that the speaker has a dear friend). I have argued against Clark and Marshall (1981) that proper names (e.g., Simone de Beauvoir) are not specialized for retrieving general encyclopedic information, that demonstrative pronouns (e.g., this table) are not specialized for retrieving physically salient objects, and that personal pronouns (e.g., she) are not specialized for retrieving from the preceding linguistic context. All of these referring expressions can and do retrieve from all three "geographic" contexts (see Ariel 1988a, 1998b ). Instead, each referring expression codes a specific (and different) degree of mental accessibility (Ariell988a), and referring expressions are actually accessibility markers, i.e., expressions cueing the addressee on how to retrieve the appropriate mental representation in terms of degree of mental accessibility. Based on distributional findings re such distinctions I have suggested the following accessibility marking scale (see Ariel 1990, p. 73), which proceeds from low accessibility markers to high accessibility markers: (2) Full name+modifier >full name> long definite description> short definite description 5 > last name > first name > distal demonstrative+ modifier > proximate demonstrative+ modifier > distal demonstrative+ NP >proximate demonstrative+ NP> distal demonstrative (-NP) >proximate demonstrative (-NP) >stressed pronoun+gesture > stressed pronoun > unstressed pronoun > diticized pronoun 6 > verbal person inflections > zero A point that needs clarification is the relevant domain in which degree of accessibility is assessed. What is the basis for our determining that a specific mental representation is of high or oflow degree of accessibility (in the absence of direct tapping of the brain)? One potential source which determines degree of accessibility is the physical context of the speech situation. Another is the discourse world, where the discourse topics and other entities mentioned or reliably predicted to be relevant to the discourse at hand can receive high or low degrees of accessibility according to their discourse role. I have argued that it is the discoursal rather than the physical salience of the entities involved which determines the degree of accessibility assigned to particular mental representations (Ariel 1998b; see also Webber 1991). Although the physical context does affect the discourse model of the speakers, mental representations are a direct product of our discourse model only. One piece of evidence for this claim comes from references to the speaker. Although a (Hebrew) speaker can (almost) always refer to herself by a personal pronoun, she can cliticize the
31
32
Mira Ariel
pronoun, or she can use verbal person agreement (with a zero subject) or even zero subject alone (all are higher accessibility markers than a full pronoun), provided she is maximally accessible within the current discourse. In other words, whereas the physical accessibility of the speaker in the real world does not change in the course of the conversation, her discourse role and prominence in it may. It is the latter which determines whether the speaker can be referred to by higher accessibility markers than a pronoun. Indeed, Rieder and Mulokandov's (1998) analysis of two television interviews corroborates my initial findings (Ariel1990, 1998b ): Even turn-initial position shows an accessibility distinction re the speaker: The fuller form is preferred turn-initially, the shorter forms are preferred when immediately preceded by at least two previous mentions. I have argued that the form-function correlations on the accessibility marking scale (namely, which referring expressions code which degree of accessibility) are not arbitrary. Three partially overlapping criteria are involved: Infonnativity (the amount of lexical information); rigidity (the ability to pick a unique referent, based on the form); and attenuation (phonological size). The prediction is that the more informative, rigid and unattenuated an expression is, the lower the degree of accessibility it codes, and vice versa, the less informative and rigid and the more attenuated the form is, the higher the accessibility it codes. Thus, "true" zero subjects (as in Chinese), verbal person agreement (as in Italian and Hebrew), cliticized pronouns (as in Hebrew and English), pronouns, stressed pronouns, demonstrative and definite NPs and proper names (of all kinds) are each specialized for (slightly) different degrees of accessibility, which accounts for their different discourse distributions. Based on previous work (most notably Sanford & Garrod 1981; Giv6n 1983) and my own, I have suggested that we can tap the degree of accessibility associated with a given mental representation at a given moment by considering properties of the mental representation/ antecedent (not necessarily a linguistic one), as well as the relationship between the antecedent and the anaphor (the accessibility marker). Thus, the more salient the antecedent the more highly accessible it is deemed. I distinguished between global discourse topics (highest degree of accessibility), local discourse topics (relatively high degree of accessibility) and non-topics (relative low degree of accessibility) in this connection, as well as between the speaker and the addressee (high degree of accessibility) versus a referent which is neither (a 3rd person- a relatively low degree of accessibility). Another salience distinction depends on the automaticity/stereotypy of the inference required in generating a Givenness status
Accessibility theory
for an entity. Inferred entities come in different degrees of accessibility (see Sanford & Garrod 1981; Ariel 1985a, 1990, 1996; Oakhill, Gamham, Gemsbacher & Cain 1992; Gundel, Hedberg & Zacharski 1993; Garnham, Traxler, Oakhill & Gernsbacher 1996; Matsui 1998). Frame induced entities (e.g., waiters in restaurants) are more accessible than inferable entities which are not salient or necessary in a specific frame (e.g., umbrellas in restaurants). In fact, some inferential information is indistinguishable from explicitly mentioned pieces of information (see Beeman & Gernsbacher, Ms., and references cited therein). This accounts for the difference between initially referring to the waiter without any anchoring, versus to Maya's umbrella (the necessity to anchor the umbrella to a Given entity) in the context of a restaurant story. Another factor influencing the relative degree of accessibility of an antecedent is competition on the role of antecedent (see Clancy 1980; and see O'Brien & Albrecht 1991, for experiments establishing that we initially access multiple antecedents). The more potential antecedents there are, the lower the degree of accessibility each is entertained with. The relationship between the antecedent and the anaphor, the degree of their unity, or cohesion (Ariel1990), can be tight, in which case the degree of accessibility of the relevant mental representation is higher, or it can be loose, in which case degree of accessibility is lower. Such a relationship exists between linguistic units primarily (an antecedent and an anaphor ). The distance between a previous mention of the same referent and the current mention is an obvious measure of an accessibility distinction. The larger the distance separating different mentions of the same mental entity, the lower the degree of accessibility with which the mental representation is entertained. But distance is not necessarily measured by words. Paragraphs and episode boundaries create a distance, despite the linear continuity (see Ariel1990; Clancy 1980; Sanford & Garrod 1981; Tomlin 1987). At episode boundaries, people have difficulties accessing prior information (Beeman & Gernsbacher, manuscript; Sanford & Garrod 1981). Similarly, units (clauses) more cohesively linked entail more dependency in their interpretation, so that material from one clause is more readily available for the interpretation of another. Such constructions create higher degrees of unity and hence, accessibility. Looser connections, on the other hand, entail more independence in the processing of each clause, in which case there is less availability (accessibility) of material of one clause for the interpretation of the other. Such differences account for the different anaphoric patterns observed for subordinations (higher degree of accessibility- repeated proper names are clearly dispreferred) versus coordinations (a lower degree of
33
34
Mira Ariel
accessibility- repeated proper names are at least possible), nontensed (infinitival) versus tensed clauses (the former show a preference for zero subjects), and restrictive versus nonrestrictive relative clauses (the latter favor resumptive pronouns more often than the former). In sum, I have argued that referring expressions are chosen according to the assessed degree of accessibility of the mental entities corresponding to them. Degree ofaccessibility depends on factors related to the inherent salience of the entity and on the unity between the antecedent and the anaphor. In addition, the conventional degree of accessibility coded by referring expressions is motivated by their relative informativity, rigidity and attenuation. 1.2
Accessibility as a complex concept
I have tried to emphasize that assessing degree of accessibility is a complex matter, since multiple factors are involved. It is the complex concept of accessibility which determines referential form, and not any single factor. This is why when we examine any one factor of accessibility, the results are significant, but far from absolute (see Ariell999, on resumptive pronouns and Garcia 1996, on reflexives and pronouns in Spanish). Accessibility factors may converge on pointing to a high (or low) degree of accessibility, as when the speaker or the addressee (highly accessible) is also the global discourse topic (highly accessible), or when the discourse topic has been recently mentioned (high accessibility), and/or has been mentioned numerous times (high accessibility). However, although distance, for instance, is a crucial factor determining degree of accessibility, it cannot be taken to perfectly represent the overall degree of accessibility involved. For instance, pronouns (high accessibility markers) can sometimes (over25% in mydata-seeAriell990, p.l8 for sources), refer to mental entities last mentioned in a previous rather than a current paragraph (entailing a lower degree of accessibility in terms of distance). The reason is that these distant references are mostly references to the global discourse topic (92%). Discourse topics can maintain a relatively high degree of accessibility despite the larger distance. The same clash would explain why it is that when two entities are introduced (e.g., Maya; kissed Rachel). the first mention topical but relatively distant NP (Maya) is later coded as an unstressed pronoun (a higher accessibility marker), whereas the more recent, nontopical NP (Rachel) is coded by a stressed pronoun (a lower accessibility marker) (e.g., and then she/SHEj..). Similarly, Brennan (1995) found that nonprominent entities (less salient) were referred to by full NPs (low accessibility markers) rather than by
Accessibility theory
pronouns (high accessibility markers), despite the recency of their mention (high accessibility). The more previous mentions an antecedent has enjoyed, the higher its accessibility. Still, discourse topics can usually be referred to by high accessibility markers despite a low count of previous mentions. Perhaps this is due to the fact that some entities, discourse topics more than others, are inferred to be present even when explicit mention is lacking (see also Grosz, Joshi & Weinstein 1995; O'Brien &Albrecht 1991;).7 Maes andNoordman (1995) find that54.2% (682) of their demonstrative+ NP expressions ( 1259, in Dutch) are a second mention reference to an entity (or proposition) just mentioned (in the same or previous sentence, for the most part). Since a modified demonstrative pronoun is an intermediate accessibility marker, but the distance factor points to a high degree of accessibility, this finding appears to be a counter-example to accessibility theory. However, a newly introduced discourse entity (and even more so when the antecedent is complex- a proposition) is not instantaneously highly accessible enough for further references by high accessibility markers (as they note themselves in note 15; see Ariel1990; Clancy 1980; Du Bois 1980; Du Bois & Thompson 1991; Tao 1996).8 Similarly, tum-initial positions are expected to contain lower accessibility markers (they form a discourse break). Rieder and M ulokandov ( 1998) then explain the surprisingly high occurrence ofzeroes and cliticized pronouns for first person references in initial turn position (in Hebrew) by noting that the preceding tum was a question addressed to the speaker in an overwhelming majority of these cases. We must not therefore mistake the individual factors contributing to degree of accessibility (e.g., tum-initial position) for criteria} conditions on linguistic usage. A study of resumptive pronouns versus gap usage in conversational Hebrew relative clauses (Ariel 1999) revealed that nonrestrictive relative clauses (relative lower degree of accessibility of the head when the relativized position is processed) contain many more resumptive pronouns than restrictive relative clauses (relative higher degree of accessibility). Still, two thirds of the nonrestrictive relative clauses contained gaps rather than resumptive pronouns. This might again seem a counter-example to accessibility theory, since an extremely high accessibility marker (the gap) is used when the degree of accessibility hypothesized between the head and the relativized position is relatively low. However, in dose to three fourths of these gapped cases, the relativized position is a subject (rather than an indirect object, for example). Subject position is assigned to prominent entities, ones which are of a higher degree of accessibility.
35
36
Mira Ariel
In general, I found that a combination of accessibility factors (head complexity, distance, grammatical role of the relativized position, and restrictiveness) predicts the occurrence of gaps and resumptive pronouns better than any one of the above factors. This is so because any particular instance may involve values for both high and low accessibility, and it is only the assessed combination of these factors (as well as others) which determines the overall degree of accessibility dictating the form chosen by the speaker. Note also that it is only the general concept of degree of accessibility which can account for why a variety of factors which seem unrelated to each other (e.g., the restrictiveness of the relative clause, grammatical role, whether the head is long or short, etc.) all encourage or discourage the usage of a resumptive pronoun. Garcia ( 1983, p. 203) similarly shows how the higher accessibility marker is chosen for cases where the antecedent is a nearby grammatical subject, as well as for cases in which the antecedent is contextually salient, although grammatically speaking, there is no similarity between these two conditions. The form-function correlation for reflexive pronouns (e.g., herself> also demonstrates how degree of accessibility cannot automatically be determined. I had probably mistakenly classified reflexive pronouns as higher accessibility markers than pronouns (Ariel 1990), because their antecedents are (for the most part) very local (within their C-command domain). Note, however, that reflexive pronouns are less attenuated than pronouns (in English), and should have therefore been relatively lower accessibility markers under accessibility theory. Now, this marking exception could be explained by reference to the high frequency of pronouns versus reflexives (unmarked forms tend to be short), but I think an alternative explanation, one based on the historical development of(English) reflexives (see Paltz 1977; Keenan 1994), shows them to be lower accessibility markers than pronouns, despite the fact that they are locally bound, whereas pronouns are locally free. Reflexive pronouns within the C-command domain are basically pronouns referring to unexpected entities. Their antecedents are quite accessible in some absolute sense, but they are not expected (and therefore accessible) in the specific role they actually occur in (which is coded by the reflexive). The same applies to the contrastive pronouns in Dutch, as analyzed by Comrie ( 1994). It is illuminating to compare locally bound coarguments of the same verb (accusatives and objects of prepositions) with locally bound nonarguments in Old English (Keenan 1994). The latter are referentially dependent on the subject, and they mark a high involvement of the referent in the event (e.g., The king walked him to London). These high involvement nonarguments are coded
Accessibility theory
by pronouns. But coarguments which are objects of verbs of serious personal harm (e.g., threaten, kill) are invariably coded by pronouns + self). The reason is that coarguments of the same verb (especially of the above kind) are expected to be disjoint in reference, since we are expected not to hit or threaten ourselves (Faltz 1977; Farmer & Harnish 1987). Hence, the argument is expected notto be co referent with the subject. The referent of the subject in such contexts is then of a lower degree of accessibility in the object role, and a lower (less attenuated) accessibility marker (a pronoun+ self) is used. We see that it is indeed the degree of expectation (accessibility) for subject coreference that matters (rather than argument versus nonargument role) when we compare the accusatives of bodily harm with those of verbs of grooming (e.g., dress). Unlike the former, the latter do create a high expectation for subject coreference (we are expected to dress ourselves), hence the accessibility of subject coreference for the object is high, and indeed they take regular pronouns, rather than reflexives in Old English (e.g., She dressed her). What is crucial for accessibility theory at this point is that we realize that a relatively lower accessibility marker (the reflexive) can grammaticize for a syntactic context that other things being equal is considered to be a very high accessibility context, namely the C-command domain. When the entity (although highly accessible) is not predicted to appear in a certain role, its degree of accessibility is (relatively) low, despite the short distance from the previous mention, and despite the fact that its previous coding marks it as highly accessible (subject). For the same reason, namely the complexity involved in accessibility assessment, I believe Giv6n (1992) was too hasty in his conclusion that accessibility is reducible to a binary distinction in language. Giv6n finds that the definite descriptions in his data retrieve antecedents which occurred at a variety of distances, unlike zeroes and pronouns, which retrieve discourse entities mentioned 1-2 clauses back for the most part. We should remember, however, that accessibility cannot be established on the basis of one factor (distance in this case) and that definite descriptions do not actually constitute a homogenous category of referring expressions in terms of degree of accessibility (see Almor 1999; Ariel1990, 1996}. First, the fact that definite descriptions retrieve antecedents from many distances can be explained by reference to other factors involved in assessing degree of accessibility: grammatical role (i.e. subject versus nonsubject}, degree of discourse salience (topicality), paragraph and frame boundaries, the number of previous mentions, etc. In other words, one should examine the "exceptional" short-distance definite descriptions and establish that they do not actually code a low degree of accessibility despite the
37
38
Mira Ariel
short distance. Only highly accessible entities (as measured by a comprehensive assessment of degree of accessibility) which are coded by low accessibility markers nonetheless constitute counterexamples to the theory. Toole (1992, 1996) checked such cases and found that the majority of these can be explained within a more complex assessment of degree of accessibility (plus intended divergences produced in order to generate special implicatures - see 2.1 below). The accessibility factors considered by Toole were: (1) Distance (whether anaphoric expressions corefer with antecedents in the same or in the last proposition, in the same or in a previous episode); (2) Topicality (how many times the antecedent was mentioned in the last four propositions); (3) Competition (how many matching intervening entities there were between the last mention of the antecedent and the anaphor). Second, when I divided definite descriptions into lower and higher accessibility markers according to their degree of informativity, consistent distributional differences were discernible. In Ariel ( 1990, p. 44), I presented data which showed that whereas the majority (78.2%) of definite descriptions composed of 1-2 content words were discourse anaphoric (higher accessibility), the majority (65.3%) of the definite descriptions which contain 3+ content words were first mentions (lower accessibility). In Ariel (1996) I distinguished between definite descriptions of 1, 2, and 3+ content words, as well as definite NPs + relative clauses, all as introducers of new discourse entities. Indeed, the lower the accessibility of the entity introduced, the more informative the referring expression was. 9
1.3
On the nonexclusivity of accessibility considerations
While accessibility considerations are a central aspect of referential choices, they by no means exhaust the selection process. Contextual assumptions must be relied upon in order to ascertain that a referential (rather than attributive or generic) use has been made (see Mueller-Lust & Gibbs 1991 ). Relevance-based considerations help select among equally accessible potential referring expressions, such as my neighbor, the mayor, Mark and the idiot Inferential processes are also crucially involved in determining reference, as in Horrified, she snatched the mea~from the do~ and threw i~ into the fire (from McEnery & Thomas 1992; see Ariell990, Part III; Comrie 1988b; Gundel et al. 1993; Gundel & Mulkern 1998; Matsui 1998, inter alia Sanford & Garrod 1981 ). Such considerations may sometimes even dictate violations of accessibility theory for special effects (see Ariel 1990, Part Ill). 10 Thus, I have argued that women and minorities are consistently referred to by higher accessibility markers than are called for given
Accessibility theory
the specific (relatively low) accessibility assessment (initial references), because speakers are not careful enough in making sure their addressees actually identify the referents intended. The dearest example for this phenomenon is the wide-spread use of first names for women and minorities (for data and analysis of references to the 'other', see Ariell990, 9.2; and see Mulkern 1996, p. 247). Such violations are very much socially and culturally bound. Indeed, there is a rich anthropological linguistics literature on naming patterns in different societies, which shows how the universal accessibility marking scale is embedded in social norms. I here mention one such example of a cultural difference, from the Nayaka, a hunter-gatherer group in India. Bird-David ( 1995) finds that names are not the rigid designators we usually think of. They function quite differently, and are rarely used in this society. Children are referred to as 'girl' or 'boy', or in relational terms, e.g., 'daughter', even by non-parents. Adults are mostly referred to by kin terms, which is a mark of intimacy (rather than their names). It is mostly adolescents that are referred to by nicknames. But names are not even necessarily constant in a person's life. People may have a number of names simultaneously. Bird-David notes that her informants sometimes asked people what they are called these days, even though members of the community live in extreme proximity to each other, and are in constant contact, even in their home-huts.u In contrast, Downing (1996, p. 130) argues that bare proper names are co-recognitionals, and are used when "present in the territory of information ofboth participants" (of the conversation). Only when the referents do not meet this condition other referring expressions are used (e.g., this+ proper name). It remains to be seen how general this principle is (I had independently made similar claims for Hebrew- Ariell990, pp. 203-206), in view ofthe Nayaka pattern. Note that although accessibility theory defines the relevant degree of accessibility to be that of the addressee as assessed by the speaker, a speaker may pretend to speak for another, and she then has to assume the degree of accessibility of the entity as it would have been assessed by that speaker. This is what happens in example ( 1) above when Frank is assuming the teacher's identity who will be reading Melissa's paper. He refers to awakened by a lower accessibility marker, that word, since the teacher will have no basis for assuming that the word is highly accessible to Melissa when reading his comments. Following Kuno ( 1987), I have also claimed that higher accessibility markers are used to code the character whose point of view is reflected in the discourse (Ariel 1990, pp. 203-204). The examples below demonstrate this
39
40
Mira Ariel
clearly. (3) and (4) are excerpts describing the same rape by two Hebrew newspapers (Haaretz and Maariv). Both newspapers relied on the same source: the police record, and hence the extraordinary similarity. Note that Haaretz and Maariv differ in that only Haaretz clearly adopts the victim's point of view. This can be seen from: a. the choice of subject role in describing the meeting between the rapist and the victim (Haaretz chose the victim, Maariv chose both the rapist and the victim), b. the choice of verb for the rapists expressing interest in having sex with the victim (demand in Haaretz, asked in Maariv), c. the addition of'as a result' adverbial in Maariv, making the rapes appear to be the result of the victim's refusal: (3) i.
ii
In the complaint the woman; claimed that on May 2, 0; met Roter1 ••• Then the twoi+k demanded from her; to have sex with themi+k" According to her,, when o, refused, Roter1started punching htrj ... (Haaretz, 5.17.1995).
(4) i.
According to the police, Roter1and the rape victim; met in the beginning of May ... ii. . .. and at a certain point 01+k asked the rape victim; to have sex with themi+k" This one; refused, and as a result, the twoi+k cruelly raped
hetj ... (Maariv, 5.17.1995). Note that in both papers the victim and the rapist are initially introduced (in this narrative) by a low accessibility marker, as is appropriate. It is in (ii) that the difference in point of view clearly shows itself in referential forms as well: In Haaretz the victim is coded by 0, and the rapist by a proper name (a low accessibility marker); In Maariv the rapists are coded by 0, and the victim is referred to by a demonstrative pronoun (a mid accessibility marker). When comparing all the zero versus pronoun ratios in the two papers, Haaretz has 6 zeroes versus 2 pronouns for the victim, but Maariv has precisely the opposite ratio: 6 pronouns versus 2 zeroes for her (pronouns here include demonstrative pronouns). The newspapers do not differ with respect to the zero/pronoun ratio for the rapists (1.4 and 1.3 times more zeroes respectively). Similarly, Giv6n (1998) finds that in his own unpublished novel, for which he has two versions, one from the perspective of one character and one from the other's, all the full NPs were reserved for references to the character whose point of view was not being represented. 12 A few additional examples of why accessibility theory can only account for default referential choices follow. Du Bois ( 1991) discusses what he terms
Accessibility theory
analogue reference, namely, cases where the speaker refers to X, but intends the addressee to derive from it also a reference toY as a conversational implicature. Such references may not only violate the requirement to select referring expressions according to the degree of accessibility of the mental representation at hand, they may ignore the accessible/inaccessible distinction, referring by an indefinite NP to a referent previously identified by a definite NP (One of his examples has: The cook. followed by a cook.>· Hakulinen ( 1987) argues that Finnish speakers avoid personal references, and thus, generic zero references are by now almost completely conventionalized as first person references. The next, originally Hebrew, example shows the speaker oscillating between 1st and 3rd person references to himself: (5) But Ii insisted then. A person j? i? devoted two months and a half, O;w built a whole program, 0;1 1.! took care of a budget, it is not as if Minister Katzav gave mei, I took care, I went ... (Mudai, TV interview, 2.11. 1998, from Mulokandov and Rieder 1998). The speaker here is dearly understood to be speaking of himself, but he is trying to create the impression that he refers to himself from an objective, "other", rather than "self' point of view. Hence the 3rd person "inappropriate" generic references to an indefinite person, which combine with predicates which unequivocally describe his and only his actions. Sanford and Moxey ( 1995) show that despite the theoretical (high) accessibility of some discourse entities, they are not easily referred to: (6) In the garden, I saw a young girli kicking a tree;. 1I looked at themi+i for a while (Sanford & Moxey's example 17, 17').
The ability to refer to even highly accessible entities is relevance-based. In a different context, the above example is perfectly acceptable, as they show. Indeed, accessibility theory takes it for granted that speakers have already decided on who to refer to, even though it is not at all a cut and dry decision, simply depending on "who did what to whom". This is clearly seen in the following example, where the speaker switches from his initial we to a he. It is certainly not because he has suddenly realized that the string is Yo Yo Ma's, rather than the whole dance group's: (7) Morris: We broke a STRING, Or HE broke a string (A TV interview with Yo Yo Ma and Mark Morris, a choreographer, Israeli TV, 7.9.1998).
41
43
Mira Ariel
Such choices of referential forms are made independent of accessibility theory, and they generate a whole array of conversational implicatures (in this case, how Mark Morris sees Yo Yo Ma as integral to his show). Maes and Noordman ( 1995) argue that a combination of a demonstrative pronoun and a noun phrase is used when the NP serves a predicational rather than an identification function. Such expressions are actually used in order to modify the addressee's representation of the intended entity. A lower accessibility marker is then used for such a marked purpose. The marked accessibility marker (there is a mismatch between the high degree of accessibility and the accessibility marker chosen) conveys information which the speaker directs the addressee to access in connection with the referent. For example, when the expression This Reagan follows the sentence Ronald Reagan is clearly suffering from memory failure, it is interpreted as 'the Reagan suffering from memory failure' (see their example 17). Taken together, Sections 1.2 and 1.3 argue that accessibility theory is not reducible to any one linguistic principle, because degree of accessibility is a complex psychological concept, and at the same time, that accessibility theory cannot exclusively account for referential choice and interpretation.
2.
Corroborating and enriching accessibility theory
I have presented many pieces of evidence for the applicability of accessibility theory in Ariel ( 1985a, 1988a, 1990 ). The reader is referred to those sources for original analyses of mine and for extensive references to other works which support the accessibility claim. More recent research has corroborated and enriched the applicability of accessibility theory. I here restrict myself to citing works I did not have access to when initially presenting accessibility theory (even though some had been published before). Much of the work to be mentioned was conducted independently of accessibility theory, some is a direct product of it. 2.1
General accessibility predictions
This section is dedicated to supporting general accessibility theory points: I quote works which argue for the replacement of formal conventions with what I would term degree of accessibility codings (Comrie, Garcia). I present distributional analyses of a variety of referring expressions (Saadi, Toole, Dolman),
Accessibility theory
all of which point that accessibility considerations (e.g., distance, competition, antecedent prominence etc.) are at work. Some also argue that genre differences do not refute the validity of the accessibility proposal (Toole, Dolman, Saadi, Kronrod and Engel). I present findings which confirm that degree of accessibility is a dynamic and complex notion which cannot be reduced to single factors (Gernsbacher et al., McKoon et al., Toole, Kibrik), and that it is not the only factor determining referential form (Kronrod and Engel, Almor). Finally, I present research which develops accessibility theory beyond my original proposals (Almor, Epstein). Comrie (1994) shows how Dutch contrastive pronouns refer to the less expected antecedent (lower accessibility on my account). While in most cases this means a nonsubject, it is not invariably so. Comrie argues that it has to be the non topic actually. Garcia ( 1983, 1996) argues that what seems to be a difference between subject versus nonsubject antecedent (for si versus el in Spanish) is a difference in "contextual obviousness", what I would term degree of accessibility. Indeed, in many cases, antecedents of high accessibility are also subjects, but Garcia demonstrates how non subject antecedents can take si anaphors, provided they are highly accessible (e.g., discourse topics). Assuming an accessibility distinction between si and el can also explain the higher frequency of si (the higher accessibility marker) with human versus nonhuman antecedents. In addition, Garcia (1996) specifically relates the fact that si does not distinguish for gender to its marking a higher degree of accessibility (my terminology) than the pronominal forms, which do. This corresponds to my Informativity criterion for accessibility marking. She also examines the role of competing antecedents and determines that the more salient the nonantecedent competitor, the lower the accessibility marker required for the antecedent. Finally, Garcia finds that the higher accessibility marker is preferred when the argument is governed. I have suggested the same for gap versus resumptive pronoun usage (Ariel 1999). Governed arguments are more predictable and hence more accessible. Saadi ( 1997) examined the English and Hebrew versions of one children's story and one adult story. Her findings support the accessibility predictions. All four sources shifted from a predominant use of zeroes and pronouns to lexical NPs as the distance from the last mention of the same entity was larger. The same applies to the factor of competing (intervening) referents: the more intervening characters mentioned, the higher the likelihood for a lexical NP to occur. All sources also distinguished between the main character (more salient, mentioned more times) and a secondary character, so that the main character
43
44
Mira Ariel
was referred to by high accessibility markers much more often than the secondary character. In both languages there was also a difference between the adult and the children's stories (more so in Hebrew), in that the children's stories contained more lexical NPs. Saadi suggests that this difference is due to the fact that adults writing for children assume that children's short term memories are not fully developed. McKoon, Ward, Ratcliff, and Sproat ( 1993) testify to the fact that degree of accessibility is a complex concept, the components of which may work in opposite directions, as when the antecedent is part of a compound (accessibility is lower) but the entity is topical (higher accessibility). Kibrik (1996) shows that degrees of activation dictate referential forms in Russian narratives. He also underscores the importance of the multiplicity of factors involved in determining degree of accessibility. Toole (1992, 1996) has convincingly shown how degree of accessibility, when measured by a few criteria (see 1.2) can explain the distribution of referring expressions in four discourse genres. Her conclusion is that despite the statistically reliable differences in referring expressions in different genres (see Fox 1987), accessibility theory can account for referential choices in all the written and spoken genres she examined. The statistical differences found stem from contextual factors which determine what types of discourse entities (in terms of degree of accessibility) tend to occur in discourses of different genres. In other words, a case by case analysis ofthe referring expressions used in all the genres revealed the same accessibility form-function correlations. Toole found that accessibility marking violations are only performed in order to achieve special objectives, e.g., low accessibility markers to clarify at the addressee's request, to define a term etc (see also Ariell990: Part III; Maes & Noordman 1995; Vonk, Hustinx & Simons 1992). Dolman ( 1998) too found no differences in referential choice between children from high and low socioeconomic backgrounds.' 3 Both groups complied with accessibility theory (degree of accessibility measured as a combined function of distance from last mention and the importance of the character to the story). Kronrod and Engel (1998) reached similar conclusions in their examination of referential forms used in newspaper headlines (see also Jucker 1996 ). They found no genre differences between the high brow subscription paper and the news stand popular papers, and between the different sections within the papers (front page, other news items, stories and sports). All the headlines showed a clear preference for intermediate accessibility markers (first names, last names and short definite descriptions). The fact that intermediate accessi-
Accessibility theory
bility markers predominate, despite the initial retrieval status, where low accessibility markers are expected, is explained by reference to Du Bois' (1985) notion of competing motivations. Headlines must be short and vague (in order to save space and arouse curiosity). High accessibility markers would have served that function best. But because the referents are also initial retrievals, and hence of a rather low degree of accessibility, a compromise is struck, and most of the referential forms are of an intermediate degree of accessibility. Here is (my own, originally Hebrew) illustrative example. Compare the referring expressions in the headline (a) with their counterparts in the opening sentence of the article (b): (8)
Arafati invited Kadafi, to pray in ]erusaletf\. when(\ will be the Palestinian capital b. The Palestinian authority chair, Yassir Arafati, invited the Lybian leader, Muamar Kadafi;• to pray in Emtern ]erusalemk, when this on'it will become the capital of the Palestinian state (Haaretz, 7.14.1998). a.
Gemsbacher, Hargreaves, and Beeman (1989) show how and why degree of accessibility of concepts shifts in the course of discourse. Sentential first mention entities are later entertained at a relatively high degree of accessibility due to comprehenders' assumption that first mentions are the discourse topic. 14 But mention in the last clause also facilitates retrieval, due to the high accessibility associated with the last clause processed (dark & Sengul 1979). As Gemsbacher et aL (1989) noted, these two facilitating conditions sometimes contradict each other. In a series of experiments measuring accessing speed at different processing stages, Gernsbacher et al. were able to establish that degree of accessibility is a dynamic phenomenon. Thus, an entity mentioned clauseinitially is less accessible than a more recently mentioned entity at first, but later, it gains in accessibility, as the units in which the two entities appear are integrated into one whole. In other words, recency is a short-term accessibility booster, whereas sentence-initial mention is a long-term accessibility booster. That the time in which we measure degree of accessibility is of the essence can also be seen in an experiment by Gemsbacher (1989). Almor ( 1999) embeds my initial proposal that referring expressions are "price tags" on processing effort in a more comprehensive system of processing assessment. Other things being equal, low accessibility markers take longer to process than high accessibility markers. Anaphors with a high informational load (roughly low accessibility markers) are easier to process when the antecedent is of a relatively low degree of accessibility (a nonfocussed antecedent).
4S
46
Mira Ariel
The same low accessibility anaphors are harder to process if the antecedent is highly accessible. This seems to echo the "repeated noun penalty" (see Gordon, Grosz & Gilliom 1993; Gordon & Scearce 1995). At the same time, I have argued (Ariell990, chap. 9) that intended divergences from appropriate accessibility marking are possible, but limited to cases where specific conversational implicatures are sought, above the referential function. I reasoned that the extra contextual implications justify the extra-processing cost. Almor formulates this intuition into a principle whereby "additional cost must serve some additional discourse function" (p. 5), such as adding some new information about the referent (and this is contra the "repeated noun penalty"). In this way, Almor integrates the cognitive approach with the pragmatic approach. Thus, high accessibility contexts can accommodate relatively low accessibility markers, provided increased contextual effects result. Almor then underscores the fact (convincingly illustrated also by Maes & Noordman 1995) that we cannot really account for the distribution of referring expressions by reference to the referential function of NPs alone. In fact, his experiments demonstrate that low accessibility markers are relatively easily processed, despite the high accessibility of the antecedents, provided they add some new information about the referent. Aimor ( 1999) is mainly interested in processing effort: He wants to calculate the ease of processing anaphors as an interaction of three factors: discourse focus (i.e., degree of accessibility), the amount of new information contributed by the anaphor, and the information load differential between the antecedent and the anaphor. Informational load is not equivalent to my Informativity criterion. It is calculated as the conceptual distance between the anaphor and the antecedent. In order to assess this difference, Almor draws distinctions among antecedents and anaphors not previously made by accessibility theory, such as between more general (and less informative) terms (e.g., the bird, a creature) and more specific (and informative) ones (e.g., the robin, an ostrich), between more versus less typical instances of a category (e.g., 'robin' vs. 'ostrich'). His experiments show different response times to different pairs of antecedent/anaphor according to these distinctions, some of them, even somewhat counter-intuitive (e.g., that the same anaphoric expression, e.g., the bird will be processed faster when anaphoric to an antecedent which is a less typical member in its category (e.g., an ostrich) than when it is typical (e.g., a robin), when both are focused. Almor thus adds another dimension to the antecedentanaphor relation that I did not discuss, that of conceptual difference. Finally, Epstein (l998b) extends the concept of (low) accessibility to in-
Accessibility theory
dude the accessibility of new discourse entities as well. His claim is that low accessibility characterizes the appropriate use of definite descriptions referring to entities which lack a previously stored mental representation, so that the addressee is instructed to construct a new representation, the definite article marking that the knowledge required for the construction is accessible. Such accessibility for non-Given entities can derive from the high prominence of the entity, from the fact that it is a frame-appropriate inferred role, or from the accessibility it has for a noncanonical narrator. 15 2..2.
Accessibility predictions pertaining to the type of antecedent
Recall that accessibility theory predicts that the higher the accessibility with which the mental representation is entertained, the higher the accessibility marker used to retrieve it (and vice versa for low accessibility). I present below recent findings corroborating this claim. Gemsbacher {1990) proposes the structure building framework, according to which when comprehenders are engaged in constructing mental representations for incoming information, their strategy is to build coherent structures, by first laying a foundation and then incorporating information that coheres with the foundation into it. Less coherent information makes comprehenders shift to a new constructed structure. According to Gemsbacher, two very basic cognitive processes are enhancement and suppression. These bear direct relevance for accessibility theory. Enhancement mechanisms elevate the degree of accessibility of memory nodes, suppression mechanisms reduce it. Enhanced entities "overshadow" and suppress the activation of other discourse entities. They are also more resistant to being suppressed by other discourse entities (see Gemsbacher and Jescheniak ms). One example of an enhanced entity is a "cataphoric" NP. Gemsbacher and Shroyer (1989) distinguish between NP forms as to degree of"cataphoricity", namely how marked they are for potential further references (see also Downing 1996; Sanford, Moore & Garrod 1988). 16 The assumption is that the way in which discourse entities are introduced (e.g., by an indefinite article versus by an indefinite this, in English) give rise to different expectations re further mentions (see also Giv6n 1992, about the interaction of grammatical role with marking by an indefinite this; Mueller-Lust and Gibbs 1991, on proper names; and Sanford et al. 1996, and Paterson et al. 1998, about quantified NPs as antecedents). Translated into accessibility terminology, "cataphorically" marked discourse entities become relatively more salient antecedents, because they occupy a privileged position among mental representations. They should therefore be
47
48
Mira Ariel
referred to by relatively higher accessibility markers, and they are. 17 However, NPs are not simply classified into+ versus- "cataphoric", i.e., as+ versusself-enhancing and other-suppressing. Some (e.g., contrastive stress) are more "cataphoric" than others (indefinite this), that is, they trigger a higher degree of activation, so their antecedents are more highly accessible. Gemsbacher ( 1989) also shows how the introduction of different, even new discourse entities suppresses the accessibility of current discourse entities, even if these have been established as topics before. I have referred to this phenomenon as competition, which, I argued, lowers the accessibility of all discourse entities (see Garcia 1983, p. 200, for why pronouns rather than short reflexives are sometimes used for the accessible discourse topic for this reason; Halmari 1996, p. 172, and Keysar et al. 1998, for why some competition is or is not a problem after all). Other nominal forms may be distinguished as to discourse prominence and hence to degree of cataphoricity. Halmari ( 1996) shows how zeroes, pronouns, demonstrative pronouns, proper names and definite descriptions signal different degrees of accessibility, by examining the grammatical role of their antecedents. The assumption is that subjects are used for more highly accessible entities than other grammatical roles are. And indeed, 98% of the zeroes she found had subject antecedents. The same applied to 72.5% of the pronouns, but the antecedents for demonstrative pronouns, for example, were evenly distributed among all grammatical roles. About 30% of the proper names and the definite descriptions refer to genitive antecedents (as opposed to 1% of the zeroes and 13o/o of the pronouns). Indeed, Gordon et al. (1993) and Gordon and Chan (1995) found that the "repeated name penalty" (using too low an accessibility marker) applies to subjects, but not to other syntactic statuses. Stebbins ( 1997) shows how some languages use number marking cataphorically (my term) only or preferentially for establishing new discourse entities, linking this usage to the high Informativity involved in nouns marked for number. Such languages may omit number marking in subsequent mentions. The same goes for noun classifiers and noun particles (see my interpretation of Hinds' (1983) findings re Japanese in Ariel1990, p. 90). Sproat and Ward (1987) and McKoon et al. (1993) (see also Greene, Gerrig, McKoon & Ratcliff 1994; Ward, Sproat & McKoon 1991) present similar findings. Sproat and Ward and McKoon et al. show how the way we present a concept in the discourse affects its degree of accessibility, even if it is not actually introduced as a discourse entity. This in tum affects referential options and ease of processing, as measured by reading times. For example, McKoon et al. compare anaphoric references to the non referential 'deer' in
Accessibility theory
deer hunting versus the referential 'deer' in hunting deer. Indeed, when the discourse creates a high degree of accessibility, an "illicit" (nonreferential) antecedent is properly referred to even by the high accessibility pronoun. McKoon et al. then conclude that syntactic factors contribute to the determination of degree of accessibility (and the same could be claimed for subject position proved crucial by Halmari (1996)). I tend to think the other way round, namely that it is degree of discourse prominence which influences both syntactic role and degree of accessibility (see also Gundel et al. 1993). In other words, more important entities will be introduced as referential, rather than as nonreferential, as subjects rather than as nonsubjects. Oakhill et al. 1992 show how depending on the antecedent, conceptual anaphors (e.g., I need a plate. Where do you keep them?) are appropriate, though at some processing cost. In general, they show that depending on the degree of the accessibility of the antecedent, different referring expressions are appropriate. Garrod and Sanford ( 1982) and Albrecht and Clifton ( 1998) find that an entity coded as an NP conjunct constitutes an inferior antecedent (less accessible) so references to it take longer to process. Alrnor ( 1999) demonstrates the role that focus plays in raising the degree of accessibility of an antecedent. Referents coded by focussed NPs and later referred to by anaphoric expressions were read faster than referents coded by nonfocussed NPs. Conversely, Alzheimer Disease damages working memory. Almor (in press) then explains why Alzheimer patients prefer references by lower accessibility markers (repetitive definite descriptions) over the more context-appropriate high accessibility markers (pronouns). Arnold (1997, to appear) corroborates Almor's (1999) findings, but then seeks to explain the apparent puzzle of why topic (old information) and focus (new information) both facilitate reference interpretation. The reason is that despite the differences in the nature of the information they themselves convey, both elevate the degree of accessibility of the entity they are associated with. Arnold also finds that the global topic has a stronger effect than focus or local topic. This is important in that it shows that we cannot substitute the complex concept of degree of accessibility with simple rules such as "if anaphoric with a subject, or with a focussed NP, or if a sentence topic, then the mental representation intended is to be coded by a high accessibility marker" (see also Arnold to appear).
49
so
Mira Ariel
2..3
Accessibility predictions pertaining to the type of anaphor
Accessibility theory predicts that accessibility markers which are relatively uninformative, nonrigid and attenuated retrieve highly accessible mental representations (the opposite holds for low accessibility markers). The researchers here mentioned support this claim by pointing to the correspondence between degree of antecedent accessibility and the informativity, rigidity or attenuation of the anaphor. Fowler, Levy, and Brown ( 1997) note that the same conditions which encourage the usage of pronouns (high accessibility markers) also encourage the shortening of the pronunciation of proper names (thereby making them signal a higher degree of accessibility). Brennan ( 1995) found that subjects lengthened their pronunciation of pronouns (thereby turning them into slightly lower accessibility markers) when the antecedents were nonsubjects (a lower degree of accessibility). Downing (1986) argues that Japanese classifiers are used as anaphoric expressions, the degree of accessibility (my terminology) they mark being in between pronouns and lexical NPs. Because of their high informativity, classifiers can refer to relatively distant antecedents, and in contexts where there are intervening antecedents. Both contexts are indications of an intermediate degree of accessibility. 18 Garnham et al. ( 1994), Rinck and Bower ( 1995), and Cacciari et al. ( 1997) present evidence for the importance of gender marking (even if it is arbitrary gender)- my informativity criterion. 19 Mithun ( 1996) shows how prosodic cues affect the degree of accessibility coded by the same accessibility marker, a definite NP. She distinguishes between lexical NPs which occur in separate intonation units, those that do not, and those that occur in the more Given syntactic position (postverbally in Central Porno) with a specific intonation.20 Baker (1995) presents data showing that discourse prominence and contrast determine the appropriate usage of English free reflexive forms (i.e., unbound reflexives). Although he does not note this, a superficial count of the data he quotes reveals a difference between bare reflexives (relatively attenuated) and reflexives combining with pronouns and lexical NPs (less attenuated expressions). The former mark a higher degree of accessibility than the latter. 21 Thus, languages can utilize very many additional formal markings than I have originally listed (see Ariel1990, pp. 69-93, on the universality of the accessibility marking scale). Obviation, logophoric forms and switch reference systems come to mind (on the latter see Ariel 1990). Mulkern (1996, p. 245) notes how full names function differently from partial names (and see Ariel1990, pp. 36-46). The latter mark a higher degree
Accessibility theory
of accessibility because they are less informative. Lichtenberk's ( 1996) data can be adduced in support of my claim that proximate and distal demonstratives show an accessibility distinction, and not just a deictic distinction. When tracking discourse entities, the distal demonstrative + NP retrieved entities mentioned more than twice the distance of the antecedents of the proximate demonstrative+ NP. Brizuela (1997) shows that a demonstrative NP codes a higher degree ofaccessibility than a demonstrative pronoun+ a definite marker. Interestingly, the same distinction in Hebrew is merely a register difference. Once again, we see that length of expression, not necessarily accompanied by additional content, determines a lower degree of accessibility. Onishi and Murphy ( 1993) note that metaphoric references (too low accessibility markers) to the topic slow subjects down, even though the same metaphors do not slow them down when they do not serve as referring expressions. Beun and Cremers ( 1998) find that speakers use redundant information (making their expressions code a lower degree of accessibility) when referring to physically available objects, especially when the objects are out of focus (of a lower degree of accessibility). Mehudar ( 1996) analyzes the differences between proximate and obviative references in terms of degree of accessibility (see also Arnold, to appear). She corroborates my proposal (Ariel1990, pp. 7~91) that all languages distinguish between some degrees of accessibility in their referential system, although the distinctions need not be uniform. Thus, in some languages, the proximate is reserved for humans only, in Fox the entity has to be a human with a high social status even. In some languages, the sentence is the relevant unit for determining the choice between the obviative and the proximate (proximates refer intra-sententially, obviatives extra-sententially). Crucially, what remains constant across languages is that the proximate refers to the more highly accessible entity than the obviative, the different researchers referring to it alternatively as the one in the focus of attention, the central focus of the discourse, the focus of consciousness etc. 2..4 Accessibility predictions pertaining to the antecedent-anaphor relationship (unity) Accessibility theory predicts that higher accessibility markers should be used when the connection between the antecedent (unit) and the anaphor (unit) are tight (and vice versa for a loose connection). Recent work has supported this claim. Halmari (1996) presents data showing how paragraph-initial position
51
S1
Mira Ariel
creates a lowering of the accessibility marking for continuing discourse entities. The 90 cases where she found too low accessibility markers were also paragraph-initial. In a psycholinguistic experiment, Fowler et al. ( 1997) found that episode boundary was crucial for choosing longer anaphoric expressions (of a lower degree of accessibility). Khan (1999, p. 330) finds that in conversational Jewish Neo-Aramaic (of Arbel), the use of the grammatically optional subject pronoun marks the "clause as being separated from the preceding context by some kind of discontinuity or disjunction,. It is then relatively more frequent when there is a change of subject or in grounding, when the events are percieved as separate, and at the beginning of speech. I have argued that the way we refer to initial retrieval entities (loose connection to an antecedent) is also crucially dependent on degree of accessibility (Ariel 1996). Even initial retrievals, which are brand new to the discourse, can be more or less accessible. For example, frame-induced entities are highly accessible. They are coded by relatively higher accessibility markers, then. Chafe (1996, pp. 42-46) distinguishes between two types of inferred entities. He mentions in this connection a contrast between a stressed definite description and an unstressed one. The latter was used when the inferred entity was more automatically accessible. Ziv (1996) shows how when the inferred entity is stereotypically accessible (i.e., highly accessible) even pronouns can be used for initial retrievals. Maes and Noordman ( 1995) find that Dutch definite NPs refer to more remote antecedents than demonstrative NPs, initial retrievals for the most part. Section 2 has described recent research which supports the main tenets of accessibility theory, namely that referential choice is made by assessing the degree of accessibility of the mental representation retrieved, by considering the salience of the antecedent and the degree of unity between the antecedent and the anaphor.
3·
Accessibility theory and the grammar-pragmatics division of labor
3.1
The grammatical status of the accessibility principles
Accessibility theory correlates between specific referring expressions and their usage by reference to a cognitively motivated principle. In this respect, accessibility theory resembles recent attempts to reduce some anaphora phenomena to pragmatic principles, such as Reinhart (1983), Kempson (1984), Levinson (1987, 1991), Huang {1994), Gundel et al. (1993), and Ward et al. (1991). One
Accessibility theory
could then suggest that accessibility theory should be formulated as a set of extralinguistic inferences, connecting between linguistic forms and proper contexts on the basis of common sense inferences from their semantic meanings, rather than based on conventional form-function correlations (see Reboul 1997; Bach 1998). Such a move would minimize the contribution ofaccessibility theory to predicting referential form usage. Alternatively, one could maximize the role of accessibility theory, by arguing that the accessibility principles actually replace formal rules. Thus, while I made no attempt to replace the Ccommand domain by a cognitive concept (although I view it as the grammaticization of a highly accessible context), van Hoek ( 1995, 1997) uses accessibility theory to reduce C-command to a discourse concept which is sensitive to the prominence of the antecedent and the degree of unity between the antecedent and the anaphor. She thus reformulates Reinhart's C-command restrictions against full NPs being in the C-command domain of pronouns co indexed with them as an accessibility marking violation, where a low accessibility marker is used in a high accessibility context. It remains to be seen whether van Hoek's accessibility restrictions can actually replace the grammatical principle. For example, for the most part, the subject is indeed the most highly accessible entity, discourse-wise as well, which explains why the entities under its domain can be dependent on it for interpretation but not vice versa. However, what if a nondiscourse topic happens to be the grammatical subject (as in 11 below)? Can such a subject, since it is not so salient, be pronominal and coindexed with a full NP in its domain? I doubt that. I therefore see van Hoek's intriguing development of accessibility theory within the sentence (see Ariel1990, Part II originally) more as testifying to a plausible grammaticization path of accessibility considerations into grammatical rules. I myself have opted for a nonminimal nonmaximal position (Ariel 1990, 1994). In general, I have argued that the linguistic-extralinguistic division of labor does not neatly divide utterance interpretation according to the topics identified by linguists (see Ariel1998c, Ms). Such a division would posit that all aspects of reference interpretation belong either in the grammar or in pragmatics. Instead, most probably each and every linguistic form undergoes a dual interpretation procedure, whereby some aspects of its interpretation are linguistically derived, and others are associated with it extra-linguistically (i.e., inferentially). This is certainly the case for referring expressions, where a pragmatic theory (such as Sperber and Wilson's (1986) Relevance theory) has a major role to play (Ariel1990). Moreover, I have argued that while the formfunction correlations established by accessibility theory are cognitively well
53
S4
Mira Ariel
motivated for the most part (by the criteria of informativity, rigidity and attenuation mentioned above), some aspects of the accessibility scale (which expressions code which degree of accessibility) need to be grammatically stipulated nonetheless (see Ariel1990, pp. 76-87). Reference interpretation then is modularized between a linguistic (formal rules and accessibility degree lexically specified) and an extralinguistic inferential competence (see also Farmer and Harnish 1987).22 This much is perhaps obvious. What is less obvious is that the linguisticextralinguistic division does not coincide with the sentential-extrasentential division either, nor with the obligatory/optional dichotomy. Garcia ( 1983) and Ariel ( 1987, 1990) have emphasized that imposing on grammatical principles a sentential domain misses generalizations that hold both within and across sentences. Aissen ( 1997) confirms that the same principles account for obviation within and outside the clause. The span within which one third person referent must be proximate and all others obviative can be indefinitely large. The same applies to logophoric markers (marking the character whose point of view is conveyed) (see Hyman and Comrie 1981). In fact, in Plains Cree, the constraint that there must at least be one proximate marker is imposed on a stretch of discourse and not on the sentence, which may well not contain any (Comrie 1994). A switch reference system can also involve a relationship between nonadjacent clauses (Comrie 1994). Degree of accessibility, I have argued, is crucial both within and across sentences, and this is why when extremely high accessibility obtains, a zero can be used, whether its antecedent is sentential (e.g., a matrix antecedent in a control context) or extra-sentential (the discourse topic, for the most part). This is why Spanish si refers to the subject for the most part (a sentential highly accessible antecedent), but when it does not, it refers to the discourse topic (an extrasentential highly accessible antecedent). I also suggest that the grammarians' division into a grammatical (i.e., obligatory) versus a pragmatic (i.e., optional) "avoid pronoun principle" (for different languages- see Bouchard 1983; Hermon 1985) is unnecessary. Pronoun "avoidance" corresponds straightforwardly to avoiding too low an accessibility marker when the antecedent is extremely highly accessible. Precisely such variability between languages is expected if we assume that cognitive principles apply in all language, but they grammaticize only in some of the cases (see also Comrie 1994),23 The position I have adopted is that while there is a universal cognitive basis for referential form and usage, specific grammars translate the cognitive generalization somewhat differently (see Levinson 1987, 1991 for a similar point rea
Accessibility theory
pragmatic universal). There is then a role to the specific grammar of the language in determining referential forms and interpretations (see also Gundel et al. 1993). This division of labor between extragrammatical and grammatical principles explains the differences among languages (Cf. the use of zero subjects in English and Chinese, high accessibility markers in both languages) despite my assumption that mental representations are similarly accessible to speakers of different languages. Since each language only draws a certain number of accessibility distinctions, the choice of actual forms (to have or not to have a definite article, for example) and the precise accessibility domain carved for each referring expression (e.g., what to count as extremely high accessibility licensing a zero in Chinese and in English) may vary. Many languages allow (or dictate) zero for second person references in imperatives (i.e., where the entity referred to is highly accessible), as well as in control verb contexts, where depending on the type of verb, a high cohesion between the clauses creates a high degree of accessibility for the matrix antecedents (as in I didn't want to see him, or Like he wanted me to look at him- Jury). But these are grammaticized conventions, rather than directly motivated tendencies, in that they do not absolutely have to occur in each language. Greek and Sakapulteko Maya, for instance, do not have zero subjects in control contexts, and the latter does not force a zero in imperatives (DuBois, personal communication).lndeed, accessibility markers even show dialectal variability (see Garcia 1996, and Cameron 1997, on the variability of Spanish referential forms). Similarly, when we examine the usage of accessibility markers, we can see how formal and cognitive factors work in tandem in conditioning their occurrence. Hyman and Comrie (1981) argue that Gokana logophoric suffixes can always be anaphoric to subjects (a formal condition), but they can be anaphoric to an object, provided it is the source of the information (a pragmatic condition). Aissen (1997) claims that the (obligatory) choice of the argument to be coded as proximate depends on grammatical function, semantic properties and discourse salience (a mixture of formal and pragmatic conditions). In Ariel ( 1987) I proposed a scale ofaccessibility contexts, showing that formally defined contexts (e.g., where there is an obligatorily and uniquely determined antecedent, as in obligatory control contexts, or in wh- extractions) are on a par with cognitivelydefined contexts (e.g., the discourse topic) in that both may require or encourage the use of the same referring expression. Indeed, accessibility markers can be properly used by either fulfilling a formal criterion, or by fulfilling a pragmatic condition. For example, reflexive pronouns in English have an obligatorily formally defined condition: they have to be bound within
SS
S6
Mira Ariel
their C-command domain. But they can also be used without a sentential antecedent at all, when they are the subject of consciousness. Are these syntactic and pragmatic contexts really different? Note that within the sentence, reflexive pronouns have some contrastive residue. Not so outside the sentence. I suggest that what these uses have in common is an intermediate degree of accessibility. In the minimal (C-command) domain (high accessibility), only a contrastive (relatively low accessibility) entity is of an overall intermediate degree of accessibility. Across the sentence boundary (low accessibility), the subject of consciousness (high accessibility) is also entertained at an overall intermediate degree of accessibility (see Zribi-Hertz 1989 re long distance reflexives). Perhaps we can say that at some deep level these two contexts are cognitively the same. This will allow us to distinguish between potential grammaticizations (where similar degrees of accessibility get coded by the same accessibility marker) versus impossible ones (where different degrees of accessibility get coded by the same referring expression). Note, however, that languages may differ with respect to these two contexts. There may be languages which allow their reflexive pronouns in one but not in the other context. While the degree of accessibility associated with (long) reflexives may be intermediate for all languages, we need to specify for each language what mid accessibility translates into for the specific marker. 24 3.2
The grammaticization of accessibility markers
Grammaticization often entails a transition from a pragmatic, extralinguistic tendency to a grammatical, often obligatory rule. In Ariel ( 1998a, 2000) I have outlined such a historical path of change, leading from free pronouns to verbal person agreement inflections (more attenuated than pronouns, hence marking a higher degree of accessibility), arguing that such a change occurs for the forms referring to highly accessible discourse entities. Since speakers tend to shorten the forms referring to highly accessible entities (the criterion of attenuation), and since the speaker and the addressee (but not 3rd persons) are consistently highly accessible, it would be first/second person pronouns which consistently get shortened (as a pragmatic tendency). Shortening may lead to cliticizations and eventually to obligatory inflection (a grammatical rule). This is why most of the languages which manifest verbal person agreement markers restrict them to lst and 2nd persons. Person agreement development is a case where accessibility theory directly motivates bona fide grammatical morphemes (i.e., person agreement markers which are shortened free pronouns). I believe
Accessibility theory
that the creation of reflexive pronouns from independent pronouns and independent adjectival selfin English (see Keenan 1994) can be similarly motivated. Pronouns and modifier self were independent forms which consistently cooccurred in Old English in contexts where subject co reference was unexpected. A bare pronoun was then modified by selfin order to mark the relatively lower degree of accessibility of the subject by a longer referring expression. In fact, the same process can be seen in the current example: (9) Frankly, I'm torn my own selfas to which way to raise hell (Clark Reed, as quoted in The International Herald Tribune, Jan 2-3, 1999). In other cases, accessibility theory can motivate grammatical, even obligatory constraints on the distribution of various referring expressions. I have mentioned in this connection the binding conditions (Ariell987, 1990; see Keenan 1994; Levinson 1987, 1991 ). In Ariel (1999), I argue that whereas the distribution of zeroes and resumptive pronouns in relative clauses seems quite diverse among the languages of the world (e.g., some languages make zeroes obligatory with subject relativized positions, some allow or encourage resumptive pronouns only with nonrestrictive relative clauses etc.), accessibility theory can motivate the variability in grammatical patterns we actually find. These stem from frequent discoursal patterns which reflect the usage of zeroes and resumptive pronouns according to the degree of accessibility of the antecedent (the relative clause head) when the relativized position is processed. Zeroes are an option, or preferred, or grammaticized for extremely high accessibility contexts, and resumptive pronouns for relatively low accessibility contexts (e.g., syntactic islands). The precise use conditions are language-dependent, of course. It is important to note that while grammaticizations are often merely the freezing of specific realizations of accessibility distinctions into obligatory linguistic rules (e.g., for gaps and resurnptive pronouns in relative clauses), once some rule is part of the language, it may interact with other linguistic facts, and generalizations of patterns may then even obliterate the originally pragmatically motivated distribution (see Comrie 1983, 1988a). This is how I interpret Keenan's (1994) explanation for why English lost its high involvement pronouns (as in the king walked him to London). In Old English there were two contexts where anaphors were locally bound to their subjects: nonarguments of the high involvement type and contrastive coargurnents of the verb. As I have mentioned before, the latter are less accessible than the former, even though both pick the subject as antecedent. A situation where (locally
57
sa
Mira Ariel
bound) pronouns are used for high involvement subjects and reflexive pronouns are used for (locally bound) contrastive arguments is quite compatible with accessibility theory (see 1.2 above). However, once local binding becomes a characterizing feature of distribution, it is harder (less general) to have two types of referring expressions in the same, by now grammatically defined context (local binding with the subject). Perhaps this is why English dropped the "exceptional" use of high involvement pronouns. Hebrew did not. The same explanation applies to the less even spread of reflexive forms to objects of preposition (as in you making positive choices for yourself in your life- Death, versus that's his way ofdrawing your attention to him- Jury- see Bouchard 1985; Faltz 1977; Zribi-Hertz 1980, 1989). For objects of prepositions, especially ungoverned ones, coreference is not so unlikely as for accusatives. Hence, a pronoun could have been acceptable (indeed it was in Old English, and still is in some cases, as in Do you have any sharp objects on you?- Risk; Cf. with they brought it upon themselves- Cutiepie).25 Note also that whereas genitives are pronominal in many languages, they are reflexive in some (e.g., Swedish and Turkish). Accessibility expectations allow them to be pronouns (due to the high accessibility of their referents), but a formally defined generalization may force a reflexive in this context. Arnold (to appear) argues that Mapudungun subjects code the most accessible entity of the clause. However, this choice has been frozen into an animacy scale, whereby first/second person references are automatically higher on the scale than third person references. The result is that on the rare occasions when third person referents are more accessible, it is still the first/second persons which are selected for subject position. This is another case where a formal rigid distinction replaces the more variable, cognitive one. Rieder and Mulokandov (1998) find a surprising fact: Hebrew first person plural pronouns (anaxnu) contract more often (2.5 times more) than singular first persons (am). 'We' is also coded as zero more often than 'I' (1.5 times more). These are seemingly unexpected under accessibility theory, since surely the degree with which speakers are accessible to their addressees exceeds the accessibility ofthe speaker plus another or others (the referents of 'we'). However, once we take into consideration that Hebrew 'we' is three syllables long, the findings are no longer unexplained. Recall that long forms (i.e. least attenuated) code a relatively low degree of accessibility. Since speakers must choose between zero and a three syllable NP for 'we' (in modem Hebrew), they would tend to opt for the high accessibility forms more often. Such findings demonstrate the interaction of accessibility theory with specific facts of particular languages, in this case the
Accessibility theory
lexical options available. Indeed, in Saadi ( 1997), the number of high accessibility markers (pronouns and zeroes calculated together) was identical for the adult story in Hebrew and English, but Hebrew showed more zeroes than English, and English showed more pronouns than Hebrew. These differences are obviously motivated by the freer zero options available in Hebrew. The grammaticization of specific anaphoric expressions in certain syntactic structures can also be motivated by accessibility considerations. Ziv (1994) is explicit about it. She argues that one should not simply treat left and right dislocations as whole syntactic constructions used in specific (and different) pragmatic contexts. Rather, she shows that the facts of their pragmatic distribution match the referential forms they employ (an initial NP for left dislocations, an initial pronoun for right dislocations), which, in turn, are governed by the degree of accessibility associated with the entity coded by the dislocated NP. In other words, the frozen referential forms in left and right dislocated sentences are no different from their free occurring counterparts. Montgomery ( 1989) discusses it versus that left dislocations. He finds that that dislocations occur with the more complex (and clausal) NPs, they establish a contrastive focus in 26% of the cases (as opposed to 7o/o ofthe it dislocations), and they initiate an "oral paragraph". Note that these are all features which characterize entities of a relatively lower degree of accessibility, and in this respect the findings for it and that left dislocations are parallel to the ones presented in Unde (1979), Grosz (1981), and Schiffman (1984) for anaphoric it and that (all quoted in Ariel 1990). Finally, Heller {1998) argues that the Hebrew demonstrative ze 'this', when functioning as a copula, forces an extended refernce interpretation for the subject. Again, this directly mirrors the referential properties of the intermediate accessibility marker ze (see Ariel1998b). Giora and Lee {1996) also show that an initially motivated accessibility finding can develop into a (partially) grammaticized fact of a somewhat different functional nature (see also Marslen-Wilson et al. 1982). Giora and Lee argue that while accessibility theory can account for the fact that paragraphinitially, accessibility markers tend to be lower (pronouns instead of zeroes in Chinese), it cannot account for the fact that paragraph-final referring expressions also tend to be lower accessibility markers. It is possible that this distribution is due to the fact that lower accessibility markers are better cataphoric devices. Alternatively, since lower accessibility markers naturally occur paragraph-initially, they may be reanalyzed (also) as discourse segmentation markers. This is what Giora and Lee argue for. 26 In a similar vein but more radically, Vonk et al. ( 1993) argue that overspecified referring expressions (too low
59
6o
Mira Ariel
accessibility markers) affect discourse structure, rather than merely reflect it, as I originally argued. Lower accessibility markers instruct the addressee to shift from the current global discourse topic, even if the protagonist herself remains the same. In their experiments, the decision to shift to a new unit of information was determined by the choice of a too low accessibility marker, rather than the other way round. I believe that the high correlation between segment-initial position and low accessibility markers is originally motivated by the default strategy of emptying short term memory at the end of segments. However, this correlation may then be used in the other direction, namely to aid addressees in segmenting the discourse, especially when other means (such as time and place shifting expressions - see Gernsbacher 1991) are not available. In Section 3 I have argued that referential choice and interpretation is partly governed by grammatical principles and partly by extragrammatical accessibility considerations. However, because of grammaticization processes, the grammar-internal/external division of labor is not rigid across languages, nor within languages.
+ Competing theories of reference Accessibility theory is not the only theory which seeks to anchor referential forms in a broader, less than fully linguistic system. Chafe ( 1976 and onwards), Giv6n (1983), Levinson (1987, 1991) (and Huang 1994), Gundel et al. (1993) and Centering theorists (Grosz et al. 1986, 1995) have also offered such theories.27 It is important to note, however, that these theories do not clash with accessibility theory on a few important points: all theories offer some version of a scale on which referring expressions are arranged; all agree that additional, pragmatic factors can override the principles they propose. Crucially, the theories converge on the predictions re gross distinctions between zeroes, pronouns and lexical NPsl8; indeed, no counter-examples to accessibility theory have been shown to be better accounted for by these theories. Still, there are conceptual as well as empirical differences between these theories, and I will here briefly mention why I think that accessibility theory provides a better account for referential form use and interpretation. 29 Chafe (1976, 1994, 1996) was the first to argue for a direct connection between referential forms and cognitive statuses. In fact, accessibility theory can be seen as an extension of his (and later Giv6n's 1983) basic insight. Chafe recognizes that activation states are not categorical (discrete), but for language,
Accessibility theory
he distinguishes between three types of activation states only: activated, semiactive and inactive. Referential forms are chosen according to the estimated cognitive status of the referent: unstressed pronouns retrieve activated referents, and stressed nouns and noun phrases retrieve semiactive and inactive referents. Chafe then has to attribute many distinctions that I attribute to degree of accessibility to other distinctions which are partially orthogonal to degree of activation (identifiability, familiarity, contrastiveness). Although I believe that identifiability and contrastiveness are orthogonal to degree of accessibility, I think that Chafe is attributing distinctions to these concepts that are better treated as accessibility distinctions. For example, one wonders why stressed forms are consistently used for both lower accessibility and contrastiveness in many languages. I have argued (Ariel1990) that a contrastive form is used for an entity not predicted to occur (in the particular role). Hence the connection between contrastiveness and a relatively lower degree of accessibility. Also, Chafe claims that demonstrative pronouns identify better than pronouns (he contrasts it with this, see also Vonk et al. 1992, p. 303 ), in order to distinguish between them, since the three-way activation division is not enough for that. It is not dear how this identifies anything better than it (except by marking a lower degree of accessibility). Although it is widdy believed that this is more informative than it (presumably because of its deictic component), the actual distribution of it versus this shows that spatial deixis is very marginal in discourse. And this does not provide more information about the intended antecedent than it(see ex. 1 again). The main problem with Chafe's proposal is that a three-way distinction cannot account for the range of data I have referred to. In fact, Chafe himsdf presents a counterexample to his own three-way distinction: stressed pronouns which are not contrastive. They are an intermediate category. Finally, Chafe (1996, p. 40) anticipates that more degrees of activation may need to be recognized. Levinson's basic intuition is that coreference is preferred over noncoreference, and that minimal forms (e.g., zero, pronoun) should be used, unless fuller forms (e.g., lexical NPs) are specifically required (i.e., if the grammar does not allow the use of a minimal form). If, however, a fuller form is found where a more minimal form is licensed by the grammar, the addressee draws an implicature that the speaker did NOT intend a coreference reading (see originally Reinhart 1983 ). I have argued against Levinson's ( 1987, 1991) theory at length (see Ariel1994, 1996, and see also Blackwell, 2000). I have presented many counterexamples to his predictions, most of which stem from his insistence on the (grammarian's) coreference-disjoint reference distinction. Thus,
61
62.
Mira Ariel
Levinson can indeed motivate why certain anaphoric expressions are disjoint in reference from certain antecedents (when a low accessibility marker is used instead of a high accessibility marker). But he cannot explain why the same low accessibility markers must be interpreted as co referent with other antecedents, which are undistinguishable from the "illegitimate" antecedents on his account. I will here mention one example (His and the zero refer to the sexual abuser, who is the discourse topic): (10) REBECCA: RICKIE: REBECCA: RICKIE: REBECCA:
.. put the newspaperi on his lap, Y[eah], 0 [mas Iturbated, and then lifted the paperi [up I, [Yeah], .. for her to see (Jury).
Note that grammatically, 'the newspaper' could have been referred to by an it in the second mention. Cross-sentential pronouns are quite frequent in discourse (in fact, see the use of his and her in this example). But it was not. Still, no disjoint reading is generated, and we understand the two expressions as coreferent NPs. The reason is, I have argued, that the mental representation of 'the newspaper' is not highly accessible enough to merit a pronoun, but that does not at all rule out a coreference reading. Levinson seems to equate between high accessibility marking and coreference and between low accessibility marking and disjointness. I have argued that (non) coreference and degree of accessibility are orthogonal to each other. Another insensitivity of Levinson's (and others) is manifest in this example, namely the lack of attention paid to the difference between types of full NPs, here the newspaper and the paper. The latter is a shorter referring expression, therefore marking a higher degree of accessibility. It is a full lexical NP on Levinson's account, therefore undistinguishable from the longer alternative. But the shorter low accessibility form is not accidentally used. One of the most important claims of accessibility theory is that accessibility comes in a rich array of degrees, and any attempt to reduce it to a binary (coreference/disjointness) distinction is doomed to fail. Next, consider Gundel et al. (1993). A superficial look at Gundel et al.'s theory reveals an important advantage over accessibility theory. Whereas accessibility theory claims that degree of accessibility is responsible for the distribution of referring expressions, no attempt is made to specify a one-to-one cognitive correlate for each referring expression beyond the claim that a repre-
Accessibility theory
sentation is supposed to be relatively more or relatively less accessible given a specific referring expression. No cognitive status is described in the absolute. Gundel et al. 's Givenness hierarchy proposes precisely that. Their theory maps mental representations referred to onto six implicationally related cognitive statuses (each status implies that the statuses to its right hold as well): ( 11)
In focus > activated > familiar > uniquely identifiable > referential > type identifiable.
Unfortunately, the list of statuses specified looks suspiciously compatible with the distribution of just those referring expressions linguists have tended to focus on (i.e., some but not even all the referential forms in English+ zero). Now, I agree that linguists must absolutely set their goal at explicating linguistic forms, but the result in this case is that the cognitive aspect of the explanation is severely compromised. The cognitive basis of referential forms is drastically reduced if cognitive statuses are actually defined as a disjunction of statuses. Consider the status of'uniquely identifiable'. This status actually comprises two rather different cognitive activities: the addressee is either to retrieve an existing representation for a specific entity, or else to immediately generate such a representation. Now, I am not denying that definite descriptions (most prominently) trigger both of these cognitive processes. But are we really justified in claiming that these two are one and the same cognitively? The status of'referential' is also a hi-cognitive status according to Gundel et al.'s definition: "the addressee must either retrieve an existing representation of the speaker's intended referent or construct a new representation" (276). In fact, it is hard to see how the characterization of 'referential' differs from that of 'uniquely identifiable'. 30 In addition, it is not dear how the first disjunct of 'uniquely identifiable' and of 'referential' differs from the status of 'familiar', i.e., "The addressee is able to uniquely identify the intended referent because he already has a representation of it in memory" (p. 278). Moreover, it is not only that cognitive statuses are disjunctive, so is the relationship between referring expressions and cognitive statuses. That, this, IT (stressed) and this N are all said to mark one and the same 'activated' status in English (Japanese has 6 expressions for this cognitive category); Russian and Spanish have two expressions for the 'familiar' status. These show a many forms-one function relationship. Mulkern (1996), using Gundel et al.'s (1993) theory, finds that partial proper names are either 'familiar' or 'activated', whereas full names are either 'uniquely identifiable' or 'familiar'. These show a
63
64
Mira Ariel
one form-many functions relationship. In other words, there is no one-to-one correspondence between forms and cognitive status in any direction. Another problem with Gundel et al.'s Givenness hierarchy is raised by Ziv {1996): pronouns ('in focus') are predicted to always be 'uniquely identifiable' according to the Givenness hierarchy (because the hierarchy is implicational), but they are definitely not always so. In Ziv's examples they are unidentified inferred role players. This seriously undermines the potential explanatory superiority of the Givenness proposal. Also, while there is psychological evidence for the scalar relationship between 'in focus' and 'activated' (see the references above), there is no psychological evidence for the scalar distinctions between the other four categories on the scale. Finally, Gundel et al.'s (1993) theory (as well as the other theories here discussed) are far too restricted as to the referring expression types they recognize. The problem of a one-to-one correlation between an absolutely defined cognitive status and each referring expression is aggravated once we take into consideration the actual rich array of expression types. For example, how would one distinguish between zeroes and pronouns in a language which uses both as very high accessibility markers (e.g., English, and even more so, Hebrew)? Both must be classified as 'in focus' markers, but they each have a distinct distributional pattern. How can we distinguish between full and cliticized pronouns? Between first/second and third person pronouns, between more and less informative definite descriptions and proper names? Between longer and shorter definite descriptions and names where length does not affect the degree of informativity (Cf. the newspaper with the paper above)? I doubt that Gundel et al. can offer as many coherently defined cognitive statuses as there are distinguishable referring expression types. The only way they (as well as Levinson, Chafe and Centering theorists -see below) can handle such different distributions is by incorporating additional explanations. Accessibility theory handles most of these distributional patterns by one and the same generalization, although it does not completely replace identifiability and contrastiveness. Centering theory has focussed on an important factor in referential form choice: text coherence and its effect on the prominence of potential antecedents. Centering theorists (Grosz, Joshi & Weinstein 1986, 1995; Walker, Iida & Cote 1990) distinguish between antecedents as to their likelihood ofbecoming the focus of the next clause: topics, entities empathized with and subjects are expected to be the next clause topic more than non-topics, non-empathized entities and non-subjects respectively. They themselves are ordered as above with respect to potential future focussing on. Centering theorists then cor-
Accessibility theory
rectly predict that the more prominent discourse entity will be coded by zero or pronoun (depending on the language). They also emphasize that the salience of a discourse entity is determined by a combination of syntactic, semantic and pragmatic factors. All this is of course quite compatible with accessibility theory. However, Centering theory cannot be taken as a theory about referring expressions in general. Its proponents cannot even be said to characterize the usage of pronouns really, which they purport to. Their formulation of the 'pronoun rule' (or the zero rule forlanguages like Japanese) is that ifsome entity is realized as a pronoun, then so must be the highest ranking entity. This is for the most part the discourse topic, and hence, it will indeed be coded by at least as high an accessibility marker as the less accessible discourse entities. 31 But note that this formulation is far from a complete picture of anaphora (Centering theory does not even consider referential nonanaphoric cases). How does one decide whether she can refer to the other, less salient antecedent by a pronoun? And how does one decide on referring expressions other than pronouns? Note that Centering theory predictions are not violated if the highly salient discourse topic is coded by a full NP, provided the other, lower-ranking centers are too (or if they are not mentioned). This is a serious problem in view of the discourse findings presented in the literature. In fact, "the repeated name penalty", which has been presented as support for Centering theory is not actually predicted by it, just because the pronoun rule is formulated in such a way that it does not rule out lexical anaphors. What are the predictions for the highest ranking entity if a lower one is coded by a demonstrative pronoun? Probably because their main interest lies in coherence relations between clauses, Centering theorists, do not address these questions. Their rule determines what the coding should be for the one, most prominent discourse entity, and even this is not stated absolutely, but rather, in comparison to other discourse entities. I suggest that what the pronoun rule boils down to is 'use a high accessibility marker for highly accessible entities, subject to what the selection of high accessibility markers is in the language'. Since some languages can distinguish between zero and pronoun and/or between cliticized and full pronouns, the most highly accessible discourse entity may actually be required to be coded by a higher accessibility marker than the next one in ranking (a stronger requirement than the Centering one). Further research is required. The following example shows that Centering theorists focus too much on local connections. While in most cases the discourse topic is also the sentence subject of the sentence, in this case it is not. Thus, despite the fact that the higher officer is mentioned as a subject in two
65
66
Mira Ariel
consecutive clauses (not to mention the hair) it is still the discourse topic (officer Feil) which is pronominal and not the higher officer in both instances: (12) It is not as if he; looks like a hippie really, or anything like that. ... Fell's; grey-brown hair ... covers his; collar from behind. But one day in 93' the officer; in charge of him; demanded from him; to have a haircut ... The officer; accused him; ... (Haaretz 1.21.99).
In sum, all the theories discussed in Section 4 correctly predict some of the distributional patterns of referring expressions, but none, I believe, can account for the full range of data as well as accessibility theory.
s.
Directions for further research
While anaphora has been extensively researched by both linguists and psycholinguists, many questions are still unresolved. I list below a series of open questions pertaining to referential forms and linguistic and psycholinguistic research. I have divided them into linguistic (5.1) and psycholinguistic (5.2) questions.
s.t
Linguistic proper questions
Kirsner (1979, 1990; Kirsner & Van Heuven 1988) has data which contradicts accessibility theory predictions re proximate and distal demonstratives used anaphorically. Accessibility theory predicts that the demonstrative used for proximate physical pointing will also code a higher degree of accessibility when used anaphorically, as compared with the distal demonstrative. However, in Kirsner's Dutch data, it is the distal one (die) which refers to the less distant antecedents (70% of the demonstratives which had an antecedent in the same sentence were distal); It is the proximate demonstrative (deze) which refers to the more distant antecedents (89% of the demonstratives referring to an antecedent 2-3 sentences away were proximate). However, other languages pattern as predicted by accessibility theory (see Ariell998b; Lichtenberk 1996), and 87o/o of deze's do find their antecedents in the same or the previous sentence. Perhaps because the proximate demonstrative is originally used for highly accessible entities which are marked in the specific role they occur in, they are reinterpreted as "greater urging that the hearer find the referent" (Kirsner 1979, p. 358), leading to their reclassification as relatively lower
Accessibility theory
accessibility markers. Further research is needed. 32 Potential complications for accessibility theory are introduced by Arnold (to appear). Arnold first establishes that Mapudungun subjects code the most accessible entity of the clause. This should mean that when the subject is referred to in a subsequent clause, the anaphoric expression used should be a high accessibility marker. This is generally true, but Arnold shows that there is an additional factor at work: parallelism. The probability for an anaphoric object to be nonovert (a high accessibility marker) is higher when the antecedent is an object (i.e. representing an entity of a relatively lower degree of accessibility) than when it is a subject (coding the most highly accessible entity of the previous clause). Arnold attributes this phenomenon to the effect of parallelism. Rosen ( 1996) discusses a similar phenomenon. Zero subjects are interpreted not necessarily on the basis of the previous clause zero subject, but rather, on the basis of a possibly nonadjacent previous clause where the verb has a similar argument structure. Chambers and Smyth (1998) provide psycholinguistic evidence for the preference for pronouns to be coreferent with antecedents of the same structural status (subjects with subjects, and crucially, objects with objects). It remains to be seen whether parallelism is a separate factor working orthogonally to accessibility theory or whether the findings can be motivated within accessibility theory, by incorporating expected grammatical role as an accessibility factor. Du Bois ( 1980) had in fact argued that there is a separate tracking mechanism for objects. Degree of accessibility is a feature that characterizes Given information. It should then be fruitful to re-examine all those forms argued to code Given information (e.g., presuppositions, ergative nominals), and see whether it is general Givenness which determines their proper use or whether it has to be a specific degree of accessibility. For example, DuBois ( 1987) found that the rate of new entities is significantly lower than the rate of lexical NPs. This gap is probably explained by the presence of Given discourse entities which are of a relatively low degree of accessibility, and hence, are coded by lexical NPs despite their Givenness. If the motivation he proposes for ergative and accusative markings is based on the lexical versus nonlexical distinction, then it is probably based on the consistently high degree of accessibility of agents versus the inconsistent degree of accessibility associated with intransitive subjects and objects, rather than on the Given-New distinction between them. 33 The same applies perhaps to the pragmatic principle Du Bois proposes as underlying his "preferred argument structure", namely,"Avoid more than one new argument per clause" (p. 826). It should perhaps be replaced by "Avoid more than one
67
68
Mira Ariel
argument of a low degree of accessibility". Asswning that this is true, it remains to be seen what counts as a too low a degree of accessibility. 34 Gemsbacher (1989) demonstrates how different referring expressions enhance the accessibility of the mental representations associated with them. More explicit expressions (lower accessibility markers, proper names, for example) boost the activation of their mental representations faster and more than higher accessibility markers (pronouns, for example, see also Clifton & Ferreira 1987). In effect, the same accessibility marking scale reflects accessibility enhancing (and suppressing): the lower the accessibility marker used, the more enhanced the discourse entity coded by it will become (and the more suppressed other discourse entities will become). This means that the same accessibility markers code a specific current degree of accessibility (say, low), but at the same time, they contribute (at least partly) to the opposite degree of future accessibility (high). This can explain why speakers shift to lower accessibility markers from time to time, even when they continue to discuss the same discourse entity. These shifts combined with results obtained by Sanford and Garrod ( 1981) and Almor ( 1999), point to conflicting motivations in referential expression choice: "Live for today" versus "Live for tomorrow". Sanford and Garrod's results show that using too low an accessibility marker (e.g., a definite NP when the antecedent is a repeated discourse topic) slows subjects down. Gernsbacher's results, on the other hand, show that lower accessibility markers boost future degree of accessibility. In other words, in some cases the speaker has to choose whether she wishes to attend to her addressee's needs by choosing her accessibility marker in accordance with the current degree of accessibility (e.g., high), or by" ensuring the future", that the entity at hand remains/regains a high degree of accessibility (by choosing an accessibility marker which is relatively too low). Such competing motivations are rampant in natural language (see DuBois 1985), and further research is called for in order to find out when it is that speakers opt for reflecting current degree of accessibility and when they opt for establishing or maintaining a high degree of accessibility for future references. An interesting open question that needs researching into is the question of the correlation between referentiality and degree of accessibility. It seems that some forms are not only very high accessibility markers (e.g., zeroes), they are also more compatible with nonreferential readings (see Cameron 1997; Doron 1982; Garcia 1996; Sells 1984). However, as antecedents, nonreferential entities (e.g., 'whoever', generic NPs, impersonal 'you') are on the whole less accessible, and hence should have prompted relatively lower accessibility anaphoric
Accessibility theory
expressions. I have argued against a referential/nonreferential marking dichotomy, showing that even nonreferential arguments in relative clauses are not restricted to gaps and may take resumptive pronouns (Ariel1990, pp. 153155). But the fact remains that the preference for gaps is stronger when relative clause heads are nonreferential, despite the relative low accessibility of nonreferential antecedents. Perhaps this is related to the future accessibility marking of NPs. Nonreferentials are typically noncataphoric, hence the avoidance of low accessibility markers. 3s Grammaticization raises interesting questions too, not limited to referring expressions. How do we determine that a certain interpretative process is grammaticized, as opposed to being merely a common-sensical choice? (see also Kirsner & Van Heuven 1988). McDonald and MacWhinney (1995) show that when there is a clash between a 1st mention antecedent and an antecedent compatible with the semantics of the verb, the latter wins out. Is this a grammatical fact or only an e:xtragrammatical strategy (because violating the latter would cause incoherence)? When do we say that a discourse pattern has become grammaticized? Is a certain statistical percentage sufficient? Do we require lOOo/o? If we do, we will hardly be able to establish any obligatory grammatical rules. But then, if we do not impose such a high requirement, it is hard to tell the difference between the discourse profile of some form and the linguistic convention dictating its distribution (see also Ariel 1999). For example, most of the antecedents of Spanish si are subjects. In this case, Garcia ( 1983) argues against this being a grammatical rule. Instead, si is taken to demand highly accessible antecedents. But Garcia herself (as all functionalists, in fact) is not committed to "all or nothing" principles. Rather, she presents her theory as a set of principles generating discoursal preferences. These by definition are not lOOo/o correlations. The question then arises as to when we decide that a certain high percentage represents a formal rule and when we posit an extralinguistic generalization, which, as Garcia says, may not show a lOOo/o correlation either. Perhaps we should after all impose a requirement for (a near) lOOo/o correlation for grammatical principles, provided we recognize their complexity (see 1.2 above), as well as the fact that sometimes competing factors may block full compliance with the generalization. Dahl and Fraurud (1996) and Fraurud (1996) argue that we need to recognize the importance of animacy in referential choice. In their Swedish data, pronouns retrieved some human but no nonhuman antecedents which were not in the immediately preceding sentence. All the nonhuman referents coded by pronouns had a nearby antecedent. In general, whereas over a third of
69
70
Mira Ariel
the human definite 3rd person NPs were coded by pronouns, only 8% of the nonhuman NPs were coded by pronouns. Now, is this a discourse profile, or a grammatical convention? Should we say that pronouns cannot refer to nonhuman antecedents which do not occur in the same or the previous sentence? It seems that this is true in 100% of the cases, after all. I would rather not impose such a grammatical rule in this case. Fraurud's findings can be seen as reflecting the discourse profile of pronouns with nonhuman antecedents: Nonhuman entities are not as salient to us as humans are. If so, we should only expect to find that nonhumans in the same (large) distance as human antecedents are of a lower degree of accessibility. Hence the inability of pronouns to refer to them. In order to distinguish between these two options one should examine cases where nonhumans are very salient, as when they are the continuing discourse topic. If they cannot be referred to by nonimmediate pronouns even in such cases, then Fraurud's findings should be incorporated as a grammatical (semi-arbitrary) convention. 5.2 Questions pertaining to the connection between psycholinguistic research and grammar
The cognitive psychologists' findings so far seem to me to corroborate all existing theories, although they are presented as supporting either Centering theories (e.g., Gordon & Chan 1995; Kennison & Gordon 1997), Giv6n's (1983) topic continuity theory (Gernsbacher & Jescheniak, Ms.; Gemsbacher & Shroyer 1989) or accessibility theory (Almor 1999, in press; Arnold 1997). The reason is that the psycholinguistic findings support any theory which posits some scale of referential forms (Gundel et al. 1993 included). It would be interesting to think of psycholinguistic experiments which would test the different predictions of these different theories in order to establish whether one is possibly superior to others. 36 Recall that Almor ( 1999) argues very forcefully that when low accessibility markers are justifiably used in high accessibility contexts, processing is not slowed down. This finding contradicts my claim that proper accessibility marking can and is violated for special effects at a processing cost. It is not clear to me that Almor has actually proved that this is not the case. In order to do that, he would have to compare contextually informative low accessibility markers with pronouns when the antecedent is in focus. The comparisons he presents only compare justified versus unjustified low accessibility markers, but not high accessibility markers. I expect pronouns to take less time than
Accessibility theory
informative low accessibility markers. I would then view his current findings as showing that unjustified low accessibility markers merely slow addressees down more than justified low accessibility markers do, and not that justified low accessibility markers do not absolutely slow processing down. Gernsbacher and Faust ( 1991) explain the problem of less skilled comprehenders by reference to their less efficient suppression mechanism. Their experiments concern ambiguous words, where they find that the less skilled comprehenders have no problem making use of contextual cues for the appropriate interpretation. They also have no problem enhancing contextually appropriate information, but they find it relatively difficult to reject contextually inappropriate meanings that were generated automatically. Two questions come to mind. If suppression mechanisms are crucial for reference interpretation, then these same comprehenders are expected to also have problems in interpreting referring expressions where reliance on suppression is required (i.e., when there are competing antecedents). A more radical research goal is to look into the possibility that comprehenders may have different suppression and enhancement problems in different tasks, specifically, in reference determination versus ambiguity resolution. Gernsbacher assumes that enhancement and suppression are general cognitive skills, and indeed shows that the same less skilled comprehenders have difficulties suppressing non-verbal stimuli. Still, we should ascertain that this generalization holds across different linguistic interpretative processes as well. Psychologists have worried in the past about the ecological validity of their laboratory experiments, namely about the applicability of their experimental findings to the natural activities of their subjects. This problem still exists, of course, but I would like to point to a related problem. Suppose we grant that the discourses recently tested in many psycholinguistic experiments are real enough. Based on the psycholinguistic findings, we could easily establish a very rich scale of degrees of mental accessibility for concepts in various contexts. An important research goal then awaits linguists in trying to understand which of these psychologically real distinctions translates into a possible grammatical distinction (i.e., one that occurs at least in some language). Is it just the frequency of the cases in which the accessibility-related processing distinction is crucial for communicative purposes that determines that a linguistic distinction is to be instituted? If so, can we prove that this is the case? Alternatively, it could be that what is universal is not that specific "fundamental" (essential) accessibility distinctions are to be drawn by each grammar, but rather, that some accessibility distinctions be drawn. In other words, perhaps it is not so
71
7'l.
Mira Ariel
important what precise contexts are to be declared as bearing a distinct (high/ low/intermediate, etc.) degree of accessibility so much, as it is crucial that each language should have at least a minimal number of linguistically marked accessibility distinctions, which it then maps on to various contexts in a motivated yet somewhat language-dependent way. The universal could also be some combination between these two alternatives, namely, that some specific essential distinctions are "obligatory", and others have to occur, but with no restrictions as to where they are drawn. Some processing distinctions may simply be uncodable by (semi-)formal rules, and thus can only constitute laboratory results. Some distinctions may be mutually exclusive (obviation and logophoricity perhaps), because they are too similar. For example, Gemsbacher ( 1989) finds that when pronouns are relatively quite informative (when the gender distinction can distinguish between the intended antecedent and an unintended one), they suppress other discourse entities more than when they are not as informative (because the competing antecedents are of the same gender, see also MacDonald & MacWhinney 1990). No grammaticized consequence is expected in this case, however, since languages do not usually offer a choice between gender-marked versus gender unmarked pronouns; they have one or the other option. Similarly, Gernsbacher et aL (1989) find that first mentions in conjoined (new) NPs are later more highly accessible than second mention NPs. Non subject initial participants are also more accessible (Gemsbacher 1991). Yet, no language is known to have grammaticized the notion of clausal first mentions. It is usually the subject that serves as a locus of grammatical conventions, e.g., the restriction of reflexives in some languages to subject antecedents. No language, to the best of my knowledge, restricts reflexives to clausal first mentions, even though many languages allow nonsubjects in sentence initial position quite freely. It is still possible, of course, that a clausal first mention discoursal preference will be found, regardless of grammatical role. Karmiloff-Smith ( 1985) points out that children tend to refer to main characters by pronouns, and to marginal characters by lexical NPs. While this would be a well-motivated grammaticization path, since main characters are consistently more salient than marginal ones, I do not expect it to be an adult grammaticization path: We need to refer to main characters by lower accessibility markers sometimes (e.g. following episode boundaries) and we need to refer to temporarily highly accessible marginal characters by high accessibility markers. 37 Gernsbacher' s ( 1989) findings raise an interesting question pertaining to the nonautomatic connection between psycholinguistic and linguistic facts.
Accessibility theory
Gemsbacher's subjects first read a sentence which introduced two characters. They were then presented with a participial phrase which biased them towards one of the two potential antecedents, followed by a pronoun. Reaction time measurements revealed no difference in the accessibility of the two antecedents at the stage where the pronoun was encountered, despite the biasing adverbial. At the end of the clause, however, the degree of accessibility associated with the appropriate antecedent was higher. 38 Such findings raise a question about the processing stage relevant for measuring degree of accessibility. If accessibility theory is correct, we should expect that the relevant time is the stage at which the anaphor is processed (whenever that may be). However, it is not dear that grammaticizations can be sensitive to fine-tuned accessibility fluctuations over very short spans of time, so perhaps we must not expect a perfect fit between degree of accessibility as measured by psycholinguistic experiments and degree of accessibility as it is reflected in linguistic conventions. Further research is needed to settle this question. Last, some linguistic and psycholinguistic results that we have are actually in conflict with each other. Thus, dark and Sengul (1979) compare the retrieval of antecedents in previous clauses of the same sentence versus across sentence boundary. Based on subjects' reaction times, they conclude that it is the clause rather than the sentence that makes the significant difference in the processing time of anaphors. However, Clancy (1980), examining English and Japanese narratives, finds that it is the sentence rather than the clause that better accounts for the distribution of fuller versus attenuated referential forms. We need to find out whether there is one relevant unit (either the clause or the sentence), or whether under different conditions, or for different forms, one of them may be the more relevant unit (a more plausible possibility). 39 In this article I have tried to describe the main claims and findings of accessibility theory, emphasizing that the notion of accessibility is complex and that it is by no means the only factor determining referential form. I corroborated the accessibility claims relying on more recent research which also further develops the theory. I have argued that accessibility theory is at least partly linguistic, despite the fact that it is well motivated cognitively, and that it accounts for the rich data better than other theories of discourse reference. Finally, I have raised several open questions regarding discourse references. I hope linguists and psycholinguists will be prompted to explore them.
73
74
Mira Ariel
Notes
Accessibility theory is very much in tune with many recent psycholinguistic proposals, where mental accessibility of various referents has been experimented with. In fact, these experiments (together with discourse data) form the empirical basis of accessibility theory. However, psycholinguists and linguists approach the accessibility of mental representations with different goals in mind. Whereas the psycholinguists are interested in learning about human memory, the linguists are interested in learning about naturallangauge expressions. Hence, the psycholinguists use pronouns in order to draw conclusions about working memory, and any definite lexical NPs (definite descriptions and first names) to learn about the reinstatement process from memory. In contrast, the linguist must establish a formfunction correlation for each referring expression type. The psycholinguists see anaphora in general as a coherence device, and do not pay careful attention to minute differences between different anaphoric devices. They ignore nonanaphoric referential uses. In addition. psycholinguists are interested in how the processing of anaphora is performed, in how speakers assess the degree of accessibility of mental representations to their addressees (Morton Ann Gernsbacher, personal communcation). They want to define processing cues, which are different from linguistic codes (see Garnham, Oakhill & Cruttenden 1992; Garrod, Freudenthal & Boyle 1994; MacDonald & MacWhinney 1990; McDonald & MacWhinney 1995; Rinck & Bower 1995, inter alia), and to find when the accessing is performed, (e.g. Cacciari, Carreiras & Cionini 1997; Garnham, Traxler, Oakhill & Gernsbacher 1996; Lucas, Tanenhaus & Carlson 1990; McDonald & MacWhinney 1995). All of these are not of direct interest to the linguist. Last, many psycholinguists are committed to a dichotomy between working and long term memory, and therfore invariably compare two accessibility contexts or two referring expressions at a time. I find that unacceptable from a linguist's point of view, since the impression cretaed is that language poses a binary decision, parallel to the shortlong term memory division, where in reality, referring expression must be selected from a large variety of referring expressions. 2. Zeroes are empty argument slots, as in '0 (=you) wanna gor'. 1.
3· I thank Jack Du Bois (personal communication) for providing me with this example. All the examples in this text are taken from DuBois (2000), unless otherwise specified. 4- See Schilperoord ( 1996) for an argument that degree of accessibility (resulting from the
hierarchical structure of the text) determines pause lengths.
s.
Note that definite descriptions count as quite low accessibility markers here. Other researchers, however, have sometimes had to say that definite descriptions refer to the most •satient" or contextually uniquely identified referents (e.g.Chafe 1994, 1996; McCawley 1979 ), in order to make sure that addressees interpret the expression as referring to the immediately relevant entity. See Walker and Prince ( 1996, ex 1) where the guy is preferentially understood to refer to the non-topic 'guy', rather than to the topical'guy', and ex. 13, where in a discourse about two sisters, her sister changes its reference to whoever is not the sister in focus.
Accessibility theory
6. Oiticized pronouns are shortened pronouns, e.g., 'ya'.
7· Note the following example (Jury), where the molester is first referred to as he, then, strictly speaking, he is not referred to for a few intonation units I marked with •. Still, he is later referred to again with a pronoun:
RICKIE:
" '"REBECCA: '"RICKIE: '"REBECCA: '"RICKIE: '"REBECCA: '"RICKIE:
,.
You know like, (H) but he was making, I don't know how you describe it, you know how you can be like a nuisance to someone?
[Mhm). [Or)_ you may smell or some[thi]ng.
[Yeah]. you know like that you [know,
[Yeah). or] moving around, you know like, ... as he wanted her to move.
Indeed, Mauner, Tanenhaus, and Carlson (1995) found that missing agents in agentless passive sentences were processed nonetheless. 8. Independently, Terken and Nooteboom (1988) found that one previous mention was not sufficient for subjects to treat an entity as Given. In fact, Maes and Noordman (1995) also argue for special functions of these second mentions (see below).
9· In addition, in order to reduce all distinctions to one binary distinction, Giv6n simply ignores many referential devices, e.g., names (and first, last and full names each code a different degree of accessibility), and agreement markers. He also lumps together referential fonns which have different distributional patterns, e.g., zero and pronoun.
Nurit Assayag (personal communication) brought to my attention the following example (from her originally Hebrew conversational data), where the speaker refers to herself by too low an accessibility marker (a full pronoun rather than zero in the second mention) in order to maintain a syntactic parallelism with the preceding clause: So I began and nobody said anything. So I continued and nobody said anything.
10.
u. But note that Bird-David's research is on naming rather than on referential forms per se. This is true for all the anthropological work on names. However, Giv6n's examination of another novel which alternates between the perspectives of the two main characters (Cold mountain) revealed a different pattern: The zero/ pronoun versus NP distribution for references to the two characters is either similar, or else, there are more full NPs for the other than for the set£ At first blush, these findings seem contradictory, but actually, once the author relinquishes the narrator's role to some character, that character's consciousness is at work. Of course, normally, that entails the centrality of the self ( Giv6n' s novel), but at other times, the other is so central to the self that the other merits a higher rate of high accessibility markers (Cold mountain). This last point deserves further checking.
12.
13. This finding contradicts Bernstein's ( 1970) conviction that speakers of lower classes
7S
76
Mira Ariel
only use the restricted code. J.4. Note, however, that first mentions are more highly accessible only when all other factors are equal. When a non-first mention is marked as focus, as in wh-clefts, it is the second mention entity, coded by the focussed NP that is more highly accessible (see Almor 1999). 15. In fact, I have already briefly argued that the accessibility markers used to access inferred entities manifest accessibility differences (Ariel1990, pp. 184-190). 16. See Van den Broek (1990) for the importance of predictive inferences in general. 17. We can similarly establish that some NPs are marked for low or no catapboricity.
Quantified NPs, for example, are known to serve as antecedents for intra-sentential anaphora, but not for extra-sentential anaphora. 18. But first, note that referring expressions are not only measurable along one dimension. Thus, as Downing points out, classifiers also exempt speakers from marking the social status of the (human) referent. Also, as Kirsner (1990) argues, while the Dutch definite article and the distal demonstrative are sometimes interchangeable, the latter are used when the entity referred to bas been distinguished from others, while the demonstratives (both the distal and the proximate) are used when the speaker needs to alert the addressee to seek out the referent. Similarly, Epstein (1998a) argues that the has additional functions to reference establishing, e.g., marking the referent as prominent. Second, expressions commonly used to refer are not always used referentially, and as such are also otherwise classified (most notably, definite descriptions, which are sometimes used attributively or generically- see Mueller-Lust & Gibbs 1991). 19. But although Cacciari et al. ( 1997) found that gendered anapboric expressions speeded
up interpretations even when there was no competition over antecednthood, Gamham et al. 1992 suggest that the gender cue is not always used by subjects. However, we need to also examine the informativity and length of the lexical NPs involved. 21. In fact, C. L. Baker (personal communication) agreed with me on this point. 20.
22. I completely reject Reboul's (1997) assumption of an 'all or none' grammatical/extragrammatical status for reference interpretation. The fact that some aspects of referentiality are better accounted for by a pragmatic theory does not mean that all must be accounted for by pragmatic principles.
23. I actually believe that the 'avoid pronoun' principle is superfluous (see Ariel1990, pp.
100-105). In this case, then, I suggest to replace a grammatical principle with a functional principle. 24- One should however remember to distinguish between long and short reflexives. Accessibility theory predicts that they would be used differendy, and indeed they are (see Reinhart & Reuland 1993).
25- I thank Jack Du Bois for giving me the PP reflexive examples. 26. See Mithun
(1996, p. 231) for a similar finding.
27. Reboul (1997) argues against accessibility theory, but in effect against all attempts to
offer a linguistic theory for extrasentential referential forms. Although she herself does not
Accessibility theory
propose a specific account, she believes that with Relevance theory (Sperber & Wilson 1986) "one can account for the use of referring expressions, if one considers the semantic content of such expressions and the relationship between their semantic content and their referring ability" (p. 91, emphasis added). I had in fact argued against the first part of such a proposal in Ariel {1990, p. 83-86). I have shown that many referring expressions do not differ with respect to their semantic content, but they signal a different degree of accessibility nonetheless (e.g., it/that; name/shortened name; full pronouns/reduced pronouns/verbal person agreement markers). Degree of accessibility could be seen as the relationship between the semantics of the expression and referring ability, but it is not a transparently inferred relationship. Differences between languages which have the same referential forms (e.g., English, Hebrew and Otinese all have pronouns and zeroes, but they use them quite differently) are also left unaccounted for under an exclusively pragmatic theory. 28. In fact, Tao ( 1996) is the only one who claims to have different findings, where zero (in Chinese) is used to shift, rather than to maintain reference. 19. The theories also differ in scope of application. Only Levinson has argued that his principles actually replace the binding rules (and see also Garcia, 1996).
30. Gundel et al. ( 1993) claim that unlike 'referentials', 'uniquely identifiables' are identified based on the referring expression alone without reference to the rest of the sentence. I doubt that context is ever ignored. In any case, it is hard to know how one could check whether or not sentential (or other) context was actually used in the interpretative process. 31. In fact, Otambers and Smyth (1998) point out that Centering theory also cannot account for the acceptability of examples such as: Josh criticized Paul and then Marie insuhed him, where the pronoun does not refer to the most prominent forward looking center, nor is it the subject (and topic?) of either clauses. For other arguments against centering theory, see Chambers and Smyth (1998). 31. Kirsner argues that deze (+NP) codes HIGH DEIXIS, which often translates to relative low accessibility (in terms of referential distance and antecednt complexity). Note, however, that Kirsner's own attempt to incorporate the higher effort required in HIGH DEIXIS with references to entities physically near, rather than far from the speakers is unconvincing. Also, if important entities require HIGH DEIXIS, does that mean that pronouns coding continuing discourse topics are HIGH DEIXIS too? In other words, Dutch poses a puzzle as to why its proximate demonstrative marks higher accessibility for physical pointings but lower accessibility for discoursal refernces, when compared with the distal demonstrative (but see Piweck et al., as cited in Beun & Cremers 1998, for a different claim re the deictic usage of the proximate and distal demonstratives in Dutch). 1 tentatively suggest that this bas to do with the markedness of the proximate demonstrative (by far the rarer form in spoken Dutch). Thus, there is a potential conflict between demonstratives (in general) and definite descriptions. In terms of accessibility coding. the demonstrative should be the shorter form, but in terms of frequency it is the definite (or the distal demonstrative) which is predicted to be the shorter form. However, once length is established via markednmess (i.e., demonstratives are longer than definites) this formal difference in attenuation may affect the degree of accessibility later attributed to them.
n
78
Mira Ariel
33· If we replace Givenness with degree of accessibility, we can perhaps also explain why proper names pattern with 3rd persons in split ergative systems (both are not extremely highly accessible), rather than with 1st/2nd persons, even though they are (almost) equally Given (see Dixon 1979, p. 87). 34· In fact, DuBois (to appear) proposes that the deeper generalization behind the distribution of agents versus intransitive subjects and objects is sensitive to low versus high processing costs. It is the highly demanding NPs which are restricted in distribution. Indeed, other things being equal, high accessibility marking entails a low processing cost because the entity is highly accessible, and low accessibility marking entails a high processing cost because the entity is not so easily retrievable. However, pragmatically motivated exceptions to accessibility theory do occur, where highly accessible entities are referred to by relatively low accessibility markers (e.g., epithets), or vice versa (less common), where entities of a relatively low degree of accessibility are referred to by high accessibility markers. Accessibility theory predicts that both cases entail a high cost of processing, and hence, they should pattern as high processing cost entities, rather than according to either their marking or their real cognitive accessibility. This hypothesis requires testing. I thank Jack Du Bois for discussing this point with me. 35· It is also possible that using high accessibility markers (usually zeroes or pronouns) promotes the dependence of the interpretation based on another linguistic marker, which is required for nonreferentials. 36. As Morton Ann Gernsbacher (personal communication) reminds me, a huge task stiU remains of finding psycholinguistic lab evidence for the continuum of accessibility. 37· In fact, the children tested initially referred to the protagonists with indefinte NPs (and not pronouns), and they did from time to time refer to secondary characters by pronouns. Unfortunately, Karmiloff-Smith does not provide actual numbers. Also, the opportunity to refer to secondary charaters by pronouns was quite limited, since they were mentioned twice at most. )8. Unlike McDonald and MacWhinney (1995), Garnham et al. (1996) too find that
relevant semantic information takes effect only at the integration stage. Cacciari et al. ( 1997) suggest that the different findings re when semantic information is used in reference tracking may actually point to differences between different languages. 39· The psycholinguists also have contradicting results sometimes, e.g., Garrod and Sanford (1982) versus Albrecht and Clifton (1998) re anaphoric references to a conjoined NP antecedent when the anaphor is a subject.
References Aissen, J. (1997). On the syntax of obviation. Language. 73,705-750. Albrecht, f. E., & Clifton, C. Jr. (1998). Accessing singular antecedents in conjoined phrases. Memory and Cognition, 26, 599-610. Almor, A. (1999). Noun-phrase anaphora and focus: The informational load hypoth-
Accessibility theory
esis. Psychological Review 106,748-765.. Almor, A. (in press). Constraints and mechanisms in theories ofanaphor processing. In M. Pickering, C. Clifton, & M. Crocker (Eds.), Architectures and mechanisms in sentence comprehension. Cambridge: Cambridge University Press. Ariel, M. ( 1985a). Givenness marking. PhD thesis, Tel-Aviv University. Ariel, M. ( 1985b). The discourse functions of Given information. Theoretical Linguistics, 12, 99-113. Ariel, M. (1987). The grammaticalization of accessibility. Ms. Tel Aviv University (A much abridged version of this appeared as Part U of Ariel1990). Ariel, M. ( 1988a). Referring and accessibility. Journal of Linguistics, 24, 6~7. Ariel, M. (1988b). Retrieving propositions from context: Why and how. Journal of Pragmatics, 12, 567-600. Ariel, M. ( 1990). Accessing Noun Phrase antecedents. London: Routledge. Ariel, M. (1994). Interpreting anaphoric expressions: A cognitive versus a pragmatic approach. journal ofLinguistics, 30, 3-42. Ariel, M. (1996). Referring expressions and the +/- coreference distinction. In T. Fretheim, & J, K. Gundel (Eds.), Reference and referent accessibility (pp. 13-35). Amsterdam: John Benjamins. Ariel, M. ( 1998a). Three grammaticalization paths for the development of person verbal agreement in Hebrew. In Jean-Pierre Koenig (Ed.), Discourse and cognition: Bridging the gap (pp. 93-112). Stanford: CSLI/Cambridge University Press. Ariel, M. (1998b). The linguistic status of the 'here and now'. Cognitive linguistics, 9, 189-237. Ariel, M. ( 1998c). Mapping so-called 'pragmatic' phenomena according to a 'linguisticextralinguistic' distinction: The case of propositions marked 'accessible'. In M. Darnell, E. Moravcsik, F. Newmeyer, M. Noonan, & K. Wheatley (Eds.), Functionalism and formalism in linguistics. Amsterdam: John Benjamins. Ariel, M. ( 1999 ). Cognitive universals and linguistic conventions: The case of resumptive pronouns. Studies in language 23, 217-264. Ariel, M. Ms. Linguistic pragmatics. Ariel, M. (2000). The development of person agreement markers: From pronouns to higher accessibility markers. In M. Barlow and S. Kemmer (Eds), Usage-based models oflanguage (pp. 197-260). Stanford: CSLI. Arnold, J. ( 1997). What is salience?: The role of topic and focus in processing reference. Unpublished Ms. Stanford University. Arnold, J. (to appear). Multiple constraints on choice of reference: Null, pronominal, and overt reference in Mapudungun. In f. W. DuBois, L Kumpf & W. Ashby (Eds.), Preferred argument structure: Grammar as architecture for function. Amsterdam: John Benjamins. Bach, K. (1998). A review ofThorstein Fretheim and Jeanette K. Gundel (Eds.) Reference and referent accessibility. Pragmatics and cognition, 6, 335-57. Baker, C. L. (1995). Contrast, discourse prominence, and intensification, with special reference to locally free reflexives in British English. Language, 71,63-101. Beeman, M., & Gernsbacher, M.A.. Structure building and coherence inferendng during comprehension. University of Oregon manuscript
79
So
Mira Ariel
Bernstein, B. A. ( 1970). A sociolinguistic approach to socialization: With some reference to educability. In F. William (Ed.), Language and poverty. Chicago: Markam. Beun, R. J., & Cremers, A. H. M. (1998). Object reference in a shared domain of conversation. Pragmatics and cognition. 6, 21-52. Blackwell, S. E. (2000). Anaphora interpretations in Spanish utterances and the neoGricean pragmatic theory. Journal ofPragmatics, 32,289-424. Bouchard, D. (1983). The avoid pronoun principle and the elsewhere principle. In P. Sells, & C. Jones (Eds.), NELS 13. University of Massachusetts, Amherst. Bouchard, D. (1985). The binding theory and the notion of Accessible SUBJECT. Linguistic inquiry, 16, 117-33. Brennan, S. E. (1995). Centering attention in discourse. LangMge and cognitive processes, 10, 137-67. Brizuela, M. ( 1997). The selection ofdefinite expressions in Spanish: A note on the notion of processing effort. Paper presented at the International cognitive linguistics conference at Amsterdam. Broek, P. van den, (1990). The causal inference marker: Toward a process model of inference generation in text comprehension. In D. A Balota, G. B. Flores d'Arcais, & K. Rayner (Eds.), Comprehension processes in reading (pp. 423-445). Hillsdale, NJ: Lawrence Erlbaum. Cacciari, C., Carreiras, M., & Cionini, C. (1997). When words have two genders: Anaphor resolution for Italian functionally ambiguous words. Journal of memory and language, 37, 517-32. Cameron, R. ( 1997) Accessibility theory in a variable syntax of Spanish. Journal of pragmatics, 28, 29-67. Chafe, W. L. ( 1976). Givenness, contrastiveness, definiteness, subjects, topics and point of view. In C. N. Li (Ed.), Subject and topic(pp. 25-55). NewYork:Academicpress. Chafe, W. L. (1994). Discourse, consciousness, and time. Chicago: The University of Chicago Press. Chafe, W. L ( 1996). Inferring identifiability and accessibility. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 37-46). Amsterdam: John Benjamins. Chambers, C. G., & Smyth, R. (1998). Structural parallelism and discourse coherence: A test of centering theory. Journal ofmemory and language, 39, 593-608. Oancy, P.M. ( 1980). Referential choice in English and Japanese narrative discourse. In W. L. Chafe (Ed.), The pear stories: cognitive, cultural, and linguistic aspects of narrative production (pp. 127-202). Norwood, NJ: Ablex. Clark. H. H., & Sengul, C. J. (1979). In search for referents for nouns and pronouns. Memory and cognition. 7, 35-41. Clark H. H., &Marshall, C. (1981). Definite reference and mutual knowledge. InA. K. Joshi, B. L. Webber, & I. A. Sag (Eds.), Elements ofdiscourse understanding(pp. 1063 ). Cambridge: Cambridge University Press. Clifton, C., & Ferreira, F. ( 1987). Discourse structures and anaphora: Some experimental results. In M. Coltheart (Ed.), Attention and performance Xl1 (pp.635-653). Hove, East Sussex: Lawrence Erlbaum.
Accessibility theory
Comrie, B. ( 1983). Form and function in explaining language universals. Linguistics, 21, 87-103. Comrie, B. (1988a). Topics, grammaticalized topics, and subjects. BLS, 14,265-79. Comrie, B. ( 1988b). Coreference and conjunction reduction in grammar and discourse. In J. A Hawkins (Ed.), Explaining language universals (pp. 18~208). Oxford: Blackwell. Comrie, B. (1994). Coreference: Between grammar and discourse. Proceedings of the eighteenth annual meeting of the Kansai Linguistic society ( 1993 ), 1-10. Dahl, 0., & Fraurud, K. (1996). Animacy in grammar and discourse. InT. Fretheim, & J, K. Gundel (Eds.), Reference and referent accessibility (pp. 47-64). Amsterdam: John Benjamins. Dolman, R. (1998). Comparing referential forms in narratives ofchildren aged 5-6 from low and high socioeconomic classes. A seminar paper, Tel Aviv University. Do ron, E. ( 1982 ). On the syntax and semantics of resumptive pronouns. Texas linguistic forum, 19, 1-48. Downing, P. A. ( 1986). The anaphoric use of classifiers in Japanese. In C. Craig (Ed.), Noun classes and categorization (pp. 345-375). Amsterdam: John Benjamins. Downing, P. A. ( 1996). Proper names as a referential option in English conversation. In B. Fox (Ed.), Studies in anaphora (pp. 95-143). Amsterdam: John Benjamins. DuBois, J, W. (1980). Beyond definiteness: The trace of identity in discourse. In W. L. Chafe (Ed.), The pear stories: cognitive. cultura~ and linguistic aspects of narrative production (pp. 203-274). Norwood, NJ: Ablex. DuBois, J. W. (1985). Competing motivations. In J, Haiman (Ed.), Iconicity in syntax (pp. 343-365). Asterdam: John Benjamins. DuBois, J.W. ( 1987). The discourse basis of ergativity. Language, 63, 805-855. DuBois, J. W. (1991). Definiteness, reference, and analogues. Paper presented at the cognitive linguistics symposium, UC San Diego. DuBois, J. W. (2000). Santa Barbara Corpus ofSpokenAmerican English. 3 CO-ROMs. Linguistic Data Consortium, University of Pennsylvania. DuBois, J.W. (to appear). Discourse and grammar. In M. Tomasello (Ed.), The new
psychology of language: Cognitive and functional approaches to language structure, Vol. 2. Erlbaum. Du Bois, J. W., & Thompson, S. A. (I 991 ). Dimensions of theory of information flow. Unpublished manuscript, University of California, Santa Barbara. Epstein, R. ( 1998a). Reference and definite referring expressions. Pragmatics and cognition, 6, 189-207. Epstein, R. ( 1998b ). Definiteness and the construction ofdiscourse referents. Unpublished MS. Rutgers University-Camden. Faltz, L. M.( 19n). Reflexivization: A study in universal syntax. PhD thesis, University of California, Berkeley. Farmer, A., & Harnish, H. (1987). Communicative reference with pronouns. In J. Verschueren & M. Bertuccelli-Papi (Eds), The pragmatic perspective: Proceedings of the international pragmatics conference (pp. 547-566). Amsterdam: John Benjamins.
81
h
Mira Ariel
Fowler, C. A., Levy, E. T., & Brown, J. M. (1997). Journal of memory and language, 37, 2~0.
Fraurud, K. {1996). Cognitive ontology and NP form. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibt1ity (pp. 65-87). Amsterdam: John Benjamins. Garcia, E. C. ( 1983). Context dependence of language and linguistic analysis. In F. Klein-Andreu (Ed.), Discourse perspectives on syntax (pp. 181-207). New York: Academic Press. Garcia, E. C. ( 1996). What ..reflexivity" is really like. Linguistics, 34, 1-51. Gamham, A., Oakhill, J., & Cruttenden, H. ( 1992 ). The role of implicit causality and gender cue in the interpretation of pronouns. Language and cognitive processes, 7, 231-255.
Garnham, A., Oakhill, J., & Ehrlich, M. F. ( 1995 ). Representations and processes in the interpretation of pronouns: New evidence from Spanish and French. Journal of memory and language, 34,41-62. Garnham, A., Traxler, M ., Oakhill, J., & Gernsbacher, M.A. (1996). The locus of implicit causality effects in comprehension. Journal of memory and language, 35, 517-543.
Garrod, S., Freudenthal, D., & Boyle, E. ( 1994). The role ofdifferent types of anaphor in the on-line resolution of sentences in a discourse. Journal ofmemory and language, 33,39-68.
Garrod, S., & Sanford, A. J. ( 1982). The mental representation of discourse in a focused memory system: Implications for the interpretation of anaphoric noun phrases. Journal ofsemantics, 1, 21-41. Gernsbacher, M.. A. ( 1989). Mechanisms that improve referential access. Cognition, 32, 99-156.
Gemsbacher, M. A. ( 1990 ). Language comprehension as structure building. Hillsdale, NJ: Erlbaum. Gemsbacher, M. A. ( 1991 ). Cognitive processes and mechanisms on language comprehension: The structure building framework. The psychology oflearning and motivation, 27, 217-263. Gernsbacher, M.. A., & Faust, M. E. (1991). The mechanism ofsuppression: A component of general comprehension skill. Journal of experimental psychology: Learning, memory and cognition, 17, 245-262. Gemsbacher, M.A., Hargreaves, D., & Beeman, M. ( 1989). Building and accessing clausal representations: The advantage of first mention versus the advantage of clause recency. Journal ofmemory and language, 28, 735-755. Gemsbacher, M.A., & Jescheniak, J. D. Ms. Cataphoric devices in spoken discourse. Gemsbacher, M.A., & Shroyer, S. (1989). The cataphoric use of indefinite this in spoken narratives. Memory and cognition, 17, 536-540. Giora, R, & Lee, C. (1996). Written discourse segmentation: The function of unstressed pronouns in Mandarin Chinese. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 113-140). Amsterdam: John Benjamins. Giv6n, T. (1983). Topic continuity in discourse: An introduction. InT. Giv6n (Ed.),
Accessibility theory
Topic continuity in Discourse: A Quantitative Cross-Language Study (pp.1-42). Amsterdam: John Benjamins. Giv6n, T. ( 1992). The grammar of referential coherence as mental processing instructions. Linguistics, 30, 5-55. Giv6n, T. ( 1998). The usual suspects: The grammar of perspective in narrative fiction. U Diversity of Oregon Institute of Cognitive and decision sciences: Technical report no.9H6. Glenberg, A. M., & Kruley, P. ( 1992). Picture and anaphora: Evidence for independent processes. Memory and cognition, 20,461-471. Gordon, P. C., & Davina, D. (1995). Pronouns, passives, and discourse coherence.
Journal of memory and language, 34, 21~231. Gordon, P. C., Grosz, B. J., & Gilliom, L. ( 1993 ). Pronouns, names, and the centering of attention in discourse. Cognitive Science, 17,311-348. Gordon, P. C., & Scearce, K. A. (1995). Pronominalization and discourse coherence, discourse structure and pronoun interpretation. Memory and cognition, 23, 313323. Greene, S. 8., Gerrig, R. J., McKoon, G., & Ratcliff, R. (1994). Unheralded pronouns and management by common ground. Journal of memory and language, 33, 511526. Grosz, B. J. (1981). Focusing and description in natural language dialogues. In A. K. Joshi, B. L. Webber, & I. A. Sag ( Eds. ), Elements ofdiscourse understanding (pp. 84105 ). Cambridge: Cambridge University Press. Grosz, B. J., Joshi, A., & Weinstein, S. (1986). Towards a computational theory of discourse interpretation. Unpublished Ms. Grosz, B. J., Joshi, A., & Weinstein, S. (1995). Centering: A framework for modeling the local coherence ofdiscourse. IRCS Report 95-01. The Institute for research in cognitive science, University of Pennsylvania. Gundel,}. K., Hedberg, N., & Zacharski, R. (1993). Cognitive status and the form of referring expressions in discourse. Language, 69, 27 4-307. Gundel, J. K., & Mulkern, A. E. ( 1998). Quantity implicatures in reference understanding. Pragmatics and cognition, 6, 21-45. Hakulinen, A. (1987). Avoiding personal reference in Finnish. In J. Verschueren & M. Bertuccelli-Papi (Eds ), The pragmatic perspective: Proceedings of the international pragmatics conference (pp. 141-153). Amsterdam: John Benjamins. Halmari, H. (1996). On accessibility and coreference. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 155-178). Amsterdam: John Benjamins. HeUer, D. (1998 ). The demonstrative set-A second pronominal copula in Hebrew. Unpublished Ms. Tel Aviv University. Hermon, G. (1985). Syntactic modularity. Dordrecht: Foris. Hinds, J. (1983). Topic continuity in Japanese. ln T. Giv6n (Ed.), Topic continuity in Discourse: A Quantitative Cross-Language Study (pp.43-95). Amsterdam: John Benjarnins. Hoek, K. van (1995). Conceptual reference points: A cognitive grammar account of
83
84
Mira Ariel
pronominal anaphora constraints. Language, 71,310--40. Huang, Y. (1994). The syntax and pragmatics of anaphora. Cambridge: Cambridge University press. Hyman, L. M., & Comrie. B. ( 1981 ). Logophoric reference in Gokana. Journal ofAfrican languages and linguistics, 3, 19-37. Jucker, A. H. ( 1996). News actor labelling in British newspapers. Text, 16, 373-90. Karmiloff-Smith, A. (1985). Language and cognitive processes from a developmental perspective. Language and cognitive processes, 1, 61-85. Keenan, E. L. ( 1994). Crtt~ting anaphors: An historical study of the English reflexive pronouns. Unpublished MS, UCLA. Kempson, R. M. ( 1984). Wtt~k crossover, logical form and pragmatics. Paper delivered at GLOW, April1984. Keysar, B., Barr, D.}., & Balin, J. A. ( 1998). Definite reference and mutual knowledge: Process models of common ground in comprehension. Journal of memory and language, 39, 1-20. Khan, G. ( 1999). A grammar of neo-Aramaic: The dialed of the Jews in Arbel. Leiden: Brill Kibrik, A. A. (1996). Anaphora in Russian narrative prose: A cognitive calculative account. In B. Fox (Ed.), Studies in anaphora (pp. 255-303). Amsterdam: John Benjamins. Kirsner, R. S. (1979). Deixis in discourse: An exploratory quantitative study of the modem Dutch demonstrative adjectives. In T. Giv6n (Ed.), Discourse and syntax (pp. 355-375). New York: Academic Press. Kirmer, R. S. ( 1990). Grappling with the ill-defined: Problems of theory and data in synchronic grammatical description. In R Amacker, & R Engler (Eds. ), Presence de Saussure: Actes du CoUoque lntemationale de Geneve (21-23 Mars 1988) (pp. 187201 ). Publications du Cercle Ferdinand de Saussure 1. Librairie Droz, Geneva. 187-201. Kirsner, R. S., & Heuven, V. J. van ( 1988 ). The significance of demonstrative position in modem Dutch. Lingua, 76, 209-48. Kronrod, A., & EngeL 0. (2001). Accessibility theory and referring expressions in newspaper headlines. Journal ofPragmatics, 33, 683-699. Kuno, S. ( 1987). Functional syntax. Chicago: The Chicago University Press. Levinson, S. C. ( 1987). Pragmatics and the grammar of anaphora: A partial pragmatic reduction ofbinding and control phenomena. Journal oflinguistics, 23, 379-434. Levinson, S. C. ( 1991 ). Pragmatic reduction of the binding conditions revisited. Journal oflinguistics, 27,107-161. Lichtenberk, F. ( 1996). Patterns of anaphora in To'aba'ita narrative discourse.ln B. Fox (Ed.), Studies in anaphora (pp. 379-4ll). Amsterdam: John Benjamins. Linde, C. ( 1979). Focus of attention and the choice of pronoun in discourse. InT. Giv6n (Ed.), Syntax and semantics 12: Discourse and syntax. (pp. 337-354). New-York: Academic Press. Lucas, M. M., Tanenhaus, M., & Carlson, G. N. (1990). Levels of representation in the interpretation of anaphoric reference and instrument inference. Memory and cog-
Accessibility theory
nition, 18,611-631. MacDonald, M., & MacWhinney, B. ( 1990). Measuring inhibition and facilitation from pronouns. journal of memory and language. 29, 469-92. Maes, A. A., & Noordman, L. G. M. (1995). Demonstrative nominal anaphors: A case of nonidentificational markedness. Linguistics, 33, 255-282. Marsico-Wilson, W., Levy, E., & Komisarjevsky Tyler, L. ( 1982). Producing interpretable discourse: The establishment and maintenance of reference. In R. J. Jarvella & W. Klein (Eds), Speech, Place and Action (pp. 339-380). Chichester: John Wiley and sons. Matsui, T. (1998). Pragmatic criteria for reference assignment A relevance-theoretic account of the acceptability of bridging. PragmtJtics and cognition. 6, 47-97. Mauner, G., Tanenhaus, M. K., & Carlson, G. N. ( 1995). Implicit arguments in sentence processing. journal of memory and language. 34. 357-382. McCawley, J.D. (1979). Presupposition and discourse structure. In C. K. Oh, & D. A. Dinneen (Eds.), Syntax and semantics voL 11: Presupposition (pp. 371-388). New York: Academic press. McEnery, T. & Thomas, J. (I 992). The pragmatic basis ofautomated pronoun resolution. A paper presented at the 4th IPrA conference, Kobe Japan. McDonald J. L., & MacWhinney, B. (1995). The time course of anaphor resolution: Effects of implicit verb causality and gender. journal of memory and language, 34, 543-566.
McKoon, G., Ward, G., Ratcliff. R., & Sproat, R. (1993). Morphosyntactic and pragmatic factors affecting the accessibility of discourse entities. journal ofmemory and language, 32,56-75. Mehudar, 0. (1996). The phenomenon of obviation in third person pronouns in Indian languages in light ofaccessibility theory. A Tel Aviv University seminar paper. Mithun, M. ( 1996 ). Prosodic cues to accessibility. InT. Fretheim, & J. K. Gundel (Eds. ), Reference and referent accessibility (pp. 223-233 ). Amsterdam: John Benjamins. Montgomery,M. (1989). Choosing between that and it. In R. W. Fasold, & D. Schiffrin (Eds. ), Language change and variation (pp. 241-254). Amsterdam: John Benjamins. Mueller-Lust, R. A. G., & Gibbs, R. W. ( 1991). Inferring the interpretation of attributive and referential definite descriptions. Discourse Processes, 14. I07-31. Mulkern, A. E. (1996). The game of the name. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 23~250). Amsterdam: John Benjamins. Oakhill, }., Gamham, A., Gernsbacher, M.A., & Cain, K. (1992). How natural are conceptual anaphors? Language and cognitive processes, 7, 257-80. O'Brien, E. J& Albrecht, J. E ( 1991 ). The role of context in accessing antecedents in text. journal ofexperimental psychology: Learning, memory and cognition, 17, 94-102. Onishi, K. H., & Murphy, G. L. ( 1993 ). Metaphoric reference: When metaphors are not understood as easily as literal expressions. Memory and cognition, 21, 763-772. Paterson, K. B., Sanford, A. J., Moxey, L. M., & Dawydiak, Eugene (1998). Quantifier polarity and referential focus during reading. journal of memory and language, 39, 290-306.
Reboul, A. ( 1997). What (if anything) is accessibility? A relevance-oriented criticism
Bs
86
Mira Ariel
of Ariel's accessibility theory of referring expressions. In J, H. Connolly, R. M. Vismans, C. S. Butler, & R. A. Gatward (Eds.), Discourse and pragmatics in functional grammar (pp. 91-108). Berlin: Mouton de Gruyter. Reinhart, T. ( 1983). Anaphora and semantic interpretation. Chicago: Chicago University Press. Reinhart, T., & Reuland, E. (1993). Reflexivity. Linguistic inquiry, 24,657-720. Rieder, G., & Mulokandov, N. (1998). The cliticization and deletion of first person pronouns in Hebrew. A seminar paper. Tel Aviv University. Rinck, M., & Bower, G. H. ( 1995). Anaphora resolution and the focus of attention in situation models. Journal of memory and language, 34, 11~ 131. Ro~n. V. (1996 ). The interpretation of empty pronouns in Vietnamese. InT. Fretheim, & J, K. Gundel (Eds.), Reference and referent accessibility (pp. 251-261). Amsterdam: John Benjamins. Saadi, I. (1997). Untitled seminar paper. Tel Aviv University. Sanford, A. J,, &Garrod, S.C. (1981). Understanding written language. Chichester: John Wiley and sons. Sanford, A.}., Moore, K., & Garrod, S.C. (1988). Proper names as controllers of discourse focus. Language and Speech, 31,43-46. Sanford, A.}., & Moxey, L. M. (1995). Notes on plural reference and the scenariomapping principle in comprehension. In G. Rickheit, & C. Habel (Eds.), Focus and coherence in discourse processing (pp. 18--43). Berlin: Walter de Gruyter. 18-34. Sanford A. J,, Moxey. L. M., & Paterson, K. B. ( 1996). Attentional focusing with quantifiers in production and comprehension. Memory and cognition, 24, 144-155. Schiffman, R. J. (1984). The two nominal anaphors it and that. In}. Drogo, V. Mishra, & D. Testen (Eds.), Papers from the twentieth regional meeting of the Chicngo Linguistic Society (pp. 344-357). Chicago: Chicago Linguistic Society. Schilperoord, J. (1996). It's about time: Temporal aspects of cognitive processes in text production. Amsterdam/Atlanta: Rodopi. Sells, P. ( 1984). Syntax and semantics ofresumptive pronouns. PhD thesis, University of Massachusetts. Sperber, D., & Wilson, D. ( 1986). Relevance. Oxford: Blackwell. Sproat, R., & Ward, G. ( 1987). Pragmatic considerations in anaphoric island phenomena. CLS, 23,321-35. Stebbins, T. (1997). Asymmetrical nominal number marking: A functional account. Sprachtypologie and Universalienforschung, 50, 5-47. Tao, L. ( 1996). Topic discontinuity and zero anaphora in Chinese discourse: Cognitive strategies in discourse processing. In B. Fox (Ed.), Studies in anaphora (pp. 487513 ). Amsterdam: John Benjamins. Terken, J,, & Nooteboom, S. }. (1988). Opposite effects of accentuation on verification latencies for Given and New information. Language and cognitive processes, 2, 145163.
Tomlin, R. S. (1987). Linguistic reflections of cognitive events. In R. Tomlin (Ed.), Coherence and grounding in Discourse (pp. 455-480). Amsterdam: John Benjamins. Toole, J. (1992 ). Local or global: An investigation ofthe effect ofgenre on referential choice.
Accessibility theory
An MA thesis submitted to Monash University. Toole, J. (1996). The effect of genre on referential choice. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 263-290). Amsterdam: John Benjamins. Vonk, W., Lettica, G., Hustinx, M., &W. Simons (1992). The use of referential expressions in structuring discourse. Language and cognitive processes, 7, 301-333. Walker, M. A., Iida, M., & Cote, S. (1990). Centering in Japanese discourse. In COLING
90 : Proceedings of the 19th international conference on computational linguistics, 251-261.
Walker, M.A., & Prince, E. F. (1996). A bilateral approach to Givenness: A hearerstatus algorithm and a centering algorithm. InT. Fretheim, & J. K. Gundel (Eds.), Reference and referent accessibility (pp. 291-306). Amsterdam: John Benjamins. Ward, G., Sproat, R, & McKoon, G. ( 1991 ). A pragmatic analysis of so-called anaphoric islands. Language, 67,439-74. Webber, B. L. ( 1991). Structure and ostension in the interpretation of deixis. Language and cognitive processes, 6, 107-35. Ziv, Y. ( 1994 ). Left and right dislocations: Discourse functions and anaphora. Journal of pragmatics, 22, 629-645. Ziv, Y. (1996). Pronominal reference to inferred antecedents. The Belgian journal of linguistics, 10, 55-67. Zribi-Hertz, A. (1980). Coreference et pronoms ref1echis: Note sur le contraste lui/luimeme en Fran~. Linguisticae Investigationes, 4. 131-179. Zribi-Hertz, A. ( 1989). Anaphor binding and narrative point of view: English reflexive pronouns in sentence and discourse. Language, 65,695-727.
87
CHAPTER
3
The influence of text cues on the allocation of attention during reading Michelle L. Gaddy* Paul van den Broek Yung-Chi Sung University of Minnesota
When attempting to comprehend text, readers must devise ways to condense the wealth of information that is presented to them. Limitations on working memory or attentional resources make this task particularly important, for without the ability to selectively attend to particular textual information, readers would be faced with the burden of trying to retain every piece of information that is presented in text throughout the reading process. Successful comprehension of text therefore involves a reader's ability to appropriately allocate his or her attention to the most important aspects of a text, and at the right times, during reading. The purpose of this chapter is to describe how linguistic, typographical, and text-structure cues guide readers' attention during the process of reading. Linguistic cues are verbal devices that "include short signal phrases that pave the way for important items of information" (Glynn, Britton & Tillman 1982, p. 196). Examples of linguistic cues include function and relevance indicators, anaphors, and cataphors. Typographical and text-structure cues are devices that are used by authors to help readers determine the main points of a text (Leon & Carreteras 1992; Lorch 1989). Examples of typographical cues are boldface type, italic font, and underlining. Examples of text-structure cues are titles and headings. In this chapter we illustrate how linguistic, typographical, and text-structure cues direct the allocation of attention during reading and how, in doing so, they affect the comprehension process as well as its outcome, the mental representation of the text as a whole. In the first section we provide a brief overview of the interplay between attention, comprehension processes, and the construction of a memory representation. In the second section we
90
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
discuss the role of the various types of text cues and review relevant findings in cognitive studies of discourse comprehension.
1.
The role of attention allocation in reading: The Landscape Model
Cognitive research on reading has made clear the central role that attention plays in the comprehension process (e.g., Hidi 1995; Just & Carpenter 1980, 1992; Kintsch & Van Dijk 1978 ). A reader's allocation of attention to particular elements in a text has a direct impact on the way in which the text is interpreted, as is evident in recent theoretical models of reading (Goldman & Varma 1995; Kintsch 1988; Langston & Trabasso 1998; Van den Broek, Risden, Fletcher & Thurlow 1996; Van den Broek, Young, Tzeng & Linderholm 1998). The importance of attention allocation, and hence of the role that text cues play in reading, is best appreciated when one considers in some detail the cognitive processes that occur during reading. We will use the Landscape model (Van den Broek et al. 1996, 1998) to lay out the most recent theoretical insights into these processes. In the Landscape model, as well as in other recent models, reading is conceptualized as a cyclical process. With each consecutive reading cycle a new text segment and its constituent concepts are processed: 1 New textual information enters the reader's working memory or attentional buffer (i.e., is "activated") and, because working memory capacity is severely limited (e.g., Just & Carpenter 1992), information that was in working memory during the preceding cycle is at least in part erased. Moreover, in the course of processing and interpreting the new information the reader may activate background knowledge or reactivate text information from earlier reading cycles. This results in further competition for the limited attentional resources. Thus, during each reading cycle four potential sources determine the activation values of concepts: the text that is currently being processed, the immediately preceding reading cycle, reading cycles that occurred even earlier than the immediately preceding cycle, and the readers' own background knowledge (Van den Broek et al. 1996, 1998 ). As we will discuss below, various factors, including text cues, influence which of these sources will contribute in a particular cycle. Over the course of reading, individual concepts fluctuate in their activation as the reader proceeds from cycle to cycle: Some concepts come into the focus of attention, others fade, and yet others remain in working memory but fall and rise in the level of their activation.2 Together, the fluctuations in activations of all concepts form a landscape. Figure 1 provides an example of
Influence of text cues
Table l.
Example Expository Text Fragment (Numbers indicate processing cycles)
Why American Songbirds are Vanishing ( 1) The steep declines in waterfowl, shoreline birds, and grassland birds over the past several decades generally are well understood (2). What is not as obvious is why forest-dwelling migratory songbirds also are vanishing- especially the so-called Neotropical migrants that breed in northern latitudes but migrate to winter homes in the tropics (3). As decreases in their populations accumulated (4), it was widely noted that the missing
species could still be found in large continuous tracts of forests but not in isolated tracts(S). This observation was dubbed the forest fragmentation effect (6 ). What possible explanations might be given for the forest fragmentation effect (7)? One simple hypothesis to explain the effect is that they generally prefer larger forest plots as nesting sites and so avoid isolated plots because they tend to be small (8). It is important to note that this hypothesis predicts that the density of these birds' nests in a forest will increase as the size of the forest increases (9). However, when they set about documenting the presence or absence of songbird species in forest fragments of different size ( 10), researchers obtained mixed and- sometimes - contradictory results ( 11). (Adapted from Scientific American)
such a landscape, based on the text in Table 1. The horizontal dimensions reflect the reading cycles and the major concepts relevant to the text, respectively, whereas the vertical dimension represents the degree of activation of concepts in each cycle. Activation levels are relative so any numerical scale could be used. Here, we use a scale of'O' (no activation) to '8' (high activation). Later (also see footnote 3) we will describe the particular patterns of activation in this example, but for now the important point is that concepts fluctuate in their activation across reading cycles. The patterns of activations across reading cycles form the basis for the memory representation that the reader constructs of the text. At each cycle in Figure 1, a cross-section of the landscape shows the different concepts that are activated, to various degrees, in that cycle. When two concepts are activated simultaneously, a connection is built between them in the reader's episodic memory for the text. The strength of this connection is a function of the amount of activation of each of its constituent concepts: The more strongly each is activated, the stronger the ensuing connection. Furthermore, if two concepts repeatedly are co-activated the memory connection between them is updated with each recurrence, again in proportion to the activations of the two
91
9~
Michelle L. Gaddy, Paul van den Broek and Yung-Chi Sung
ACTI\IAnON
f
0
R&11DINO CYCLES
Figure 1.
"
Landscape of activations for the Songbird passage
concepts. In each cycle, multiple connections are created or updated. In this fashion, the reading process results in the gradual building of a representation of the text. The evolving representation, in turn, affects subsequent reading cycles: Concepts that are activated in the current cycle elicit partial activation of other concepts that have become associated to them in the reader's current memory representation or background knowledge. As a consequence of this process, called cohort activation, the incoming information in each cycle is interpreted against the backdrop of the interpretations of the information in preceding cycles (Van den Broek et al. 1998).3 This overview of the cognitive processes that take place during reading makes clear the central importance of attention allocation. The dynamic distribution of attention over the course of reading determines which concepts enter into the comprehension process and how they connect to each other. A change
Influence of text cues
in the attentional distribution at any point during reading reverberates throughout the comprehension process and the outcome of this process, the memory representation of the text. Many aspects of the Landscape depiction of the reading process are based on prior research by numerous investigators of discourse processing (for reviews, see Gemsbacher 1994; Balota, Flores d'Arcais & Rayner 1990). However, in the majority of these earlier studies the focus has been on one or two aspects at a time, ignoring or holding constant the others. The Landscape model extends these earlier conceptualizations by including additional components, amongst them notions from models of memory representation and access (e.g., connectionist models such as McClelland & Rumelhart 1988), and by simultaneously capturing all components and their interactions. The depiction of the reading process in all its dynamic detail leads to a considerable improvement in explanatory power and to an increased ability to predict human behavior (Goldman & Varma 1995; Kintsch 1988; Langston & Trabasso 1998; Van den Broek et al. 1996, 1998). At its most general level the model provides a conceptual framework of the reading process, which allows one to understand the complex interplay of factors and to construct well-grounded hypotheses about the impact of a specific factor. At a more detailed level, the model can be used to test empirically a specific theory or hypothesis, by implementing the theory or hypothesis and by comparing the predicted landscape of fluctuating activations and accompanying memory representation to empirical data. A good example is the investigation of the various possible sources of activation. Most investigators agree that concepts in the current reading cycle and, to a lesser extent, from the preceding cycle influence the activation of concepts. There is less agreement, however, about the contribution of the other two potential sources, reinstatement of concepts from prior cycles and the recruitment ofbackground knowledge (e.g., Graesser, Singer & Trabasso 1994; McKoon & Ratcliff 1992; cf. Van den Broek, Fletcher & Risden 1993). Under what circumstances are these sources accessed, if at all? Competing views about source access during reading of narratives have been implemented in the Landscape model and their predictions for on-line activation and for the eventual memory representation were compared to empirical data (Risden 1996; Van den Broek et al. 1998). The results indicate that readers have standards for coherence, by which they gauge their comprehension and determine whether to engage in further processing (Van den Broek, Risden & Husebye-Hartmann 1995). Although standards of coherence vary depending upon the reader and the circumstances (e.g., read-
93
94
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
ing goal, see Linderholm, Gustafson, Van den Broek & Lorch 1997), two standards of coherence that are almost universally applied by readers in normal reading situations involve establishing referential and causal coherence. H the information in the current sentence and that which is carried over from the preceding sentence do not provide referential and causal coherence, then the reader will access the other two sources, prior cycles and background knowledge.4 This example illustrates how the Landscape model can help us understand and test specific hypotheses about factors that may affect the reading process and the emerging memory representation. In the past two decades, considerable knowledge has been gained about how characteristics of the reader (comprehension processes, attentional capacity, background knowledge, etc.), in interaction with text content, affect activation patterns during reading. In contrast, very little is known about the role of the textlinguistic form of the reading materials (Lorch 1989). In the remainder of this chapter we will explore how textual cues may affect the distribution of readers' attentional resources and how such effects would reverberate throughout the reading process and the construction of the final representation of the text.
1.
The role of textual cues in attention allocation
1.1
Linguistic cues
Meyer ( 1975) describes linguistic cues as a class of textual devices that give emphasis to certain aspects of the semantic content of text. These linguistic cues serve as verbal devices that assist readers in deciding where attention should be allocated during the reading process (Glynn 1978). We will focus on three particular types of linguistic cues: function and relevance indicators, anaphors, and cataphors.
Function and relevance indicators Function and relevance indicators, or pointer words as they are sometimes called, are used in text to "explicitly inform the reader of the author's perspectives of a particular idea" (Meyer 1975, p. 80). These types of cues also can be considered importance indicators, for they indicate to the reader which content in text is most important (Lorch 1989). Examples of function and relevance indicators include phrases such as "it is important to note that ... ", "let
Influence of text cues
me stress that ... ", "fortunately... ", and "in summary ... ". In the Songbird passage, the following sentence segment contains an example of this type of linguistic cue: (1)
It is important to note that this hypothesis... (cycle 9)
Function and relevance indicators usually precede the content that they signal, but occasionally succeed it (e.g., "The British government in particular has been keen on this idea"). In terms of the Landscape model, concepts signaled by function and relevance indictors will be highly activated as a result of the increased attention that readers pay to textual information that is cued as important. Furthermore, these concepts will not only be more strongly activated in the current reading cycle, they also will be carried over more frequently into subsequent cycles. As a result, their connections to other textual information will be both stronger and more numerous. This, in tum, will result in increased memory for textual content preceded by function and relevance indicators. In example ( 1), the information following the relevance indicator, the prediction that nest density increases with forest size, is highlighted and will receive more activation. Figure 2 displays the change in activations for the text concepts as a result of function and relevance cues as well as of the cues to be described below. 5 As a result of these changes in activation, some concepts will accumulate more activation over the course of the entire text and, also, enter in more or stronger connections with other concepts that are activated in the same cycle or, through carry over, with concepts in immediately following cycles. In example ( 1), the concepts in the sentence marked by the functional marker (e.g., 'predict', 'density', 'forest size') nearly double in the strength with which they are encoded in the memory representation and more than double in their overall connectedness to other concepts in the text. 6 The effects of function and relevance indicators on the eventual representation of a text are fairly well documented. For example, readers recall text information more often when it is signaled than when it is not signaled (Leon & Carreteras 1992; Loman & Mayer 1983; Lorch & Lorch 1986; Mayer, Dyck & Cook 1984). Less is known, however, about the way in which function and relevance indicators direct readers' attention during the reading process itself. In one of the few studies in which both on-line reading behaviors and off-line memory were investigated, Lorch and Lorch ( 1986) observed that marking text segments with a summary indicator (e.g., "To summarize, ... ") prompted readers to slow their reading significantly. In contrast, marking text segments
9S
96
Michelle L. Gaddy, Paul van den Broek and Yung-Chi Sung
CONCEPTS
.,
READING CYCLES
Figure 2. Change in landscape of activations for the Songbird passage as a result of the inclusion of textual cues
with importance signals (e.g., "One argument is particularly notable") did not result in slower reading. Both types of markers led to significant improvements in recall of the signaled text segments. The absence of reading-time effects for importance signals led Lorch and Lorch to speculate that these signals may prompt readers to tag the information as important as soon as it is encountered, thereby leaving reading time unaffected. These findings, and the possibility that the two types of signals evoke different kinds of processing, are readily understood in the context of the Landscape model There are two ways in which a concept can receive increased activation. First, it can be processed for a longer period of time, in essence extending the number of processing cycles, or by increasing the activation it receives in a particular cycle (see Figure 2). A summary signal prompts the
Influence of text cues
reader to connect the current information with concepts that had been activated in preceding cycles. This requires reinstatement of the earlier concepts, a process which has been shown to be time consuming resulting in slower reading times (e.g., Albrecht & O'Brien 1993; Van den Broek & Thurlow 1990; cf. Van den Broek 1994).ln contrast, an importance signal prompts the reader to allocate more of the available activation in a particular cycle to the signaled concept, a process which does not alter the length of the reading cycle. Thus, both types of signals result in a stronger memory representation of the signaled information. In the case of summary signals it is through stronger connections with concepts from prior cycles, and in the case of importance signals it is through higher activation of the current concepts and hence through stronger connections to concepts in subsequent cycles.
Anaphors Anaphoric devices alert readers to the fact that an object or event mentioned in one sentence is identical to an object or event mentioned in a preceding sentence (Albrecht & O'Brien 1993; Gemsbacher 1990; Van den Broek 1994). The foUowing sentence, taken from the passage in Table 1, contains an example of an anaphor. (2)
One simple hypothesis to explain the forest fragmentation effect is that they generally prefer larger forest plots as nesting sites ... (cycle 8)
The word "they" serves as an anaphor by referring to "songbird populations", which is mentioned three sentences earlier. Anaphora relate sentences by means of anaphoric reference, and readers establish basic coherence in text by resolving anaphoric references. In the Landscape model an anaphoric device improves access to previously mentioned concepts in text by eliciting their re-activation. As a result these concepts will participate in additional reading cycles and hence form more and/or stronger connections to other concepts. As a result, they wiU feature more strongly in the memory representation of the text. Again, Figure 2 illustrates the effects of anaphors on activation patterns in the Songbird passage. There is ample evidence that anaphors lead to reactivation of their referents. For example, McKoon and Ratcliff ( 1980) asked participants to read short paragraphs in which the final sentence of the paragraph either did or did not contain an anaphoric reference to a concept mentioned in the first sentence of the paragraph. After each paragraph was read, participants were given a recog-
97
98
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
nition test for the referent mentioned in the first sentence. When the last sentence contained an anaphor, participants recognized the referent more quickly than when the last sentence did not contain an anaphor, demonstrating that anaphors activate their referents. In a similar study, Dell, McKoon, and Ratcliff ( 1983) demonstrated that an anaphor not only reactivates its referent but also other concepts that were in the same clause or sentence as the referent. However, the reactivation of other concepts is weaker than that of the referent: Unlike the referent, the 'other' concepts quickly decline in activation level once the anaphor is processed. Similar findings on the role of anaphors in directing activation through reinstatement have been obtained under different circumstances and using a variety of measures (e.g., Clifton, Kennison & Albrecht 1997; Ehrlich & Rayner 1983; O'Brien, Duffy & Myers 1986; Vonk, Hustinx & Simons 1992).7 These findings support the Landscape's predictions concerning the activation patterns that result from anaphoric references. As predicted, anaphors result in the reactivation of referents who thus enter in additional reading cycles. Moreover, the observation that concepts that co-occurred with the referents also will be reactivated, albeit weakly, is readily understood in the context of cohort activation. By virtue of their co-occurrence, the referent and the other concepts in that cycle have become connected. When the referent is reactivated, cohort activation will lead to partial reactivation of the other concepts as well. The implications of anaphoric references for the eventual memory representation are less well investigated. According to the Landscape model, the fact that anaphoric referents and, to a lesser extent, their cohorts participate in the current cycle and, through carry over, to some extent in subsequent cycles leads to the a stronger memory trace of the reactivated concepts (through repeated activation) and to the establishment of additional connections in the memory representation (through additional co-occurrences with other concepts). Such effects have been observed with respect to concepts that are reactivated to establish causal coherence (e.g., O'Brien & Myers 1987; Van den Broek & Lorch 1993; Van den Broek & Thurlow 1990; Van den Broek, Lorch & Thurlow 1996) but as yet the predictions with respect to the effect of anaphoric references on the memory representation remain untested.
Cataphors Cataphors (sometimes called 'anticipatory anaphora') are linguistic cues that refer to subsequently mentioned concepts (Matthews 1997). An example of a
Influence of text cues
cataphoric device is the unstressed, indefinite article this (Gernsbacher & Shroyer 1989). Thus, the concept 'egg' receives more activation if it is first introduced in a text by "And then the man picks up this egg" than when introduced by "And then the man picks up an egg." An illustration of a cataphoric device in the Songbird passage is the use of 'they' to signal the upcoming referent: ( 3)
However, when they set about documenting the presence or absence ... , researchers obtained... (cycles 10 & 11)
Cataphoric devices signal to readers that the concept mentioned in or after the cataphor (In example 3, the as yet unnamed people who were documenting the absence or presence of songbirds) will become important or be mentioned again at a later point in the text. The cognitive role that cataphoric devices play during the reading process can be understood in terms of readers' attention allocation. In the Landscape model, concepts that occur in or immediately after cataphors will be highly activated. These concepts will remain activated for several cycles due to the fact that cataphors signal to readers that particular concepts will be mentioned again or become important at a later time. In contrast, concepts that are activated by non-cataphoric devices, such as a or an, will not remain activated over succeeding cycles because readers will not assume that such concepts will become relevant again. Because concepts marked by cataphors will be activated over many cycles, they will be strongly connected to other concepts in text. These connections will influence memory for the concepts signaled by cataphors. Figure 2 illustrates the impact of the cataphoric construction in the Songbird passage. It is interesting to contrast the cataphoric construction of the text in cycles 10 and 11 with the non-cataphoric form, "However, when the researchers set about documenting the presence or absence ... , they obtained ... ". In the cataphoric version the contents of the entire cycle 10 ("the people who are documenting..") is marked because its exact interpretation is not clear yet. As a result, all concepts are carried forward with increased activation (over and above carry over and cohort activation) and hence interact with the subsequent cycle ( 11 ). In contrast, in the non-cataphoric version only the concept "researchers" receives increased activation in cycle 11. Thus, the net effect of the cataphor in this example is to increase the activation of all concepts from cycle 10 during processing of the next cycle and hence to create a more tightly
99
100
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
interwoven memory representation of the concepts in cycles 10 and 11. As always, such changes will reverberate throughout the processing and representation of the text Relative to anaphors, cataphors have received little attention in the psycholinguistic literature. However, there is indirect evidence that cataphoric devices influence attention allocation. For example, Gemsbacher and Shroyer ( 1989) asked participants to listen to a series of short narratives and to provide endings for each narrative. Half of the narratives included a sentence in which a concept was preceded by a cataphoric device, the indefinite this, whereas the other half of the narratives included a sentence in which the same concept was preceded by a or an. Concepts preceded by the indefinite this were mentioned more frequently in participants' continuations of the narratives than were concepts preceded by a or an. Thus, cataphoric devices increase subsequent referential access to the concepts that they cue. This, in tum, suggests that the cataphor directed the way in which the listeners allocated their attention. According to the Landscape model, cataphors influence the allocation of attention during reading in a similar way: Signaled concepts receive extra activation in the current cycle and are more strongly carried over into subsequent cycles. This leads both to an overall increase of the signaled concept over the course of reading and to the strengthening or construction of connections in predictable ways. To test these on-line and off-line predictions, studies on the effects of cataphoric devices in written discourse are in order. 2..2.
Typographical cues
Just as the semantic form of text influences comprehension, so does its physical layout. Boldface type, italic type, underlining, capitalization, and color variation are examples of typographical cues that are often used by textbook authors to make certain words or phrases within text visually distinct In Table 1, the italicized phrase "forest fragmentation effect" serves as a typographical cue. The purpose of using such cues is to draw readers' attention to particular information in text, such as topic sentences or key words. Indeed, there is considerable evidence that readers tend to remember cued information more than non-cued information (e.g., Cashen & Leicht 1970; Crouse & Idstein 1972; Foster & Coles 1977; Fowler & Barker 1974; Glynn & DiVesta 1979; Golding & Fowler 1992; Klare, Mabry & Gustafson 1955; Leon & Carreteras 1992; Lorch, Lorch & Klusewitz 1995; Nist & Hogrebe 1987). As Lorch ( 1989) has pointed out, there is no reason to believe that certain typographical cues are
Influence of text cues
more effective than others, and researchers have not demonstrated the superiority of particular cues. Apparently, all varieties of typographical cues are processed in a similar manner and the memory representations that result from such processing will be similar, although it is plausible that their visual distinctiveness affects their efficacy in attracting attention. Typographical cues direct readers' attention to specific textual content, much as function and relevance signals do. According to the Landscape model this greater attention to cued textual information results in a larger and stronger memory trace for this information, both because of the total amount of activation it receives and because of new or stronger connections between cued concepts and other, co-occurring concepts in the text. To illustrate, Figure 2 shows the effects of italicizing "forest fragmentation effect" on its activation pattern across reading cycles. The change in activation pattern, in turn, results in a 33% increase in the strength of this concept in the final memory representation and in a 150% increase in its connections to other concepts. 8 The effect of typographical cues on memory for textual information is well-documented. There is ample evidence that the presence of typographical cues in text will facilitate recall of cued textual information, with little or no effect on recall of non-cued information (see Lorch, Lorch & Klusewitz 1995). The effects of typographical cues on the reading process and allocation of attention itself and, hence, how cued information is connected to other information in the text are not as well established. Apparently, typographical cues affect readers' memory representation of text by increasing the amount of attention that readers pay to cued portions of text (Lorch et al1995, Exp. 2). When cues are present in text, readers take longer to read cued than non-cued parts, and the increased processing time allotted to cued information results in that information being recalled more frequently than non-cued information. These findings provide insight in the mechanisms which may lead cued information to attain a prominent position in readers' memory representations of text, but it should be noted that measures of reading time are only indirect indicators of attention allocation. Longer reading times for particular text segments are suggestive that readers are attending more to those segments but they do not allow a firm conclusion about what actually is activated. The Landscape model allows predictions about the relative distribution and time course of activation as a result of these cues. The predictions are consistent with the reading time and memory findings but converging evidence is needed to fully grasp the way in which typographical cues exert their effects on attention allocation during reading.
101
102
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
2.3 Text-structure cues
Typographical and linguistic cues both operate on the content contained within the body of a text. In contrast, text-structure cues, such as titles and headings, operate by means of structuring the text as a whole. In terms of function, text-structure cues are similar to typographical cues, for both types of cues convey information about the relative importance of text elements to readers (Lorch 1989). The main difference is one of scope, with typographical cues operating primarily on the word or phrase that is being signaled and textstructure cues operating on the entire body of text.
Titles and headings In many instances, the main body of a text is preceded by a title that contains information about the content of the text. Due to variations in spatial location and typeface, titles are often easily distinguishable from the text that succeeds them (Lorch 1989), and they can be thought of as "umbrellas" under which content pertaining to a particular topic or theme can be found. Thus, titles act as advance organizers (Kozminsky 1977). The title of the passage in Table I, "Why American Songbirds are Vanishing", serves this function. Like titles, headings usually differ from the text that they signal by means of spatial location and typeface variation. Headings are text-structure cues that delineate subsections of text (Lorch 1989). They signal the organization of the text and provide the reader with information about the content of the immediately following text. The main difference between headings and titles is that titles usually apply to text as a whole whereas headings apply to subsections within text. In terms of the Landscape model, titles and headings direct readers' attention to particular content in text and bias readers' comprehension strategies. They do so in several ways. First, titles and headers contain content information that is entered into a processing cycle just like information in other text segments and, thus, influences the interpretation and activation patterns of subsequent text segments. Second, the content of titles and headings will be processed more deeply than that of regular text segments. To some extent this is due to the fact that titles and headings are physically distinct from the remainder of the text (much like segments within the text that are signaled by function or relevance cues), but, more importantly, titles and headings provide information about the theme of the text. As a result of their theme-carrying function, titles and headings may be processed more extensively, for example by
Influence of text cues
prompting the reader to recruit background knowledge, and they are more likely to be re-instated -or carried over from cycle to cycle- during the reading of the text that follows. Thus, the theme itself will be activated more strongly and more frequently, and this high level of activation will influence how the reader processes the subsequent text and the kind ofinformation that the reader remembers. Third, by providing readers with information about the theme of the text, titles and headings may prompt readers to attempt to detect congruence between later segments and the theme and to selectively allocate attention to that information that is congruent. In this fashion, titles and headings may alter the readers• comprehension processes. Figure 2 depicts how the title in the Songbird passage influences activation patterns throughout reading. This, in tum, results in dramatic changes in the memory representation. For example, the strength of the concept 'songbirds• in the representation increases by SSo/o and the strength of its connections to other concepts by 150%.9 The results of several studies on the effects of titles on comprehension support the Landscape view of the function of text-structure cues. Kozminsky ( 1977) demonstrated the potential biasing effect of titles. Participants read one of three texts that differed only with respect to their titles, each of which emphasized different ideas in the text. The results of a free-recall test administered after completion of the reading task indicated that recall was biased toward the theme emphasized in the title. The role that theme information can play in directing attention and memory is reflected in the finding that readers process text differently based on their beliefs about the content of the text. In a set of classic studies, Pichert and Anderson ( 1977) asked participants to read a story from one of two perspectives, that of a home-buyer or that of a burglar. Participants recalled different details from the passage depending on their particular perspective. Thus, readers use information about the content or theme of a text to assist them in deciding on appropriate strategies to employ during text-processing. Similarly, Zwaan ( 1994) demonstrated that readers process text differently depending on their belief about the genre of the text. Two groups of participants read the same texts, with half being told that the texts were excerpts from newspapers and the other half being told that the texts were literary stories. Expectations about text genre affected the way in which readers processed the text. Participants who thought they were reading literary stories took longer to read the texts, recalled significantly more surface information from the texts, and recalled significantly less situational information from the texts. Titles prompt readers to establish an appropriate context within which to
103
104
MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
interpret the text (Lorch 1989). Titles achieve this by activating the appropriate background knowledge or schema that readers need in order to understand the text. When readers are given text without a title, they experience difficulty in comprehending the text, especially if the main idea of the text is rather ambiguous (Bransford & Johnson 1972; Dooling & Mullet 1973). In such cases, the presence of a title serves to "facilitate the assembly of previous knowledge that will incorporate the 'new' textual information" (Kozminsky 1977, p. 482). The effects of headings on text-processing and comprehension have received less attention than those of titles (Lorch 1989). Moreover, the extant research has been focused primarily on the influence of headings on readers' memory for textual content. In general, headings, like titles, provide readers with an explicit representation of the organization and theme of text as evidenced by the fact that readers recall more from text when supplied with a heading than when a heading is absent (Lorch 1989). The above research shows that text-structural cues such as titles and headings play an important role in readers' memory representations for text. Little is known, however, about how titles and headings actually are processed by readers and how exactly they affect the processing of subsequent text. The Landscape view holds that titles and headings will direct readers' attention during the reading process. By providing readers with information about the structure and theme of a text, these cues will be highly activated during the reading process and they will be reactivated or carried over throughout the reading process. Moreover, they prompt readers to attempt to connect subsequent text to the theme and thus direct attention allocation during later cycles. The ability to do so effectively is likely to differ among individuals. Thus, one would expect that good readers show stronger effects of text-structural cues such as titles and headings in their allocation of attention and eventual memory representation than do poorer readers.
3·
Concluding remarks
Attention allocation plays a very important role in text comprehension. In order for comprehension to be successful, readers must selectively attend to the most important parts of text, at the appropriate times during the reading process. When attention is allocated in this manner, those textual elements that receive increased attention will secure a more prominent position in readers' memory representations of text.
Influence of text cues
In this chapter we have attempted to illustrate how the content and form of text influence both the way in which text is processed and the eventual memory representation. The effect of content and form on comprehension is mediated by various linguistic, typographical, and text-structure cues that authors use to direct readers' attention to particular textual information. Some text cues have been studied primarily in terms of their effect on the memory representation that readers form of text, whereas others have been investigated mostly in terms of their effects on the allocation of attention during reading. The Landscape model provides an integrated view of reading process and the construction of a memory representation (Van den Broek et al. 1996; Van den Broek et al. 1998). In doing so, it allows us to understand the effects that have been observed and to make informed predictions about as yet unexplored effects. The Landscape model highlights the fact that text comprehension involves an interaction between reader and text characteristics. Text characteristics, such as the content and form of the text, signal to readers that attention must be allocated to certain aspects of the text. The Landscape model allows us to trace the impact of these characteristics, in the context of readers' limited attentional resources and standards for coherence, on the allocation of attention during reading and on the memory representation that gradually emerges. In doing so, it helps us understand the way in which textual cues act as attention-focusing instructions to the reader which, if properly followed, result in efficient and proficient reading and comprehension of the text.
Notes ,. We would like to thank Yuhtsuen Tzeng for his assistance in modeling the phenomena described in this chapter. The work presented here was made possible by the Guy Bond Endowment for Reading Research and by the Golestan Foundation and The Netherlands Insitute for Advanced Study. In principle any type of text segment (e.g., sentences, clauses, even paragraphs) can be selected as input unit. Most researchers adopt sentences or clauses as units of analysis, however. The reason is that, although processing occurs continuously, there is ample evidence that major processing occurs at the end of clause and sentence boundaries and thus that they constitute processing units. Likewise, input segments can be decomposed into different types of concepts (e.g., propositions, nouns/verbs). In the current chapter we use sentences as input units for the cycles and clauses (Trabasso, Van den Brode & Sub 1989) or major propositions (Kintsch 1998) as constituent concepts within a cycle.
1.
z. In older models of working memory and reading it was assumed, mostly for reasons of
105
to6 Michelle L Gaddy, Paul van den Broek and Yung-Chi Sung
simplicity, that at any particular moment in time concepts either were or were not activated in working memory (e.g., Kintsch &Van Dijk 1978; Kintsch & Miller 1980).As models have become more sophisticated and as computational power has increased, this dichotomous view has been abandoned in favor of a continuous one, in which concepts fluctuate along a dimension of degrees of activation (e.g., Goldman & Varma 1995; Just & Carpenter 1992; Kintsch 1988; Myers & O'Brien 1998; Van den Broek et al. 1998). ). We are providing only a summary of the main aspects of the Landscape model. Details of these and other aspects, as well as of their computational properties, can be found in Van den Broek et al (1998).
Songbirds passage in Figure 1. Concepts that are explicidy mentioned in a cycle receive high activation (S-7), depending on their importance in the propositional structure of the sentence (see Kintsch & Van Dijk 1978), concepts that are activated from background knowledge (e.g., the superordinate category "Birds") or reinstated (e.g., because they establish causal or referential coherence) receive intermediate activation (4), and concepts that provide spatial coherence receive some activation (2). In addition, concepts can be activated to various degrees in later cycles as a result of carry over from the preceding cycle or of cohort activation.
4- These principles form the basis for the landscape for the
s.
In this illustration all cues increase the activation value of the signaled content by 2.
6. The strength with which a concept is encoded can be calculated easily using the com-
puter implementation of the Landscape model (Van den Broek et al1998 ). For example, on a scale ofO (not encoded) to 5 (strongly encoded) the concept 'density' will be encoded with strength .9 in the unmarked version and with strength 1.6 in the marked version. On a scale of 0 (no connections) to 60 (strong/many connections), 'density' is connected to other concepts with an overall value of 11 in the unmarked version and of 26 in the marked version. 7· Different anaphors (e.g., pronouns, demonstratives, repetitions) may have slightly different effects on processing and hence on attention allocation (cf. Garrod, Freudenthal, & Boyle 1994). Such differences can be expected to lead to subde differences in attention allocation and memory representation. 8. The strength of the individual concept changed from 2.5 to 3.3 (on a 0-5 scale) and its total connections changed from 15 to 38 (on a 0-60 scale).
9· The strength of the individual concept changed from 2. to 3.7 (on a 0-5 scale) and its total connections changed from 18 to 48 (on a 0-60 scale).
References Albrecht, J. E., & O'Brien, E. J. (1993). Updating a mental model: Maintaining both local and global coherence. Journal of Experimental Psychology: Learning, Memory 6- Cognition, 19, I 061-1070. Balota, D. A., Flores d'Arcais, G. 8., & Rayner, K. (Eds.) ( 1990). Comprehension processes in reading. Hillsdale, NJ, USA: Lawrence Erlbaum Associates, Inc.
Influence of text cues
Bransford, J. D., & Johnson, M. K. ( 1972). Contextual prerequisites for understanding. Journal of Verbal Learning and Verbal Behavior, 11,717-726. Cashen, V. M., & Leicht, K. L. ( 1970). Role of the isolation effect in a formal educational setting. Journal ofEducariont~l Psychology, 61,484-486. Clifton, C., Jr., Kennison, S.M., & Albrecht, J. E. (1997). Reading the words her, his, him: Implications for parsing principles based on frequency and on structure. Journal ofMemory & Language, Vol36(2), 276-292. Crouse, J. H., & ldstein, P. ( 1972). Effects of encoding cues on prose learning. Journal of Eduaarional Psychology, 63, 309-313. Dell. G. S., McKoon, G., & Ratcliff, R. ( 1983). The activation of antecedent information during the processing of anaphoric reference in reading. Journal of Verbal Learning and Verbal Behavior, 22, 121-132. Dooling, D. J., & Mullet, R. L. ( 1973). Locus of thematic effects in retention of prose. Journal ofExperimental Psychology, 97, 404-406. Ehrlich, K., & Rayner, K. ( 1983 ). Pronoun assignment and semantic integration during reading: Eye movements and immediacy of processing. Journal of Verbal Learning & Verbal Behavior, 22,75-87. Foster, J., & Coles, P. (1977). An experimental study of typographic cueing in printed texL Ergonomics, 20, 57-66. Fowler, R. L., & Barker, A. S. ( 1974). Effectiveness of highlighting for retention of text material Journal ofApplied Psychology, 59, 358-364. Garrod, S., Freudenthal, D., & Boyle, E. A. ( 1994). The role of different types of anaphor in the on-line resolution of sentences in a discourse. Journt~l of Memory & Language, 33, 39-68. Gemsbacher, M.A. ( 1990). Language comprehension as structure building. Hillsdale, NJ: Erlbaum. Gemsbacher, M.A. (Ed.). (1994). Handbook ofpsycholinguistics. San Diego, CA, USA: Academic Press, Inc. Gemsbacher, M.A., & Shroyer, S. (1989). The cataphoric use of the indefinite this in spoken narratives. Memory & Cognition, 17, 536-540. Glynn, S. M. ( 1978). Capturing readers' attention by means of typographical cueing strategies. Educational Technology, 18,7-12. Glynn, S. M., Britton, B. K., & Tillman, M. H. ( 1982). Typographical cues in text: Management of the reader's attention. In D. H. Jonassen (Ed.), The technology of text (Vol2, pp. 192-209). Englewood aiffs, NJ: Educational Technology Publications. Glynn, S.M., & DiVesta, F. J. (1979). Control of prose processing via instructional and typographical cues. Journal ofEducational Psychology, 71,595-603. Golding, J. M., & Fowler, S. B. (1992). The limited facilitative effect of typographic signals. Contemporary Educational Psychology, 17,99-113. Goldman, S. R., & Varma, S. (1995). CAPping the construction-integration model of discourse comprehension. In C. A. Weaver, S. Mannes, & C. R. Fletcher (Eds.), Discourse comprehension: Essays in honorofWalter Kintsch (pp. 337-358). Hillsdale, NJ: Erlbaum.
107
1o8 MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrative text comprehension. Psychological Review, 101,371-395. Hidi, S. E. ( 1995). A reexamination of the role of attention in learning from text. Educational Psychology Review, 7, 323-350. Just, M.A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87, 329-354. Just, M. A., & Carpenter, P. A. ( 1992 ). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99, 122-149. Kintsch, W. ( 1988). The role of knowledge in discourse comprehension: A construction-integration model Psychological Review, 95, 163-182. Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York, NY, USA: Cambridge University Press. Kintsch, W., & Van Dijk. T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363-394. Klare, G. R., Mabry, J. E., & Gustafson, L M. ( 1955). The relationship of patterning (underlining) to immediate retention and to acceptability of technical material. The Journal ofApplied Psychology, 39, 40-42. Kozrninsky, E. ( 1977). Altering comprehension: The effect ofbiasing tides on comprehension. Memory and Cognition, 5, 482-490. Langston, M. C., & Trabasso, T. ( 1998). Identifying causal connections and modeling integration of narrative discourse. In H. van Oostendorp, & S. R. Goldman (Eds.), The construction ofmental representations during reading(pp. 29-69). Mahwah, NJ: Erlbaurn. Leon, J. A., & Carretero, M. (1992). Signal effects on the recall and understanding of expository texts in expert and novice readers. In A. J. M. Oliveira (Ed.), Structures ofcommunication and intelligent help for hypermedia courseware (pp. 97-111 ). New York: Springer. Linderholm, T., Gustafson, M., Van den Broek. P., & Lorch, R. F. (March, 1997). The effect of reading goals on inference generation during reading expository text. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL. Loman, N. L., & Mayer, R. E.( 1983 ). Signaling techniques that increase the understandability of expository prose. Journal of Educational Psychology, 75,402-412. Lorch, R. F., Jr. (1989 ). Text-signaling devices and their effects on reading and memory processes. Educational Psychology Review, 1, 209-234. Lorch, R. F., Jr., & Lorch, E. P. (1986). On-line processing of summary and important signals in reading. Discourse Processes, 9, 489-496. Lorch, R. F., Jr., & Lorch, E. P. ( 1996 ). Effects of organizational signals on free recall of expository text. Journal ofEducational Psychology, 88, 38-48. Lorch, R. F., Jr., Lorch, E. P., & Klusewitz, M.A. (1995). Effects of typographical cues on reading and recall of text. Contemporary Educational Psychology, 20, 51-64. Matthews, P. H. (1997). The Concise Oxford Dictionary of Linguistics (p. 48). Oxford University Press, Inc.
Influence of text cues 109
Mayer, R E., Dyck, )., &Cook, L. K. (1984). Techniques that help readers build mental models from scientific text: Definitions pretraining and signaling. Journal ofEducational Psychology, 76, I 089-1105. McClelland, J, L., & Rumelhart, D. E. ( 1988). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Computational models ofcognition and perception. Cambridge, MA: MIT Press. McKoon, G., & Ratcliff, R. (1980). The comprehension processes and memory structures involved in anaphoric reference. Jounral ofVerbal Learning and Verbal Behavior, 19, 668-682. McKoon, G., & Ratcliff, R ( 1992). Inference during reading. Psychological Review, 99, 440-466. Meyer, B. J. F. (1975). The organization of prose and its effects on memory. New York: Elsevier. Myers, J. L., & O'Brien, E. J, (1998). Accessing the discourse representation during reading. Discourse Processes, 26, 131-157. N ist, S. L., & Hogrebe, M. C. ( 1987). The role of underlining and annotating in remembering textual information. Reading Research and Instruction, 27, 12-25. O'Brien, E.}., Duffy, S. A., & Myers, J. L. ( 1986). Anaphoric inference during reading. Journal ofExperimental Psychology: Learning, Memory & Cognition, 12, 346-352. O'Brien, E. J., & Myers, J. L. (1987). The role of causal connections in the retrieval of text. Memory 6- Cognition, 15, 419-42 7. Pichert, ). W., & Anderson, R. C. (1977). Taking different perspectives on a story. Journal ofEducational Psychology, 69,309-315. Risden, K. A. C. ( 1996). Causal inference in narrative text comprehension. Dissertation
Abstracts International Section A: Humanities 6- Social Sciences, VoL 57(6-A). Trabasso, T ., Van den Broek, P ., & Sub, S. Y. ( 1989). Logical necessity and transitivity of causal relations in stories. Discourse Processes, 12, 1-25. Van den Broek, P. (1994). Comprehension and memory of narrative texts: Inferences and coherence. In Gernsbacher, M.A. (Ed.), Handbook of psycholinguistics (pp. 539-588). San Diego, CA: Academic Press. Van den Broek, P., Fletcher, C. R., & Risden, K. (1993). Investigations of inferential processes in reading: A theoretical and methodological integration. Discourse Processes, 16, 169-180. Van den Broek, P., & Lorch, R F. (1993). Network representations of causal relations in memory for narrative texts: Evidence from primed recognition. Discourse Processes, 16,75-98. Van den Broek, P., Lorch, E. P., & Thurlow, R. ( 1996). Children's and adults' memory for television stories: The role of causal factors, story-grammar categories, and hierarchical level Child Development, 67, 3010-3028. Van den Broek, P., Risden, K., Fletcher, C. R, & Thurlow, R (1996). A "landscape" view of reading: Fluctuating patterns of activation and the construction of a stable memory representation. In B. K. Britton, & A. C. Graesser (Eds.), Models of understanding text(pp. 165-187). Mahwah, NJ: Erlbaum.
no MicheUe L. Gaddy, Paul van den Broek and Yung-Chi Sung
Van den Broek. P., Risden, K., & Husebye-Hartmann, E. {1995). The role of readers' standards for coherence in the generation of inferences during reading. In R F. Lorch, Jr., & E.}. O'Brien (Eds.), Sources for coherence in retJding (pp. 353-374). Hillsdale, NJ: Erlbaum. Van den Broek, P., & Thurlow, R (1990). Reinstatements and elaborative inferences during the reading of narratives. Paper presented at the annual meeting of the Psychonomic Society, New Orleans, LA. Van den Broek, P., Young, M., Tzeng, Y., & Underholm, T. (1998). The landscape model of reading: Inferences and on-line construction of a memory representation. In H. van Oostendorp, & S. R. Goldman (Eds. ), The construction ofmental representations during retlding (pp. 71-98). Mahwah, NJ: Erlbaum. Vonk, W., Hustinx, L G., & Simons, W. H. (1992). The use of referential expressions in structuring discourse. Language 6- Cognitive Processes, 7, 301-333. Zwaan, R. A. (1994). Effect of genre expectations on text comprehension. journal of
Experimental Psychology: Letlming. Memory, and Cognition, 20, 920-933.
CHAPTER4
Lexical access in text production On the role of salience in metaphor resonance* Rachel Giora and Noga Balaban Tel Aviv University
1.
Introduction
What kind oflexical processes are involved in text production initially? Do they mirror the processes involved in text comprehension? On the face of it, it seems plausible to assume that, at least, in one respect they should not: Since authors know what they have in mind, and since they know their intended meaning prior to the production of the word( s) they select to convey that meaning, they would access only that intended meaning. Text production, it might be assumed, then, would not involve accessing unintended senses of a word selected for expressing a certain meaning or concept (as has been shown for text comprehension, see later). Rather, the word's intended meaning- the one compatible with the context- would be accessed directly. However, since the text producer is also her own comprehender (Levelt 1989, p. 13), text production may very well resemble text comprehension. In this study, we will test this hypothesis with regard to metaphor production in discourse. Lexical access involved in the very early stages of comprehension has been largely shown to be insensitive to contextual information (see Peleg, Giora & Fein 2001, but see Vu, Kellas & Paull998). The main thrust of lexical research into initial processes involved in comprehension and disambiguation has found ample evidence in favor of an exhaustive access model, or a variation of it, exhibiting sensitivity to meaning frequency. According to the exhaustive access model, lexical access is modular: Lexical processes are autonomous and impervious to context effects; all the word's coded meanings are accessed automatically upon its processing, regardless of contextual information or frequency (Fodor 1983; Onifer & Swinney 1981; Rayner, Pacht & Duffy 1994;
tu
Rachel Giora and Noga Balaban
Swinney 1979). Upon the ordered access version of the model, lexical access is exhaustive but frequency sensitive: The more frequent meaning is accessed first, and search for the intended meaning proceeds only in case the more frequent meaning is incompatible with contextual information (for a review, see Gorfien 1989; Rayner et al. 1994; Simpson 1994). Context, then, affects interpretations only at a later stage and suppresses incompatible meanings (Fodor 1983; Swinney 1979). The modular view of lexical access has been challenged by a direct access, interactionist hypothesis. According to the direct access model, lexical access is selective. Context directs access completely, so that only the appropriate meaning (of words) is made available for comprehension (Simpson 1981; Glucksberg, Kreuz & Rho 1986; Jones 1991; Martin, Vu, Kellas & Metcalf 1999; Tabossi 1988; Vu, Kellas & Paul1998). In the field of figurative language comprehension, the evidence adduced so far apparently supports a direct access model. The prevailing hypothesis is that in a rich and supportive context, figurative and literal interpretation should be accessed directly, without recourse to irrelevant interpretations. Particularly, the intended figurative meanings of metaphors, ironies, and idioms should be tapped directly without having to process the sentence literal meaning at all (see Gibbs 1994 for a review). Similarly, in a context biased toward the literal meaning, only that meaning should be made available for comprehension. Literal and figurative interpretations, then, should involve equivalent processes sensitive to contextual interpretation. They should be processed automatically (Keysar 1989; Gildea & Glucksberg 1983; Glucksberg, Gildea & Bookin 1982), involve the same categorization procedures (Giucksberg & Keysar 1990; Shen 1997), and take equally long to read (Kemper 1981; Inhoff, Lima & Carroll 1984; Ortony, Schallert, Reynolds & Antos 1978). The picture, however, has not been monolithic. Tapping online processes by measuring reading times at the end of figurative phrases rather than at the end of sentences showed that even when embedded in a context a few sentences long, metaphoric phrases required longer processing times than the same phrases used literally (Janus & Bever 1985 ). In addition, figurative referring expressions were found to take longer to read than their literal equivalents (Gibbs 1990). Familiar metaphors were found to be processed initially both literally and metaphorically, regardless of contextual information (Williams 1992). 1 They were further shown to retain their contextually incompatible, literal meaning in contexts biasing their interpretation toward the metaphoric
Lexical access in text production
meaning (Giora & Fein 1999b; Williams 1992). More recently, ironic utterances were found to take longer to read when embedded in ironically than in literally biasing context (Dews & Winner 1997, 1999; Giora, Fein & Schwartz 1998; Schwoebd, Dews, Winner & Srinivas, 2000), and to be interpreted only literally initially (Giora et al.1998; Giora & Fein 1999a). (For a reinterpretation of Gibbs's ( 1986) findings suggesting that ironies are interpreted literally rather than ironically first see Giora 1995). In contrast, conventional language was processed faster than less conventional language. For instance, idioms were found to take longer to read in literally than in idiomatically biasing contexts (Gibbs 1980), and faster than their variant versions (McGlone, Glucksberg & Cacciari 1994). Conventional ironies took equally long to respond to in ironically and literally biased contexts. Similarly, they were processed initially both literally and figuratively (Giora & Fein 1999a). In Pexman, Ferretti and Katz (2000), novel metaphors took longer to read than familiar metaphors, and in Turner and Katz ( 1997) familiar proverb were faster to read than less familiar ones. Which processing model then can best account for this array of inconsistent findings? Recently Giora ( 1997, in press) proposed that comprehension of figurative and literal language be viewed as governed by a general principle of salience, according to which salient meanings should always be accessed upon encounter. A meaning of a word or an expression is salient if it is coded in the mental lexicon. Salience, however, is a graded notion. Factors affecting degree of salience are conventionality, frequency, familiarity, and prototypicality. Thus, the institutional meaning of bank would be salient, that is, foremost on our mind, if we interact with commercial banks more often than with riverbanks. It would be less salient if the reverse holds. Conversational implicatures constructed on the fly, however, would be nonsalient, because they are not coded in the mental lexicon. The graded salience hypothesis thus predicts paralld access for similarly salient meanings (the figurative and literal meanings of conventional metaphors), sequential processes when salience imbalance is involved (e.g., novel metaphors whose literal but not figurative meaning is salient). Prior context may affect comprehension immediately. It may be predictive and avail the compatible meaning very early on. However, it is not sensitive to linguistic information and does not interact with lexical accessing. Consequently, it is not effective in blocking salient but contextually incompatible meanings (see also Pdeg et al. 2001). As a result, salient but contextually
U3
114
Rachel Giora and Noga Balaban
incompatible meanings that have been involved initially would be suppressed postlexically if they interfere with comprehension. They would not, if they are conducive to the compatible interpretation ('The retention hypothesis, Giora, in press; Giora & Fein 1999b). According to the graded salience hypothesis, then, when the most salient meaning is compatible (for instance, the figurative meaning of conventional idioms), it would be accessed directly (as shown by e.g. Gibbs 1980; Vander Voort & Vonk 1995) and integrate with contextual information. However, when a less rather than a more salient meaning is invited by context (e.g., the figurative meaning of novel metaphors, the literal meaning of conventional idioms, or a novel interpretation of a highly conventional literal expression), contextual information would not inhibit the salient meaning. Rather, that meaning would be accessed upon encounter regardless of context, and would be suppressed and replaced by the appropriate meaning, or retained for further processing, depending on the role it plays in comprehension (Giora in press; Giora & Fein 1999b). This holds even when context is strong and highly predictive (Gerrig 1989; Gibbs 1980; Giora et aL 1998; Peleg et al. 2001; Turner & Katz 1997). The graded salience hypothesis, thus, differs from the modular view in that it is salience sensitive and does not posit automatic suppression of contextually incompatible meanings (as assumed by Gernsbacher, Keysar & Robertson in press; Grice 1975; Swinney 1979). In sum, the direct access view pairs with the plausible assumption that authors know what's on their mind and predicts that speakers and authors would access appropriate meaning selectively. In contrast, the modular view and the graded salience hypothesis predict that lexical access in production may resist context effects. Diverging from the modular view, the graded salience hypothesis further predicts that contextually incompatible meanings would be retained if they do not obstruct the comprehension process.
2.
Lexical access in production -the case of metaphoric language
As a working hypothesis, we assume here that the processes involved in text
production mirror, at least partially, those involved in text generation. Given that speakers have access to both their internal and overt linguistic products, functioning both as producers and comprehenders of their own text (Levett 1989), the factors found to be crucial for text comprehension are assumed here to govern text production as well.
Lexical access in text production
2.1
Predictions
On this assumption, the (received) direct access view regarding understanding figurative language would predict that generating metaphors should not involve activating their incompatible literal interpretation, provided the context is strong and supportive of a metaphoric interpretation. This claim has been weakened recently only with regard to novel and less familiar metaphors, suggesting that their interpretation may involve some recourse to underlying conceptual metaphors (Keysar, Shen, Glucksberg & Horton 2000, Shen & Balaban 1999 ),2 and hence to the literal meaning as well. For example, comprehension of the novel metaphor The microbe of a unity government (see Appendix) may involve accessing the conceptual metaphor POLITICS IS A DISEASE. Such view thus implies that processing novel metaphor involves activation of some aspects of its literal meaning ('disease'). According to the modular view, contextually inappropriate meanings that have been activated upon production should be suppressed after a short delay (of 300-500 msec). In processing figurative language this means that the literal meaning of nonliteral utterances which has been activated initially (about 0300 msec) should be disposed of after it has been utilized. In contrast, the graded salience hypothesis predicts that the literal meaning of familiar as well as less of unfamiliar metaphors would be activated upon text production, because its components are salient and would be accessed automatically. It further predicts that it would be retained, because it supports the metaphoric interpretation (cf. the retention hypothesis above. For a different view, see Gemsbacher et al., in press). To reject the direct access view, it is sufficient to show that either (a) familiar and less or unfamiliar metaphors involve their contextually inappropriate literal meaning more or less indistinguishably; or alternatively (b) that conventional metaphors involve their figurative and contextually incompatible literal meaning indistinguishably. While such findings contest the direct access view, they are also inconsistent with the modular view, which predicts immediate suppression of such incompatible meanings. They are, however, accountable by the retention hypothesis supplementing the graded salience hypothesis. Absence of traces of the literal meaning, however, will not contest the graded salience hypothesis (since speakers do not have to use available information), though they will be more consistent with the direct access and modular views.
us
116
Rachel Giora and Noga Balaban
Given that online measures of text production are hardly available, one way to test these predictions is to study the ecology of metaphors in naturally occurring discourses. Mention of a meaning of a metaphor in the metaphor's immediate neighborhood, i.e., in the metaphor's clause or in the next two or three clauses, may count as evidence that that meaning was active in the producer's mind and was, therefore, made manifest. Rather than discarded as incompatible, that meaning was retained for further processes. Note that the various approaches in question (with the exception of the modular view) differ only insofar as unintended (i.e., literal) meanings of familiar metaphors are concerned. Thus, if familiar and less familiar metaphors both prime and retain their unintended literal meaning indistinguishably, this would support the graded salience hypothesis, but challenge both the modular and direct access views, the former on the basis of the suppression hypothesis and the latter on its selectiveness hypothesis. Given that in a metaphor-inviting context, familiar metaphors are expected to be processed only metaphorically, findings indicating that familiar metaphors were processed literally would be problematic for the direct access view: They will attest that these meanings have been accessed regardless of context. However, if only less familiar metaphors activate and retain their unintended literal meanings, this would be consistent with the weaker version of the direct access view proposed by Keysar et al. (2000) and only partly consistent with the graded salience hypothesis and modular view. To test these hypotheses we collected naturally occurring metaphors that were either elaborated on following their mention, or were not. We then looked into (a) whether the set of metaphors elaborated on differed in terms of familiarity/novelty from the set which received no elaboration. We further looked into (b) whether metaphors judged as highly familiar exhibit no elaboration, as would be predicted by the various versions of direct access view. Findings to the contrary would favor the graded salience hypothesis. 2..2.
Method
Materials Our materials were metaphors appearing in newspaper articles. We randomly collected 60 metaphors from the columns' section of Ha'aretz- an Israeli daily- during the months of August and September 1997. Thirty involved some mention or echo of their (unintended) literal interpretation, i.e., a word or an expression semantically related to their literal meaning, in the same or next dause(s) (e.g., la, lb), and 30 did not (e.g., 2a, 2b, see also Appendix).
Lexical access in text production
(1) a.
The strikes in the Education system took place when the Union was putting up a fight against the government. In this fight, threats, sanctions and even a general strike were the weapons. (Ha'aretz, 4.9.97: B1) b. In this situation, the Treasury looks like an island of sanity in a sea of unconstrained demands. (Ha'aretz, 12.9.97: Bl)
(2) a. He lost his health, and his spirit broke. (Ha'aretz, 1.9.97: B1) b. Every honest and benevolent person should have given a shoulder to the minister of Treasury so that he can succeed in implementing his plan. (Ha'aretz, 4.9.97: Bl)
Participants Forty native speakers of Hebrew {30 females, 10 males) participated in the experiment. They were all undergraduates in the department of Poetics and Comparative Literature, Tel Aviv University, aged 21-40. They participated in the experiment voluntarily.
Procedure The participants were presented with the 60 metaphors and another five, contrived, novel metaphors. They were asked to rate them on a 1-7 familiarity scale: from the least unfamiliar ( 1) to the most (7) familiar metaphor. 2..3 Results The mean familiarity rate of each metaphor (ranging from 2-6.95) was the basic datum for the analysis. Findings showed that, as predicted by the graded salience hypothesis, but contra the direct access approach (see (a) above), metaphors followed by a mention of their literal meaning did not differ familiarity-wise from those that were not, the difference between the means was insignificant (t=0.96, p=0.34, two tail). That is, the metaphors whose literal meaning was retained - echoed and elaborated on - in the immediate or next clause{s) (e.g., (la-b) above) were not evaluated as more or less familiar than those that received no literal extension (e.g., (2a-b) above). Moreover, a check of the number of metaphors which received the highest familiarity rates {6-7) reveals that 15 of them belonged in the group of (30) metaphors which had literal extensions (e.g., (Ia -b) above) and 17 belonged in the other group of (30) metaphors whose literal meaning was not elaborated on (e.g., (2a-b) above). Thus, as predicted by (b) above, and in accordance with
U7
118
Rachel Giora and Noga Balaban
the graded salient hypothesis, but contra the direct access and standard pragmatic models, highly familiar metaphors retained both their compatible and contextually incompatible (literal) meaning indistinguishably, suggesting that even highly conventional metaphors involve processing their salient though contextually incompatible meaning.
3·
Discussion and conclusions
Since we did not use an on-line measure, our findings cannot directly support an autonomous view of lexical access in production. However, they are certainly consistent with it, suggesting that context does not block activation of contextually incompatible but salient (i.e., coded) meanings. Even highly familiar metaphors, whose metaphoric meaning may be processed directly, avail their salient, literal meaning upon their production alongside the metaphoric meaning, though this meaning may be incompatible with contextual information. These findings are consistent with the graded salience hypothesis, which is (only) partially congruent with a modular view of lexical access in production. 3 They suggest that the processes involved in text production are similar to those involved in text comprehension. Activation of salient meanings is automatic, and does not interact with the context at an early stage: Context does not pre-select only the appropriate meaning; that is, it does not inhibit activation of salient incompatible meanings (cf. section 1, and see also Honeck 1997, p. 467), though it can of course predict it. Illustrative is the following example (cited in Honeck 1997, p. 47), where the salient meaning of leaf(a part of a plant) is activated in the idiom 'turn over a new leaf even though this is not the meaning of leaf (page) on which the idiom is based. This 'error', we propose, attests to the involvement of salient (though incompatible) meanings in text production, as opposed to less salient (though more compatible) meanings (such as the 'page' meaning of leaf): (3)
Fred is cutting his lawn earlier now. I guess he's turning over a new leaf.
Our findings corroborate findings by Nayak and Gibbs ( 1990) quoted in Gibbs ( 1994, p. 301 ), which show that readers are sensitive to the salient, contextually incompatible meanings of figurative language when they are asked to rate the appropriateness of an incoming text-segment (and act as text producers, to a certain extent). For example, when asked to rate the appropriateness of idioms
Lexical access in text production 119
for a particular paragraph ending, subjects tended to select as more appropriate the idiom stemming from the source domain enhanced by the paragraph. Thus, out of (4a) or (4b) in the context of (4), they selected (4a) whose (incompatible) literal meaning 'coheres' or 'resonates' with the salient literal meanings of the chain of metaphors instantiated in (4): (4) Mary was very tense about this evening's dinner party. The fact that Bob had not come home to help was making her fume. She was getting hotter with every passing minute. Dinner would not be ready before the guests arrived. As it got closer to five o'clock the pressure was really building up. Mary's tolerance was reaching its limits. When Bob strolled in at ten minutes to five whistling and smiling, Mary (a) blew her top (b) bit his head off These findings show that text progression is not only sensitive to the figurative meanings accessed by the text's comprehender/producer (which should have resulted in no appropriateness difference between the two possible idiomatic continuations), but that it is even more sensitive to salient though incompatible meanings. Though this experiment is only partially relevant to our research, involving comprehension and text appreciation rather than text production, its findings are suggestive of the same processes alluded to here. Our findings can also be viewed as an instantiation of a more general phenomenon of"dialogic syntax" (DuBois 1998, 2000a,b and see also Coates 1966; Levelt 1989). Dialogic syntax occurs when a speaker constructs an utterance based on an immediately co-present utterance. Du Bois discloses the ubiquity of"dialogic syntax", showing that a vast array of linguistic elements such as syntactic, semantic, pragmatic, lexical, and even phonetic patterns in one speaker's discourse can be traced back to an immediately co-present utterance. This suggests that activation of any linguistic element makes it available for the same or next speaker to elaborate on. Activation of that element enhances it, which, in turn, suppresses other possible alternatives. Enhancement makes it accessible and, hence, a preferable candidate to be selected and elaborated on in the next utterance (Gernsbacher 1990). Our findings show that metaphors, not least familiar metaphors, are processed (also) literally in the mind of the discourse producer, thereby allowing reoccurrence of the salient/literal meaning in the next discourse segment. Evidence of similar effects of a given utterance on adjacent ones ( cf. Du Bois 1998,
120
Rachel Giora and Noga Balaban
2000b) suggests that salience of meanings (and constructions), which is subject to local manipulation (enhancement/suppression) in discourse through dialogic resonance, is a major factor in discourse production.
Appendix Translated sample items, all taken form Ha'aretz, August-September 1997, Bl: I. Metaphors with an extension of their literal meaning (in the Hebrew original):
(I) Of all the viruses and microbes which call upon our country, the worst is the microbe of a unity government. This plague, which usually breaks out abruptly... (2) This is a story that has begun but has not yet ended. One important scene ended when Yitzhak Rabin was assassinated [ ... 1. The second scene ended when Peres was Prime Minister ... (3) The Palestinian affair is a time bomb that has to be dismantledl(disassembled) before it explodes. (4) Israel needs ... not only those who flirt with the capital market, but those who marry it, for better or worse. in poverty and in wealth, until "a purchase proposal" do they part. (5) Politically the present Croatian leaders' wishing to blur the impression of the "Ostasha" rehabilitation is understandable. But they will find it difficult to erase the moral stain of their attempt to rehabilitate the murderers and their accomplices. (6) (Addressed to Albright): If you sink(/delve) into [meaning preoccupy yourself with I Dahania, or into the port, or into the security arrangements and into the days on which a bulldozer will pass or will not pass, you will drown in a St!ll of details. (7) All this happened when the civil servants were drowsy while on duty. If they were alert/ (wide awake), the deterioration [ ... 1might have been prevented. II. Metaphors with no extension of their literal meaning: (I) In her position as the mother of the future king, [DianaI was stuck as a bone in the throat of the British monarchy. And form this position and being so bright, she opened a window into the inhumanity of the royal family. (2) Ninety percent of the property in Israel bas now been turned into state property [ ... 1 regardless of whether the laws passed the elementary test of justice and equality. (3) They don't talk of small fish, they talk of terrorists with blood on their hands. (4) Soon the patients rolled up their sleeves and their lawyers entered the battlefield.
Notes • This research was supported by a grant from The Israel Science Foundation. We have also benefited from discussions with Mira Ariel, Jack Du Bois and Ray Gibbs. Thanks are also extended to Sharon Himmelfarb and two anonymous reviewers for their comments.
Lexical access in text production
Williams (1992) actually studies polysemous words but most of them were metaphorically based.
1.
Inconsistently with the direct access view, the consensus is that comprehension of novel metaphors differs from comprehension of familiar metaphors and should involve activating their contextually incompatible salient meaning. i.e., their literal meaning (e.g., Keysar et al. 2000, and see Gibbs 1994 for a review).
2..
3· The modular view of lexical access in speech production distinguishes between meaningrelated and form-related processes (see, e.g., Levett 1993; Dell & O'Seaghdha 1992). However, this view is only marginally relevant to our discussion here.
References Blasko, G. D., & Connine, C. ( 1993 ). Effects of familiarity and aptness on metaphor processing. Journal of Experimental Psychology: Learning, Memory, and Cognition. 19,295-308. Coates, J. (1996). Women talk. Oxford: Blackwell. Dell, S. G., & O'Seaghdha, P. G. ( 1992). Stages oflexical access in language production. Cognition, 42,287-314. Dews, S., & Winner, E. ( 1997). Attributing meaning to deliberately false utterances: The case of irony. In C. Mandell, & A. McCabe (Eds.), The problem of meaning: Behavioral and cognitive perspectives (pp. 377-414). Amsterdam: Elsevier. Dews, S., & Winner, E. (1999). Obligatory processing of the literal and nonliteral meanings of ironic utterances. Journal of Pragmatics, 31, 1579-1599. Du Bois, W. J. (1998). Dialogic syntax. Paper presented at The cognitive theories of intertextuality conference. Tel Aviv university. DuBois, W. J. (2000a). Santa Barbara corpus of spoken American English. CD-ROM. Philadelphia: Linguistic Data Consortium.[ www.ldc.upenn.edu/Publications/ SBC/]. DuBois, W. J. (2000b). Reusable syntax: Socially distributed cognition in dialogic interaction. Paper presented at CSDL 2000: Fifth Conference on Conceptual structure, discourse, and language. University of California, Santa Barbara. Fodor, J. (1983). The modularity ofmind. Cambridge: MIT Press. Gernsbacher, M.A. (1990). Language comprehension as structure building. Hillsdale, N.J.: Erlbaum. Gemsbacher, M.A., Keysar, B., & Robertson, R. W. (in press). The role of suppression and enhancement in understanding metaphors. Journal ofMemory and Language. Gerrig. R. J. (1989). The time course of sense creation. Memory & Cognition, 17, 194207. Gibbs, R. W. Jr. ( 1980). Spilling the bean on understanding and memory for idioms in conversation. Memory 6- Cognition, 8, 449-456. Gibbs, R. W. Jr. ( 1986). On the psycholinguistics of sarcasm. Journal of Experimental Psychology: General, 115, 3-15.
Ul
12.2
Rachel Giora and Noga Balaban
Gibbs, R. W. Jr. (1990}. Comprehending figurative referential descriptions. Journt~l of Experimental Psychology: Learning. Memory and Cognition, 16, 56--66. Gibbs, R. W. Jr. ( 1994}. The poetics of mind. Cambridge: Cambridge University Press. Gildea, P., & Glucksberg, S. (1983}. On understanding metaphor: The role of context. Journal of Verbal Learning and Verbal Behavior, 22, 577-590. Giora, R. ( 1995}. On irony and negation. Discourse Proasses, 19, 239-264. Giora, R. (1997}. Understanding figurative and literal language: The graded salience hypothesis. Cognitive Linguistics. 7, 183-206. Giora, R. (1999}. On the priority of salient meanings: Studies of literal and private figurative language. Journal of Pragmatics, 31, 919-929. Giora, R. (in press}. Ott our mind: Salience. context and figurative language. New York: Oxford University Press. Giora, R. & Fein, 0. ( 1999a). Irony: Context and salience. Metaphor and Symbo~ 14, 241-257. Giora, R. & Fein, 0. (1999b). On understanding familiar and less-familiar figurative language. Journal ofPragmatics, 31, 1601-1618. Giora, R. & Fein, 0. ( 1999c). Irony comprehension: The graded salience hypothesis. Humor, 12,425-436. Giora, R., Fein 0., & Schwartz, T. ( 1998).1rony: Graded salience and indirect negation. Metaphor and Symbo~ 13, 83-101. Glucksberg, S., Gildea, P., & Bookin, H. G. ( 1982). On understanding nonliteral speech: Can people ignore metaphors? Journal of Verbal Learning and Verbal Behavior, 21, 85-98. Glucksberg, S., & Keysar, B. ( 1990). Understanding Metaphorical Comparisons: Beyond Similarity. Psychological Review, 97, 3-18. Glucksberg, S., Kreuz, R., & Rho, S. H. (1986). Context can constrain lexical access: Implications for models of language comprehension. Journal ofExperimental Psychology: Learning. Memory, and Cognition, 12, 323-335. Gorfein, S. D. (Ed.). ( 1989). Resolving semantic ambiguity. New York: Springer Verlag. Grice, P. H. (1975). Logic and Conversation. In P. Cole, & }.Morgan (Eds.), Speech acts. Syntax and semantics. 3 (pp. 41-58). New York: Academic Press. Hogaboam, T. W., & Perfetti, C. A. ( 1975 ). Lexical ambiguity and sentence comprehension. Journal ofVerbal Learning and Verbal Behavior, 14,265-274. Honeck, R. P. (1997). A proverb in mind: The cognitive science of proverbial wit and wisdom. Mahwah, NJ: Erlbaum. lnhoff, A. W., Lima, S.D., & Carroll, P. J. (1984). Contextual effects on metaphor comprehension in reading. Memory & Cognition. 12, 558-567. Janus, R. A., & Bever, T. G. (1985). Processing of metaphoric language: An investigation of the three stage model of metaphor comprehension. Journal of Psycholinguistic Research, 14, 473-487. Jones, f. L. ( 1991). Early integration of context during lexical access of homonym meanings. Current Psychology: Research 6- Reviews, 10, 163-181. Kemper, S. ( 1981 ). Comprehension and interpretation of proverbs. Journal of Psycholinguistic Research, 10, 179-183.
Lexical access in text production 123
Keysar, B. ( 1989}. On the functional equivalence of literal and metaphorical interpretations in discourse. Jourmd ofMemory and LIJnguage, 28, 375-385. Keysar, B., Shen, Y., Glucksberg. S., & Horton, S. W. (2000). Conventional language: How metaphorical is it? Journal ofMemory and Language, 43, 57~593. Levett, W. J. M. ( 1989}. Speaking: From intention to articulation. Cambridge Mass.: MIT Press. Levett, W. J. M. (Ed.). (1993). Lexical access in speech production. Cambridge and Oxford: Blackwell. Martin, C., Vu, H., Kellas, G., & Metcalf, K. ( 1999}. Strength of discourse context as a determinant of the subordinate bias effect. The Quarterly Journal of Experimental psychology, 52, (2). McGlone, M.S., Glucksberg, S., & Cacciari, C. (1994). Semantic productivity and idiom comprehension. Discourse Processes, 17,167-190. Nayak, N., & Gibbs, R. W. Jr. (1990). Conceptual knowledge in idiom interpretation. Journal ofexperimental Psychology: GeneraL 116, 315-330. Onifer, W., & Swinney, D. A. (1981). Accessing lexical ambiguities during sentence comprehension: Effects of frequency of meaning and contextual bias. Memory & Cognition, 9, 225-236. Ortony, A., Schallert, D. L., Reynolds, R. E., & Antos, S. J. (1978). Interpreting metaphors and idioms: Some effects of context on comprehension. Journal of Verbtll Learning and Verbal Behavior, 17,465-477. Peleg, 0., Giora, R., & Fein, 0. (200 1}. Can context selectively activate the contextually appropriate meaning of an ambiguous word? Metaphor and SymboL Pexman P., Ferretti, T., & Katz, A. (2000). Discourse factors that influence irony detection during on-line reading. Discourse Processes, 29,201-222. Rayner, K., Pacht ). M., & Duffy, S. A. ( 1994). Effects of prior encounter and global discourse bias on the processing oflexically ambiguous words: Evidence from eye fixations. Journal ofMemory and Language, 33, 527-544. Rttanati, F. ( 1995). The alleged priority of literal meaning. Cognitive Science, 19, 207232.
Schwoebel, }., Dews, S., Winner, E., & Srinivas, K. (2000). Obligatory Processing of the Literal Meaning of Ironic Utterances: Further Evidence. Metaphor and SymboL 15, 47~1.
Shen, Y. (1997). Metaphors and global conceptual structures. Poetics, 25,1-17. Shen. Y.• & Balaban, N. (1999). Metaphorical (in)coherence in discourse. Discourse Processes, 28, 139-154. Simpson, G. B. ( 1981). Meaning dominance and semantic context in the processing of lexical ambiguity. Journal ofVerbal Learning and Verbal Behavior, 20, 12~136. Simpson, G. B. (1994). Context and the processing of ambiguous words. In M A. Gernsbacher (Ed.). Handbook of Psycholinguistics (pp. 359-374). San Diego: Academic Press. Simpson, G. B., & Burgess, C. ( 1985 ). Activation and selection processes in the recognition of ambiguous words. Journal of Experimental Psychology: Human Perception and Performance, 11,28-39.
124 Rachel Giora and Noga Balaban
Swinney, D. A. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects. journal of Verbal Learning and Verbal Behavior, 18, 645659. Tabossi, P. ( 1988 ). Accessing lexical ambiguity in different types of sentential contexts. journal ofMemory and Language, 27, 324-340. Turner, N. E., & Katz, A. (1997). Evidence for the availability of conventional and of
literal meaning during the comprehension of proverbs. Pragmatics and Cognition, 5,203-237.
VandeVoort, M. E. C., &Vonk, W. (1995). You don't die immediately when you kick an empty bucket: A processing view on semantic and syntactic characteristics of idioms. In M. Everaert, E-J van der Linden, A. Schenk & R. Schreuder (Eds.), Idioms: Structural and psychological perspectives (pp. 283-299). Hillsdale NJ: LEA Vu, H., Kellas, G., & Paul, S. T. {1998). Sources of sentence constraint in lexical ambiguity resolution. Memory & Cognition, 26,979-1001. Williams, J. N. ( 1992). Processing polysemous words in context. Evidence from interrelated meanings. Journal ofPsycholinguistic Research, 21,193-218.
SECTION
2
Relational coherence in text and text . processmg
In this second section we switch from referential coherence and the activation of concepts to relational coherence. Over the last decade, coherence relations, or rhetorical relations, like CAUSE-CONSEQUENCE, CoNTRAST and PROBLEM-SoLlTTION have been studied intensively. One key issue has turned out to be the categorization of coherence relations and of their prototypical expressions, connectives: CAUSE-CONSEQUENCE relations are more similar to CLAIM-ARGUMENT relations than to LisT relations, just like the causal connective because has more in common with since than with and. The Chapters 5-9 can be read as contributions to a discussion about these categorization issues. What all these text-linguistic chapters have in common is that none of them deals with connectives and markers as linguistic items per se, but that the analyses are taken to be relevant to mental representations of these linguistic elements. In Chapter 5, Knott argues that the distinction often made between socalled pragmatic and semantic or content, epistemic and speech act coherence relations is not always clearcut. As an alternative, he proposes an intentionbased definition of pragmatic relations. In Chapter 6, Noordman deals with the same distinction, now with regard to causal (because) and concessive (although) relations. He uses corpus research to analyze, first, how semantic and epistemic concession relations are realized linguistically, and, second, how such realizations affect the surrounding discourse. Noordman provides evidence from reading time studies that suggests that semantic concessions are easier to process than epistemic relations. Moreover, he shows how various ordering variants (e.g. although p, q versus q, although p) can be accounted for by the thematic development within the discourse surrounding these variants, thus establishing a clear link between the mental representation of discourse and its surface characteristics.
tUi
Section 2: Relational coherence in text and text processing
Research on relational coherence is strongly influenced by Mann and Thompson's Rhetorical Structure Theory (RST; Mann & Thompson 1988). One of the relations in RST, the Elaboration relation, is debated in Chapter 7, by Knott. Oberlander, O'Donnell and Mellish because it does not fit in with other discourse relations and is often used as a waste-paper basket, to be used in text analysis when no other relation fits. As an alternative, the authors propose to account for coherence either by coherence relations (for local dependencies) or by entity-based moves (for non-local dependencies). The two remaining chapters in this Section introduce other frameworks to extend the theoretical basis of the analysis of rdational coherence. In Chapter 8 Pander Maat uses Sperber & Wilson's Relevance Theory (Sperber & Wilson 1986) to analyze the meaning of the conjunction and. He makes a number of controversial claims on the proper analysis of and, suggesting among others that this connective does not link two segments directly, but makes them jointly rdevant to the surrounding context. To support his analysis Pander Maat makes ample use of corpus analysis. In Chapter 9 Snoeck Henkemans continues the discussion about different categories of coherence relations and connectives from the viewpoint of Argumentation Theory. She explores the rdationship between, on the one hand, text-linguistic accounts of connectives like those present in Chapters 5 through 8, with the tradition of Argumentation Theory on the other hand. More specifically, she discusses the starting-points of the pragma-dialectical account of argumentation, and the way in which the speech acts of argumentation and explanation are characterized. Then, she compares this account with the textlinguistically inspired approach. One of the conclusions is that it is sometimes unclear how propositional and illocutionary levels can be separated as different hierarchical levels in the analysis of explanation and argumentation. Snoeck Henkemans argues that the pragma-dialectical approach creates a good basis for interpreting linguistic cues like connectives in a well-founded and systematic way.
References Sperber, D. & Wilson, D. ( 1992). Relevance: Communication and cognitiotL (2nd ed.). Oxford: Blackwell. Mann, W. C. & Thompson, S. A. (1988). Rhetorical Structure Theory: A functional theory of text organization. Text, 8, 243-281.
CHAPTER
5
Semantic and pragmatic relations and their intended effects Alistair Knott University of Otago
1.
Introduction
This paper takes as its starting point an assumption that has received much attention in both computational and psycholinguistic treatments of discourse: that the coherence of an extended text can be explained in terms of an account of the relations which hold between its component spans. Naturally, when it is stated as baldly as this, the assumption is almost vacuously true. However, it is useful in providing the beginnings of a vocabulary for discussing the issue of text coherence. It means that the empirical question of what makes a text coherent can be re-expressed as two more specific questions. Firstly, how can we define the set of relations which are permitted within coherent text? And secondly, what are the structural constraints on the configuration of these relations within coherent text? In the present paper, I will concentrate on the first of these questions. Accounts of the semantics of coherence relations have often suggested that they should be thought of as composite entities, defined in terms of a number of different dimensions. This assumption is implicit in systemic approaches to conjunctive relations (Halliday & Hasan 1976; Martin 1992), in psycholinguistic/text-linguistic approaches (Van Dijk 1979; Redeker 1990; Sanders, Spooren & Noordman 1992), and in computationaUsemantic approaches (Hobbs 1985; Elhadad & McKeown 1990; Knott & Mellish 1996; Oversteegen 1997). In all of these approaches, specifying the semantics of a coherence relation involves specifying a number of different values, which position it within a multidimensional space. Furthermore, if we assume that intersententiaUinterclausal conjunctions can be used to signal coherence relations in surface text, then the semantics of these conjunctions can be specified in a similar fashion. There is a
12.8 Alistair Knott
large measure of variation in the above studies as to the closeness of the mapping between the set of coherence relations and the set of connectives. Likewise, the studies differ as regards the number of dimensions they propose and the degree of interdependence they envisage between relations. However, the principle is the same in each case. One of the fundamental dimensions to have been proposed distinguishes between relations that hold between the content of the text spans they link and those that hold between the utterances of the text spans themselves, or the beliefs which underlie them. This distinction is most clearly illustrated with reference to two kinds of 'causal' relations: those that simply describe a cause and effect occurring in the world (such as example (I) below), and those that have an argumentative or rhetorical force (such as example (2) ). ( 1} Bill was starving, so he had a sandwich. (2} Bill had five sandwiches, so he was/must have been starving.
In each case, so is taken to link a cause and effect. But while in example (1) the cause and effect are taken to be the eventualities described by the respective clauses, in (2), the speaker's beliefin the eventuality described in the first clause is taken to cause her conclusion about the eventuality described in the second. Nearly all the theories that advocate a decomposition of relations include a dimension which reflects this difference. The main reason for this is the apparent productivity of the distinction, in particular in capturing alternative senses of sentence and clause connectives. There are many connectives, of several different types, which can be analyzed as ambiguous with respect to this dimension. As well as causal/inferential connectives like so and because, there are conditional connectives such as if, contrastive connectives such as but, disjunctive connectives such as or and otherwise, and temporal connectives such as then. In each case, a similar ambiguity seems to be discernible; the challenge is to find a formulation of the distinction which covers all the cases. Several different terms have been used for the two types of relation, and slightly different definitions have been given. Halliday and Hasan (1976) and Martin ( 1992) refer to EXTERNAL and INTERNAL relations; Van Dijk ( 1979) refers to SEMANTIC and PRAGMATIC relations; and Redeker ( 1990) refers to IDEATIONAL and PRAGMATIC relations. I will adopt van Dijk's formulation in the remainder of this paper. In this paper, I begin in Section 2 by outlining the two types of relation in more detail. In Section 3, I then consider some problems with this account
Semantic and pragmatic relations U9
noted by Sweetser ( 1990) 1 which led to her proposing a new tripartite distinction in place of the bipartite one. In the new classification, the class of PRAGMATIC relations is divided in two: ( 1) the class of EPISTEMIC relations, which hold at the level of premises and conclusions about what is the case in the world and (2) the class of SPEECH-ACT relations, which hold between the utterances themselves. I also present Sweetser's extension of the set of relations from causal to conditional, disjunctive and sequential relations. In Section 4, I consider the advantages and disadvantages of the proposals previously outlined, and also note some problems which apply equally to each proposal. In Section 5, I then propose an alternative, intention based definition of PRAGMATIC relations which overcomes a number of these problems. An assessment of the new definition, noting advantages relating to the scope of its applicability and to its explanatory force, as well as cases which it cannot handle, is given in Section 6. I conclude in Section 7 by considering some possible solutions to these problems, by introducing a new bipartite distinction, orthogonal to the SEMANTIC/PRAGMATIC distinction, between CAUSE-DRIVEN and RESULT-DRIVEN relations.
1.
The bipartite semantic/pragmatic distinction
Consider again the following sentences: ( 1) (2)
Bill was starving, so he had a sandwich. Bill had five sandwiches, so he was/must have been starving.
The distinction between SEMANTIC relations such as example ( 1) and PRAGMATIC relations such as example (2) has been captured by many researchers by interpreting the latter type of relation as containing an implicit performative. In the example above, the performative could be made explicit as follows: (3)
Bill had five sandwiches, so I conclude he was starving.
Generalizing from this example, we can state that SEMANTIC relations hold directly between the propositional content of the two related utterances, while PRAGMATIC relations hold between the utterances themselves, interpreted as speech acts. The dearest expression of this idea comes from Van Dijk ( 1979, p. 449): "Pragmatic connectives express relations between speech acts, whereas semantic connectives express relations between denoted facts".
130
Alistair Knott
Many other theorists have advocated a similar distinction - although typically qualified in some way - including Halliday and Hasan (1976), Martin (1992), Redeker (1990), and Sanders et al. (1992). (Different qualifications will be mentioned in Sections 3 and 4.2.) Interpreting PRAGMATIC relations as holding between speech acts allows us to give a connective like so a single denotation, which applies equally to PRAGMATIC contexts and to SEMANTIC ones. Rather than assuming a lexical ambiguity in so (as signaling either a cause in the world, or an argumentative relation), we can assume that the relation signaled is simple causality in both cases, with the difference between the cases stemming from a systematic variation in which propositions are being linked by this relation. The effect of this analysis is essentially to reduce relations at the PRAGMATIC level to relations at the SEMANTIC level. SEMANTIC relations are ones in which information about a relation between two propositions in the world is conveyed from the speaker to the hearer. PRAGMATIC relations, according to the definition just proposed, are exactly the same; it is just that the information which the speaker is communicating to the hearer, is (partly) about her own speech acts. A point to note about the definition of PRAGMATIC relations is that it extends in a useful way to relations involving sentences which are not in the indicative mood. Consider the following two sentences, for instance: (4) Bill's starving. So why isn't he eating? (5) Bill's starving, so give him something to eat. The point is that some transformation on the latter sentence in each example is necessary to render it directly comparable to sentences in the indicative mood. Moreover, we want the transformation to reflect the real differences between the different kinds of sentences. Moving to the level of the speech act which underlies the sentence is a way of doing this. The speech act behind a sentence can always be expressed as an indicative sentence, which gives us the possibility of making direct comparisons. Thus we can gloss (4) and (5) respectively as follows: (6) Bill's starving. So I ask 'why isn't he eating'? (7) Bill's starving, so I instruct you to give him something to eat. In summary, the distinction between SEMANTIC and PRAGMATIC relations can be seen as bearing on two separate problems: the problem of identifying the ambiguity of certain connectives, and the problem of finding a uniform repre-
Semantic and pragmatic relations
sentation for sentences in different moods. It is fair to say that it is the former problem which originally motivated the distinction, and that it is an advantage of the distinction that it goes some way towards addressing the latter problem. However, as we will see, the distinction does not enable a complete solution to the latter problem. In fact, this should already be obvious from the fact that PRAGMATIC relations are to be analyzed simply as providing information about the world; a complete treatment of imperatives and interrogatives is bound to require something more than this.
3·
The tripartite CONTENT/EPISTEMIC/SPEECB-ACT distinction
Sweetser ( 1990) notes an important underspecification in a gloss such as given in example (3). Is the implicit act of 'concluding' to be thought of as a speech act? There are good reasons for maintaining that this is inappropriate. For Sweetser, (t)here is a class of causal-conjunction uses in which the causality is that between premise and conclusion in the speaker's mind( ... ) and there is another class of uses in which the causality actually involves the speech act itself.
Sweetser draws a sharp distinction between the act of drawing a conclusion and the act of stating it. A speaker is only obliged to state a conclusion she has reached "inasmuch as the rules of conversation make it incumbent on us to say things we believe to be true"; it is therefore inappropriate to gloss an argumentative relation such as that in example (2) in terms of speech acts. Essentially, what the speaker decides to say and what the speaker decides to believe are two quite different things, and it is not possible to accurately express one in terms of the other. Note that we can still analyze (2) as containing an implicit performative, but the performative verb should be clearly understood as describing a theorem-proving act on the part of the speaker rather than a linguistic one, thereby nailing an important ambiguity in a word like conclude, which can be understood in both senses. 2 Sweetser proposes a division of the class of PRAGMATIC relations into two categories: SPEECH-ACTrelations, which hold between the utterances themselves, and EPISTEMIC relations, which hold at the level of premises and conclusions about what is the case in the world. Her overall classification is tripartite; there is a third category of relations, termed CONTENT relations, which effectively cover the same ground as those termed SEMANTIC in the bipartite classification.
131
132.
Alistair Knott
Note that the distinction between EPISTEMIC and SPEECH-ACT relations is to some extent independent of the question of whether a relation contains any non-indicative sentences. There can be SPEECH-ACT relations between two indicative sentences as well as EPISTEMIC ones. Example 8 is a case in point: (8) The answer's on page 200, since you'll never find it for yourself.
Here we would certainly not want to suggest that the speaker is reaching a conclusion about where the answer is, on the basis of knowing that the hearer will not find it for himself. Glossing with an implicit speech-act performative ('I tell you that the answer's on page 200') is much more appropriate. However, there is still some degree of dependence on mood in Sweetser's definitions. As she says (1990, p. 78): "If an utterance is imperative or interrogative in form, then it cannot reasonably be causally conjoined to another utterance except at the speech-act level". In practice, it seems likely that many advocates of the bipartite position are aware of the variations which Sweetser points out within the class of PRAGMATIC relations. Definitions of PRAGMATIC relations tend in fact to be hedge the issue to some extent, frequently containing disjunctive elements or appealing to somewhat ill-defined higher-level constructs. For instance, Sanders ( 1997, p. 126) defines PRAGMATIC relations as "applying between the content of one span and the speaker's 'claim/advice/conclusion' about the content of the other", while for Redeker ( 1990, p. 369) "they hold between the 'beliefs and intentions' which underlie the two spans". Others appeal to rather nebulous higher-level constructs; for instance, for Halliday and Hasan ( 1976, p. 240) the relationship in question is not so much a relationship between speech acts (though it may take this form) [ ... ] as a relationship between different stages in the unfolding of a speaker's COMMUNICAnoN ROLE [ ••• ] his choice of speech role and rhetorical channel, his attitudes, his judgments and the like.
dearly, theorists are looking for a way of generalizing across the various different types of PRAGMATIC relations. Sweetser, by dividing the class in two, has to some extent abandoned this goal, but by the same token her definitions are much more intelligible. 3.1
SPEECH-ACf and EPISTEMIC
disjunctions
As mentioned in Section 1, one of the attractions of the SEMANTidPRAGMATIC
Semantic and pragmatic relations
distinction is its productivity: it seems to find application not only for causal relations, but also for other types of relations. A strong recommendation for Sweetser's finer-grained distinction is that it too seems productive across this range of relations. Some of the interesting distinctions between EPISTEMIC and SPEECH-ACI' relations are outlined in the following three sections; I will omit discussion of the corresponding CONTENT relations, which should in each case be quite dear. Examples of EPISTEMIC and SPEECH-ACI' disjunctive relations are given in (9) and (10) respectively. (9) John is home, or somebody is picking up his newspapers. (10) Would you like to come round tonight? Or is your car still in the shop?
Sweetser analyses example (9) as conveying that the alternative propositions presented are the only two possible conclusions that one can reach. On the other hand, she analyses ( 10) as containing a pair of alternative speech acts: the speaker is asking to be understood by the hearer as performing either one or the other. The distinction between the two analyses seems useful here; it is inappropriate to gloss example (9) in terms of speech acts, for the same reason as it is inappropriate to gloss example (2) in such terms, as discussed at the beginning of Section 3. 3.2.
SPEECH-ACf and EPISTEMJC conditionals
Examples of EPISTEMIC and SPEECH-Acr conditional relations are given in examples (11) and (12) respectively. ( 11) If}ohn went to that party, he was trying to infuriate Miriam. ( 12) How old are you, ifit is not a cheeky question?
Example ( 11) is to be analyzed as expressing an implication relation between the speaker's beliefs: the speaker is really informing the hearer that 'if I [the speaker] believe that John went to the party, I believe that he was trying to infuriate Miriam'. On the other hand, example ( 12) is to be understood as the conditional performance of a speech act: the speaker only wants to ask about the hearer's age if it is not a cheeky question. Again, for the same reasons, it is not appropriate to gloss ( 11 ) in terms of speech acts, and thus there seems to be good reason for giving different analyses for the two cases.
133
134
Alistair Knott
3·3
SPEECH-ACT and EPISTEMIC temporal sequences
Finally, examples of EPISTEMIC and SPEECH-ACT temporal relations are given in examples (13) and (14) respectively. ( 13) A: Why don't you want me to take basket weaving this summer? B: Well, Mary took basket weaving, and she joined a religious cult. (14)
Go to bed now! And no more backtalk!
In each of these examples, according to Sweetser, and is to be interpreted sequentially. However, while the temporal sequence in (14) relates to the order in which the speech acts are performed, the sequence in example ( 13) relates to the order of events in the epistemic world. The idea in the latter case is that the two propositions are both to be interpreted as premises in an argument that A should not take basket weaving, but that their ordering is significant. While this latter analysis is not completely clear to me, the fact that explicitly sequential conjunctions like to begin with and next can be used to link multiple premises in an argument has often been noted, and provides all equally good rationale for temporal EPISTEMIC relations. For instance, to echo examples given by Halliday and Hasan (1976): (15)
John's unsuitable for the job. To begin with, he is too young. Next, he is too hotheaded. Finally...
Sweetser's objections against speech-act glosses of argumentative relations are as telling in this case as in the others; an epistemic interpretation of the sequential relation is again preferable.
4-
Some problems with the existing distinctions
While the bipartite and tripartite classifications are both extremely useful in distinguishing between different types of coherence relation and uses of connective phrases, there remain some outstanding problems in each case. In this section, I will review some of these.
4-1
Generalizations across SPEECH-ACT and EPISTEMIC relations
One problem with the tripartite account is that EPISTEMIC and SPEECH-ACT
Semantic and pragmatic relations
relations do seem to share a lot in common. For instance, as Sweetser points out, many connectives (such as French puisque) are appropriate for both EPISTEMIC and SPEECH-ACT relations, but not for CONTENT relations. It is useful to be able to capture this by positing a single category of relations into which they both fall. Indeed, this is an important reason for thinking of PRAGMATIC relations as a single class. On the other hand, as we have seen, the bipartite distinction hardly makes matters clearer. To my mind, those definitions of PRAGMATIC relations which are truly general (in the sense that they propose a single general characterization of all relations in this class) are open to the objections which Sweetser has leveled at them, while those definitions that allow for the distinction which Sweetser notes are essentially disjunctions, and hence tripartite in every respect except perhaps a terminological one. And so the problem of defining the commonalities between SPEECH-ACT and EPISTEMIC relations is still outstanding for both types of account. 4-2
Problems in defining the level at which relations hold
A related problem with both proposed classifications concerns another type of disjunction which is present, more or less implicitly, in the definition of PRAGMATidSEMANTIC/SPEECH-ACT relations. In each case, the general idea is to suggest that a basic relation, such as cause, disjunction or temporal sequence, applies between the two related spans, and that the different classes of relation are the result of this basic relation applying at different levels: between the propositions which are expressed by the spans, or between the speaker's beliefs in these propositions, or between the speaker's utterance of the propositions. To make this more concrete, we can imagine three functions that operate on an utterance U: one function c(U) which returns its propositional content, one function b(U) which returns the speaker's belief in its content (or some event relating to the genesis of this belief), and one function u(U) which returns the utterance itself (i.e. the identity function). We then could state, generally, that the coherence relation between two utterances UJ and U2 is to be analyzed as R (f(Ul),f(U2))3
where R denotes a 'basic' relation (cause, sequence or whatever), and f denotes one of the above functions. The definitions of the three categories of relation would then be straightforward:
135
136 Alistair Knott
for SEMANTIC (coNTENT) relations, f = c; for EPISTEMIC relations, f =b; for SPEECH-ACT relations, f = U. However, while these definitions would be nice, they do not always result in the analyses that have been suggested as suitable for the examples we have seen thus far. In fact, getting the analysis right frequently depends on us applying different functions to the two utterances in the relation. For instance, consider the following SEMANTIC, EPISTEMIC, and SPEECH-ACT relations: (16) Bill was starving, so he ate loads offood. ( 17) Bill is starving, so he'll want loads of food. (18) Bill is starving. So why isn't he eating?
In each case, the basic relation can be taken to be cause. In (16) and (17), the functions seem to work out: the relation applies between the propositional contents of the two spans in the former case, and between the beliefs in these propositions in the latter case. But in ( 18) we would not want to say that the relation applies between the two utterances. The cause of my asking the question 'why isn't Bill eating?' is simply the fact that Bill is starving, or possibly my belief in this fact; it is certainly nothing to do with my utterance of it. In fact, the case of SPEECH-ACT relations is even more problematic. While some SPEECH-ACT relations only involve one of the utterances, others involve both. The following relation is a case in point: (19) Would you like to come round tonight? Or are you busy?
In this case, as we have seen, we must hold that the basic relation (here, disjunction) applies between the two interrogative speech acts to get the desired analysis. To sum up: in the accounts we have seen so far, it does not seem possible to state in a general way 'which variable changes' when we move from one level to another. Again, we have to rely on disjunction to get things right. For instance, Sanders et al. ( 1992) define the class of PRAGMATIC relations as those where the segments are related "because of the illocutionary meaning of one or both of the segments". What we really want is a formula with a free variable in it, and different value for this variable for CONTENT, EPISTEMIC and SPEECH-ACT relations; but it is far from clear how this can be achieved.
Semantic and pragmatic relations
4·3
Problems with imperative sequences
A general problem we have already mentioned is that the classifications of relations considered so far have a certain amount to say about relations involving non-indicative sentences, but seem unlikely as the basis of a complete account, either of these sentences or of the relations between them. To take a specific example of this problem, consider the case of a temporal sequence of imperatives. For instance: (20) Peel the onions. Then chop them. Interestingly, there are no examples in Sweetser of and or then being used to relate two imperatives in a temporal sequence. This is because the account of speech-act relations is clearly not suitable in such cases. We certainly do not want to interpret the temporal sequence here as holding between the speaker's utterances: the important thing, of course, is that the actions themselves need to be performed in the right sequence. The problem, then, is to come up with an account which does justice to cases such as (20) while still giving the kind of analysis of sequences of premises or speech acts that we have already seen in Section 3.3. It should be quite clear that any complete account of non-indicative sentences would involve a departure from the current definitions of PRAGMATidSPEECH-ACT relations, in the sense that these definitions are all expressed in terms of the information which relations convey to a hearer, over and above any information which is contained in the individual utterances. It would obviously be wrong to analyze a simple imperative proposition in this way. There is clearly more to the imperative do X than the information I [the speaker1 tell you to do X; we have to make reference in the definition to the conditions under which the imperative has the desired effect on the hearer. The same seems uncontroversially true for the sequence of imperatives given above: the relation of sequence has to be seen as part of the desired effect of the utterance. What is needed is a way of unifying this analysis with the analyses we have seen so far of imperatives, as they feature in other types of relation. 4·4
Problems of explanatory adequacy
The problem just mentioned for SPEECH-ACT relations has a parallel in the definitions proposed so far for EPISTEMIC or argumentative relations. Again,
137
138 Alistair Knott
this stems from the fact that the content of these relations is essentially taken to be informative. Consider the following example: (21)
Bill is starving, so he11 want loads of food.
According to the analyses seen so far, the relation in this text basically provides the hearer with information about the causes of one of her beliefs. This seems plausible enough in some contexts. However, a common- even prototypical - function of argumentative relations is to achieve a rhetorical effect on their hearers, causing them to believe a proposition in circumstances where simply stating it would not. (Indeed, the category of PRAGMATIC relations has long been associated with the effect of increasing some desire or positive regard in the hearer; see for instance Mann and Thompson (1988), Bateman and Rondhuis (1997).) Analyzing an argumentative relation as providing information about why the speaker believes something does not by itself explain how it can have this effect, at least not directly. It requires in addition the stipulation that when the hearer learns about the causal processes amongst the speaker's own beliefs, this will prompt similar processes amongst his own beliefs. This stipulation is not part of the relation definitions we have seen so far. Moreover, the idea that this kind of analogy is responsible for the effect of an argumentative relation is not particularly plausible: an argument that is compelling for one person is not always compelling for someone else. 4·S
Problems with 'conditional speech acts'
A final problem for the definitions we have seen so far relates to their application in conditional or disjunctive contexts. The problem is specific to the class of SPEECH-Acr relations. Consider, for instance, Sweetser's analysis of the following texts: (22) Would you like to come round tonight? Or is your car still in the shop?
(23) How old are you, if it's not a cheeky question? As we have seen, (22) is analyzed as a case of SPEECH-ACT disjunction: the disjunction is between whether the speaker is taken to perform the interrogative speech act associated with the first utterance, or that associated with the second. Example (23) is a SPEECH-Acr conditional.That is to say, the question 'how old are you?' is only to be understood as being asked if it is not a cheeky one. Both of these analyses rely on the notion of conditional speech acts. But
Semantic and pragmatic relations 139
this notion is not unproblematic, as Sweetser recognizes. Is it really up to the hearer to decide whether the speaker has performed a speech act? Clearly, conversation involves a great deal of cooperation between speakers and the assumption of certain shared goals. But is the issue of whether a speaker has performed a speech act really a matter for negotiation between the two participants? It seems that to claim this is at least to diverge from what is normally meant by a speech act or utterance. Moreover, there are some cases where a speech-act analysis would require a fairly radical divergence from what is normally meant. Consider the following example: (24) If you run out of money, come to me for a loan. I should point out that (24) is not the kind of text that Sweetser uses to exemplify SPEECH-ACT conjunctions. However, as already stressed, it would be useful to have an account which was general enough to cover this case as well. The alternative of having distinct treatments for the two kinds of conditional imperative seems to involve a measure of redundancy. In any case, if we were to attempt a SPEECH-ACT reading of this example ('If you ever run out of money, I tell you to come to me for a loan'), we would certainly encounter problems: we would have to envisage the speech act as occurring some way into the future, at some as-yet unknown moment. Moreover, if we interpret the text as expressing a habitual conditional (where we could substitute if with whenever), the speech act must also be allowed to occur arbitrarily often. Clearly, in either case we are stretching the notion of'speech act' beyond its original meaning.
s.
A new bipartite definition
s.t
Intention-based definitions of coherence relations
The problems mentioned in Sections 4.4 and 4.3 both point towards the need to include reference to the intentions which underlie all utterances in the definitions of SPEECH-ACT and EPISTEMJC relations. In fact, reference to intentions in relations has long been advocated by those interested in discourse relations from the point of view of computational linguistics, in particular from text generation. For Mann and Thompson ( 1988 ), a specification of the effect achieved on a reader by juxtaposing the two spans in a relation is a central component of its definition. This idea in turn gave rise to a procedural concep-
140 Alistair Knott
tion of relations, as planning operators in discourse structuring systems; for a review, see Hovy (1993). More recently, Moore and Paris (1993) and Moore and Pollack (1992) have also emphasized the need for a representation of intentions in discourse, though in their model it is proposed that intentional relations should be represented alongside informational relations, rather than as an alternative to them. While this work provides us with many of the concepts needed to think of relations in terms of intentions, none of it directly addresses the issue we have been concerned with, namely the problems with and tensions between the definitions of PRAGMATIC, EPISTEMIC and SPEECH-ACf relations. An intentionbased approach to the problems is proposed in the next section. s.2
Intention-based definitions for SEMANTIC and PRAGMATIC relations
The idea I am proposing is quite simple. Let us assume we are characterizing the relation between two utterances Ul and U2. Again, we assume that we can identify a basic relation R (cause, disjunction, or whatever) that is to be used in this characterization, and that the task is to identify what this relation applies between. I am proposing that two classes of relation are sufficient to capture much of the data so far considered; retaining the bipartite terminology, these will be termed SEMANTIC and PRAGMATIC. In each case, a relation between Ul and U2 is expressed in terms of the intended effect of the complex utterance Ul + U2. We also make use of the functional notation introduced in Section 4.2. Two functions are needed: one function c(U) returns the propositional content of an utterance U as before; the other function ie(U) returns the intended effect of U. We can then define ie(Ul + U2) in terms of R, Ul, U2, and these two functions: Relation-type SEMANTIC
PRAGMATIC
Definition
ie (Ul + U2) =believes(hearer, R(c(Ul), c(U2))) ie (Ul + U2) =R(ie(Ul ), ie(U2))
In English: the intended effect of a SEMANTIC relation is that the hearer believes that the basic relation R holds between the propositional contents of the two related utterances, while the intended effect of a PRAGMATIC relation is that the basic relation actually holds between the intended effects of the two related utterances. In each case, the relation definition evaluates to a proposition, which can be thought of as the goal state which the complex utterance is
Semantic and pragmatic relations
expressed to achieve. In the case of SEMANTIC relations, this proposition is that the hearer believes a second, embedded proposition, namely that the relation applies. For PRAGMATIC relations, the proposition that the relation applies is directly asserted, rather than scoped within a hearer belief. The other difference between SEMANTIC and PRAGMATIC relations concerns which function is applied to Ul and U2 to obtain the propositions between which the relation is asserted to apply. Before we proceed to some examples of these definitions in action, some axioms need to be specified for determining the intended effect ie(U) of an atomic utterance U. While the notion of the intentions behind an utterance is notoriously ill-defined, I will be using it in the following technical, and fairly concrete, sense: a. The intended effect of an indicative sentence is that the hearer believes its propositional content. b. The intended effect of an imperative sentence is that the hearer performs the action in question. c. The intended effect of an interrogative sentence is that the hearer answers the question. Note that in each case, the intended effect evaluates to a proposition. We can now illustrate how the definitions work in a range of cases. 5·3
Causal relations
The following two texts are SEMANTIC and PRAGMATIC causal relations respectively: (25) Bill was starving, so he ate five sandwiches. (26) Bill ate five sandwiches, so he was [must have been) starving.
For (25), the propositional content of the first utterance is that Bill was starving; the propositional content of the second utterance is that Bill ate five sandwiches; and the intended effect of the complex utterance is that the reader believes that Bill's being starving caused Bill's eating of five sandwiches. For ( 26 ), the intended effect of the first utterance is that the hearer believes that Bill ate five sandwiches; the intended effect of the second utterance is that the hearer believes that Bill was (must have been) starving; and the intended effect of the text as a whole is that the hearer's belief that Bill ate five sandwiches
141
142 Alistair Knott
actually causes the hearer's belief that Bill was starving. This analysis seems to provide a good account of an argumentative relation. In describing the cause in such a case as something which the speaker intends to occur between two hearer beliefs, it provides the basis for an explanatory account of how argumentative relations achieve their persuasive impact; this is something which in Section 4.4 we argued was independently needed anyway. Sweetser would classify (26), being argumentative, as EPISTEMIC. Now note that the new definition of PRAGMATIC relations applies equally well for SPEECHACT causal relations. For instance: (27) We're late, so hurry up. In this case, the intended effect of the first utterance is that the hearer believes that we are late; the intended effect of the second utterance- an imperativeis that the hearer hurries up, and the intended effect of the whole text is that the hearer's belief that we are late actually causes the hearer to hurry up. Again, this seems a plausible account of the text in question. In fact, it seems more direct than an account in terms of speech-act glosses: ifwe assume the speech-act gloss ('so I tell you to hurry up') there is still a step to take between the hearer's understanding the cause ofthe speaker's statement and his actually hurrying up. In comparing (26) and (27) the main thing to note is that the work which in Sweetser's account is done by differences between the definitions of EPISTEMIC and SPEECH-ACT relations is done in the new account simply by the differences between the intended effects of an imperative and an indicative sentence. As we have argued in Section 4.3, these are differences which any theory of sentence semantics needs to represent anyway. Moreover, in framing a single definition which covers EPISTEMIC and SPEECH-ACT relations, we are addressing the generalizations between the two classes of relation which, as noted in Section 4.1, are problematic for the previous accounts.
5·4
Disjunctive relations
Now consider some disjunctive relations. (28) The milk is in the fridge, or it is on the sideboard. (29) John is home, or somebody is picking up his newspapers. (30) Would you like to come round tonight? Or is your car still in the shop? Example (28) is a SEMANTIC relation. The analysis here is straightforward: the
Semantic and pragmatic relations 143
propositional content of the first utterance is that the milk is in the fridge; the propositional content of the second utterance is that the milk is on the sideboard; and the intended effect of the complex utterance is that the hearer believes that either the milk is in the fridge or it is on the sideboard. Example (29), which is EPISTEMJC in Sweetser's terms, is PRAGMATIC in ours. Accordingly, the intended effect of the first utterance is that the hearer believes that John is at home, the intended effect of the second utterance is that the hearer believes that someone is picking up John's newspapers; and the intended effect of the complex utterance is that either the hearer believes John is home, or the hearer believes someone is picking up his newspapers. This analysis seems at least as good as that given by the definition of EPISTEMIC relations. Example (30) is also PRAGMATIC for the present account. Thus, the intended effect of the first utterance is that the hearer answers the question (i.e. says whether he would like to come round tonight). The intended effect of the second utterance is also that the hearer answers the question (i.e. says whether his car is still in the shop); and the intended effect of the complex utterance is that either the hearer answers one question, or answers the other. Again, this analysis seems to be perfectly good. In fact, it improves on the speech-act account: note that we do not have to introduce the problematic notion of 'conditional' speech acts.
5·5
Conditional relations
Now consider how conditional relations fare under the new definition: IfJohn goes to a party, he gets drunk. (32) IfJohn went to the party, he was trying to infuriate Miriam. (33) How old are you, if it's not a cheeky question?
(31)
Example (31) is a SEMANTIC relation. The propositional content of the first utterance is that John goes to a party; the propositional content of the second utterance is that John gets drunk; and the intended effect of the complex utterance is that the hearer believes that if John goes to a party he gets drunk. Example (32), which is EPISTEMIC in Sweetser's terms, is PRAGMATIC in ours. Accordingly, the intended effect of the first utterance is that the hearer believes that John went to the party; the intended effect of the second utterance is that the hearer believes that John was trying to infuriate Miriam; and the intended effect of the complex utterance is that if the hearer believes that John went to
144 Alistair Knott
the party, the hearer believes that John was trying to infuriate Miriam. Again, this seems as good as the EPISTEMIC account. Example (33), which is SPEECH ACf in Sweetser's terms, is also PRAGMATIC for the new definition. Thus, the intended effect of the first utterance is that the hearer answers the question (i.e. says how old he is); the intended effect of the second utterance is that the hearer believes that the question is not cheeky; and the intended effect of the complex span is that if the hearer believes the question is not cheeky, he answers it. This seems to me to be an improvement on the SPEECH-ACT account. For one thing we avoid the problematic notion of conditional speech acts, as before. Moreover, the analysis specifies that it is the hearer's judgment as to whether or not the question is cheeky on which something is conditional- the SPEECH-ACf account does not say anything about the perspective from which this is to be judged. Finally, consider another type of conditional imperative, the kind that Sweetser's account does not treat: (34} If you run out of money, come to me for a loan.
The new definition of PRAGMATIC works just as well for this example as for the others: the intended effect of the complex span here is that if the hearer believes he has run out of money, the hearer comes to the speaker for a loan. 4
s.6
Sequential relations
Finally, consider some sequential relations. (35} John peeled the onions. Next he chopped them. (36} John's unsuitable. To begin with, he's too young. Next, he's hotheaded ... (37} Peel the onions. Next, chop them.
Example (35) is a SEMANTIC relation: the intended effect ofthe complex utterance is that the hearer believes that John peeled the onions and then chopped them. Example (36) is a PRAGMATIC relation, and would presumably be taken as EPISTEMIC by Sweetser. On the new analysis, the intended effect of the complex utterance is that the reader first believes that John is too young, and then believes that John is hotheaded. Again, this analysis seems as good as the EPISTEMIC analysis; we are still talking about the ordering of events in the epistemic world, but it is the hearer's world rather than the speaker's. Example (37) is also PRAGMATIC in our terms. It is not the kind of example
Semantic and pragmatic relations
of a speech-act sequence that gets discussed in the literature, because as we saw in Section 4.3, it would be wrongly analyzed in this case. However, by the new definition it is unproblematic: the intended effect of the first utterance is that the hearer peels the onions; the intended effect of the second utterance is that the hearer chops them, and the intended effect of the complex utterance is that the hearer first peels the onions and then chops them.
6.
An assessment of the new definitions
6.1 Advantages of the new definitions
I will begin by briefly summarizing the advantages of the new definitions of SEMANTIC and PRAGMATIC. Firstly, the definitions permit some useful generalizations across EPISTEMIC and SPEECH-ACT relations. Because the intended effect of all imperative speech acts is different from that of an assertive one, expressing the definition of PRAGMATIC relations in terms of intended effects allows a single characterization of these two kinds of relation. Secondly, the new definitions provide a clearer picture of compositionality. They are compositional in two senses. Firstly, they allow a consistent account of the variable that changes when we move from SEMANTIC to PRAGMATIC relations- namely, the identity of the function (c(U) or ie(U)) that is applied to the two related utterances. Moreover, it is compositional in a more conventional sense. When we think of compositional semantics, we say that the meaning of a sentence can be derived from the meaning of its constituents, plus their manner of combination. The new definition allows us to make a similar statement about intended effects, for PRAGMATIC relations: we can say that the intended effect of the complex text span formed by a PRAGMATIC relation is derivable from the intended effects of its two constituent spans, together with the relation which links them. Thirdly, the new definitions provide an explanatory story for argumentative relations. The new definition of PRAGMATIC relations, as applied to an argumentative text, does not simply represent it as a description ofthe speaker's own thought processes, but specifies how it is that it can, if successful, have a rhetorical impact on the hearer. Fourthly, the new definitions provide for a better treatment of conditional speech acts. As we have seen, by making the conditional events the intended effects of utterances, rather than the utterances themselves, we avoid a number
145
146 Alistair Knott
of problems in analyzing texts where ifand or are used to conjoin imperative or interrogative sentences. Finally, the new definitions appear to provide the basis for a more general treatment of imperatives. We have seen that the new definition of PRAGMATIC relations allows them to apply more widely, to sequences of imperatives and conditional instructions, as well as to the set of cases dealt with under the rubric of SPEECH-ACT relations. 6.2.
Some problems for the new definitions
While there are many advantages with the new definitions, I should also note two cases where they do not apply straightforwardly, and where some modification of the account appears necessary. Both of these cases involve relations which Sweetser has analyzed as SPEECH ACT; and in each case there are aspects of this definition which the new definition of PRAGMATIC does not apparently capture.
Problematic SPEECH-ACT causes Consider the following case of a causal SPEECH-ACT relation. (38) Since you asked nicely, I am 78.
In this text, the definition of SPEECH-ACT relations seems more natural: it does indeed seem as though the primary effect here is informational, with the hearer being told why the speaker is making a certain utterance. If we used the new conception of PRAGMATIC relations, we get a strange analysis, where the intended effect of the complex utterance is that the hearer's belief that he asked nicely should cause the hearer to believe that the speaker is 78. This seems off the mark. There are other cases where the new PRAGMATIC definition is possible, but not particularly attractive. For instance: (39) The rules can't be broken, so "no". (40) Since we're on the subject, when was George Washington born? In each case, the original definition of SPEECH-ACT relations is a lot more natural. In summary, it must be acknowledged that there are cases where the new definition of PRAGMATIC relations does not work. So we have not eliminated the need for a class of SPEECH-ACT relations altogether; we have just made it smaller, and possibly more homogeneous. The question therefore remains as to how to define the class of SPEECH-ACT
Semantic and pragmatic relations 147
rdations. It might be thought that, now the class is smaller, we could frame a definition for it on the same terms as those given for SEMANTIC and PRAGMATIC relations. We could imagine re-introducing the function u(U), which for an utterance U returns the utterance itself, and suggesting that for SPEECH-ACT rdations, the intended effect of the complex utterance ie(Ul + U2) given some basic relation R is something like believes(hearer, R(u(Ul), u(U2))). But this does not work. The problem is the same as before: we want to apply different functions to Uland U2. (It is not because the speaker says that we are on the subject of George Washington that he asks about when he was born, but because he is on the subject, or believes he is.) The attractive feature of the new definitions for SEMANTIC and PRAGMATIC is that the same function is applied to both utterances.
Problematic SPEECH-ACT conditionals Another class of problematic cases includes a group of conditionals of the foUowing form: (41) If you're hungry, there's some soup in the fridge.
Sweetser would analyze this as a SPEECH-ACT conditional: 'if you are hungry, then [let us consider that) I tell you that there is soup in the fridge'- the idea being that the assertive speech act is not understood to have been made unless the hearer is hungry. This certainly seems strange, for the reasons already discussed, but in this case it is not clear that the new definition of PRAGMATIC is any easier to understand. On the PRAGMATIC account, the intended effect of the complex utterance is that if the hearer believes that he is hungry, then he also believes that there is soup in the fridge. But surdy, given the text in question, the hearer is going to believe there is soup in the fridge regardless of whether he believes he is hungry? The notion of conditional hearer actions (the obeying of orders, the answering of questions) is perfectly intelligible, but the notion of conditional hearer beliefs in this context seems at least as strange as that of conditional speech acts.
7·
Future directions: generalizing over the intentions of participants
and protagonists Before concluding, I will sketch an extension (still very speculative) to the current definitions of SEMANTIC and PRAGMATIC rdations, which points to
148 Alistair Knott
further possible generalizations using the concept of intentions. This involves the introduction of a second bipartite classification of relations, into CAUSEDRIVEN and RESULT-DRIVEN relations, which I have discussed elsewhere (Knott & Mellish 1996). 7.1
Cause-driven and result-driven relations
The new distinction can be introduced by considering the following pair of sentences: (42) John had been up all night, but he looked fresh as a daisy. (43) John was hungry, but he couldn't find anything to eaL
But is typically analyzed as signaling a violated expectation of some kind. For instance, in (42), the fact that John has been up all night leads to the expectation that he will look tired, which is then not forthcoming. The analysis is often formalized in terms of presupposed defeasible rules (Knott & Mellish 1996; Lagerwerf 1998). In (42), for instance, the presupposed rule would include 'X has been up all night' on its left-hand side, and 'X looks tired' on its right: the but indicates that this rule is defeated in the situation being described. While this analysis works for (42), it does not work for (43), as is noted by Spooren (1989) and Knott and Mellish (1996). To conform to the proposed representation, we would have to assume a rule of inference whose premise is that X is hungry and whose conclusion is that X finds something to eat. More generally, to deal with cases like (43 ), we would need a rule of inference whose premise is the fact that an agent has a goal, and whose conclusion is that the goal is achieved. However, this rule of inference is clearly not always a valid one: agents in the world frequently have goals that cannot be achieved. Of course, if we also know that the conditions in the world are such that the goal is achievable, then the inference is perfectly possible to make. But that is precisely what is not known in an example like (43)- until the clause introduced by but is processed. 5 Knott and Mellish (1996) propose that instead of attempting to reduce cases like (43) to a violated expectation, we should distinguish between two kinds of but: one, termed CAUSE-DRIVEN, which signals a violated expectation, and one, termed RESULT-DRIVEN, which signals a frustrated plan. The notion of a protagonist's plan is thus accepted as a primitive in the account. A generalization between the two types of but is then proposed, which draws on the fact that a defeasible rule can be used in two ways: deductively, to generate
Semantic and pragmatic relations 149
an expectation of its right-hand side based on knowledge of its left-hand side, or abductively, to plan an action which achieves a state of affairs which features on its left-hand side in pursuance of a goal to achieve the state of affairs on its right-hand side. See Knott and Mellish ( 1996) for details of this distinction. The crucial point to note about the distinction for present purposes is that the definition of RESULT-DRIVEN relations makes explicit reference to the goals of a protagonist.
7.2
Speaker and protagonist goals
The presence of goal-based primitives in the definitions of PRAGMATIC and RESULT-BASED relations opens up interesting possibilities for further generalizations. Specifically, there seem to be some interesting similarities between the goals a speaker pursues by making utterances, and the goals a protagonist being described in a discourse pursues by taking actions. Compare these two texts, for instance: (44) Bill told Jim to go to bed, but Jim wasn't tired. (45} Bill [to Jim]: Go to bed! Jim [to Bill]: But I'm not tired! In each case, we can talk about Bill seeking to satisfy a goal that Jim goes to bed by telling him to do so, and thereby discovering a circumstance that indicates that this action on its own will not be enough. In each case we can also use the conjunction but. But the two texts are nevertheless quite different in (44), the intention is that of the protagonist in a narrative monologue, while in (45) it is that of the first speaker in a conversational exchange. Moreover, (44) contains only indicative clauses, while (45) contains an imperative clause too. It seems quite likely that the definition of PRAGMATIC which is required to account for the imperatives in (45) will also suffice to introduce the goals necessary for the analysis of the but in this example as RESULT-DRIVEN. This possibility is something which will be pursued in further work.
8.
Summary
This paper began by noting some difficulties with existing notions of PRAGMATIC, EPISTEMIC and SPEECH-ACT relations. A solution to some of these prob-
ISO
Alistair Knott
lems was then proposed, in the form of a new definition of PRAGMATIC relations framed in terms of the intended effects of a speaker's utterances. This new definition has a number of advantages over its predecessors, although there are still some remaining problems with it. Finally, a possible extension to the account was presented, by introducing a new category of RESULT-DRIVEN relations: there appear to be some interesting generalizations in prospect relating to speaker and protagonist goals. Whether these can be exploited in improved definitions of SEMANTIC and PRAGMATIC relations is a matter for future research.
Notes Sweetser in fact adresses two early versions of the bipartite distinction formulated by Ross ( 1967) and Davison ( 1973 ). However, her criticisms apply to the more recent versions too.
1.
2. In fact, Sweetser ( 1990, p. 92) is unwilling to commit to the view that explicit performative should be understood as contributing to the semantics ofa sentence. But the rationale for this is mainly concern for a proper demarcation the domains of'semantics'and 'pragmatics'; the only relevant implication for our purposes is that the performative glosses are to be understood as falling in the latter domain rather than the former.
3· Or perhaps more precisely, as stating that the proposition R(f (Ul), /(U2)) is asserted by the speaker.
+
Note that the account works equally well if the conditional is interpreted as a habitual: there are none of the problems associated with conditional, future or repeated speech acts that would arise if we extended the SPEECH-ACT accounL Note that the protasis has to be evaluated from the perspective of the hearer in this case. This reading certainly seems possible, but it might be thought that it could also be evaluated from the speaker's perspective (i.e. the hearer comes to the speaker for a loan if the hearer believes the speaker believes he has run out of money). If this is considered a problem, it could be borne in mind that the intended effect of an indicative utterance with content p can also be taken to be that the hearer believes that the speaker believes that p. With this alternative interpretation, both of the required perspectives are available.
s.
Spooren ( 1989) apparently suggests that a violated expectation does arise in a case such as (43), as the necessary additional premise (in this case, that there is food to be found) is an implicature of the utterance expressing the protagonist's goal (i.e./ohn was hungry). While we must certainly implicate the fact that John has a goal if this is not explicitly stated, there seems no good reason to suggest that the speaker's informing the hearer about this goal should lead to an implicature that the goal is satisfiable. If anything. hearers are used to hearing about situations in which a protagonist's goals are not satisfied: this is certainly the predominant case in narratives.
Semantic and pragmatic relations 151
References Bateman, J., & Rondhuis, K. J. (1997). Coherence relations: Towards a general specification. Discourse Processes, 24, 3-49. Davison, A. ( 1973). Performative verbs, adverbs and felidty conditions: an inquiry into the nature ofperformative verbs. Ph.D. thesis, University of Chicago. Elhadad, M., & McKeown, K. R (1990). Generating connectives. In COLING-90, 97101.
Halliday, M., & Hasan, R. ( 1976). Cohesion in English. London: Longman. Hobbs, J. R. ( 1985 ). On the coherence and structure ofdiscourse. Technical Report CSU85-37, Center for the Study of Language and Information, Stanford University. Hovy, E. ( 1993 ). Automated discourse generation using discourse structure relations. ArtifidaiinteUigence, 63,341-385. Knott, A., & Mellish, C. ( 1996 ). A feature-based account of the relations signalled by sentence and clause connectives. Language and Speech, 39, 143-183. Lagerwerf, L. ( 1998). Causal connectives have presuppositions. Ph.D. thesis, Tilburg, The Netherlands: Katholieke Universiteit Brabant. Mann, W. C., &Thompson, S. A. (1988). Rhetorical structure theory: A theory of text organization. Text, 8, 243-281. Martin, J. (1992). English Text: System and Structure. Amsterdam: Benjamins. Moore, J., & Pollack, M. ( 1992). A problem for RST: The need for multi-level discourse analysis. Computational Linguistics, 18,537-544. Moore, J.D., & Paris, C. L. (1993). Planning text for advisory dialogues: Capturing intentional and rhetorical information. Computational Linguistics, 19, 651-694. Oversteegen, L ( 1997). On the pragmatic nature of causal and contrastive connectives. Discourse Processes, 24,51-86. Redeker, G. (1990). Ideational and pragmatic markers of discourse structure. Journal of Pragmatics, 14,367-381. Ross, J. R. ( 1967). Constraints on variables in syntax. Ph.D. thesis, MIT. Now reprinted as Infinite Syntax. Norwood, NJ: Ablex. 1986. Sanders, T. ( 1997). Coherence relations in context; on the categorization of positive causal relations. Discourse Processes, 24, 119-147. Sanders, T. J. M., Spooren, W. P.M., & Noordman, L. G. M. (1992). Towards a taxonomy of coherence relations. Discourse Processes, 15, 1-35. Spooren, W. ( 1989). Some Aspects Qf the Form and Interpretation ofGlobal Contrastive Coherence Relations. Ph.D. thesis, Catholic University of Nijinegen, the Netherlands. Sweetser, E. ( 1990). From Etymology to Pragmatics: Metaphorical and Cultural Aspeas of Semantic Structure. Cambridge: Cambridge University Press. Van Dijk, T. A. (1979). Pragmatic connectives. Journal ofPragmatics, 3, 447-456.
CHAPTER
6
On the production of causal-contrastive although-sentences in context Leo G. M. Noordman* Katholieke Universiteit Brabant
In our thinking and reasoning, concepts of causation and negation play an important role. For example, on the basis of observations in the world, we are able to predict the consequences of the facts we observe. We are able to anticipate future consequences of observed causes. We are able to reason about events that do not exist, and about hypothetical events. We can generate expectations about future events, but also negations of expectations. These processes may be rather complex; they involve causality, including expectations, and negation. The notions of causality and negation are pervasive in our thinking and reasoning processes. In general, we express our thinking and reasoning in language. Language is the expression of our train of thought and it reflects the discursive processes in our thinking. Consecutive sentences or utterances reflect in general the thinking process that we perform as speakers and writers and that we want our listeners and readers to perform as well. This chapter deals with the train of thought that occurs in our mind when we deal with causality and negation. This train of thought is referred to as a thinking or reasoning process in this chapter. These processes are observable in the language. What kinds of strategy do the speaker and writer use in formulating their complex thought? What kinds of decision do they make? The way in which we try to discover the reasoning processes is by analysing the product of these processes: the sentences that express the train of thought of the speaker and writer. Similarly, we are interested in the processes in understanding complex reasoning. How do listeners and readers understand the complex thought of their interlocutor? This article restricts itself to the analysis of concessive sentences expressed by the conjunction although. An example is sentence ( 1).
154
Leo G.M. Noordman
(1) Although John had worked hard, he failed the exam.
A concession expresses a complex thought. According to Quirk, Greenbaum, Leech and Svartvik (1985, p. 1098) "concessive clauses indicate that the situation in the matrix clause is contrary to expectation in the light of what is said in the concessive clause". That a concessive relation includes a causal and a contrastive relation is dearly expressed by Helwig and Buscha (1991): "an expected causal connection is ineffective. The ground that is mentioned in the subordinate clause does not have the effect that is expected on the basis of the law of cause and effect" (p. 591; translation LN); "A concessive relation includes a causal relationship as well as an adversative relationship" (p. 562; translation LN). And Grevisse ( 1986, p. 1667; translation LN) argues that "a concessive proposition indicates that the logical relation that one expects between the concession and the main verb does not hold. A concessive proposition expresses a cause that is ineffective and counteracted and that did not have the effect that one could expect". But now compare sentences (1) and (2). (2) Although John failed the exam, he had worked hard. Both sentences express a causal and a contrastive relation. The subordinate clause evokes an expectation on the basis of causal world knowledge. The expectation is contrasted with and denied by the content of the main clause. Although both sentences contain the same words, the expectation in the two sentences is quite different. In a sense, the sentences express different relations. So, the first question we have to address concerns the kinds of relation that are expressed by although-sentences. If the although-sentences have an underlying causal relation, it seems that there are different causalities involved Since the focal point of this chapter is how causality is expressed in language, the analysis of the although-sentences focuses on the underlying causality. The second question deals with the way in which the concessive relation affects the organization of the discourse. How do the although-sentences function in their context? If a concession expresses a denied expectation, one may wonder to what extent the expectation is evoked by the preceding context. If the expectation is denied, does the speaker/writer give in the subsequent text an explanation for the denial or does (s)he give an alternative reason or motivation for the event described in the main clause? How does the subsequent context follow up on the thoughts expressed in the subordinate clause and the main clause, i.e., how does the reasoning process develop? Several
On the production of causal-contrastive although-sentences in context
authors have argued that the two clauses have different status. This presumably has consequences for the function of the clauses in their contexts. The conceded part has little prominence (Grote, Lenke & Stede 1997). Elhadad and McKeown (1990) claim that the main clause in an although-sentence has directive status and the subordinate clause subordinate status. The argumentative orientation (Ducrot 1984) of the complex sentence as a whole is the argumentative orientation of the main clause (Elhadad & McKeown 1990). The same asymmetry is expressed by Mann and Thompson ( 1988) in terms of the notions nucleus and satellite. In an although-sentence that expresses a concessive relation, the main clause expresses the nucleus; the subordinate clause expresses the satellite. Nuclei are more central in a text than satellites. If one deletes the satellites from a text, what remains is still a coherent text; if one removes the 'most-nuclear' unit of the text, the text looses its significance. The nucleus and satellites have also different functions and effects for the writer/ reader. "The nuclear portion realizes the primary goal of the writer and the satellite provides supplementary material" (Thompson & Mann 1987, p. 360). A concessive "relation holds when a writer chooses to strengthen a point by affirming that point in the face of a potentially opposing point" (Thompson & Mann 1987, p. 363) . The nucleus is the point that is affirmed; the satellite is the opposing point. The writer "has positive regard for the situation presented in the nucleus" (Mann & Thompson 1988, p. 254). The nucleus expresses the belief, the approval or the intention of the writer. The writer intends that the reader's positive regard for what is described in the nucleus increases. A similar analysis is given by Spooren ( 1989) in his discussion of but-sentences. Spooren demonstrates that the but-conjunct dominates the other conjunct in the sense that the but-conjunct reflects the opinion of the writer more than the other conjunct. If the main clause expresses the directive act and the subordinate clause a subordinate act, or if the main clause expresses the nuclear information and the subordinate clause the satellite, this may have consequences for the thematic development of the text. The directive act and the nuclear information reflect the purpose of the writer. Then one may expect that the subsequent context will be a continuation of the main clause. The research by Spooren (1989) is relevant in this respect. He demonstrated that the but-clause reflects the opinion of the writer, and that, accordingly, the subsequent text is more related to the but-conjunct than to the other conjunct. This asymmetry was not found for clauses connected by the conjunction and.
•ss
1;6
Leo G.M. Noordman
The question about the role of the context is the more obvious if one realizes that the same relation can be expressed in different ways by changing the order of the clauses. For example, the information in sentence ( 1) can be expressed in sentence (3). (3)
John failed the exam, although he had worked hard
What will be a likely continuation of this sentence? Sentences ( 1) and (3) have the same nuclei. Spooren (1989) demonstrated that the subsequent context after a but-sentence is a continuation of the second clause, i.e. the but-conjunct. The question arises whether the subsequent context continues on the butconjunct, because the but-clause is the last conjunct, or because it expresses the opinion of the writer. When sentences were connected by the conjunction and there was no tendency for the subsequent context to follow up on the second conjunct. This suggests that the asymmetry found by Spooren is due to the conjunction but, and not to the linear order. On the basis of these considerations, one could expect that the context following sentence (I) and (3) is a continuation of the main clause, that expresses the nuclear information and the directive act and not a continuation of the last clause. But then the question arises whether there are differences between sentence (I) and (3 ). Aie there contextual constraints for these different kinds of the denial of expectation? It is likely that differences in clause order may be subject to thematic development in the text. One remark should be made with respect to the scope of the analysis of although-sentences in their discourse context in this chapter. Discourse reflects cognitive processes. These processes include the generation of the information by the writer, the expression of the information in such a way that the writer expresses his/her viewpoint and accommodates the viewpoint of other people including the potential reader. That these processes are involved is particularly clear in concessive sentences. Concessive sentences have in general an argumentative function in a discourse. The speaker/writer expresses his or her opinion in relation to other opinions. The nucleus has 'positive regard' and realizes the goal of the writer; the satellite is the opposing point. This chapter does not claim to analyse the argumentative richness of concessive sentences. That goes beyond the scope of this chapter. The analysis deals with the argumentative function of although-sentences only to the extent that we investigate how the main and subordinate clauses function in the development of the thought the writer expresses in the preceding and subsequent context. In that
On the production of causal-contrastive although-sentences in context 157
sense the present analysis can be considered as an empirical test of the notions nucleus and satellite: do the nucleus and the satellite behave in a different way with respect to the context? What emerges is "that the role of connectives is not only to indicate a logical or conceptual relation, but also to indicate the structural organization of discourse" (Elhadad & McKeown 1990, p. 98 ). This chapter will deal with these two aspects of concessive although-sentences. The first issue is: how is the logical or conceptual relation expressed in the although-sentence? The conceptual relation is an expectation, based on causality, that is denied. How is this relation expressed? We will analyse the underlying causality in although-sentences. The second issue is: how does the concessive relation fit in the organization of the discourse?
1.
Different kinds of although-relations
1.1
Text analytic differences between the relations
The reasoning underlying sentence ( 1) can be represented as a syllogism in which the conclusion is contradicted by the statement in the main clause: 'If one works hard, one normally passes the exam. John worked hard. Therefore, John would normally pass the exam. But in fact, John failed the exam'. The major premise in this syllogism is a general expectation that is generated on the basis of the subordinate clause. The expectation is contradicted by the main clause for the specific situation described by the minor premise. The major premise in (1) is different from the premise in (2). The premise in (2) is: 'If one fails the exam, one normally has not worked hard'. Consider the premise in sentence ( 1): 'If one works hard, one normally passes the exam.' This clause expresses a contingency between working hard and passing the exam. Underlying sentence ( 1) is the assumption that working hard is a cause for passing the exam. The relation between the two propositions in the premise is a relation between two events in the world: 'Working hard' of somebody (John) leads to 'passing the exam' of that person. There is a causal relation between the two events stated in the premise and, consequently, there is a causal relation underlying the events expressed by sentence ( 1). The analysis of sentence ( 2) is formally similar- in the sense that a premise is derived in the same way as in sentence ( 1) - but the premise is different. The derived premise, 'If one fails the exam, one normally has not worked hard', does not describe a contingency
158
Leo G.M. Noordman
between failing the exam and not working hard. It is not the case that failing the exam is a cause for not working hard. However, that sentence does express a contingency: failing the exam is the cause for the speaker's conclusion that the person in question probably did not work hard. So, the relation between the two propositions in the premise is not a relation between two events in the world, but between an event and a conclusion of the speaker. One can say that the relation between the propositions in ( 1) is a relation between locutions and in (2) between illocutions. It should be noted that the validity of the premise in sentence (2) rests on the validity ofthe premise in sentence ( 1): 'If one fails the exam, one normally has not worked hard' is justified because 'If one works hard, one normally passes the exam.' In the expectation underlying sentence ( 1) the consequence (passing) is deduced from the cause (working hard). I will refer to this as reasoning from cause to consequence. In the expectation underlying sentence (2) the reasoning is in the inverse direction: from the consequence one infers the cause. There is a kind of backward reasoning. The consequence is the sign for the cause. I will refer to this as reasoning from consequence to cause. One can argue that there is an incongruence in sentence (2) between what is conceptually the condition and the consequence and what is expressed in the sentence as antecedent and consequent. What conceptually is the consequence (failing the exam) is expressed as the antecedent in the sentence (if one fails, one has probably not worked hard). On the other hand, in sentence ( 1) there is a congruence between what is conceptually the cause and the consequence and what is linguistically expressed as antecedent and consequent. The expectation that is generated in ( 1) can be represented as 'if p then not-q', where p stands for the subordinate clause in (1), and q for the main clause in (1) (representing sentence (1) as: Although p, q). The expectation generated in (2) can be represented as 'if p then conclude not- q'. Relations as exemplified by ( 1) are called 'semantic' relations by Sanders, Spooren and Noordman ( 1992), since the relation between the propositions in the language express a relation between events in the world. In contrast, relations as exemplified by (2) are called 'pragmatic' relations, since an illocution, or a speech act is involved in the relation. Sweetser (1990) identifies the relations as content relations and epistemic relations resp. 'Epistemic' refers to the fact that a conclusion by the speaker is involved in the relation. It should be noted that content and epistemic relations are defined in terms of(il)locutions and not in terms of (reversed) causality. Reversed causality will in general imply an epistemic relation. But non-reversed causality does not necessarily imply a
On the production of causal-contrastive although-sentences in context 159
content relation. For example: 'I hear that it is raining, so the streets are wet' may mean that I conclude that the streets are wet and therefore is an epistemic relation ('so the streets must be wet'). Noordman (1979) investigated the psycholinguistic processing of these relations and called them 'condition-consequence relations' and 'inference relations' resp. The term 'inference relation' expresses that the condition is inferred from consequence. The sentence expresses an act of concluding. Traxler, Sanford, Aked, and Moxey ( 1997) used the labels'causal' and' diagnostic', where diagnostic expresses the inverse reasoning process from the consequence to the cause. In a more discourse semantics approach, Oversteegen ( 1997) described the relations involved in because-sentences and althoughsentences in terms of presuppositions and operators involved in the presuppositions. As an example: sentence (2) contains the presupposition that in general if x fails the exam, one may conclude that x did not work hard. The phrase 'one may conclude' is called a belief operator. One may identify a third kind of although-relation. An example is: 'Although I sympathize with your problems, get the paper in tomorrow' (Sweetser 1990, p. 79). The underlying meaning is 'I command you ...... , in spite of my sympathy'. The underlying premise is: if I sympathize with your problem, I will not command you to get the paper in tomorrow. The expectation can be represented as 'if p then say not-q'. The relation here involves an act of saying. Therefore, Sweetser calls these relations speech act relations (or rather speech act conjunctions). Oversteegen discusses these relations in terms ofa speech act operator. The three kinds of although-relation discussed so far contain an underlying expectation that is denied by the main clause. Although-sentences expressing a denial of expectation form the topic of this chapter. There is a fourth kind of although-sentence that is slightly different: the concessive opposition. As an example, consider the sentence 'Although that fiscal regulation yields much money, it is not fair' in the context of a discussion whether a particular fiscal regulation has to be maintained or not. There is no expectation that if a fiscal regulation yields much money, the regulation is fair. It is not the case that an expectation is evoked that is denied by the main clause. Rather, the subordinate clause and the main clause are expressions of opposite arguments. The subordinate clause is an argument in favor of the fiscal regulation and the main clause is an argument against it. This can be represented as p ~ r, and q ~ -r, where p is the subordinate clause; q is the main clause and r is a proposition that expresses the issue that is being discussed in the concessive opposition. In this
16o Leo G.M. Noordman
case, r is the issue whether that fiscal regulation should be maintained or abolished. In general r is a proposition that is introduced in the previous context (see Spooren 1989 for a similar analysis of concessive but-sentences). A concession is more symmetric than an denial of expectation, because both the main and the subordinate clauses in a concession refer to a proposition r that is derived from the preceding context. I prefer to distinguish between denials of expectation and concessive opposition, although Grote, Lenke and Stede ( 1997) correctly claim that "all concessions seem to share a common structure on an abstract level of knowledge representation" (p. 114). That abstract representation of concessions is: "On the one hand, A holds, implying the expectation of C. On the other hand, B holds, which implies Not-C, contrary to the expectation induced by A." (p. 95). But they have to make a distinction depending on whether not-Cis explicitly mentioned or not. IfA is the although clause and note is the main clause, then the concession is a denial of expectation. If A is the although-clause and B is the main clause, the concession is a concessive opposition (Spooren 1989). In order to decide whether an although-sentence expresses a denial of expectation or a concession, one has to find out whether the subordinate clause is directly related to the negation of the main clause, or whether both the main clause and the subordinate clause are related to a proposition in the context. Another way to distinguish between denial of expectation and concession is the following heuristic, that is based on the fact that concessions are more symmetric than denials of expectation: If one reverses the clauses without the conjunction ('Although p, q' into 'Although q, p') and if the difference between the two versions corresponds to the difference between a non-reversed and a reversed causal order, the original sentence was a denial of expectation; if there is no difference, the original sentence was a concession. The main focus in this chapter is on although-sentences that express a denial of expectation. The analysis of the concessive relations restricts itself to the underlying causal expectation: does the underlying expectation express a causeconsequence order or a consequence-cause order? I will use the notions default order causal relation and reversed order causal relation for this distinction. 1.2
Evidence for the differences in processing default order and reversed order causal relations
This section focuses on the processing of causal sentences that describe the relation between events in default order and in reversed order. Reversed order
On the production of causal-contrastive although-sentences in context
causals are more complex than default order causals. In a reversed order causal, there is an underlying 'you may conclude that' and that conclusion is based on a default order causal relation: 'If you fail the exam, one may conclude that you probably have not worked hard, because if you work hard, you pass the exam'. To test this hypothesis, Noordman (1979) had participants read conditional sentences that expressed causal relations. There were two kinds of sentence, exemplified by 'I{John is ill, he is not going to his work' and 'IfJohn is not going to his work, he is ill'. The first kind of sentence corresponds to a default order causal relation and the second kind to a reversed order causal relation. In these examples, the reasoning in the default order relation is from cause to consequence; in the reversed order relation it is from consequence to cause. One may hypothesise that reasoning from cause to consequence is easier than from consequence to cause, since there is a congruence between what is cause and consequence in the real world and what is linguistically expressed in the sentence as antecedent and consequent. In the sentences expressing a reversed relation, there is no such correspondence: "there is an incongruence between what is the condition and the [consequent] according to the structure of the sentence and what is the condition and the consequence according to the knowledge of the listener and speaker. Handling this incongruence requires extra processing time. This incongruence can be solved in fact by processing the embedding proposition: 'it can be inferred [concluded] that ... "' (Noordman 1979, p. 97) and by processing the default order causal relation.
Method Sixteen different item types were constructed by combining four dichotomies, only one of which is relevant for the present study: whether the sentence expressed a default order relation or a reversed order relation. There were eight topics of sentences, yielding 128 sentences altogether. Half of the sentences expressed a true situation in the world and half of the sentence expressed a situation that was not true in the world. There were two groups of sentences that were in all respects comparable. Each of the 20 participants in the experiment was presented with one group of 64 sentences. The sentences were presented by means of projectors with electronic shutters. The participants were required to judge as quickly as possible whether the sentence was true or false. The answers were given by pressing one of two response buttons. The time was measured from the moment the sentence was presented until the participant pressed a button.
161
162.
Leo G.M. Noordman
Results and discussion The average verification times for the correct answers are presented in Table 1. Table I. Mean verification times (ms) for the two relations. default order causal reversed order causal
3547 3833
Sentences expressing a default order causal relation were processed 286 ms faster than sentences expressing a reversed order causal: minF' (1,23)=10.19, p<.Ol. The results show that sentences that express the real world cause as the antecedent clause and the real world consequence as the consequent clause are processed more quickly than sentences in which there is no such correspondence. This confirms the hypothesis that default order causal relations are more fundamental than reversed order causal relations. L3 Evidence for the differences between the relations in a corpus analytic study
In this section we will analyze a corpus of newspaper texts with respect to the occurrence of the different relations discussed in Section 1.1. The question is whether the different kinds of relation are found in natural texts, whether the different kinds of relation differ in their frequency of occurrence and whether these differences can be explained in terms ofthe characteristics of the relations. Two hundred and eleven although-sentences from the Dutch newspaper Volkskrant of 1993 were collected. The sentences were analyzed with respect to the nature of the relation they expressed. These analyses were performed by three judges. Differences in the analyses were discussed until agreement was reached. The report of the corpus analysis is based on Noordman and van Rijswijk ( 1997). What can be predicted on the basis of the characteristics of the relations about the frequency of occurrence of the relations? It has been argued that default order causal relations are less complex than reversed order causal relations. Therefore, it is expected that default order causal relations occur more frequently than reversed order causal relations. This prediction rests on the assumption that less complex linguistic structures occur more frequently than more complex structures. There is evidence for this assumption. For example, the number of syllables of the words and the morphological complexity of the words correlate negatively with the frequency of occurrence of
On the production of causal-contrastive although-sentences in context 163
the words in a language (Baayen 1989; Koehler 1986; Nettle 1995; Zipf 1935, 1949). Unmarked adjectives, that from a cognitive point of view are more fundamental and from a morphological point of view are in general less complex, occur more frequently than marked adjectives (see the CELEX corpus; Baayen, Piepenbrock & van Rijn 1993 ). The second prediction refers to the order of the main and subordinate clause in the sentence. There is no general conclusion in the literature concerning the order of the clauses. Ramsay ( 1987) found that preposed if-clauses and when-clauses occur more frequently than preposed main clauses. On the basis of research on comprehension and acquisition of complex sentences (Clark & Clark 1977), we may say that preposed main clauses are easier than preposed subordinate clauses. Accordingly, we could hypothesize that preposed main clauses occur more frequently than preposed subordinate clauses. This has been confirmed for because-sentences by Renkema (1996}. Thompson (1985) found that preposed subordinate clauses that express a purpose occur less frequently than postposed subordinate clauses. However, she argues correctly that such a comparison does not make sense, because the two kinds of sentence have different functions. On the basis of the functions of the subordinate clause and main clause of an although-sentence, we expect that the preposed subordinate clauses occur more frequently, because the subordinate clause expresses an expectation that is denied by the main clause. It makes sense that the expectation is denied only if it is evoked in the first place. These predictions can be considered as reflections of a preferred conceptual strategy in reasoning (N oordman & Vonk 1998). That preferred strategy is considered as hypothetical and its predictions can be tested in the corpus. The conceptual strategy in reasoning is that we preferably reason from causes to consequences, i.e. that we deduce consequences from causes rather than causes from consequences. The basis for this strategy is our sensorimotor experience in the world. In fact, we learn the notion of causality and we learn causal relations by acting in the world, by observing the cooccurrences between causes and consequences in the world and by observing that causes precede consequences. If the cause-consequence order is conceptually more fundamental than the consequence-cause order, it seems likely that we prefer to reason from cause to consequence in stead of from consequence to cause. If a sentence expresses a reasoning from cause to consequence, illustrated by the underlying expectation in ( 1), that sentence should be easier to understand than a sentence that expresses a consequence to cause reasoning process, illustrated by the underlying expectation in (2). Similarly, we should expect
164 Leo G.M. Noordman
that there is a preference to produce sentences that express a cause-consequence reasoning rather than sentences that express a consequence-cause reasoning. The preference for cause-consequence reasoning can be considered in fact as to consist of two correspondences of the language with the world, or rather with our knowledge of the world: a conceptual correspondence and a linear correspondence. There is a conceptual correspondence if what in (our knowledge of) the world is the cause and consequence is expressed in the sentence as antecedent and consequent resp. This is realized, for example, in the default order causal relation underlying sentence ( 1). In the reversed order causal relation of sentence (2), there is no such correspondence. The conceptual correspondence is expressed in the propositional representation of the sentence: the proposition that is expressed as antecedent corresponds to the cause in our knowledge of the world; the proposition that is expressed as the consequent corresponds to the consequence in our knowledge of the world. There is a linear correspondence if the cause in the underlying expectation is presented in the first clause and the consequence (both what is expected and its denial) in the second clause. In that case, the information on the basis of which the expectation is evoked is presented first, and what is expected together with the denial of what is expected is presented second. Both sentences (1) and (2) exhibit this linear correspondence. The first clause expresses the antecedent of the expectation and the second clause the denial. This is different in sentences (4) and (5). (4) John failed the exam, although he had worked hard. (5) John had worked hard, although he failed the exam.
The linear correspondence refers to the surface representation of the sentence: the first clause corresponds with the antecedent of the expectation and the second clause with the denial According to the linear correspondence, the surface order of'antecedent of expectation-denial' is preferred to the order of 'denial-antecedent of expectation'. These considerations lead to the following predictions. First, default order causal relations (as sentences ( 1) and (4)) occur more frequently than reversed order causal relations (as sentences (2) and (5)). Second, subordinate-main clause order (as in sentences (1) and (2)) occurs more frequently than mainsubordinate clause order (as in sentence (4) and ( 5) ). Since I have no predictions with respect to speech act relations, and since there were only two sentences expressing a speech act relation, they will not be discussed in this chapter.
On the production of causal-contrastive although-sentences in context 165
An example of three relations is given in (6), (7), and (8). (6} Tourists who will go to Croatia this summer will not notice anything about the refugees who are presently still housed in hotels. The Croatian government will do everything to transfer the estimated eighty thousand refugees to non-tourist areas. This is what the Croatian Assistant Secretary of tourism, N. Bulle, said at the holiday fair that was opened in Utrecht last Tuesday. Although the big Dutch tour operators are avoiding Croatia and Slovenia this summer, both countries do everything to restore the holiday country image they had before the civil war. The former Yugoslavian federal states badly need the foreign currency to restore their damaged infrastructure and monuments and to getting the economy going again. "Tourism is simply the shortest way to get the foreign currency" said a representative of the Croatian embassy. The although-sentence in (6) expresses an reversed order causal relation. The underlying expectation in (6) is not: 'if big Dutch tour operators are avoiding Croatia and Slovenia, these countries do not everything to restore the holiday country image', in contrary. The underlying expectation is: 'if big Dutch tour operators are avoiding Croatia and Slovenia, you may infer that these countries did not everything to restore the holiday country image', where 'not doing everything to restore the image' is a cause for 'avoiding Croatia and Slovenia'. (7) The leader of the Bosnian Serbs, Radovan Karadzic, expects heavy opposition of the rank and file of his party against the peace plan for Bosnia. On his way back from Geneva, where he assented to the constitution proposals of the mediators Lord Owen and Cyrus Vance, Karadzic said in Belgrade last Wednesday that he nevertheless thought that the parliament of the unilaterally proclaimed Serbian republic in Bosnia would agree with the plans. The peace that the Geneva conference was supposed to achieve, still seemed far away on Wednesday. Although Sarajewo experienced a relatively quiet day, there were heavy fights between Serbs and Muslims and between Muslims and Croatian in other places. In Gornji Vakuf, there was the first British casualty last Wednesday, a 26-year-old UN soldier who was shot to death behind the steering wheel of his military vehicle. The although-sentence in (7) expresses a concession. The subordinate clause is an argument against the claim that 'the peace seemed far away' and the main clause is an argument in favor of that claim.
166 Leo G.M. Noordman
(8) The League against Cursing came into conflict with the Ohra insurance company because of a TV commercial. The heavenly styled commercial shows a long queue of people in front of the gate of heaven. Peter passes by searching for many celebrities, but for the person with an Ohra insurance policy, who is at the back of the queue, all doors open without any problems. "Well, with Ohra you are somebody ... " the commercial finishes with a sigh. Although no wrong word is uttered anywhere, the League received many complaints from shocked Christian supporters. According to the League, one may only talk about matters like the Last Judgement with "holy esteem and the greatest respect". "You should not make light of it, or you would hurt many Christians in their deepest religious experiences. Those commercial makers would not dare to make such jokes about other religions, because that might have entirely different consequences. n The although-sentence in (8) expresses a default order causal relation. The underlying expectation is 'if the commercial does not contain any wrong words, there will be no complaints', where offensive language is the cause for complaints. Table 2. Frequency of occurrence of although-sentences according to main-subordinate order and type of relation. denial of expectation default order causal reversed order causal concession
Although subordinate, main
Main, although subordinate
101
20
45 10
26
7
Results and discussion The frequency of occurrence of the different types of although-sentences is presented in Table 2. As predicted, default order causal relations occur much more frequently than reversed order causal relations ( 146 vs. 30; chi-square= 76.45; df=l; p<.OOl). The second prediction is confirmed as well: the subordinate-main order occurs much more frequently than the main-subordinate order (147 vs. 62; chi-square= 34.57; df=l; p<.OOl ). There is no difference between the three kinds of relation in this respect: chi square= 1.42, df=2, p=.49. The preference for the subordinate-main clause order was predicted in particular for the default order causal and the reversed order causal relations, because only for those relations there is an expectation that is first evoked and
On the production of causal-contrastive although-sentences in context 167
then denied_ Concessives were supposed to be more symmetric, because both main clause and subordinate clause relate to a same contextually given proposition. This point will be discussed again in the next section. The results strongly support the preference for reasoning from cause to consequence. First, default order causal relations, in which the real world cause and consequence are expressed in the antecedent and consequent of the sentence resp., are more frequent in the corpus than reversed order causal relations, in which the antecedent expresses the real world consequence. This was called conceptual correspondence. Second, the fact that the order of subordinate-main clause occurs more frequently than the reversed order indicates the preference to mention first the cause for the expectation and subsequently the negation of that expectation. This was called linear correspondence. In the subordinate-main order, there is a linear correspondence between the order of the clauses and the order of cause for the expectation and (negated) consequence. In a default order causal relation, the consequence is an event in the world; in an reversed order causal relation the consequence is a conclusion about the world.
1.
Thematic constraints for the different relations
This section addresses the question of how the different relations are embedded in their context. How is the information in the main clause and in the subordinate clause related to the preceding and subsequent context? This question can also be phrased from a processing point of view. Suppose that the speaker/writer at a particular moment in the discourse, given what (s)he has just uttered, has decided to formulate an expectation and its denial, and to express this in terms of an although-sentence. A number of conceptual choices has then been made already, for example, the kind of the conceptual relation that is to be expressed, the expectation and the contrast the speaker wants to make, and, consequently, the content of the main and subordinate clauses. Other choices have to be made yet, in particular the choice of the order of the main clause and the subordinate clause. And that choice may depend on the thematic development of the discourse. What are the factors underlying the thematic structure of the sentence? How does an although-sentence function in the thematic structure of the discourse? Specific predictions can be formulated on the basis of the role of the main clause and subordinate clause and on the basis oflocal continuity constraints.
168 Leo G.M. Noordman
If we assume that the main clause in a complex sentence expresses the most important information, one can expect that the subsequent text is a thematic continuation of the main clause rather than of the subordinate clause. There are more specific arguments for this prediction, that are based on the character of although-sentences that express a denial of expectation. First, in these sentences, the subordinate clause evokes an expectation that is denied by the main clause. It then seems rather obvious that the writer will continue to discuss this denial of expectation, by giving, for example, the reason why the expectation is denied or by discussing the consequences of the denial. Second, an argument can be derived form the analysis of but-sentences given by Spooren ( 1989). Denials of expectation and concessive oppositions can be expressed by butsentences as well as by although-sentences. Spooren demonstrated that in butsentences that express denials of expectation and concessions, the subsequent text tended to be a continuation of the but-clause rather than of the other clause. We assume that a denial of expectation or a concession can be expressed both by 'p but q' and by 'although p, q'. Consequently, it is expected that the text following an although-sentence is a continuation of the main clause rather than of the subordinate clause. These arguments are in agreement with the theories by Mann and Thompson and Elhadad and McKeown discussed above; they can be considered a prediction derived from these theories. The second prediction regards the local continuity of the text. In general, consecutive sentences in a text show local continuity. Each subsequent sentence is related to its successor. This can be realised, for example, by the fact that what is new information in one sentence is given information in the subsequent sentence. Consequently one may predict that in the construction 'Although subordinate, main' the subordinate clause will be mainly connected to the preceding context and the main clause to the subsequent context. In the construction 'main, although subordinate' the subordinate clause will be mainly connected to the subsequent context and the main clause to the preceding context. The combined predictions with respect to thematic continuity, based on the function of the main and subordinate clause and on the basis of local continuity, are summarised in Table 3.
On the production of causal-contrastive although-sentences in context 169
Table 3.
Prediction of thematic continuity of the although-sentences with their preceding and subsequent context, in terms of the main clause effect and the local continuity effect subordinatepreceding
subordinatemain
mainsubordinate
main-preceding
local continuity effect local continuity effect
subordinatesubsequent
mainsubsequent
local continuity effect main clause effect local continuity main clause effect effect
Two remarks should be made with respect to the concessive althoughsentences. The arguments for thematic continuity apply both to denials of expectation and to concessive sentences, except one. That was the argument that when an expectation is evoked by the subordinate clause and is denied in the main clause, it is likely that the subsequent discourse gives the justification or the consequence of its denial. This argument is specific to the denials of expectation. So, it may be the case that the continuity of both the subordinate clause and the main clause with the subsequent context is more equal than in the case of denials of expectation. The second remark is that in case of a concessive although-sentence, there is a contextually given proposition with which both the subordinate clause and the main clause are connected. Both the main clause and the subordinate clause evoke expectations. In that sense, concessives are more symmetric than denials of expectation. Consequently, one may expect that the tendency that only the first clause will be related to the preceding context will be less true; both the main clause and the subordinate clause will be related to the preceding context. The continuity of both the subordinate and main clause with the preceding context will be more equal than in the case of denials of expectation.
Results The materials consisted of a selection of 83 sentences from the newspaper corpus discussed in the previous section. The thematic continuity of the main and subordinate clauses with both the preceding and the subsequent context was analysed. In this analysis it was judged whether the main clause and the subordinate clause were related as regards their content to the preceding and subsequent context. Is there continuity between the message in the clauses and in the context; does the context follow up on what is communicated by the
170
Leo G.M. Noordman
clause, and the other way around? As a check on this decision, we tried to identify the relation between the clause and the context, using lists of relations such as presented by Mann and Thompson (1986, 1988). For each sentence four dichotomous scores were obtained. Consider the earlier examples. In text 6 the subordinate clause deals with the Dutch tour operators. Neither the preceding context nor the subsequent context deals with these tour operators. The main clause deals with the efforts of the two countries to restore their image, which is in continuity with both the preceding and the subsequent contexts. In text 7, the quiet day in Sarajewo in the subordinate clause is contrastively related to the preceding sentence. The heavy fights in the main clause illustrate the absence of peace in the preceding context. The subsequent context is a direct continuation of the heavy fights in the main clause; it does not follow up on the subordinate clause. In text 8, the subordinate clause -there is no wrong word in the commercial- follows up on the preceding context. The preceding context does not deal with the complaints in the main clause. The subsequent context elaborates on the complaints in the main clause; it does not follow up on the 'no wrong word is uttered' in the subordinate clause. Table4.
Proportion of sentences that show a thematic continuity of the main clause and the subordinate clause with the preceding context and the subsequent context, separately for the different orders of the clauses and the different relations. The number of sentences is within parentheses.
denial of expectation default order causal
reversed order causal
concession
subor.-main (16) main-subor. (17) subor.-main (16) main-subor. (10) subor.-main (17) main-subor. (7)
subor.preceding
mainpreceding
subor.subsequent
mainsubsequent
.75
.56
.19
.94
.06
.94
.12
.88
.31
.56
.00
.94
.10
.80
.50
.40
.88
.71
.29
.88
.43
.86
.57
.71
Table 4 presents the proportion of sentences that showed thematic continuity of the main and subordinate clauses with the preceding and subsequent con-
On the production of causal-contrastive although-sentences in context
text. The thematic continuity scores differed for the three kinds of relation: F(6, 231) = 2.97; p < .01. Therefore, the results will be discussed for each relation separately.
Default order causal relations The thematic continuity scores for the four conditions of the sentence 'Although subordinate, main' differ significantly: Cochran Q=19.29; p<.OOl. Both predictions were confirmed. The high continuity of the main clause with the subsequent context illustrates both the main clause effect and the effect of local continuity. The fact that the subordinate clause is more highly connected to the preceding context than the main clause also confirms the local continuity hypothesis. The same effects were observed for the sentence 'main, although subordinate'. First, the subsequent text has the highest thematic continuity with the main clause. Second, in agreement with the local continuity hypothesis, the preposed main clause is more strongly connected with the preceding context and the postposed subordinate clause with the subsequent context. An additional finding is that the main clause is not only highly connected with the subsequent context, but also with the preceding context. The preceding context anticipates the main clause more than the subordinate clause (Me Nemar: p=.O 1). So, the main clause effect not only regards the subsequent context, but also the preceding context. Both constructions show the same effects of the main clause and of local continuity. But the role of the main and subordinate clauses are quite different in the two constructions. The two constructions serve quite different thematic functions. The sentence: 'Although subordinate, main' functions as a hinge joint: the first clause (the subordinate clause) is connected to the preceding context and the second clause (the main clause) is connected to the subsequent context. In contrast, the sentence 'main, although subordinate' shows low continuity of the subordinate clause with the preceding context; the main clause is highly connected both with the preceding and the subsequent context. These observations show that these sentence forms correspond to quite different decisions on the part of the writer in producing an although-sentence. They can be described in terms of the following complex writing strategy for an although-sentence: 'Formulate the information that is thematically connected with the preceding context in the first clause'. Then the choice of the mainsubordinate order depends on considerations of thematicity as follows: 'If this information, that is connected to the preceding context, does not remain the theme, formulate this information as the subordinate clause. It is then the basis
171
172
Leo G.M. Noordman
for the expectation that is going to be denied in the main clause and the main clause will be the theme of the subsequent text. On the other hand, if the information that is connected to the current context does remain thematic, formulate that information as the main clause, even though the information on the basis of which the expectation is evoked (the subordinate clause) is formulated as the second clause'. This last possibility is less preferred. It occurs less frequently; in our corpus only in 45 of the 146 cases (see Table 2). This supports the idea that a writer in general first evokes an expectation and then denies it. This is in agreement with the earlier discussed preference to reason from cause to consequence. The question arises whether it is possible to express quantitatively the different factors that determine the thematic continuity of the although-sentences. This is the more interesting because the factors are partially confounded as can be seen from Table 3. There are in fact four factors involved. The main clause effect specifies the first factor: the subsequent context is thematically related to the main clause (factor a). The results indicated that, in addition, the main clause is also thematically related to the preceding context (factor b). The local continuity effect can be split up into two factors: the second clause in a complex sentence is thematically related to the subsequent context (factor c), and the first clause is thematically related to the preceding context (factor d). Each factor expresses a preference of how to formulate the although-sentence. The factors express tendencies that account for the thematic continuity. The four factors constitute a model for the thematic continuity of the sentence with its context. The model is a purely descriptive device. It will be used to find out whether the set of four variables can account for the continuity data for the different relations and whether there are differences between the relations. The factors are considered as strictly additive. If we can specify the values of these factors in such a way that they account for the thematic continuity scores, then they give insight into the relative importance of the factors for the continuity of the text. The numerical values of the four factors can be estimated on the basis of the data. They are presented in Table 5. The correlation between the model (the four factors) and the data is significant; r =.98; p < .001.
On the production of causal-contrastive although-sentences in context 173
Table 5.
Parameters for the thematic continuity and fit of the data with the model denial of expectation default order causal
parameter a parameter b parameterc parameter d fit of the data with the model: r
.79 .34 .00 .48 .98
concession
reversed order causal .39
.33
.50
.25
.50
.20 .38 .92
.25 .99
a: continuity of main clause with subsequent context b: continuity of main clause with preceding context c: continuity of second clause with subsequent context d: continuity of first clause with preceding context
Two remarks should be made with respect to these factors. First, factor c, independently of factor a, does not play a role. There is no continuity of the second clause with the subsequent context, if the second clause is not the main clause. That was clear for the 'main, although subordinate' sentences. Second, the most important factor appears to be the tendency to formulate the although-sentence in such a way that the main clause is thematically related to the subsequent context. The main result is that the model describes the thematic continuity quite well. The factors account for the thematic continuity. Application of the model to the reversed order causal sentences and concessive sentences will show whether there are different preferences involved regarding the thematic continuity of these sentences.
Reversed order causal relations The analysis of the continuity data for the reversed order causal relations indicate that the continuity scores differ between the four conditions of the 'although subordinate, main' sentences; Cochran Q = 27.39, p < .001. The main clause hypothesis is confirmed again. The subsequent text has the highest thematic continuity with the main clause. The local continuity hypothesis is also confirmed, although to a lesser extent than for the default order causal relations. The preposed subordinate clause is thematically more related to the preceding context than to the subsequent context, but the thematic continuity with the postposed main clause is stronger. The postposed main clause has the strongest continuity with the subsequent context.
174
Leo G.M. Noordman
The differences for the four conditions of the 'main, although subordinate' sentences are also significant; Cochran Q = 7.78; p < .05. However, the main clause hypothesis is not confirmed. Confirming the local continuity hypothesis, the preposed main clause is more highly connected with the preceding context than the postposed subordinate clause. The continuity of the subsequent context with the postposed subordinate clause is not stronger than with the main clause. In this case, the effect oflocal continuity and the main clause effect cancel each other. In addition, it is clear that for both sentence constructions, the continuity of the main clause with the preceding context is strong: the preceding context is more highly connected to the main clause than to the subordinate clause (Me Nemar; p < .001). Can the four factor model that we applied to the default order causal relations, account for the continuity of the reversed order causal relations? If so, do the parameters have different values and, if so, what is then the interpretation? With the estimated values of the parameters as presented in Table 5, we obtained a correlation of r = .99; p < .001 between the model and the data. Accordingly, the model describes the continuity scores quite well. The values of the parameters are slightly different from the parameters for the default order causal relations. Two differences between the reversed order causal relations and default order causal relations are worth mentioning. First, in case ofa 'main, although subordinate' reversed order causal relation, the continuity of the postposed subordinate clause with the subsequent context is relatively strong. Accordingly, parameter c is high. Second, in case of an 'although subordinate, main' reversed order causal relation, the continuity ofthe preposed subordinate clause with the preceding context is rather weak. Accordingly, parameter dis rather low. A rather speculative interpretation is the following. In an althoughsentence that expresses a reversed order causal relation, the subordinate clause expresses what is conceptually the consequence and the main clause expresses what is conceptually the cause. Accordingly, the strong continuity of the subordinate clause in the sentence 'main, although subordinate' with the subsequent context indicates that the subsequent context is a continuation on the consequence. The low continuity of the subordinate clause in 'although subordinate, main' sentence with the preceding context indicates that the preceding context preferably does not deal with the consequence, but with the cause. What we see is that for sentences that express a consequence-cause reasoning, there is in the relation of the sentence with its context a slight preference for cause-consequence reasoning. But this preference can be cancelled by the main clause effect, as is evidenced by the high continuity of the
On the production of causal-contrastive although-sentences in context 175
'although subordinate, main' sentence with the subsequent context. This post hoc analysis needs further empirical study.
Concessive relations The thematic continuity scores for the sentence 'although subordinate, main' differ between the four conditions: Cochran Q = 17.04; p < .001. The effect of the main clause is observed again. The subsequent context has a higher continuity with the main clause than with the subordinate clause. This is also evidence for the local continuity effect. The hypothesis oflocal continuity gets additional support: The preposed subordinate clause has a higher continuity with the preceding context than with the subsequent context, but this continuity is about the same as the continuity of the main clause with the preceding context. The difference between the four conditions of the sentence 'main, although subordinate' are not significant; Cochran Q =2. 72; p =.43.1t should be noted that there were only seven sentences of this type in the corpus. It was suggested that both clauses in a concessive opposition are related to a proposition that is evoked by the preceding context. In that sense, a concessive relation was expected to be more symmetric than a default order causal or reversed order causal relation, that both express a denial of an expectation. Indeed, the continuity scores for the main clause and the subordinate clause with the previous context are more similar for the concessive relations than for the default order causal and reversed order causal relations. Similarly, the continuity scores for the main clause and the subordinate clause with the subsequent context are more similar for the concessive relations than for the default order causal and reversed order causal relations. This was expected because concessions do not express a denial of expectation as do default order causal and reversed order causal relations. How well does the four factor model describe the data? The parameters were estimated on the basis of the data. They are presented in Table 5. The correlation between the model and the data was significant: r = .90; p = .001. The value of the parameters are in general lower than for the default order causal and reversed order causal relations. Since each parameter expresses an asymmetry, this fact illustrates that the concessive relations are more symmetric in their continuity with the context than the other relations.
176
Leo G.M. Noordman
3·
Conclusion
This chapter is concerned with the question of how a particular kind of complex reasoning is expressed in language. The kind of reasoning that is investigated deals with causation and negation. Linguistically, the reasoning is expressed in sentences that are connected by the conjunction although; it are causal-contrastive sentences. When we express a causal-contrastive sentence, we have to make a number of decisions. The causal relation can be expressed in a number of ways; the evoked expectation and its contrast can be expressed in different orders; there are several ways in which the main clause and the subordinate clause can be related to the embedding context. The question addressed in this chapter is how causal-contrastive sentences are expressed in language. The aim of the analysis is to get insight into the reasoning process as it is reflected in causal-contrastive sentences. The main findings are that the majority of although-sentences express a causal relation in the default order of cause-consequence, and that the dominant order of the clauses is subordinate clause first, and main clause second. These observations can be considered as a preferred strategy to express causalcontrastive thought. They can be interpreted as correspondences between language structure and cognitive structure. The assumption is that causes precede consequences in the world. Consequently, we conceive causal relations as cognitive structures in which causes precede consequences. That is indeed the way in which we experience causality in the real world and the way in which we learn the concept of causality. The results concerning the although-sentences can be formulated in terms of two principles of correspondence between linguistic structure and cognitive structure. The first is what we call conceptual correspondence. That reflects the observation that default order causal relations occur more frequently than reversed order causal relations. In although-sentences that express a default order causal relation, the (causal) expectation is expressed as an expectation that is based on the cause. In the although-sentences that express a reversed order causal, the causal expectation is expressed as an expectation based on the consequence. In that sense, the reasoning process develops from cause to consequence in the default order causal sentence and from consequence to cause in the reversed order causal sentences. The second correspondence principle is what we call linear correspondence. Linear correspondence deals with the order of the clauses in the surface structure of the sentence. It implies a correspondence between the order of the clauses and the order of the events
On the production of causal-contrastive although-sentences in context 177
in the expectation, the clause expressing the cause for the expectation preceding the clause expressing its denial. Evidence for this linear correspondence was the observation that the subordinate clause, which expresses the cause as the basis for the expectation, preferably precedes the main clause. In addition to these two principles that express a preferred correspondence between linguistic structure and cognitive structure, two other principles were found, that affect the thematic continuity between a complex causal-contrastive sentence and its context. The first principle is what we called the main clause effect. This implies that the subsequent context is a thematic continuation of the main clause. In addition, it was observed that there is a tendency for the main clause to be a continuation of the preceding clause as well. The second principle is what we called the local continuity effect. This implies that the highest continuity is obtained between the first clause and the preceding context, and between the second clause and the subsequent context. This chapter deals only with although-sentences, but the relations expressed by although-sentences can be expressed by other conjunctions as well, e.g., but, though (Dutch al). These conjunctions behave in a different way. They occur preferably in the second clause. It is an empirical question to what extent the principles discussed in this chapter apply to these other sentences. Is there also for the other kinds of concession a preference to express the expectation in the order of cause-consequence? If a conjunction, i.e., though or the Dutch al, occurs predominantly in the second clause, is this because ofhigh continuity of the main clause and the subordinate clause with the preceding and the subsequent context, resp., which overrules the linear correspondence, or are other factors involved? Obviously, further study is needed on the contextual constraints of the different types of concessive sentences. A few remarks should be made with respect to the thematic continuity scores for the different kinds of relation that are expressed by although-sentences. The type of although-sentences that occurs most frequently is the causal-contrastive sentence that expresses a default order causal relation and that has a preposed subordinate clause. This type of although-sentences reflects the two correspondence principles and the two thematic principles. Thematically, this sentence functions as a hinge joint: the preposed subordinate clause is highly connected with the preceding context and the main clause is highly connected with the subsequent context. This sentence construction occurs more frequently than the construction with a preposed main clause. In that construction, the main clause has the highest connection both with the preceding context and the subsequent context. These observations can be formulated in terms of a writing
178
Leo G.M. Noordman
strategy: Formulate in the first clause the information that is highly connected to the preceding context. If this information does not remain the theme in the subsequent context, formulate it as the subordinate clause. If this information will remain thematic, formulate it as the main clause. The reversed order causal relations showed two interesting facts that contrasted with the default order causal relations: for the 'although subordinate, main' sentences, the low continuity of the subordinate clause with the preceding context, and for the 'main, although subordinate' sentences, the high continuity of the subordinate clause with the subsequent context. Both effects can be interpreted as a reflection of a reasoning process from cause to consequence, but they need further empirical support. The concessive relations are strictly speaking not causal relations, at least not in the sense that an expectation is based on the subordinate clause and denied by the main clause. It was argued that both clauses in a concessive relation are related to a proposition that is evoked in the preceding context. In that sense, a concessive relation should be more symmetric than a default order causal relation. That was reflected indeed in the thematic continuity scores. The difference between the two clauses in thematic continuity with the preceding context as well as with the subsequent context were much smaller than for the default order causal relations. Both the main clause effect and the local continuity effect have two aspects, depending on whether the thematic continuity with the preceding context or with the subsequent context is at issue. This yielded four parameters. The four parameters accounted very well for the thematic continuity scores. They can be considered as the underlying factors that play a role in formulating the al-
though-sentences.
Note " The author would like to thank two anonymous reviewers for their helpful comments on an earlier draft of this article.
References Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX Lexical Database (CD-ROM). Linguistic Data Consortium, University of Pennsylvania, Philadelphia.
On the production of causal-contrastive although-sentences in context 179
Baayen, R. H. ( 1989). Corpus-based approach to morphological productivity. PhD Dissertation, Free University, Amsterdam. dark, H. H., & Clark, E. V. (1977}. Psychology and Language. New York: Harcourt, Brace, Jovanovich. Ducrot, 0. ( 1984). Le dire et le dit. Paris: Les tditions de MinuiL Elhadad, M., & McKeown, K. R. (1990). Generating connectives. In H. Karlgren (Ed.}, Papers presented to the 13th International Conference on Computational Linguistics, VoL 3 (pp. 97-101). Helsinki: University of Helsinki. Grevisse, M. ( 1986). Lebon usage. Paris: Duculot Grote, B., Lenke, N., & Stede, M. ( 1997). Ma(r)king concessions in English and German. Discourse Processes, 2~ 87-117. Helbig, G., & Buscha. J. (1991 ). Deutsche Grammatik. Berlin: Langenscheidl Koehler, R. (1986). Zur linguistischen Synergetilc: Struktur und Dynamilc der !exile. Bochum: Brockmeyer. Mann, W ., & Thompson, S. A. ( 1986 ). Relational propositions in discourse. Discourse Processes, 9, 57-90. Mann, W., & Thompson, S. A. ( 1988). Rhetorical structure theory: Toward a functional theory oftext organization. Text, 8, 243-281. Nettle, D. (1995). Segmental inventory size, word length, and communication efficiency. Linguistics, 33,359--367. Noordman, L. G. M. (1979).lnferringfrom Language. Berlin: Springer. Noordman, L. G. M., & van Rijswijk, W. {1997). De functie van bet voegwoord 'hoewel' voor de samenhang van teksl Taalbeheersing, 3, 252-264. Noordman, L. G. M., & Vonk, W. (1998). Memory-based processing in understanding causal information. Discourse Processes, 26, 191-212. Oversteegen, E. ( 1997). On the pragmatic nature of causal and contrastive connectives. Discourse Processes, 2~ 51-85. Quirk, R., Greenbaum, S., Leech, G., & Svartvik. }. ( 1985 ). A comprehensive grammar of English usage. London: Longman. Ramsay, V. (1987). The functional distribution ofpreposed "If' and "When, clauses in written narrative. In R. S.Tomlin (Ed.), Coherence and Grounding in Discourse (pp. 383-408). Amsterdam: Benjamins. Renkema, J. (1996). Cohesion analysis and information flow: the case of "Because.. versus "because". In C. Cremers & M. den Dikken (Eds.}, Linguistics in the Netherlands (pp. 233-244). Amsterdam: Benjamins. Sanders, T. J. M., Spooren, W. P.M., & Noordman, L G. M. ( 1992). Toward a taxonomy of coherence relations. Discourse Processes, 15, 1-35. Spooren, W. P. M. S. ( 1989). Some aspects of the form and interpretation ofglobal contrastive coherence relations. PhD Dissertation, Nijmegen University, Nijmegen. Sweetser, E. ( 1990). From Etymology to Pragmatics. Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Thompson, S. A. (1985). Grammar and written discourse: initial vs. final purpose clauses in English. Text, 5, 55-85. Thompson, S. A., & Mann, W. C. ( 1987). Antithesis: a study in clause combining and
18o
Leo G.M. Noordman
discourse structure. In R. Steele, & T. Threadgold (Eds.), Language Topics. Essays in honour ofMichael Halliday (pp. 359-381 ). Amsterdam: Benjamins. Traxler, M. J., Sanford, A. J., Aked, }. P., & Moxey, L. M. (1997). Processing causal and diagnostic statements in discourse. Journal of Experimental Psychology: Learning, Memory, and Cognition. 23,88-101. Zipf, G. K. ( 1935 ). The psycho-biology oflanguage. Boston: Houghton & Mifflin. Zipf, G. K. ( 1949). Human behavior and the principle ofthe least effort: an introduction to human ecology. New York: Hafner.
CHAPTER
7
Beyond elaboration: the interaction of relations and focus in coherent text Alistair Knott Jon Oberlander Michael O'Donnell Chris Mellish University of Otago/University of Edinburgh
t.
Introduction
Many theories of discourse propose that a coherent text is one whose clauses, sentences and text spans (or perhaps the propositions expressed by these text units) stand in particular relations to one another. The basic motivation in these theories stems from the observation that a text is more than a sequence of independent units: whether a particular unit makes sense in a given discourse depends not only on this unit by itself, but also on its relationship with the other units in the discourse. This claim has been spelled out in many different ways, but there are two requirements that any such theory must meet before it has empirical content and can be tested against the facts. Firstly, a particular set of relations must be specified. It is vacuous to say that texts cohere in virtue of the relations that hold between their constituent units unless we specify what these relations are. There are as many 'possible relations between text units' as there are pairs of text units, and clearly since not all pairs of text units are coherent, we must select only some relations from this set. We can refer to the task of choosing a suitable set of relations as the task of developing a theory of 'relation semantics'. Well-known theories of relation semantics include the set of 23 relations proposed by in the original formulation of RST (Mann & Thompson 1988), the pair of relations DOMINANCE and SATISFACTION-PRECEDENCE proposed by Grosz and Sidner (1986) and the sets of conjunctive relations proposed by Halliday and Hasan ( 1976) and Martin (1983 ). Many of the papers in this volume are concerned with the task of
182
Alistair Knott, Jon Oberlander, Michael O'DonneU and Chris Mellish
defining a single class of relation, or of distinguishing a number of similar relations between one another. Secondly, a theory of relations must provide an account of whereabouts in a coherent text relations are expected to be found. This account must begin by specifying what the atomic units of the analysis are. (Are they sentences? dauses? Propositions within clauses? Units larger than sentences?) It must also state in a general way what structure of relations between these units will suffice to ensure its coherence. Clearly a text can be coherent without there being the right kind of relation between each pair of atomic units. Adjacency, or proximity, is an important factor. Often a notion of compositionality is also invoked, whereby two adjacent units linked by a relation are taken to form a new, composite unit, which can itself be linked by relations to other units. A theory which specifies whereabouts in a coherent text we can expect to find relations can be termed a theory of 'span structure'. Many of the most influential theories of this kind (including RST and Grosz and Sidner's theory) adopt the compositionality assumption in some form, and construe a coherent text as a tree of text units, in which complex units are formed from smaller units between which relations hold. A theory of span structure and a theory of relation semantics are two logically separable components of a theory of discourse coherence. But naturally, adopting a specific theory of span structure can place constraints on what would be a sensible choice of theory of relation semantics. In this paper, we consider a case in point. The simple and parsimonious theory of span structure proposed by RST necessitates the inclusion in its theory of relation semantics of a rather idiosyncratic relation called OBJECT-ATIRJBtrrE ELABORATION. We consider a number of problems with this relation in its own right, and also a number of problems with RST' s theory of span structure. We propose a revised account of relation semantics and span structure in which OBJECT-ATTRIBUTE ELABORATION is omitted, which addresses these problems. To illustrate both the problems with RST and the new account of discourse structure, we will use naturally occurring and constructed texts in the genre of 'museum guidebook descriptions•. The problems with RST were originally noticed when we built a text generation system that produces texts in this genre using a straightforward implementation of the theory. The theory implemented in the system was modified as a result, to overcome these problems.
Beyond elaboration 183
2..
RST's theory of span structure
In RST's theory of span structure, relations hold between text spans. Most relations have 'nucleus-satellite structure'; one of the spans (the nucleus) is associated with the writer's main communicative goal, and the other one (the satellite) is there to help bring about this goal, or to provide subsidiary information. Atomic text spans are basically clauses. Complex text spans are structures called 'schema applications'. A schema application for a nucleus-satellite relation is a set of adjacent text spans (either simple or complex), one of which is a nucleus, and the rest of which are linked to this nucleus by applications of a given nucleus-satellite relation. An example is given in Figure 1. Nuc is a nucleus span, Satl, Sat2 and Sat3 are satellite spans linked to this nucleus by the relation R. The complex span CS is the complex span which is formed as a result.
cs I ~----~
I
~~ ··. \ Nuc Figure I.
Sat I
Sat 2
'_""____J___ Sat 3
A schema application of the nucleus-satellite relation R
In this paper, we will be considering three central assumptions underlying RST's theory of span structure. ( 1)
Compositionality The first assumption relates to how the semantics of a complex text span are derived from the semantics of its constituent spans. The assumption is that a complex span comprising a nucleus and a number of satellites can be linked to another text span with a rhetorical relation if its nucleus span can be so linked; in other words, for the purposes of linking spans together, the semantics of a span reduces to the semantics of its nucleus. This assumption is implicit in RST's principal test for nuclearity, which specifies that the coherence of a text is largely preserved if the satellites in a given
184 Alistair Knott, Jon Oberlander, Michael O'DonneU and Chris Mellish
complex text span are removed, but is lost if its nucleus is removed. The assumption has been stated more explicitly by Marcu ( 1997), who calls it the 'strong compositionality' assumption. (2) Continuous constituency The second assumption relates to the distances over which relations are allowed to apply. Basically, RST requires that the nucleus Nand satelliteS of a relation R must either be adjacent text spans, or if not adjacent (as for instance in the case of Nuc and Sat3 in Figure 1), the text spans intervening between Nand S must also be linked to N as satellites of the relation R. (3) Tree structure In a coherent text, each text span (except for the complex span which constitutes the entire text) must be involved in exactly one schema application. This ensures firstly that there can be no sub-spans in the text that aren't linked to any other spans, and secondly that there are no overlapping complex spans; basically, it specifies that a coherent text is a tree of schema applications.
These assumptions have proven very successful in identifying all and only wellstructured texts. We illustrate with part of a text produced by ILEX-2, a generation system which delivers a sequence of descriptions of artifacts in a tour of a museum gallery. ( 1)
( 1) This jewel draws on natural themes for inspiration; (2) it is a remarkably fluid piece. (3) Indeed, Organic style jewels usually draw on natural themes for inspiration; (4) for instance the organic brooch we saw earlier
looked crystalline.
---
AMPUFJCATJON MOTIVATION
EXAMPLE --~
,------~.
(1)
Figure 2.
(2)
RST Analysis of Example 1
\
(3)
(4)
Beyond elaboration 185
The structure for this text is given in Figure 2. By the compositionality assumption, the top-level AMPUFICATION relation holds between the complex spans ( 12) and (3-4) in virtue of their respective nuclear spans, (I) and (3). The expansions of (I) with (2), and of(3) with (4), take place independently of the higher-level relation. By continuous constituency, satellite spans appear adjacent to their nuclei. By tree structure, each sub-span in the text involved in exactly one schema application. In this case, adherence to these assumptions results in a well-structured text.
3·
Some structural problems with ELABORATION
As well as being simple and parsimonious, RST's theory of span structure is able to account for the coherence of a large number of texts. However, the theory has been criticized from several perspectives. For instance, the assumption of tree structure has been questioned by Sibun ( 1992 ), and the assumption of continuous constituency has been questioned by Kittredge, Korelsky and Rambow ( 1991 ). Our central concern in this paper is to associate these structural problems with one RST relation in particular, namely OBJECf-ATTRIBUTE ELABORATION. 1
Mann and Thompson define this relation to hold between two spans if the nucleus 'presents' an object (i.e. contains a mention of it) and the satellite subsequently presents an attribute of that object. The precise meaning of 'attribute' is not clear, but any proposition which provides additional information about the object would seem to qualify. In the type of text which our system produces- a sequence of descriptions of a collection of related entities - this relation is heavily applicable, and the problems we note are thus quite widespread. 3.1
Discontinuous constituency
An initial problem is illustrated in the following text, taken from a museum guidebook.
(2) (I) In the women's quarters the business of running the household took place. (2) Much of the furniture was made up of chests arranged vertically in matching pairs( ... ). (3) Female guests were entertained in these rooms, which often bad beautifully crafted wooden toilet boxes with
186
Alistair Knott, Jon Oberlander, Michael O'Donnell and Chris Mellish
fold-away mirrors and sewing boxes, and folding screens, painted with birds and flowers. (4) Chests were used for the storage of clothes ( ... )
In this text, an entity mentioned in the middle of the first paragraph, chests, becomes the central topic of the second paragraph. We can refer to this move pre-theoretically as a 'resumption'. 2 The move is clearly legitimate in the above context, and yet an analysis in terms of a tree of relations is difficult. The problem is that sentence (4) needs to be seen as the satellite of an ELABORATION relation, but the obvious nucleus for this relation- sentence {2)- is not accessible. If we analyze sentences {2) and {3) as ELABORATIONS of sentence 1, as seems necessary, we have effectively closed off sentence (2) as the nucleus for further ElABORATIONS. In order to treat sentence {4) as an ELABORATION of sentence (2), we would have to analyze sentence (3) as being subordinate to sentence {2): this analysis seems inappropriate; moreover, it makes the position of the paragraph break hard to explain. Note that we cannot just ignore the relationship between sentences {2) and (4) in our representation of the text: it is only because the chests are mentioned in the former sentence that they are a relevant topic for discussion. To account for coherence in this case, it seems we must either abandon compositionality, in some circumstances, or adopt a notion of discontinuous constituency for text spans, or abandon the requirement that each subspan in a text is involved in at least one schema application. A particularly common manifestation of this problem is in cases of parallelism within discourse structure. Especially in descriptive texts, it is common for a number of entities to be introduced sequentially in a sequence of spans, and then elaborated on in subsequent spans in the order of their introduction. Accounting for these subsequent mentions as ELABORATIONS of the spans where they were introduced is not possible without violating adjacency or compositionality constraints. Mann and Thompson acknowledge from the outset that RST cannot account for the constraints which apply in such contexts. McKeown {1985) deals extensively with cases of parallelism in text, although this account is not set in the context of a theory of coherence relations. Kittredge et al. {1991) give several examples of parallelism; indeed, in one case they identify ELABORATION as the relation responsible for the problem.
Beyond elaboration 187
Nuclearity and embedding
3.2
The preceding section presents a case where a 'context-free' theory of span structure undergenerates the space of possible texts. There are also cases where it overgenerates; again, these relate principally to the ELABORATION relation. There often seem to be difficulties in embedding ELABORATIONS within other relations. Consider this constructed text: (3)
(1) Arts-and-Crafts jewels tend to be elaborate. (2) However, this jewel
has a simple form. This text contains a CONCESSION relation whose nucleus is (2) and whose satellite is (1). In principle, we could expand either span with additional relations. But note what happens when we embed an ELABORATION under span (1): ( 4)
( 1) Arts-and-Crafts jewels tend to be elaborate. (la) They are often massproduced. (2) However, this jewel is simple in form.
Sentence (la) elaborates on (1) by providing more information about Artsand-Crafts jewels. However, it also makes it hard to attach sentence (2) to sentence ( 1). Note that there is a coherent interpretation of the text, if (la) is treated as expanding on the proposition that Arts-and-Crafts jewels are elaborate, for instance by arguing for it, or by providing an example, rather than simply as 'saying something else about Arts-and-Crafts jewels'. However, we have chosen the elaborating sentence to make these interpretations implausible. Besides, under these interpretations the embedded relation is no longer object-attribute ELABORATION; that is precisely our point. Note also that the problem is not just due to difficulties with 'high-level' relations in general, or with 'left-branching' tree structures. Compare a text with different embedded relations: (5) ( 1) Arts-and-Crafts jewels tend to be elaborate. ( 1b) Ornateness was the fashion at the turn of the century. (lc) And not just in jewelry either. (2) However, this jewel is simple in form. The structure of this text is given in Figure 3. In this text, there are two levels of embedding, not just one: sentence ( 1) is related to sentence (lb) via an EXPUNATION relation, and sentence (lb) is itself related to sentence (lc) via an amplification relation. Nonetheless, the CONCESSION relation between sentences ( 1) and
188
Alistair Knott, Jon Oberlander, Michael O'Donnell and Chris Mellish
CONCESSION
-~~
/------ EXPLANATION
_(
_[~~CAT-IO_N
_ _\'------
lb Figure 3.
\
lc
2
Analysis of example 5
(2) is still intelligible. Note that there is no way that the relationship can be understood as applying between sentence (2) and the sentence immediately preceding it. Arguably, there may be a limit to the depth of embedding permissible for any relation, particularly for left-branching RS trees. We will discuss this idea more in Section 6. But ELABORATION's apparent resistance to even the simplest kind of embedding suggests that it is qualitatively different from the other relations.
4·
Problems with ELABORATION as a coherence relation
We now turn to some problems with ELABORATION as a component of a theory of relation semantics.
4-1
Elaboration as a relation between entities
One initial point to note is that the relation of ELABORATION is not really a relation between propositions in the same way that the other relations in RST are. A relation like EXPLANATION genuinely holds between two elements which are propositions: it holds if one proposition provides an EXPLANATION of the other, and there is no simpler way to state the relationship than this. It is not possible to identify subcomponents of the related propositions which stand in a relationship to each other that allows us to deduce that an EXPLANATION relation holds between the propositions. The same holds for the other RST relations. If a CAUSE relation holds between two propositions, is not possible to
Beyond elaboration 189
identify components of these propositions- for instance entities or predicates - whose relationship by itself allows us to deduce that a CAUSE relation holds between the propositions they are part of. But for the ELABORATION relation, this is possible. An ELABORATION relation between two propositions holds in virtue of a particular relationship (namely, identity) holding between component elements of the respective propositions (namely, entities). It is only indirectly a relationship between the propositions, in virtue of this direct relationship between entities. Many of the problems we will mention below seem to stem from this basic point. 4-2
Overlap with the focus metaphor
The discourse phenomena described by ELABORATION appear to overlap extensively with phenomena described by other theories of discourse, namely those concerned with focus structure. Consider firstly theories of local focus; in particular, Grosz, Joshi and Weinstein's (1995) account of centering. A primary concern for this theory is to catalogue the different discourse structures which can obtain in cases where two adjacent sentences make reference to a common entity. The issue is explored both in hypotheses about how this entity should be referred to in the second sentence (for example, pronominally) and about which sentence configurations make for 'good continuations'. The centering account is explicitly entity-based, and is expressed at a level of detail far greater than that given in the definition of ELABORATION, which prima facie covers the same cases. Moreover, it is not bound by the hierarchical constraints imposed on RST relations which were shown to be problematic for ELABORATION: adjacent sentences are related in chains, rather than in trees. Consider also global focus. It is often useful to speak about the global focus of a passage of text, for instance if we are summarizing it, or trying to resolve anaphora within it But it is not possible to represent the global focus of a text within the vocabulary ofRST. Consider a simple passage, in which an entity is described in a sequence of adjacent clauses. An RST analysis could identify the first of these clauses as the nucleus of an ELABORATION schema application, whose satellites are the remaining clauses. But this analysis accords a spurious significance to the proposition expressed by the first clause. It is not the proposition which is being elaborated on, but the entity. Proponents of RST are likely to concede that notions of local and global focus are necessary in addition to the account that it provides. But our point is that when these extra primitives are included in a theory of text coherence, the
190
Alistair Knott, Jon Oberlander, Michael O'DonneU and Chris Mellish
relation essentially becomes redundant, and makes no contribution of its own. The aspects of text coherence which it represents are also modeled - and better modeled - by the entity-based metaphor of focus.
ELABORATION
4-3
Linguistic signals
It has often been observed that ELABORATION is one of the few relations for which there are no conjunctive linguistic signals. There are simply no sentence or clause connectives for signaling this relation. Connectives like indeed, in fact or also do not always work: often, the best method of signaling this relation seems simply to be to close the nucleus sentence with a period, and begin a new sentence for the satellite. Mann and Thompson are at pains not to tie relations directly to linguistic signals. But there would undeniably be advantages to being able to make such connections. In practice, computational treatments of RST, whether in text generation or discourse structure parsing, do link relations to surface signals. And ELABORATION is invariably treated differently from other relations in these contexts. For instance, in Scott and De Souza's (1990) list of distinctive methods for signaling RST relations in generated texts, the ELABORATION relation is to be signaled by a relative clause whose head noun denotes the entity being elaborated on. Marco's ( 1997) algorithm for identifying the relations in a text from surface cues relies principally on discourse markers for all relations except ELABORATION (and JOINT); for these latter two relations, word co-occurrence measures provide the strongest surface indicators. There are also theoretical reasons for holding that relations are associated with particular classes of linguistic expression. The present authors have argued that the set of linguistic resources available for signaling relations in a language can provide valuable evidence for determining how the set of relations in that language should be defined (Knott 1996; Knott & Mellish 1996}, and that the lack of conjunctive signals for ELABORATION provides evidence that it is different from other relations.
s.
An elaboration-less model of text coherence
While many of the problems with ELABORATION have been noted in the past, the question of what an account of discourse relations would look like without this relation has not been seriously considered. It is this question that we would like
Beyond elaboration
to address. In this section, we outline a revised version of RST in which ELABORATION is omitted from the set of relations. We would like to preserve as much as possible of the RST-based model, while taking account of the exceptions due to ELABORATION noted above. We propose that the global coherence of a text is determined by global focus, rather than by a tree structure of relations between high-level text spans.3 At a high level of structure, we take a coherent text to be a sequence of focus spaces which succeed each other in a legal manner. We will term a focus space an 'entity-chain': basically a portion of text in which the global focus is some particular entity. An entity-chain is made up of a sequence ofRS trees, each constructed just as in RST, but minus the ELABORATION relation. These trees can either be simple trees consisting of just one text span, or more complex trees with several layers of hierarchy. In each case, we can define the 'top nucleus' of the tree to be the leaf-level text span which is reached by following the chain of nuclei from its root; in other words, it is the nucleus of the nucleus of( ... ) the nucleus of the tree. A legal entity-chain whose focus is entity E is one where the top nucleus of each tree is a fact about E. 4 Note that the facts within a single tree do not all have to be about the entity in focus. Coherence between these facts is not determined by there having entities in common, but by there being relationships of the right sort between the propositions they express. A legal sequence of entity-chains is a sequence in which the focused entity in each chain is mentioned in a proposition within the n previous chains. In our text generation system, n is effectively set to 4, although it is likely that the value of n should vary depending on the length of intervening chains. Determining the factors which contribute to the value of n is a matter for further empirical investigation. Our main claim is that the admissibility of a chain with a particular focus at a particular point in a text is a function of its linear distance from the previous mention of the focused entity, rather than of its relationship to the 'right frontier' of a discourse structure tree. An example of a legal sequence of four entity-chains ECl, EC2, EC3 and EC4 is given in Figure 4.
Figure 4.
A legal sequence of entity-chains
191
192
Alistair Knott, Jon Oberlander, Michael O'DonneU and Chris Mellish
Within each entity-chain, atomic RS trees are denoted by rectangular boxes and non-atomic RS trees are denoted by triangles. The directed arcs indicate resumption relations: links from an entity-chain to the sentence which introduces it. Note that these arcs do not have to link adjacent entity-chains, and can cross one another. The model of text structure just oudined has been implemented in the ILEX-2 text generation system; see Mellish et al. ( 1998) for details. An example of a text generated by the system is given below: (6)
(1) This piece is a necklace. (2) It was designed by a jeweler caUed Jessie
King. (3) It was designed in 1905. (4) It is made of silver and enamel (5) Jessie King was a famous designer. (6) She was Scottish, (7) but she worked in London. (8) It was in London that this piece was made. (9) Like the previous piece, ( 10) this piece is in the Arts-and-Crafts style. ( 11) Although the previous piece had a simple shape, ( 12) Arts-andCrafts style jewels tend to be elaborate; (13) for instance, this piece has detailed florals. There are three entity-chains in the text: E0 (spans 1-4) is about a particular jewel, E1 (spans 5-8) is about the jewel's designer, and E2 (spans 9-13) is about the style it is in. Within these chains there are a number oflocal RS trees: spans 6-7 (top nucleus span 7), spans 9-10 (top nucleus span 10), and spans 11-13 (top nucleus span 12). Resumptions occur from (E 1) to (E 0 ), and from (E2 ) to (E0 ). Note that neither of these resumptions is to material in an adjacent text span. Nevertheless, the resulting text seems a good optimization of focus and relation-based constraints.
6.
Relations at higher levels ofhierarchy
The idea of abolishing relations in the global structure of a text is certainly quite radical. It is a central tenet of RST that relations can apply between text spans of arbitrary size; and indeed, there are many complex texts in which relations do seem to apply at a high level of hierarchy. What should the present account say about these relations? One thing to note immediately about these high-level relations is that they are not associated with surface conjunctive signals in the same way as low-level relations are. Low-level relations between clauses and sentences can typically be signalled direcdy by conjunctions, but conjunctions cannot be used to link
Beyond elaboration 193
arbitrarily large passages of text. If an explicit signal is needed, a slightly different mechanism is used, which involves a new mention of the top nucleus of the first span. Either this proposition is simply reiterated, in what Walker (1993, 1996) calls an 'informationally-redundant utterance', or the proposition is referred to as an entity, via the mechanism of nominalization or discourse deixis (Webber 1991). Assume we are given a large span of text Sl, containing an argument that Kennedy was assassinated by the CIA, whose top nucleus is naturally enough the proposition Kennedy was assassinated by the CIA. If we want to continue with a second span S2 concluding that we can't trust the CIA, the three methods outlined above could be illustrated as follows. (7) (Sl.) Given that Kennedy was assassinated by the CIA, the organization is clearly untrustworthy. (Informationally-redundant utterance.) (8) (SJ.) Kennedy's assassination by the QA proves that the organisation is untrustworthy. (Nominalization.) (9)
(Sl.) This proves that the organization is untrustworthy. (Discourse deixis- only possible when the top nucleus of Sl can be referred to anapborically.)
Using informationally-redundant utterances, any of the methods for signaling relations between small text spans are available for larger spans too. Likewise, the mechanisms of nominalization and discourse deixis provide the means for expressing high-level relations between propositions. But note that these latter methods for signaling high-level relations involve treating propositions as entities about which things can be predicated. What is more, nominalization is a device which allows arbitrary reference to recent propositions in the text; the propositions which can be referred to are not limited to those on the right frontier of a discourse structure tree. Given these two considerations, we suggest that high-level relations signaled using nominalizations can be thought of, and are perhaps better thought of, in entity-based terms, as signals of resumptions. If we allow that any proposition in a text introduces itself as a possible topic for resumption, in addition to any entities it refers to directly, then the model of global text structure we presented in Section 5 seems to extend very well to the kind of high-level relations we have been discussing in this section.
194 Alistair Knott, Jon Oberlander, Michael O'DonneU and Chris Mellish
7·
Discussion
This paper has discussed a nwnber of problems with RST's theories of span structure and relation semantics which stem from its use of the relation OBJECT-ATTRIBUTE EL\BORATION. It argues that a better account of text coherence can be developed by abandoning this relation, and allowing that the metaphor of 'relations between propositions' only provides a partial account of text coherence. A new account of coherence is put forward in which a model of relations is supplemented with a entity-based model of focus structure. While previous accounts have suggested that relations and focus provide simultaneous constraints on coherence, the central idea in the new account is that the two adjacent text spans are coherent if either there is a suitable relation between the propositions they express, or they are linked by a legal focusing move. There are three principal advantages of the new account. Firstly, by removing ELABORATION from the set of relations, we are able to eliminate some redundancy from any account of coherence which features constraints due both to relations and to local/global focus. Secondly, the new account promises to allow a tighter association between the primitives in a discourse theory and the linguistic means by which they are expressed: coherence relations can be associated with sentence and clause conjunctions, while focus-based moves are associated with nominal referring expressions. These associations are beneficial both for the first-order task of analyzing texts, and the second-order task of defining the set of relation-based and entity-based primitives on which firstorder analyses can draw. Thirdly, the division of labor between relations and focus in the new account produces a better match to the data in some respects. At low levels of hierarchy in a text, ELABORATION cannot be embedded insideRS trees in the way that other relations can. At high levels of hierarchy, non-local and crossing dependencies do seem to occur in text structure, but they seem to be restricted to cases of reswnption, either of an entity mentioned in a recent proposition, or of this recent proposition itself. The new proposal is certainly still at a preliminary stage of development. Empirical work is needed to investigate the claims about non-local and crossing resumption relations. The space of texts generated by our text planner provides some tentative evidence of the existence of text structures which RST cannot analyze, but a study of naturally-occurring text would provide a much better testbed for the theory. What is more, there remains much to be worked out in the new model. For one thing, the weaker constraints it imposes at the level of global structure may well lead it to overgenerate the space of coherent
Beyond elaboration 195
texts; additional constraints may need to be specified. Additional constraints are also likely to be needed to determine the internal composition of entitychains. These are avenues we are currently pursuing.
Notes Our objections do not extend to other types of ELABORATION; for instance, what Mann and Thompson call PROCESS·S'IE' ELABORATION or GENERALIZATION·SPI!aFIC ELABORATION. In what follows, references to ELABORATION are exclusively to the object-attribute variety, unless otherwise stated.
1.
The notion of a resumption bears some resemblance to Grosz and Sidner's notion of a 'digression'. This is a discourse segment which (a) is not related to the immediately preceding segment by dominance or satisfaction-precedence, and (b) contains mention of an entity salient in the interrupted segment. However, Grosz and Sidner's definition implies that a link due to a common entity can only occur between adjacent segments; our claim is that resumptions can occur between non-adjacent segments. 2.
3· In this respect, our proposal is similar to that made by Mooney, Carberry and McCoy (1990).
4· A working definition of what it is for a fact to be 'about' a certain entity is given in Mellish et al. (1998).
References Grosz, B.}., & Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12, 175-203. Grosz, B. J., Joshi, A. K., & Weinstein, S. ( 1995). Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21,203-225. Halliday, M. & Hasan, R. (1976). Cohesion in English. London: Longman. Kittredge, R., Korelsky, T., & Rambow, 0. ( 1991). On the need for domain communication knowledge. Computational Intelligence, 7, 305-314. Knott, A. ( 1996). A data-driven methodology for motivating a set of coherence relations. Ph.D. thesis, Department of Artificial Intelligence, University of Edinburgh. Knott, A., & Mellish, C. (1996). A feature-based account of the relations signaled by sentence and clause connectives. Language and Speech, 39, 143-183. Mann, W. C., & Thompson, S. A. ( 1988). Rhetorical structure theory: A theory of text organization. Text, 8, 243-281. Marcu, D. ( 1997). The rhetorical parsing, summarisation and generation of natural language texts. Ph.D. thesis, Department of Computer Science, University of Toronto.
196 Alistair Knott, Jon Oberlander, Michael O'Donnell and Chris Mdlish
Martin, J, R. (1983). Conjunction: The logic of English texL In J. S. PetOfi & E. SOzer (Eds.), Micro and Macroconnexity of Texts, (pp. 1-72). Helmut Buske Verlag, Hamburg. McKeown, K. R. ( 1985). Text generation: Using discourse strategies and focus constraints to generate natural language text. Cambridge: Cambridge University Press. Mellish, C., O'DonncU, M., Oberlander, J., & Knott, A. (1998). An architecture for opportunistic text generation. In Proceedings ofthe ninth International Workshop on Natural lAnguage Generation (pp. 28-37). Montrbl. Mooney, D., Carberry, M.S., & McCoy, K. F. (1990). The generation of high-level structure for extended EXPLANATIONS. COLING 90, 2, 276-281. Scott, D. R., & de Souza, C. S. (1990). Getting the message across in RST-based text generation. In R. Dale, C. Mellish, & M. Zock (Eds.), Current Research in Natural Language Generation. (pp. 47-73). London: Academic Press. Sibun, P. ( 1992). Generating text without trees. Computational Intelligence, 8, 102-122. Walker, M. (1993). Informational redundancy and resource bounds in dialogue. Ph.D. thesis, University of Pennsylvania. Walker, M. ( 1996). Limited attention and discourse structure. Computational Linguistia, 22, 255-264. Webber, B. ( 1991 ). Structure and ostension in the interpretation of discourse deixis. Natural lAnguage and Cognitive Processes, 6, I 07-135.
CHAPTER
8
Unstressed en/ and as a marker of joint relevance Henk Pander Maat University of Utrecht
Since the 1970s, the minimal meaning and the multiple uses of the conjunction and have drawn the attention of many linguists. In this chapter I will characterize and as a marker of what will be called joint relevance, that is, as indicating that its conjuncts jointly fulfill an informational role. A number of hypotheses concerning the joint relevance of interclausal coordinations will be tested by means of corpus analysis. This chapter has the following structure. I will first review some pragmatic work on and, among which the publications of Carston ( 1993) and Blakemore (1987) who were the first to suggest a notion of joint relevance. I will then explicate this notion in terms of the question-related topic concept advanced by Van Kuppevelt (1995) and Klein (1991); Klein and Von Stutterheirn (1989). The claim will be that and marks topic continuity. Subsequently, the notion of topic continuity will be discussed both for intraclausal and interclausal andconjunctions.• In Section 2, the proposed account is evaluated by means of a corpus analysis of interclausal conjunctions. The vast majority of 222 interclausal conjunctions are shown to exhibit joint relevance. Two kinds of joint relevance environments are distinguished: the conjunction may support or elaborate an assumption or it may answer a single question. The majority of conjunctions answers questions arising from the preceding text utterances. An analysis of interconjunct relations reveals that joint relevance may be accomplished both by referentially distinct but functionally equivalent conjuncts and by conjuncts that present a single situation from different perspectives. Finally, some theoretical implications of the proposed account are discussed. The meaning of interclausal and is claimed to be procedural in that it constrains implicatures concerning the conjuncts. Furthermore, joint relevance
198 Henk Pander Maat
relations are placed within a classification of additive and comparative coherence relations. Finally, the joint relevance account is related to recent work on the role of connectives in the contruction of discourse representations.
L
Joint relevance and other pragmatic characterizations of and
u
The pragmatic inference account of and
The classic pragmatic account ofinterdausal and springs from Grice's observations on 'asymmetric' clausal coordinations, such as ( 1): ( 1)
She gave him her key and he opened the door.
Grice ( 1981) claims that in cases like ( 1), the meaning of and remains that of the logical conjunction operator'&': P & Q is true if and only if both conjuncts are true. However, the 'and-then' sense of and in ( 1) is a standard implicature due to the fourth sub-maxim of Manner, 'Be orderly'. The implicatures conveyed by and are not confined to this maxim. Instances of other kinds of implicatures are presented in Posner ( 1980): (2) Annie fell into a deep sleep and her facial color returned. (3) The window was open and there was a draft.
According to Posner, if the speaker of (2) would have wanted to convey that Annie's facial color did not return while she was asleep but at a completely different time, he should have specified this other time. Otherwise, he would have been violating the first submaxim of Quantity, which states that speakers should make their contribution as informative as is required. And the speaker of (3) would be guilty of communicating irrelevant information (Grice's third Maxim) when he would not want to convey that the draft is coming from the window. Similar explanations may be given for some further standard inferences which are made on the basis of ( 1), e.g.: (4) She gave him her key with the purpose ofletting him open it. (5) He used her key to open the door. Of course, one may disagree about the way these inferences should be derived. For instance, Carston (1988) argues that (4) and (5) can only be accounted for
Unstresses en/and as a marker of joint relevance 199
by a general principle of Relevance. She also presents a series of arguments for the claim that the inferences must be considered explicatures instead of implicatures, that is they are recovered as part ofthe truth-conditional content of ( 1). However, when it comes to the characterization of and, the pragmatic inference account suffers from two fundamental problems. As is acknowledged by most authors, many of the inferences in question are retained when and is replaced by a full stop. That is, we are dealing here with inferences based solely on the fact that two utterances are juxtaposed. This could be taken as support for the pragmatic account, but it raises the question why and is used here at all. Our account of and should teach us what distinguishes and-conjunctions from these sequences of juxtaposed sentences. The second problem is, as we will see in Section 1.2, that in some contexts and-conjunctions block inferences that are aUowed by juxtaposed sentences. This too requires an explanation in terms of the distinctive features of conjunctions and juxtapositions. 1.2.
Relevance theoretic observations on and
The difference between and-conjunctions and juxtaposed sentences has been addressed most explicitly by Carston in a later publication ( 1993 ). Here she takes up some remarks of Blakemore ( 1987) to the effect that conjoined propositions carry the presumption of optimal relevance as a whole. That is, a conjoined utterance instructs the hearer "to recover a maximally relevant conjoined proposition - one that has relevance over and above that of its individual conjuncts" (Blakemore 1987, p. 121). Taking this as a point of departure, Carston tries to explain why some propositions can not be combined into a conjoined proposition with and. One of her observations is that explanatory utterances cannot be prefaced by and: see (6b ), in which and blocks the explanation interpretation that is suggested by (6a). Now explanations can be considered answers on implicit 'why?' or 'how come?' questions; and the processing of the and-conjunction as an integrated pragmatic unit rules out the interpretation of the first conjunct as an independent unit which may raise implicit questions, thus giving rise to a second conjunct containing the answer. This applies not only to causally oriented explanations but also to elaborating statements: in (7b) the elaboration interpretation suggested by (7a) is blocked: (6)
a. John broke his leg. He slipped on a banana skin. b. John broke his leg and he slipped on a banana skin.
:1.00
Henk Pander Maat
(7) a. b.
I ate somewhere nice last week; I ate at Macdonald's. I ate somewhere nice last week and I ate at Macdonald's.
One more environment which disallows and is an argument-conclusion sequence, so that (Sa) can, but (8b) cannot have the conclusion interpretation: (8) a. These are his footprints; he's been here recently. b. These are his footprints and he's been here recently. Carston's explanation for all this is that explanatory, elaborative and conclusive relations can only be specified between processing units, and that andconjunctions are by definition only one processing unit. Carston assumes that 'processing units' are to be defined in syntactical terms (1993, p. 41 ff.), so that complex sentences such as clausal coordinations are one processing unit by virtue of being a syntactical unit. However, this cannot be upheld, since her analysis does not account for not every clausal coordination - she confines herself to and-coordinations. Thus the connective and seems to be decisive, not the coordination as such. In Section 3.2 below I will shortly discuss the difference between and and but, and I will conclude that joint relevance only applies to and. While Carston accounts for the suggestion of joint relevance in syntactic terms, Blakemore (1987, p. 120 ff.) claims that it can be fully explained by the truth-functional meaning of and together with the Principle of Relevance. In this paper, I will take a different position on the nature of the joint relevance conveyed by and-conjunctions. We will return to this question in Section 3.1. Whatever its theoretical status, I share the intuition that and-conjunctions are characterized by joint relevance. How should we explicate this notion? Again, Blakemore ( 198 7) provides some clues. She notes that in some cases the relevance of the conjoined proposition lies in the fact that it is a list, such as (9): (9)
I wrote some letters and painted the ceiling.
And one of the reasons for producing such a list is: to provide a single answer to a single (implicit or explicit) question- for example [ ... ] 'What did you do in the weekend?'- so that the relevance of the utterance hinges in the fact that each conjunct is interpreted against the same set of contextual assumptions (1987, o.c. 120).
That is, the joint relevance of the conjuncts may lie in the fact that they constitute an answer to a single question, which may be implicit.
Unstresses en/and as a marker of joint relevance
The claim that joint relevance has to do with answering a single implicit question is not theoretically explicated by Blakemore. In Subsection 1.3 I will relate this claim to recent reconceptualizations of the notion of topic. 1.3
Joint relevance and shared topic questions
The notion that utterances can be considered as answers to implicit questions is not just a plausible metaphor; in fact it has been forcefully argued for by Van Kuppevelt (1995) and Klein and Von Stutterheim (1989). Van Kuppevelt defines the topic of an utterance in terms of the (explicit or implicit) question that is being answered by a segment of discourse. That is, assuming a question like (lOa), the topic part TP of utterance (lOb) relates to the question, whereas the focus part FP contains the proper answer: 2 (10) a. b.
Who hit Bill? FP[HARRY) TP[hit Bill)
More specifically, the topic of (lOb) is defined as the set of possible answers to the topic question, in this example the set of persons that may have hit BilP Topic is a dynamic, context dependent notion, so that successive utterances will often have different topics. But a topic may also be preserved across successive utterances. For one thing, the topic may be developed further by means of subtopic-constituting questions, which are due to unsatisfactory answers to preceding questions. For instance, the dialogue in ( 10) may be continued with a question like which Harry do you mean? But a topic may also be continued when more than one utterance is needed to fully answer the topic question. This situation gives rise to non-hierarchical (list-like) multi-part answers. Sometimes these may provide the main structure of entire texts, for instance in narratives answering questions like what happened to the protagonist at tl ... t''? (see Klein & Von Stutterheim 1987, 1989) or instruction texts answering questions like how can I repklce the flat tyre on my car? These questions project a number of utterances containing chronologically related events or actions involving a certain protagonist. This imposes certain constraints on the topic-comment modulations of the individual utterances constituting the answer. For instance, the topic part of main structure utterances in narratives and instructions will contain a reference to the protagonist, simply because the underlying topic question does so.
:101
~~
Henk Pander Maat
Van Kuppevelt (1995) and Klein and Von Stutterheim (1989) have discussed the phenomenon of question-topic continuation in order to account for conventionally expected types of coherence, characteristic for certain genres of discourse. However, choices regarding topic development (changing the topic, elaborating on it by means ofsubtopics or simply continuing it) arise all the time when speakers progress from one clause to the next, and these choices need to be recognizable for hearers in order to ensure cooperation and understanding. I suggest that connectives are to be seen as indicating the direction of topic development chosen by the speaker. Many connectives (e.g. because) seem to indicate types of elaboration by subtopics. The topic questions indicated by such connectives have an interesting characteristic: they do not focalize certain parts of the sentence, but the sentence as a whole. For instance, when no connective is present, the sentence last week Harry had the flu may receive different intraclausal topic-focus articulations according to the context (see ( 11) and (12)). However, when the sentence is preceded by because, the entire sentence is presented as the answer on a why-question (see (13)). This applies regardless whether Harry has been mentioned in the preceding sentence or not (see (14)). ( 11) Harry is often ill these days. TP[Last week Harry] pp[had the flu]. (What was wrong with Harry last week?) (12) Many people have the flu these days. TPI [Last week] pp[Harry] TP2[had the flu]. (Who had the flu last week?) (13) The report has not been finished yet, because FP[last week Harry had the flu]. (Why hasn't the report been finished yet?) (14) Harry did not finish the report yet, because FP[last week Harry/he bad the flu]. (Why didn't Harry finish the report yet?) A second major type of topic development relevant for characterizing connectives, which has received less explicit attention in the literature, is topic continuation. 4 My proposal in this article is to view the clause connector and as the prototypical and most general indicator of a shared topic, in the sense that the and-conjuncts constitute a joint relevance unit. Like connectives indicating
Unstresses en/and as a marker of joint relevance :103
subtopics, and suggests topic questions which concern the conjoined clauses as a whole. To lend this suggestion some prima facie plausibility, let us return to the examples (6a) and (6b) - repeated here for convenience. It was observed above that and does not support the explanation reading: (6) a. b.
John broke his leg. He slipped on a banana skin. John broke his leg and he slipped on a banana skin.
Carston has accounted for this by saying that explanations are to be viewed as answers on how come?-questions occasioned by the first conjunct; the explanation reading is blocked by and because and indicates that the sentence has a single joint relevance. That is, no new questions can be answered in the andconjunct. Now how do we explain that reversing the order of the conjuncts of (6b) results in a sentence that does allow the causal interpretation? ( 15) John slipped on a banana skin and (he) broke his leg.
This seems to be due to the fact that ( 15) is a perfectly appropriate answer to a question like why has John not appeared on this work since two weeks?, asked by one ofJohn's colleagues to another. This question may be answered in just one conjoined sentence containing a cause-consequence sequence. In fact, the question may be paraphrased as follows: what event or sequence of events has caused Johns' absence on his work? Obviously, this question cannot be answered by a backward causal relation passage like the one in (6b ), nor can any other single question. Of course it may be said that the first conjunct of (6b) answers it, but the point is that the second conjunct answers a new question, arising from the first conjunct. That is why and is inappropriate here. This line of reasoning implies that the joint relevance of the conjuncts guides the interpretation of the mutual relation between the conjuncts. Thus, the relation between the conjuncts depends on the relation between the conjunction on the one hand and some set of contextual assumptions on the other. In cases like ( 15) it is clear that an adequate answer presupposes the temporal and causal relation between two subsequent conjuncts. However, in other contexts like what did you do on your free weekend?, one could envisage a list of activities as an answer, see (9) above.
:104
Henk Pander Maat
1.4
Joint relevance in intraclausal and interclausal conjunctions
In this section, I will provide a more general characterization of joint relevance that applies both to intraclausal and interclausal conjunctions. This is important, since the majority of and-occurences constitutes intradausal conjunctions. Schematically, these conjunctions can be represented as C(A&B) or as (A&B)C, in the cases of forward and backward ellipsis of C respectively. Though the exact line between elliptical and non-elliptical conjunctions is a matter of theoretical debate, there can be no doubt that some conjunctions do not involve ellipsis, for instance arguments of predicates like consist of or between and fixed expressions like one and a half, and more and more. For the moment we will concentrate on elliptical cases. What essentially happens in these cases is that the structure C(A&B) or (A&B )Cis preferred over the alternative interdausal structure CA & CB or AC & BC. That is, the conjuncts A and B are not individually related to some context C, but only as a whole. Of course, this 'unitizing' effect of andcoordination has been clearly recognized on the syntactic level: as a rule, the conjunction constitutes the same syntactic category as the conjuncts taken by themselves would have done (e.g. Halliday & Hasan 1976, p. 234; Baker 1995, p. 503 ). That is, coordination is considered as a procedure which turns two constituents of type X into one complex X. The joint relevance account essentially proposes to extend this principle in two ways. First, it is extended to another level of linguistic description, namely that of information structure; second, it is taken to apply not only to intraclausal but also to interclausal conjunctions. To see why the feature of jointness does only apply to the level of information structure and not to the level of semantic interpretation, consider the following two pairs of examples: (16) (17) (18) ( 19)
Clever and honest students always succeed. Clever students and honest students always succeed. Old and young members were invited. Old members and young members were invited.
Semantically, (16) is ambiguous between a collective reading (in which you need to be clever and honest to succeed) and a distributive reading (in which one of the two properties suffices); by contrast, (17) excludes the collective reading. In this pair of examples, the unitizing effect of the conjunction seems to operate not only on the syntactic but also on the semantic level: in ( 16) the
Unstresses en/and as a marker of joint relevance
conjoined adjectives may be read as jointly describing the referent of the head noun, while this interpretation is ruled out in {17) since there are two head nouns in this sentence, the referents of which receive separate characterizations. However, {18) and {19) show that this parallelism of syntactic and semantic structure does not always apply. The conjoined adjectives in ( 18) cannot be read collectively, since a member cannot be old and young at the same time. That is, semantically (18) does not differ from {19). Hence, the feature of jointness associated with the conjunction does not apply to the semantic level here. Instead, the relevant level of analysis for 'jointness' -effects is that of information structure. The difference between {18) and {19) is that they construct different topic-focus articulations. Assuming that the sentences have an argument-focus and not a predicate-focus reading (see Lambrecht 1994), we may say that different parts within the argument are focused: in {18) old and young is focal, while in {19) the focus is old members and young members. In other words, (18) and (19) may answer the question which members were invited? and who were invited? respectively. A similar difference in focus applies to ( 16) and (17). How does this account of conjoined constituents performing a joint informational role apply to full clause conjunctions? Consider the following two sentences. (20) Sheila was doing housework. Peter was watching television. Normally, two successive sentences without referential ties can be read as answers to different questions. For instance, the sentences in (20) can be seen to answer the questions what was Sheila doing? and what was Peter doing? respectively. More generally, assuming that the arguments of the sentences are topical, two sentences to be represented as P(A). Q(B) may answer questions concerning A and B respectively. Now consider the conjunction P(A) and
Q(B). {21)
Sheila was doing housework and Peter was watching television.
When these two sentences are conjoined, the reader is invited to design a shared topic question. This minimally requires her to invent a category under which Sheila and Peter may be subsumed. When Sheila and Peter belong together in some way {e.g. they are spouses), the most natural formulation would simply be Sheila and Peter. And since both predicates can easily be subsumed under the variable doing something, the shared topic question would
205
:ao6 Henk Pander Maat
run as follows: what were Sheila and Peter doing? However, my claim is that conjoining two sentences typically involves more than subsuming two intraclausal topics and foci into one category. On an intuitive level, this can be demonstrated by embedding (20) and (21) in a discourse environment inducing a topic question which does not simply follow the intraclausal articulations: (22) Yesterday night, I visited Sheila and Peter. They are an old-fashioned couple. a. Sheila was doing housework. Peter was watching television. b. Sheila was doing housework and Peter was watching television.
In (22), the contextually induced topic question is something like why are they an old-fashioned couple? This topic question does not focalize the two predicates of the conjoined clauses individually. Instead, the situation referred to by the two clauses as a whole is taken as evidence for a certain statement. In this context the conjunction is clearly more appropriate than a sequence of juxtaposed sentences. In (22), the full clause conjunction constitutes a focal unit, that is, it answers a question induced by an earlier utterance. In principle, the joined informational role of clause conjunctions may be topical, focal or both. However, I would venture the hypothesis that the most natural role of full clause conjunctions is that of an extended focus expression on the discourse level. This may be seen by inspecting several discourse environments for clause conjunctions. First, let us presume that the clause conjunction follows another utterance dealing with the same general theme. It is natural to expect then that the preceding utterance is topical in the sense that it gives rise to a question to be answered by the conjunction (the implicit question is indicated as (Q)): (23)
D.(Q) A and B.
[Full clauses A and B are the focus expression responding to some question concerning discourse context D.] Of course the clause conjunction may also open the text or some thematic unit therein, followed by a further utterance C concerning the same theme, answering a question arising from the conjunction. In that case the conjunction is only topical: (24) A and B.(Q) C. [C is the focus expression responding to some question concerning A and B. A and B are no more than topic providers.]
Unstresses en/and as a marker of joint relevance 207
However, this configuration is implausible from a processing point of view: it resembles a backward ellipsis construction in that the reader will have to wait till the next sentence before she is able to represent the joint relation of the conjunction to the context.5 It seems more efficient for the reader to be either informed of the topic before reading the conjunction (see (23)), or to be able to identify the topic while processing the conjunction. Therefore, the structures (25a) and (25b) are expected to be more frequent than (24). (25) a. (Q) A and B. (Q) C b. (Q) A and B. [A and B contain focus expressions responding to some topic that is identified on the basis of coreferential elements in A and B, on the basis of contextual assumptions, or on the basis of both; an additional utterance C may respond to some additional question concerning A and B, but it need not be present.] In every plausible configuration the conjunction is a joint answer to some contextually induced question. Whether this question arises from an earlier utterance (see (23)) or from the conjunction and contextual assumptions regarding it ((25)) is a matter of empirical investigation. Let us finally consider the collective-distributive indeterminacy once more, not with respect to semantic interpretations but with respect to information structure ( 23). Assume that two clauses A and B answer a question Q regarding D. In principle, the distributive reading of this structure would amount to splitting it (see (26)), implying that a passage like (27) should be read as (28): (26)
[D. (Q) A and B)= [D. (Q) A and D. (Q) B.]
(27) I don't like him. He is mean and he has a foul mouth. (28) I don't like him because he is mean and I don't like him because he has a foulmouth. (29) I don't like him because he is mean and has a foul mouth. Our conception of jointness implies that (26) is incorrect, because there the conjuncts only have a parallel relevance. Instead they are to be read as jointly relevant, as is the case in interpretation (29). We will return to this distinction in Section 3.2. In this section I have proposed a general account of and-coordinations: conjunctions are to be seen as information structural units, both intra- and interclausal. For interclausal conjunctions this line of reasoning resulted in a
:108
Henk Pander Maat
specific hypothesis: interdausal conjunctions typically constitute a joint answer to some contextually induced question. In the remainder of this paper, I will empirically investigate the joint relevance account by a corpus analysis of the discourse environments of interclausal and-conjuctions.
2.
Types of joint relevance: environments of and-conjunctions
2.1
Corpus
Since intuitions about the discourse contexts of and-conjunctions are unreliable, corpus research is needed. This is not to deny that intuitions play a role in the interpretation of corpus examples; the decisive difference between the introspective methodology and corpus research is, however, that the introspective researcher uses intuitions both for the collection and the analysis of linguistic data while the corpus researcher confines the role of intuitions to the analytical stage of research. Three subcorpora were assembled for this study: stock market reports and reports on soccer matches in newspapers, and information leaflets on professions distributed by branche organizations. Each subcorpus presents a rather constrained textual environment, so that certain thematic patterns recur. For instance, in stock market reports and regularly links statements about two stock market funds going up or going down. Soccer reports often narrate how a certain goal came about by sequentially describing the activities of the players that were involved in its accomplishment. And profession leaflets often list the activities that are performed in the course of some professional duty, linked by and. These are three highly different environments for and, but all three are conventional, as their frequencies testify. Both from the stock market reports and the soccer reports, 50 and-conjunctions of full clauses were assembled; the smaller subcorpus of information leaflets on professions only yielded 47 cases. Conjunctions of sentences separated by full stops, which are very infrequent, were left out. The vast majority of the conjunctions contains no interpuntion at all; only 7 cases, spread over the three subcorpora, had comma's. Conjunctions with additional connectives in the second conjunct were left out of the corpus, because they might obscure the contribution of and to the interpretation of the sentence. From all three genres, 25 conjunctions with subject ellipsis in the second conjunct were added to the corpus. This was done in order to find out whether
Unstresses en/and as a marker of joint relevance %09
the joint relevance account is equally relevant for full clause conjunctions and reduced conjunctions. One might imagine that the elliptical conjunctions do not share a topic question, but only a topical part, namely the element that is reduced in the second conjunct. If however elliptical conjunctions are found to share a topic that exceeds the ellipted element, this supports the general formulation of the joint relevance account offered in Section 1.4 above. The analysis will have to answer the following questions: ( 1) Does the majority of the conjunctions, both the full clause conjunctions and the ones with subject ellipsis, indeed jointly answer a question? (2) If so, does this question concern situations or referents from an earlier utterance or from the conjunction itself and contextual assumptions regarding it? (3) In what way is the mutual relation between the conjuncts constrained by the joint relevance of the conjunction? In the remainder of this section, I will discuss the three environments found: the conjunction may support an assumption (see 2.2) or it may answer a single question (2.3 and 2.4); only sporadically it answers two questions (see 2.5). After a discussion of the findings so far (2.6 ), We will review the distributional differences between full second clauses and elliptical ones (2.7). Finally, constraints on mutual conjunct relations will be examined (2.8). 2..2
Supporting a single assumption
The analysis revealed two kinds of joint relevance structures. In the first structure, the conjuncts jointly support some assumption, most typically contained in the immediately preceding utterance. Their function is to argumentatively or informationally strengthen this assumption by means of arguments, examples or elaborations. The second kind of structure is not hierarchical: the conjunction constitutes a topical unit by itself, or it simply continues the discourse by answering a question about an earlier utterance or a discourse referent from this utterance without being subordinated to it. This section deals with the clausal and-conjunctions supporting an assumption. These conjunctions clearly are the majority (N=l64; 74%). I will discuss the three most frequent kinds of joint support environment The first one is argumentation (N=41). The conjunction may adduce two arguments (like in (30)), in which case it will be further called a list, or it may present two
:uo Henk Pander Maat
segments which are only together an argumentation (see (31) and ( 32)). These cases will be referred to as sequences, because here the conjuncts cannot be reordered without seriously changing the interpretation. In (31) this is due to temporal order, but in (32) causal implications are added: (30) D Als jeer over denkt om kapper ofhaarstylist te worden, ben je op de juiste weg naar een goede toekomst. Het kappersvak biedt je verschillende mogelijkheden en bet geeft je een grote mate van zekerheid en zelfstandigheid. E If you consider becoming a hairdresser, you are on the right path to a promising future. The hairdresser's profession offers several possibilities and it provides a high degree of security and independence. (31) (This fragment follows a passage about the first goal, which was caused by a mistake of a goalkeeper named Bolesta.) D De oude Pool Bolesta, 35 jaar en normaal bankzitter, viet zes minuten later minder te verwijten. Overmars sjeesde aan de linkerkant voorbij Senden, gaf de hal voor en Ronald de Boer schoot de hal onnadenkend de bovenhoek in (0-2). E The old Pole Bolesta, 35 years and normally a benchwarmer, could not be reproached that much six minutes later. Overmars raced past Senden at the left wing, centred and Ronald de Boer thoughtlessly shot the ball into the upper comer (0-2). (32) D De Geusselt (bet MVV-stadion, [hpm)) is de Kuip (bet Feyenoordstadion, [hpm]) niet, zo bleek gisteravond in MaastrichL MVV hood heel wat meer verzet dan Feyenoord in bet bekerduel en Ajax moest zich tevreden stellen met een onbevredigende 1-l. E The Geusselt (the MVV stadium, (hpm]) is not the Kuip (the Feyenoord stadium, (hpm] ), as it turned out yesterday in Maastricht (home town ofMVV, (hpm]). MVV offered a lot more resistance than Feyenoord did in the National Cup match and Ajax had to content itself with an unsatisfactory 1-l. A second environment is exemplification (N=31). Exemplifications are especially frequent in stock market reports, in which general statements about a certain sector of industry are followed by specific information on certain companies. In the list variant the and-conjunction presents two illustrations of some general announcement (see (33)); but the conjoined segments may also jointly constitute an example (see (34) ). Though this fragment does not display
Unstresses en/and as a marker of joint relevance
temporal or causal priority, it may nevertheless be called a sequence because reversing its order seriously affects its interpretation: the second utterance is much more specific than the first. We will call this structure a specification
sequence. ( 33) D
E
(34) D
E
De internationals waren zonder uitzondering Xink boger. Koninklijke Olie boekte een winst van 1,90 gulden op 149,90 gulden. Unilever pakte anderhalve gulden op 192,50 gulden en Akzo steeg negentig cent op 139 gulden. The internationals ( = stocks of internationally operating companies, [hpm]) without exception were considerably higher. Royal Oil gained 1,90 guilders at 149,90 guilders. Unilever took one and a half guilders at 192,50 guilders and Akzo went up ninety cents at 139 guilders. De beleggers zoeken hun heil steevast in de veilig geachte kwaliteitsfondsen met een defensiefkarakter. Heineken, Grolsch, Hagemeyer, Wolters Kluwer, U nil ever en de verzekeringsfondsen lagen daardoor zeer gevraagd. Heineken wist met een winst van 4,50 gulden te sluiten op 185,70, bet hoogste niveau dit jaar. Gedurende de dag wist bet fonds zelfs even 187,80 gulden aan te tikken. Grolsch lag uiteraard ook goed en voegde 4,80 toe op 206,80 gulden. The investors invariably seek refuge in quality funds with a defensive character, considered safe. As a result, Heineken, Grolsch, Hagemeyer, Wolters Kluwer, Unilever and the insurance were in high demand Heineken managed to gain 4,50 guilders and closed at 185,70, this year's highest level. During the day it even managed to reach 187,80 guilders for a while. Grolsch of course was in good shape and added 4,80 at 206,80 guilders.
A third environment is elaboration (N=58). While exemplifications illustrate a statement about a set of situations, elaborations provide additional details on a situation or some discourse referent. Process elaborations answer the question how did this happen? or to what degree did this happen? For instance, in ( 35) the way a certain goal was accomplished is elaborated on while in (37) the rises announced in the first sentence are elaborated on quantitatively. Entity elaborations answer the question like what (kind of) X is meant? For instance, in (36) the conjunction provides details on the kinds of personal attention needed by airplane passengers.
:n1
212
Henk Pander Maat
Again, we may distinguish between list variants (see the two kinds of attention specified in (35)) and sequences jointly constituting an elaboration. In (36), a temporal sequence gives details on how a goal was accomplished. In (37) we find another type of sequence, the relative-absolute sequence: first a situation is represented from a comparative perspective, then this is done in absolute terms. Jointly the members of this sequence specify the size of the rise in Amsterdam. (35} D Vijf minuten na zijn entree zette hij Ajax op 2-0. Hij gleed dwars door het hart van de defensie en schoot de ballaag en zacht in de linkerhoek. E Five minutes after entering the field he put Ajax at a 2-0 lead. He penetrated right through the hart of the defence and softly shot the ball in the lower left corner. (36} D
E
(37} D
E
Bij aile werkzaamheden moet de steward(ess) er steeds aan denken, dat persoonlijke aandacht door de passagiers zeer gewaardeerd wordt. Zij verwachten antwoord op al hun vragen en hebben soms hulp nodig, zoals bijvoorbeeld bij het invullen van douane- en emigratiepapieren. During all his activities the steward(ess} must realize that personal attention is highly appreciated by the passengers. They expect answers on all their questions and sometimes need help, for instance with filling in customs and emigration documents. [ ... ]De koersen van de meeste andere aandelen sloten maandag boger. De bijna drie cent hogere dollar en de vaste stemming op de belangrijkste buitenlandse beurzen hadden duidelijk effect op bet Beursplein. De EOE-index won 2,51 punten en sloot op 294,46. [ ... ] The prices of most other shares rose this Monday. The fact that the dollar gained almost three cents and the steady trading on the major foreign exchanges clearly affected the 'Beursplein' (the square where the Amsterdam Stock Exchange resides, [hpm]). The EOEindex gained 2,51 points and closed at 294,46.
So far, we have reviewed cases in which the assumption supported by the conjunction is stated explicitly in the preceding text.6 However, some conjunctions (N=l4) are preceded by an adversative connective indicating that the conjunction is to be taken as supporting the denial of an implicated assumption.
Unstresses en/and as a marker of joint relevance
(38) D Alle obstalcels voor een struikelpartij waren voorhanden, maar Feyenoord verzaalcte niet en won simpel met 3--0. E Every conceivable obstacle to stumble over was present, but Feyenoord did not fail and simply won 3-0.
Here, the relevant question is did the expected failure actually take place? The conjunction supports the implicated negative answer to this question, the implicature being linguistically triggered by but. 2.3
Delimiting the concept of 'jointly supported assumption'
It would be natural to expect that an and-conjunction may not only support assumptions that are linguistically triggered, but also assumptions which are entirely implicit. For instance, the second sentence of the next fragment informs us about the share price development of two funds from the same branch of industry. (39)
D
E
Het chemiefonds DSM klom zes dubbeltjes naar 82,70 gulden en collega Akzo ging 30 cent vooruit naar 144,40 gulden. The chemical stock DSM climbed sixty cents to 82,70 and its colleague Akzo improved 30 cents to 144,40 guilders.
Are we entitled to infer from this conjunction a general statement like chemical industry funds did well yesterday? Perhaps this inference is not intended. An alternative rendering of the joint relevance would be the question what happened to the share prices of chemical funds yesterday? However, this question allows the two share prices to move in every conceivable direction, including two opposite directions. And an earlier study (Pander Maat 1999) has shown that opposite share price movements of funds in the same branch of industry are not regularly marked by and; instead they are introduced by connectives like maar (but) or daarentegen (as against this). Taking into account that share prices within the same branch of industry are expected to move in the same direction, this finding implies that the choice of the connective is sensitive to the assumptions that are contextually valid: and is used to indicate that a certain situation follows the assumption, while but indicates that this situation is contrary to this assumption. So the account of and and but might be as follows. When the question is what happened to the share prices oftwo companies c1 and c2?where c 1 and c2 are members of branch C, the answer may be a conjoined utterance consisting of
~13
:1.14
Henk Pander Maat
descriptions of the share price developments of c 1 and c2. The actual content of the descriptions may follow the same-direction assumption or may contradict it. In the first case, the conjuncts are linked by and. in the second case by but. What is indicated by and here is both that a single question is answered and that the answer parts support some contextually valid assumption. This account, which in fact is a modified and generalized version of the argumentative co-orientation account presented by Dutka (1993) is compatible with the large number of cases in which the conjunction supports an explicitly stated assumption. An additional attraction of it is that it provides a general account of both and and but, which would be characterized as bearing opposing values on the feature 'assumptional polarity'. However, the assumptional polarity account cannot be the whole story about and. One problem with it is the fact that cases of implicit assumptions such (39) are rare, even in stock market reports, a genre for which they seem rather suitable. The overwhelming majority of and-conjunctions listing two share price developments is preceded by general statements. In other words, and-conjunctions typically support assumptions that have been stated earlier, while the assumptional polarity account suggests that explicit and implicit assumptions should both occur at least regularly, because theoretically they are equally plausible. However, a more important problem with the assumptional polarity account is that assumptions are not always easy to construe, as will be shown in Section 2.4. 2.4
Answering a single question
We have already considered a case in which the existence of an implicit common assumption may at least be doubted, an implicit common question being the alternative representation of the joint relevance. For a considerable portion of our cases (N=52; 23%) such a question is the only conceivable analysis. (40) D Deze tactiek (van PSV, [hpm]) slaagde, maar ging met grove overtredingen gepaard. Scheidsrechter Vander Ende trader behoorlijk tegen open trotseerde de ftuitconcerten (PSV speelde thuis, [hpm]) door uitsluitend PSV'ers te bekeuren. E This tactic (ofPSV, [hpm)) was successfull, but it involved bad fouls. Referee Van der Ende properly dealt with these and fined only PSV players, defying the catcalls. (literally: and defied the catcalls by fining only PSV players.)
Unstresses en/and as a marker of joint relevance
(41) D Het aantal patienten dat gebruik maakt van specialistische hulp stijgt jaarlijks. Oat komt onder meer door de zogenaamde vergrijzing: er komen meer oude mensen en de mensen bereiken een steeds hogere leeftijd. E The number of patients receiving specialist care rises every year. One of the causes for this is the so called ageing: the population comprises more old people and people get older all the time. The conjunction in ( 40) tells us how the referee dealt with the fouls committed by one of the teams; the conjunction itself is a specification sequence in which an evaluation of the referee's reaction is followed by a factual description. Just like assumptionally related conjuncts, multipart answers may take the form of a sequence or a list. The conjunction in (41) presents a clarification of the concept of ageing (marked as potentially incomprehensible by the modifier ·so called'); the clarification lists the two phenomena which constitute •ageing'. As we have seen, a supported assumption is typically stated in the preceding text, except when it is implicated by adversative connectives. That is, in this structure much of the topical material is textually present To what degree do implicit questions concern earlier statements or parts thereof? In most cases (N=37; 71%) the implicit question is concerned with the proposition of the preceding utterance or to some central referent of this utterance: the •fouls' in (40) and the concept ageing in (41).The alternative (N= 15; 29%) is that the conjunction is a self-contained thematical unit: (42) D Unilever verliet de markt 90 cent beter op 214,80 gulden. Hoogovens kon zich aanvankelijk handhaven, maar eindigde 1,10 gulden lager op 28,10 gulden. Ahold bleef lijden onder de aankondiging van de daimemissie en moest een gulden terug naar 94,40 gulden. E Unilever left the market 90 cents better at 214,80 guilders. Hoogovens initially was able to maintain its position, but closed 1,10 guilders lower at 28,10 guilders. Ahold continued to suffer from the claim emission announcement and had to give up one guilder at 94,40 guilders. In (42), a causal sequence, the implicit question is fairly clear: how did the share prices of Ahold develop? This question is induced on an entirely contextual basis. It exemplifies a highly conventional question for stock market reports.
us
2.16
Henk Pander Maat
2..5
Two pieces of general information on a discourse referent
Until now, the topic questions invoked to account for the pragmatic unity of and-conjunctions all included a predicate-argument combination P(A). The questions used this far can be represented as follows: ( 1)
(2) (3)
Why (do you say) P(A)? (argumentation) What are examples ofP(A)? (exemplification) How did P(A) happen?/ To what degree did P(A) happen? (process elaboration)
The apparent exception to this are questions concerning entity elaborations and clarifications, which can be represented as what is meant by A? But in fact these questions do contain a predicate from a restricted class: mean by, define as, consist ofand the like. In a few cases however (N=6; 3%), such a question cannot be formulated. (43) D Je kunt ook via een dagopleiding ziekenverzorgende worden. Die opleiding, die je volgt aan een school voor Middelbaar Dienstverlenend en Gezondheidszorg Onderwijs (MDGO) duurt drie jaar en kan gevolgd worden op 12 plaatsen in Nederland. E You may also train to be a nurse by means of daytime training. This training, which is to be followed at a school for Secondary Education in Service and Health Care, takes three years and may be followed at 12 different places in The Netherlands. The only question conceivable in (43) is something like what can you say about A? However, this question lacks an informative predicate. The only unity displayed by these cases is that they provide two items of general, introductory information on some discourse referent. This implies that one cannot conjoin any two items of information. For instance, (43) is appropriate, but the fictitious variants (44) and (45) are not: (44) *This training takes three years and your teacher for Anatomy is P. Gandolf.
(45) •This training takes three years and the lessons always start at half past eight.
In sum, conjunctions presenting two general items of information are twotopic conjunctions; however, they may still be said to constitute a functional
Unstresses en/and as a marker of joint relevance 117
unit, in the sense that the conjunction is aimed to succinctly characterize some discourse referent. 2.6
What has been learnt so far
We have identified three contexts for and. Providing two items of introductory information is such a rare environment of and that it is safe to say that and as a rule requires the conjuncts to answer a single question. This does not only affirm the hypothesis regarding the joint informational role of conjunction advanced in Section 1.4, but also its specification: conjunctions are focal, rather than topical in character. Moreover, the majority of conjunctions not only answer some question, but a rather specific question, namely a question about the argumentative basis or about the specific information underlying some statement. It is remarkable that the majority of conjunctions is to be found in such a restricted set of environments. Another important result of this corpus analysis is that typically (93%), preceding discourse at least partially provides the topical material to be included in the question to be answered by the conjunction. The textual context is particularly rich when preceding utterances are supported. In these cases, one only needs to add a question word (why, how, what) in order to get a fully specified topic. Again, it is remarkable that the majority of conjunctions is anchored in such a rich textual environment. This finding is in striking contrast with a common analytical procedure followed in linguistic work on and: presenting an isolated conjunction and inviting the reader to conceive of a context that renders this conjunction acceptable. Moreover, conjunctions do not only adress topics which are retrospectively identifiable from the preceding text, a considerable number of them realizes predictable discourse patterns, known to constitute a regular ingredient of a certain discourse genre. For instance, evaluative statements in soccer reports are conventionally known to be backed up, general statements in stock market reports are always exemplified, and it is hardly surprising that a job description starts by enumerating the activities that make up a profession. 2.7
Differences between elliptical and full second clauses
As said, the comparison of elliptical and full second clauses aims to see whether elliptical conjunctions are found to share a topic that exceeds the ellipted
:1.18 Henk Pander Maat
element; if so, this would show that the joint relevance principle is valid beyond what is linguistically signalled. As is shown in Table 1, conjunctions with subject ellipsis in the second clause more often answer single questions and more often provide two pieces of information than full clause conjunctions (ChP =12.55, df =2, p < .01 ). Table 1.
Distribution of full and ellipsed conjunctions over three discourse environments (numbers and column percentages).
assumption question two pieces of information
total
full clause
ellipsed
total
118
46 (61.3%)
164
24
52
(32.0%) 5 (6.7%)
6
(80.3%) 28 (19.0%) I (0.7%) 147
75
222
This result is not surprising, since the non-hierarchical single question context more often has a discourse referent as its central element than the assumption support context, in which typically the entire proposition is at issue. And topical referents are often syntactic subjects that may be omitted in the second conjunct. Likewise, two pieces of elementary information necessarily concern a common referent. In fact, they illustrate that referential continuity is the absolute minimum of coherence required in conjunctions. Nevertheless, subject ellipsis is more often than not associated with assumptions. That is, nothing prevents a conjunction with a reduced second clause from realizing one or more arguments, exemplifications or elaborations. Moreover, the fact that only 7% of the elliptical conjunctions answers more than one question indicates that the shared topic in these conjunctions typically exceeds the ellipted element in the second clause, thus supporting the general validity of the joint relevance requirement for conjunctions. Ellipsis is only one kind of referential cohesion between the conjuncts. Therefore, the consequences of full versus ellipted second conjuncts may be seen more clearly when conjunctions containing other referential ties, such as pronominal or nominal anaphors, are left out. Table 2 shows that when no other referential ties are present, full clause conjunctions may only support assumptions. That is, while supporting an assumption does not always mean that referential ties are absent, the 'reverse' does hold: the absence of referential integration, whether by ellipsis or by other means, prevents a conjunction from
Unstresses en/and as a marker of joint relevance 219
answering a single question. Examples of conjunctions without referential ties have been presented in (31), (32) and (33). Table 2.
Distribution of full and ellipsed conjunctions over three discourse environments (numbers and column percentages): cases without other referential ties.
assumption
full clause
ellipsed
total
26
40 (62.5%) 19 (29.7%)
66
(100%)
question two pieces of information total
2..8
26
19
5 (7.8%)
5
64
90
Relations between and-conjuncts
The present approach implies that the mutual relation between the andconjuncts is constrained by the joint relevance of the conjunction. Let us see how the conjunct relations satisfy these constraints. First of all, it is not very surprising that lists are a suitable structure for joint relevance units. All too often, one argument is not enough; and even more often, general statements can only be exemplified by more than one instance. More interesting is the fact that a considerable part of the and-conjunctions (N=l28; 58o/o) are sequences of some kind. Four types of sequence have been mentioned: temporal, causal, specification and relative-absolute sequences. I will now discuss their contribution to joint relevance. The fact that temporal sequences may constitute joint relevance units is understandable given the fact that they may be regarded as ·event lists' answering questions like what happened (then)? An specific manifestation of such event-directed questions is how was this goal accomplished?, a regular topic in soccer reports; it is answered by an event sequence, typically concluded by the score. Hence, the regular association of and with chronological successivity is a topic-related phenomenon. It has nothing to do with and as such, but with the need to produce an informative reply to a question concerning several successive events. Other kinds of topic questions impose other kinds of conjunct relations. For instance, when the question is concerned with a single event, joint relevance may only be realized by presenting two versions of the event.
uo Henk Pander Maat
Though temporal and-sequences are 'asymmetric' since their order is fixed, they have symmetric traits too. Semantically they often display parallelism in that a topical participant is involved in two past tense action predicates. And taken as a piece of information, each clause is equivalent in that it contains one sub-event of the macro-event which was questioned. As against this, the interesting feature of the next three sequences is their informational asymmetry. the second conjunct is decisive for the answer, in the sense that only this conjunct provides full information. In causal sequences (see (42) and (32)), the first clause explains why a certain result came about; however, it does not provide an explanation which assumes prior knowledge of the explanandum, but rather gives a causal background that heavily implicates already what happened. In specification sequences (see (38) and (40) ), the first conjunct provides a qualification of the situation, often evaluatively phrased, while the second provides the actual fact of the matter. And in the relativeabsolute sequence the first conjunct presents a certain situation from a comparative perspective (see (34) and ( 37)), while the second conjunct characterizes the situation in itself. 7 We could say that in all these sequences, the movement is 'forward'. In temporal sequences, this forward direction should be taken in the chronological sense; in the informationally asymmetric sequences, the forward movement takes place in the discourse itself the first conjunct characterizes the situation from a certain causal, evaluative or comparative perspective, while the second actually tells us what happened. Of course, this forward movement is not conditioned by and; it is a feature of unfolding discourse, even within such tightly bound units as and-conjunctions. This informational structure also explains why causal, specification and relative-absolute sequences are suitable to answer one single question: the conjuncts present different formulations of the single situation, so that the variable in the underlying question is instantiated twice, as it were. To sum up, the conjunct relations enabling and-conjunctions to answer a single question come in two kinds. Symmetrically related conjuncts are referentially distinct, but functionally equivalent and semantically parallel members making up a larger argumentative, descriptive or narrative discourse unit. In asymmetrically related conjuncts a single situation is presented from different perspectives.
Unstresses en/and as a marker of joint relevance
3·
Theoretical implications of the joint relevance account
3.1 Joint relevance as a procedural meaning
I will discuss two points in this final section: the conceptual status of the proposed semantic description of and, and its ramifications for the classification of coherence relations. As we have seen, Posner ( 1980) and Carston ( 1988) appear to regard and as semanticallyempty(apart from its truth-functional'&'meaning). The relation between the conjuncts can only be recovered by means of conversational implicatures. And while Carston ( 1993) and Blakemore ( 1987) have observed that conjoined structures suggest some kind of joint relevance, they do not consider this suggestion as lexically encoded. Blakemore seems to think that it is due to the truth-functional meaning of and together with the Principle of Relevance, while Carston ( 1993, p. 41 ), though not denying this, adds the claim that the joint relevance suggestion of and-conjunctions is determined by the grammatical fact that two clauses are conjoined into one sentence. The same line of argument is to be found in Wilson and Sperber ( 1993, p. 7). However, coordination as such cannot be decisive here, since other coordinating connectives do not behave like and. I will demonstrate in Section 3.2 below that but-conjunctions cannot be jointly relevant in the way andconjunctions are. Nevertheless, the relevance theoretic framework offers two distinctions in terms of which the semantics of and may be formulated, namely the distinction between conceptual and procedural meaning and that between explicatures and implicatures. According to Wilson and Sperber (1993), conceptual representations have two distinctive features: their logical properties (they enter into entailment and contradiction relations) and their truth-functional properties (they can be used to describe or partially characterize a state of affairs). Most regular 'content' words are conceptual and truth-conditional. As against this, procedural elements encode constraints on the inferences that have to be made in order to arrive at the intended interpretation of the utterance containing the item. Some procedural elements are truth-functional; for instance, pronouns guide the search for a referent that is part of the proposition expressed, thus constraining explicatures. Other procedural elements, such as the discourse connectives so and after all (Blakemore 1987), are not truth-functional because they constrain the generation of implicatures regarding the relevance of the proposition they accompany.
:u1
:12.2
Henk Pander Maat
My proposal is to regard 'joint relevance' as the procedural meaning of
and. That is, and tells the hearer to process the conjuncts together in relation to a single set of assumptions, namely those embodied in the single Wh-question they are jointly answering. In other words, and encodes an instruction to construct a common information structural configuration for its conjuncts. Stated in relevance-theoretic terms, one may say that and indicates that its conjuncts jointly yield some contextual effect (though it does not specify what kind of effect). This proposal leads to a rather different view on and than the one behind the pragmatic inference account. And does not just mean '&'. This truth-functional notion is of course present in its meaning, but it is the least interesting element ofit. After all,'&' is also present in the meaning of but, and if it would constitute the essential semantic component of these connectives the difference between and and but would be a mere matter of conversational implicature. However, it is not; it is a question oflexical semantics. The fact that and encodes constraints on the information structural interpretation of its conjuncts renders a conceptual analysis of it impossible. It might be asked whether the notion of'jointness' cannot be interpreted both on a conceptual and a procedural level. In itself, this is conceivable, because there is such a concept as a jointness, which notion is indispensable when it comes to the semantics of items as together on the one hand and seperately (and arguably both). However, it was shown in Section 1.4 that and cannot be captured in terms of this notion, since it does not by itself enforce the collective semantic interpretation of its conjuncts. As a result, a semantic analysis of and will inevitably end up by overgeneralizing a subset of joint relevance environments. For instance, Lang (1984) claims that every coordinate structure invokes a Common Integrator which is exemplified by the conjuncts. This approach presumes that coordinate structures tend to parallelize the interpretation of the conjuncts, in that they are taken to refer to concepts of an equal conceptual level. When applied to interclausal and, Lang's approach can only account for the use of and in exemplifications. For these uses it is correct to say that two different situations are presented from the same conceptual perspective. However, for causal, specification and relative-absolute sequences the reverse is true: they present the same situation from two different conceptual angles. Another example is provided by Bar-Lev and Palacas ( 1980), who have proposed to account for interclausal and by a notion of'semantic command', which states that the second conjunct is "not chronologically or causally prior"
Unstresses en/and as a marker of joint relevance :12.3
to the first conjunct. This proposal can account for temporal and causal sequences and for list cases but it does so only negatively, that is, by describing what they are not. More importantly, it misses the feature of joint relevance underlying these interconjunct relations. 3.2.
Joint and parallel relevance
The second issue I want to raise here is what the present analysis of and may learn us about the classification of coherence relations in discourse. Sanders, Spooren & Noordman (1992, 1993) have proposed an interesting typology of relations, based on four binary parameters: ( 1) basic operation: CAUSAL relations versus ADDITIVE relations; (2) source of coherence: SEMANTIC relations concern real world situations
while PRAGMATIC relations concern statements (subtype EPISTEMIC relations) or speech acts (subtype CONVERSATIONAL relations); (3) polarity (POSITIVE versus NEGATIVE relations); ( 4) order (only productive for CAUSAL relations: cause precedes consequence or vice versa). Sanders, Spooren & Noordman consider and to be a prototypical marker of positive SEMANTIC additive relations. However, their concept of ADDITIVE relations is modelled on the truth-functional notion of conjunction since it only requires that both discourse segments are true for the speaker. In earlier work (Pander Maat 1998, 1999) I have argued that two subtypes need to be distinguished within this broad field of 'non-causal' relations: strictly ADDITIVE and COMPARATIVE relations. In strictly ADDITIVE relations, which are not defined for polarity, the segments may be related referentially but are not related to a shared set of contextual assumptions. In the present study, strictly ADDITIVE relations are exemplified by the conjunctions presenting two items of introductory information on a discourse referent. Above it was shown that and is only seldomly additive in this sense. By contrast, the segments of a COMPARATIVE relation are both related to a more general assumption concerning the described entities (in the case of SEMANTIC COMPARATIVE relations) or an inference supported by the segments (EPISTEMIC COMPARATIVE). In POSITIVE COMPARATIVE relations, the assumption is supported, in NEGATIVE COMPARATIVE relations its validity is restricted:
224 Henk Pander Maat
(46) SEMANTIC POSITIVE (supported assumption: Financial stocks did well yesterday) ABN Amro gained 0,40 cents at 56,68. Besides, ING went up 60 cents at 75,34. (47) SEMANTIC NEGATIVE (weakened assumption: Financial stocks did well yesterday) ABN Amro gained 0,40 cents at 56,68, but lNG lost 60 cents at 75,34.
(48) EPISTEMIC POSITIVE (supported assumption: I should buy this coat) This coat is beautiful. Besides, it is not expensive. (49) EPISTEMIC NEGATIVE (weakened assumption: I should buy this coat) This coat is beautiful, but it is rather expensive.
In Pander Maat (1999), and is discussed as a marker of POSITIVE COMPARATIVE relations, that is, no conceptual distinction is made between and and items like besides. The present chapter, however, gives strong reasons to introduce such a distinction. The joint relevance associated with and is characterized by the fact that the conjuncts are related to their context only as a whole. That is, an andconjunction performs a role that could not be performed by one ofits conjuncts on its own. This is clearly true in the case of and-sequences, which are made up by qualitatively different members. For instance, in specification sequences like (SO) or causal sequences like (51) the conjuncts cannot function on their own:
(SO) The EOE-index gained 2,51 points and dosed at 294,46. (51) The Geusselt (the MVV stadium, [hpm]) is not the Kuip (the Feyenoord stadium, [hpm)), as it turned out yesterday in Maastricht (home town of MVV, [hpm]). MVV offered a lot more resistance than Feyenoord did in the National Cup match and Ajax had to content itself with an unsatisfactory 1-1. However, this characteristic also applies to the list cases. When a statement is supported by two arguments linked by and, the suggestion is that only the conjunction of the arguments yields argumentative support. In an example like (52), in which each argument on its own is insufficient, only and can be used, not besides. (52) Going out for dinner was the only option. a. There was no food in the house and the shops were already dosed. b. There was no food in the house. *Besides, the shops were already dosed.
Unstresses en/and as a marker of joint relevance
This observation only applies to unstressed and. At least in Dutch, stressed and may fulfill functions analogous to besides, or even but also (like in He is reliable
AND intelligent.). The conclusion is that the comparative relations in (46)-(49) are not joint relevance relations, but parallel relevance relations. These two relations (joint relevance and parallel relevance) are different members of a class which I propose to call shared relevance relations. Their structures are represented schematically in Figures 1 and 2: parallel relevance: negative (contrastive)
parallel relevance: positive
r - - assumption----,
r - - assumption----,
t4
t4
Sl
.. ~
+Ibut
S2
Sl
+I+ moreover
.. 4 S2
Figure 1. Parallel relevance configurations. (Sl =first segment, 52= second segment) assumption/ question
I
I
Sl and S2 Figure 2.
Joint relevance configuration
As shown in Figure 1, the segments of parallel relevance relations are both related to a contextual assumption, positively or negatively. Ifboth relations are inferentially positive (or negative, but this is irrelevant since in that case both relations are positive with regard to the negation ofthe assumption) the relation represented by the two-pointed arrow is positive. The assumption is supported. If the relations between both conjuncts and the assumption have opposing directions, the relation is negative. Doubt is cast upon the assumption. While parallel relevance relations essentially involve three terms, joint relevance relations have two terms. This is because their segments are not individually related to the assumption or question, but only as a whole. Because there is only one relation involved between segments and context, the asymmetric joint relevance relation has no negative counterpart. While comparative relations may be ·negated' by negating the second segment and replacing besides by but (see e.g. (48) and (49) above), this is impossible in fragments like (SO) and (51).
us
:1.26 Henk Pander Maat
Their different behavior regarding polarity is an important argument for the proposed distinction between parallel and joint relevance relations. However, it appears to be supported by other evidence as well. There are many connectives that can be used in parallel contexts, but not in joint relevance environments: also, besides, and moreover, to mention a few. 8 Thus, a careful inspection of additive and comparative connectives appears to modify our current classification of coherence relations. Needless to say, this account of and leaves a number of questions unanswered. The most pressing ones concern the relations of and to other coordinating connectives realizing other varieties of shared relevance, especially but and or. A preliminary sketch of the relation between and and but was offered above, indicating that these connectives are not just positive and negative expressions of the same relation, but that they show qualitative differences. At first sight, or also indicates shared relevance in that its conjuncts, if focal, constitute two possible answers to a single question. However, oris more likely to express a kind of parallel relevance, since disjunctions often suggest that only one of the conjuncts is correct. In sum, the major challenge in future work is to further develop the concept of shared relevance. 3·3
Joint relevance and the representation of coherent discourse
In this paper, it is claimed that and encodes an instruction for the interpretation of its conjuncts as related to a single set of assumptions. Since this claim clearly concerns the process of building a discourse representation, it is worthwhile to review some recent work on the role of connectives in discourse processing, with special attention to the treatment of and. Segal, Duchan and Scott ( 1991) and Caron ( 1997) discuss the role of and as a segmentation marker in narrative and instructive discourse respectively, and their conclusions are strikingly similar. According to Caron (1997, p. 67), reporting results from Caron-Pargue (1993), et and et puis "are consistently used to imply that the same system of reference is in use". By contrast, with apr~s the hearer is informed that a new system of reference has to be constructed. Likewise, Segal et al. ( 1991, p. 41 ff.) observe that unlike then, and is often used in contexts of "deictic continuity" i.e. sequences in which time, place and persons are kept constant. When time does move forward in an andconjunction it most often contains an activity-goal sequence (e.g. he waited
until the giant was sleeping and he chopped off his head).
Unstresses en/and as a marker of joint relevance 22.7
These observations are entirely compatible with the joint relevance account advocated here, since they describe textual features regularly present in joint relevance units. However, referential integration by itself is not the heart of the matter. It often co-occurs with joint relevance, but is not necessarily attached to it, as was shown in Table 2 above. The role of connectives in building discourse representation has recently also become a topic for experimental investigations (Millis & Just 1994; Millis, Golding & Barker 1995; Deaton & Gemsbacher (in press)). These experiments have focused primarily on the connective because, examining its influence on the recall of and the on-line activation pattern for the related clauses, and on the inferences generated in the course of the comprehension process. In these experiments, subjects are presented with pairs of statements which may or may not be related by connectives. In several experiments, because has been compared with and, which is considered as a neutral 'base line' connective, imposing no particular constraints on its conjuncts. For instance, Millis et al. ( 1995, p. 39) say that and "has little intrinsic meaning", and Deaton and Gernsbacher (in press) state in their introduction: Although readers might infer a causal or temporal relation between the two events based on the meaning of the two clauses, the conjunction and connotes only that two events occurred.
In this paper, I hope to have shown that this view of and as semantically empty is inadequate. For one thing, and is not indifferent to the direction of the CAUSAL relation inferred between the conjoined propositions: it blocks backward causal interpretations (as opposed to forward causal interpretations), see Section 1.3. But more fundamentally, experiments presenting pairs of isolated statements cannot capture the instruction encoded by and, since this instruction refers the reader to some kind of contextual information. This is not to say that the experimental methodology is unfit to deal with the meaning of and, quite the contrary. If only the two clauses or sentences are preceded by a sentence that explicitly or implicitly conveys a framework for their interpretation, the influence of and may be investigated by comparing an and-conjunction with a version simply juxtaposing the two sentences. My hypothesis would be that the activation level of the first sentence after reading the third will be higher in the and-condition than in the juxtaposition-condition.
n8 Hmk Pander Maat
Notes 1. From here on I will use the term 'conjunction' to refer to the conjuncts together with their connective, taken as a whole.
Van Kuppevelt uses the term comment, but I prefer the more commonly used term focus (Klein 1991; Lambrecht 1994; Vallduvi 8c Engdah11996).
2.
3· Conceptualizing topics in tenns of questions raises the question whether such a topic concept can also account for the relevance of non-declarative sentences. Although this is not
the place for a full justification of the question topic concept, it must be remarked that it lends itself quite easily for the analysis of imperative sentences, which are to be seen as answering the question what should you do? Of course, the relevance of non-clausal linguistic utterances (e.g. hm or damn!) does not lend itself to analysis in these terms. 4- A third type of topic development inducated by connectives is topic change (by the way,
inddentaUy) or closure (well then). S· Rickheit and Sichelschmidt ( 1992) have found that forward ellipsis constructions require
less reading time than backward ellipsis constructions. This may be explained as follows. In forward elliptical constructions, an appropriate object node has been established during reading the first conjunct, which can be easily accessed when reading the second, reduced conjunct. In cataphoric coordinations, no referent is specified in the first term, so that readers have to store 'stand-alone' items until they are able to integrate these items upon reading the second term. 6. Besides argumentations, exemplifications and elaborations, some less frequent patterns involving the support of explicit assumptions occurred, for instance conjunctions providing explanations or reasons for some situation (N=16). 7. The fact that conjoined clauses may refer to the same situation seems to have a parallel in
NP conjunctions, since it calls to mind the 'appositional coordination' of coreferential NPs noted by Quirk and Greenbaum ( 1976, p. 178), as in This temple of ugliness and memorial to Victorian bad taste was erected at the Queen's express wish. 8. On the symmetry of also, see Blakemore ( 1992, p. 142 ff.) on 'parallel implications'
References Bar-Lev, Z., & Patacas, A. ( 1980). Semantic command over pragmatic priority. Lingua. 51, 137-146. Blakemore, D. ( 1987). Senu~ntic constraints on relevance. London: Basil Blackwell. Blakemore, D. ( 1992). Understanding utterances. An introduction to pragmatics. Oxford etc.: Blackwell. Caron, J, (1997). Toward a Procedural Approach of the meaning of Connectives. In J. Costermans & M. Fayol (Eds.), Processing Interclausal Relationships. Studies in the studies in the Production and Comprehension of texts (pp. 53-73). Mahwah, NJ: Laurence Erlbaum Associates.
Unstresses en/and as a marker of joint relevance
Caron-Pargue, J. (1983 ). Language et argumentation: Etude d'enchainements d'~non c~. In lA Pensee naturelle. Paris: P. U. F., 229-240. Carston, R. (1988). Implicature, explicature and truth-theoretic semantics. In R. Kempson (Ed.), MentQ/ representation: The interfaces between language and reality (pp. 15~182). Cambridge: Cambridge University Press. Carston, R ( 1993). Conjunction, explanation and relevance. Lingua, 90,27-48. Deaton,}. A., & Gemsbacher, M.A. (in press). Causal Conjunctions: Cue Mapping in Sentence Comprehension. Journal ofMemory and lAnguage. Dutka, A. ( 1993 ). Les connecteurs argumentatifs en polonais. InN. Dittmar & A. Reich (Eds.), Modality in language acquisition (pp. 97-109). Berlin: De Gruyter. Grice, H. P. (1981). Presupposition and conversational implicature. In P. Cole (Ed.), Radical pragmatics (pp. 183-198). New York: Academic Press. Klein, W., & Von Stutterheim, C. (1987). Quaestio und referentielle Beweging un Erzahlungen. Linguistische Berichte, 109, 163-183. Klein, W., & Von Stutterheim, C. (1989). Referential movement in descriptive and narrative discourse. In R. Dietrich & C. F. Graumann (Eds.), Language processing in social rontext(pp. 39-76). Amsterdam: Elsevier Science Publishers. Kuppevelt, J. van (1995). Discourse structure, topicality and questioning. Journal of Linguistics. 31, 109-147. Lakoff, R. (1971). Ifs, and's and but's about conjunction. In C. J. Fillmore & D. T. Langendoen (Eds.), Studies in linguistic semantics (pp. 114-149). New York: Holt, Rinehart & Winston. Lambrecht, K. ( 1994). Information structure and sentence form. Cambridge: CUP. Lang, E. ( 1984). The semantics ofcoordination. Amsterdam: John Benjamins. Millis, K. K., Golding,}. M., & Barker, G. ( 1995). Causal connectives increase inference generation. Discourse Processes, 20, 29-49. Millis, K. K., & Just, M A. (1994). The influence of connectives on sentence comprehension. Journal ofMemory and Language, 33, 128-147. Pander Maat, H. ( 1998). aassifying negative coherence relations and negative connectives. Journal ofPragmatics, 30, 177-204. Pander Maat, H. ( 1999). The differential linguistic realization ofadditive and comparative coherence relations. To appear in: Cognitive Linguistics, 10 Quirk, R., & Greenbaum, S. (1976), A university Grammar of English. London: Longman. Posner, R. (I 980). Semantics and pragmatics of sentence connectives in natural languages. In J. Searle, F. Kiefer & M. Bierwisch (Eds.), Speech act theory and pragmatics(pp. 169-203). Amsterdam: Reidel. Rickheit, G., Gnnther, U., & Sichelschmidt, L. (1992). Coherence and coordination in written text: reading time studies. In D. Stein (Ed.), Cooperating with written texts. The pragmatics and comprehension ofwritten texts (pp. 103-127). Berlin/ New York: Mouton de Gruyter. Sanders, T. J. M., Spooren, W. P.M., & Noordman, L G. M. (1992). Toward a taxonomy of coherence relations. Discourse Processes, 15, 1-35.
229
1.30
Henk Pander Maat
Sanders, T. j. M., Spooren, W. P.M., & Noordman, L. G. M. (1993). Coherence relations in a cognitive theory of discourse representation. Cognitive Linguistics, 4, 93133. Segal, E. M., Duchan, j. F., & Scott, P. j. (1991). The role ofintradausal connectives in narrative structuring: evidence from adults' interpretations of simple stories. Discourse Processes, 14, 27-54 Vallduvi, E., & Engdahl, E. ( 1996). The linguistic realization of information packaging. Linguistics, 34, 459-519. Wilson, D., & Sperber, D. (1993). Linguisticfonn and relevance. Lingua, 90, 1-25.
CHAPTER
9
Argumentation, explanation and causality An exploration of current linguistic approaches
to textual relations A. Francisca Snoeck Henkemans University of Amsterdam
1.
Introduction
When analyzing argumentative discourse, the analyst attempts to get a dear overview of the relevant elements in the text and of the relations between these elements. Which elements and relations are included in the analysis, depends on the approach to argumentation from which the analysis is undertaken. In the pragma-dialectical approach to argumentation that has been developed by van Eemeren and Grootendorst ( 1984, 1992), argumentation is viewed as a means of dispute resolution. Van Eemeren and Grootendorst have presented an ideal model of a critical discussion which specifies which elements are relevant to the resolution of a dispute, in what stage they are situated and what their contribution is to resolving the disagreement. The model thus serves as a heuristic tool for the analysis, or reconstruction, of argumentative discourse.' The analysis must be further justified by referring to the details of the presentation and the context. In order to investigate which elements in the presentation can be helpful in analyzing argumentative discourse, a research project was started at the University of Amsterdam that concentrates on verbal indicators provided by the Dutch language of the communicative and interactional functions of argumentative moves. The project aims at making an inventory of potential indicators, classifying their indicative force in terms of the pragma-dialectical model for critical discussion, and describing the conditions that need to be fulfilled for a certain verbal expression to serve as an indicator of a specific argumentative move.
~3~
A. Francisca Snoeck Henckemans
Taking the model of a critical discussion as a starting-point, methods are developed to give a systematic reconstruction of argumentative language use. In this endeavor, all aspects that are relevant to resolving a difference of opinion are identified and analyzed: the stages involved in the resolution process, the explicit and implicit presentation of standpoints, arguments and other crucial speech acts, the structure of argumentative discourse, the types of arguments or 'argumentation schemes' that are being used, counterarguments, concessions and rebuttals, etcetera. In carrying out our research, we make use of pragma-linguistic descriptions of words and expressions, such as connectives, that can be indicative of aspects of argumentative discourse. In the project, special attention is paid to a number of well-known problems of analysis. Such problems involve distinctions between discourse elements (speech acts or combinations of speech acts) which may be hard to make in practice, but which are nonetheless important, since the outcome of the evaluation of the argumentation may vary depending on the choices made during the analysis. One such problem is distinguishing between arguments and explanations, and this problem will be the central issue of this chapter. The speech acts argumentation and explanation can easily be confused in practice, since they both involve some form of reasoning, which in both cases may be based on a causal relationship. Connectives such as 'because' and 'so' may be used both as indicators of argumentation and explanation. Nonetheless, it is important to distinguish between these two speech acts, since they differ in illocutionary aim, and need to be evaluated accordingly. Whereas explanations are designed to increase the listener's comprehension, argumentation is aimed at enhancing the acceptability of a standpoint. In this chapter, I shall begin by explaining the theoretical framework offered by the pragma-dialectical approach. I shall discuss the starting-points of the analysis of argumentative discourse and represent the pragma-dialectical characterization of the speech acts of arguing and explaining. Next, I shall compare our approach with the relevant linguistic literature. Finally, I shall shed some light on the way in which the problem of distinguishing between arguments and explanations may be dealt with.
Argumentation, explanation and causality 233
2..
A pragma-dialectical perspective on the analysis of argumentation and explanation
In the pragma-dialectical research program, argumentation is approached from four basic meta-theoretical starting-points: the subject matter under investigation is to be 'externalized', 'socialized', 'functionalized', and 'dialectified' (van Eemeren et al. 1996, p. 276-280). What are the implications of these startingpoints when applied to the problem of analyzing argumentative discourse? First, functionalization. When analyzing argumentation, the purpose for which the argumentation is put forward, is to be duly taken into account. Functionalization can be realized by making use of theoretical instruments from speech act theory, making the speech act the basic unit of analysis, with the propositional and the illocutionary level as its component levels. By making use of pragmatic insights, the functions and structures of the speech acts performed in argumentative discourse can be adequately described. Second, socialization. When analyzing argumentation, one should realize that argumentation does not consist in one single individual privately drawing a conclusion, but takes place in the context of a process of joint dispute resolution. In order to do justice to the fundamentally dialogical character of argumentative discourse, the analysis should be aimed at elucidating the collaborative way in which the protagonist and the antagonist respond to each other's- real or projected- questions, doubts and objections. Third, externalization. The analysis does not focus on psychological dispositions or internal thought processes of the people involved in an argument, but on the extemalizable commitments created by their performance of speech acts. Fourth, dialectification. Dialectification is achieved by regimenting the exchange of speech acts directed at resolving a difference of opinion in an ideal model for critical discussion. When analyzing argumentative discourse, this model serves as a point of reference. In the pragma-dialectical approach, both argumentation and explanation are seen as complex speech acts, which ( 1) have two communicative functions at the same time: that of an assertive at the sentence level, and that of argumentation or explanation at the higher textual level, and (2) cannot stand by themselves; they must be in a particular way connected to another speech act by the same speaker. In the case of argumentation, there should exist a relation of support between the argument and a standpoint; in the case of explanation, there is to be an explanatory relation between the explanation and the assertion
234 A. Francisca Snoeck Henckemans
that expresses the state of affairs that is to be explained (van Eemeren & Grootendorst 1992, p. 29). There are two levels of analysis for argumentation and explanation: the illocutionary level at which there is a means-end relationship between the two speech acts, and the propositional level, where various relations are employed in order to achieve the effect of making it understood how something came into being, or of convincing the other party of the acceptability of a standpoint. An important difference between argumentation and explanation (apart from the difference in illocutionary aim) is that in explanations only causal relations may be employed at the propositional level, whereas for argumentation there are no such restrictions. 2 Arguments may also be based on a relation of concomitance (for instance, when the argument is presented as a sign or a symptom of what is stated in the standpoint) or on a relation of comparison (by pointing out a resemblance or similarity between that which is stated in the argument and that which is stated in the standpoint).3 Table 1 gives an overview of the main similarities and differences between argumentative and explanatory relations. Table I.
Pragma-dialectical characterization of argumentative and explanatory relations Relation between speech act 1 and 2
Complex speech act
Propositional level level
relation of comparison relation of concomitance causal relation
argumentation argumentation argumentation/explanation
mocutionary level
Speech act 1 is a means of rendering speech act 2 comprehensible
explanation
Speech act 1 is means of rendering speech act2 acceptable
argumentation
From this overview, it emerges that especially arguments that are based on a causal relationship at the propositional level may be hard to distinguish from explanations, particularly when the context provides no clues as to whether the speaker is attempting to resolve a problem concerning the comprehensibility
Argumentation, explanation and causality 235
or a problem concerning the acceptability of another speech act. In the linguistic literature, extensive study has been made of discourse connectives and other indicators of textual relations. What do these linguistic approaches have to offer for the problem of distinguishing between arguments and explanations?
3·
Other approaches
In our attempt to apply linguistic descriptions of markers of various kinds of textual relations in the analysis of argumentative discourse, we have encountered a number of obstacles. A major cause of this is that the most prominent approaches of indicators oftextual relations are developed from a meta theoretical perspective that is in a number of respects different from the functionalizing and externalizing approach favored in pragma-dialectics. In addition, there is usually a difference of purpose: linguists are not particularly interested in analyzing argumentative discourse; their distinctions are therefore often not geared to solving our problems of analysis, such as distinguishing between argumentation and explanation. Broadly speaking, two major types of approach to indicators of textual relations can be distinguished. First, top-down approaches, in which textual relations are classified in terms of a limited number of theoretical notions, and subsequently an attempt is made to determine which connectives can be used to mark these relations. Among the representatives of this approach are Mann and Thompson (1988), Sweetser (1990) and Sanders (1992, 1997). In the second type of approach, the bottom-up approach, first an inventory of indicators is made and then it is attempted to find out (in a pretheoretical way), by analyzing texts or by using substitution tests, how these indicators can be used. As a result, a set of features comes up 'inductively' that is necessary for giving a systematic description of the various indication devices. Representatives of this type of approach are Knott and Mellish ( 1996) and Schiffrin ( 1987). In principle, both types of approach are relevant to our research, but the top-down approaches are more closely related to the pragma-dialectical approach to argumentation analysis.4 Making use of the insights gained in the top-down approaches, however, is not unproblematic, due to a number of incompatibilities with the pragmadialectical starting-points. Many of the prominent approaches to textual rela-
~36
A. Francisca Snoeck Henckemans
tions are only partly functionalized, for example, Mann and Thompson's (1988), Sanders' (1992, 1997) and Sweetser's (1990) approach. In principle, these authors do all make use of concepts from speech act theory and they seem to favor a functional approach. But seen from a speech act perspective, their approaches suffer from a number of inconsistencies. Although the terminology differs from case to case, all the authors mentioned make a distinction between (at least) two types of relation: semantic relations and pragmatic relations. This distinction, although it is reminiscent of the distinction between propositional and illocutionary relations does in none of the approaches coincide with the latter distinction. For some authors, for instance Sweetser (1990), semantic relations are relations between states of affairs in reality. They belong to the external (sociophysical) domain. Propositional relations, however, are not limited to states of affairs. A propositional relation may also exist between two normative propositions. Other authors, such as Mann and Thompson (1988), give a definition of semantic relations that is closer to the concept of propositional relation, but they do not make a hierarchical distinction between relations at the propositional and relations at the illocutionary level. Propositional relations are treated as functioning on a par with illocutionary relations: For this reason, the classes of text relations in Mann and Thompson's taxonomy are by definition not mutually exclusive, while it is not explicitly recognized by these authors that relations between two speech acts can exist on two levels at the same time. This is, for example, the case in an argumentation: the functional argument-standpoint relation is situated at the illocutionary level, whereas the causal, symptomatic or analogical relationship on which the support the argument lends to the standpoint is based, is situated at the propositional level. I shall illustrate the problem by discussing some distinctions made in Mann and Thompson's Rhetorical Structure Theory. Mann and Thompson {1988) consider their theory to be a functional theory of text structure. They distinguish a large number of textual relations varying in the effect they intend to achieve in the reader. For each relation between two text spans, called the 'nucleus' and the 'satelite', they formulate a number of constraints reminiscent of the felicity conditions for speech acts. 5 The textual relations are divided into two groups, subject-matter relations and presentational relations: subject-matter relations are those whose intended effect is that the reader recognizes the relation in question; presentational relations are those whose intended effect is to increase some inclination in the reader, such as the desire to act or the degree of positive regard for, belief in, or acceptance of the nucleus ( 1988, p. 257).
Argumentation, explanation and causality 237
Although their taxonomy seems to be based on a twofold criterion, on closer inspection Mann and Thompson make a distinction between three types of effect aimed for by the writer: ( 1) increasing the reader's comprehension of the nucleus (Background), (2) increasing the reader's acceptance of the nucleus (Evidence, Motivation) and (3) getting the reader to recognize the relation between nucleus and satelite (Circumstance, Solutionhood). The first two effects are situated at the illocutionary level, whereas the third type of effect concerns the propositional level. Since recognition of the relation is a preliminary condition for understanding and acceptance, it is dear that the classes of text relations distinguished by Mann and Thompson are not of the same hierarchical order, and that therefore the classes overlap. 6 The fact that every illocutionary relation is by definition based on a subject matter (or 'propositional') relation is not recognized by Mann and Thompson. Since they do not make a hierarchical distinction between subject matter and presentational relations, nor give an account of the illocutionary purpose, or interactional goal, for which a subject matter relation can be employed by the writer, the analyses offered by them are only partially functional. Sanders ( 1997) is an author who does acknowledge that both a propositional (or 'locutionary') and an illocutionary relation may exist at the same time, but he claims this is not necessarily the case, and regards the propositional relation as of secondary importance for pragmatic relations: pragmatic relations can, but need not be based on a connection in the real world. [ ... ] in the case of a pragmatic relation the level of connection of the CR [coherence relation) is the illocutionary leveL This connection possibly exists in addition to a locutionary connection, but the relevant levd of connection is the illocutionary one (1997, p. 123).
An example of a pragmatic relation where a propositional connection (or, in Sanders' words, 'a real world' connection) is absent, is according to Sanders 'Theo was exhausted, because he told me so'. In fact, there is a connection at the propositional level between the statements '[It is true that] Theo was exhausted' and 'because he told me so', namely: 'It is characteristic of things that Theo tells me that they are true' or some similar sort of linking statement. At the propositional level, the argumentation in Sanders' example is based on a relation of concomitance. In other approaches, the lack of extemalization is problematic. Sweetser's ( 1990) Multiple-Domains Theory is a good example. According to Sweetser, in sentences that contain 'causal' conjunctions the relationship between the con-
~38
A. Francisca Snoeck Henckemans
joined clauses can be based on three types of causality: (a) real-world causality, (b) epistemic causality or (c) speech-act causality. From her examples, it becomes clear that the relation of epistemic causality must be similar to the relation between a standpoint and an argument for the truth or acceptability of its propositional content. The problem is, however, that Sweetser gives an internalizing definition of the epistemic relation which creates a fundamental difference between the epistemic causal relation and the argumentative relation.7 Sweetser does not situate the epistemic relation at the speech act level, but at the level of the speaker's thought processes. Epistemic causal relations are analyzed by her as statements about a writer's conclusions and how they were reached. The internalizing approach underlying the concept of epistemic causality runs counter to the pragma-dialectical starting-point of externalization, which requires the argumentation theorist to concentrate on the speech acts performed and the externalized or externalizable commitments of the arguer rather than on the beliefs and inferences involved in the reasoning process of drawing a conclusion. 8 A different type of problem concerns the applicability of the distinctions made in the linguistic literature to the analysis of argumentation. Sometimes the definitions of textual relations are not differentiated enough for practical purposes. Often all inference-relations are brought together under the general heading of 'causal relation'. Then it is not possible to make a distinction between establishing a causal connection, describing a causal relation, and making use of a causal relation in an explanation or in an argument. There are authors who make all kinds of subdistinctions, but it is often not obvious that they are relevant for analytical purposes. Sometimes it is even difficult to discover to which subcategory or subcategories the argumentative relation belongs. In Mann and Thompson's Rhetorical Structure Theory, for instance, it is hard to determine which relations are to be regarded as argumentative. Apart from obvious candidates, such as the presentational relations 'Motivation', 'Evidence' and 'Justify', there are subject-matter relations such as 'Cause' and 'Reason' that could or could not be used argumentatively. The same is true of 'Solutionhood': the intended effect of this relation is that the reader recognizes that one part of the text presents a solution to a problem presented elsewhere. Such a solution, however, can be presented for descriptive purposes, but also for argumentative purposes.
Argumentation, explanation and causality 239
4·
Argumentation and explanation
The taxonomies of textual relations proposed in the linguistic literature do not provide a good starting point for making an inventory of potential indicators of arguments and explanations. One of the main problems is the authors' failure to make a hierarchical distinction between relations at the propositional level and relations at the illocutionary level. Sanders ( 1997), who does allow for the possibility of relationships existing at these two levels, at the same time claims that there is always only one relevant level of connection: the propositional level in the case of semantic relations and the illocutionary level in the case of pragmatic relations. Is this a tenable claim when applied to the distinction between explanations and arguments? Sanders (1997, p. 121) believes that pragmatic relations do not cohere because they connect real world events, but because they connect communicative actions. To make this clear, he gives the following example: ( 1) The neighbors are not at home. The lights in their living room are out.
According to Sanders, in example ( 1) "the two discourse segments are related because we understand the second part as evidence for the claim in the first, and not because there is a causal relation between two states of affairs in the world: It is not because the lights are out that the neighbors are at home" (1997, p. 121). Nonetheless, there is a propositional relation between the two discourse segments, and in this case even a 'real-world' or factual one, since both the standpoint and the argument contain a factual proposition. In example ( 1), this is not a causal relation, but a relation of concomitance between two 'realworld' situations: "Characteristically, people's lights are on when they are at home., That the argumentation must be based on such a factual relation in order for the utterances to be coherent, becomes clear if we contrast example (1) with example (2), in which such a 'real-world'-connection seems to be missing: (2) The neighbors are not at home. The carpet in their living room is green. The coherence of the relationship between an argument and a standpoint thus also depends on the acceptability of the (factual or non-factual) propositional relationship between the two speech acts. Another reason why an analysis at both the propositional and the illocutionary level is required in the case of text relations, is that the judgment that a
240
A. Francisca Snoeck Henckemans
relationship is explanatory (and thus 'semantic' in Sanders' terminology) or argumentative ('pragmatic'), presupposes an analysis of the context: is the context appropriate for an argument or for an explanation? In order to make this clear I shall give an overview of the main conditions of the speech acts of arguing and explaining. My overview will be based on Houtlosser's ( 1995, p. 22~227) analysis of the speech act complex of giving an explanation, which is inspired by Van Eemeren and Grootendorst's (1984, p. 44-45; 1992, p. 31) analysis of the complex speech act of argumentation. 9 While argumentation is an attempt to convince the listener of the acceptability of a standpoint with respect to a proposition, an explanation is aimed at increasing the listener's understanding of the proposition represented by the explained statement (explanandum). There are no restrictions on the propositional content of a standpoint that is supported by an argument, but both the propositional content of the explained statement and the explaining statements are condition-bound: the explained statement should refer to a factual state of affairs and the explaining statements should mention the cause of this state of affairs. Another important difference concerns the contextual prerequisites for argumentation and explanation: argumentation is put forward when the speaker expects that the acceptability of the standpoint is at issue, whereas giving an explanation is pointless when the speaker does not believe that the explained statement has already been accepted by the listener as depicting a true state of affairs. 1 For this reason, certainty moves, according to Govier, in an argument from the premises to the conclusion, whereas in an explanation it moves from the fact explained to the explaining statements:
°
In the explanation, the explained statement is as certain as the explaining statements and often more certain. In an argument the premises are typically more certain than the conclusion ( 1987, p.162).
Govier appears to call a statement 'certain' if the speaker believes that this statement will not be disputed by the listener or that it has already been accepted by the listener. An important type of indication that the speaker believes that the listener may take issue with one of his statements is, according to Houtlosser, the epistemic use of modal terms: The epistemic use of modal terms can modify an assertive's force. Then the acceptability of the assertive is brought into question by way of an implicature (1995, p. 311).
Argumentation, explanation and causality 241
Because modal terms indicate that, as far as the listener is concerned, the acceptability of an assertive might be at issue, they can be instrumental in distinguishing arguments from explanations. This can be illustrated by means of the foUowing sentences in example (3): [l]He left his job, [2] because he couldn't get along with his colleagues. b. [l]He left his job, [2] most probably because he couldn't get along with his colleagues. c. [ 1] He may have left his job, [2] because he couldn't get along with his colleagues.
(3) a.
Without any specific contextual information, example (3a) and (3b) are best analyzed as explanations. In (3a), both the 'because'- clause [2] and the first clause [ 1] are unmodified; they are therefore presented as equally certain. In (3b), clause [2] is less certain than clause [1]. Example (3c) is a case of argumentation: clause [ 1] is modified; thus it is presented as disputable, whereas clause [2] is presented as certain. In deciding whether a relation is explanatory or argumentative, an illocutionary level analysis of the contextual conditions that need to be fulfiUed for each of these speech acts, plays a decisive role. Although Sanders ( 1997) does not explicitly acknowledge this, he does refer to the appropriate contexts for arguing and explaining in his definitions of prototypical pragmatic and semantic relations: 11 Prototypical [... ] pragmatic [... ] relations are cases in which the writer argues for something she claims to be true. [ ... ] They often contain linguistic elements expressing the evaluation from the perspective of the author( 1997, p. 125 ). Prototypical [ ... ] semantic[ ... ] rdations concern events which have already taken place [ ... ] so that there can be no dispute about the 'truth' of the statement ( 1997, p. 125).
In other words: the contextual issue whether a statement is disputable or not, is already present in Sander's definitions of semantic and pragmatic relations. Another type of linguistic due for distinguishing between arguments and explanations that is situated at the illocutionary level is the type of connective that is employed to connect two clauses. According to Houtlosser ( 1995, p. 227), it is a preparatory condition for an explanation that the speaker believes that the listener's understanding of how the state of affairs mentioned in the explanandum has come into being is insufficient. By informing the listener
242. A. Francisca Snoeck Henckemans
about the cause of this state of affairs, the speaker hopes to enhance the listener's understanding. Given these conditions, it may be assumed that the speaker believes he is providing new information to the listener in the explaining statement This is different in the case of argumentation: here the speaker must assume that the premises will be acceptable to the listener. Although this does not necessarily mean that the premises never contain new information, ordinarily the speaker will expect the premises to be acceptable to the listener because they are taken as given, or have the status of facts for the listener. In her pragmatic study of English causality conjunctions, Vandepitte ( 1993, p. 91) has shown that the selection of conjunctionals such as 'because', 'as' and 'since' is influenced by what the speaker thinks the listener already knows. Causal relations whose cause or reason is assumed to be known by the listener are most frequently introduced by means of 'as' or 'since'. The conjunctional 'because' is least frequently used to introduce a manifest state of affairs. If a causal relation is marked by 'as' or 'since', there is good reason to assume that this relation is part of an argument rather than an explanation. Clues for the identification of argumentation and explanation can also be found at the propositional level. When confronted with a case where it is doubtful whether we are dealing with an explanation or with argumentation it is an adequate analytic strategy to determine first whether the conditions for the propositional content of an explanation have been fulfilled. I shall give some examples where this is not the case. Examples (4 )-( 8) represent different cases in which the conditions for explanation are not satisfied: (4) There must have been a sea here once, because the ground is full of seashells. (5) She will have poor eyesight when she grows up, because she's always reading in bed. ( 6)
You must mail these forms today, because otherwise we won't get our subsidy.
(7) Burmese cats make a lot of noise, because aU my three Burmese cats did that. (8) He must be quite bright, because his sister was also a good student. In example (4) the 'because' -clause mentions the effect instead of the cause: the reasoning is from effect to cause. In example (5) the reasoning is from cause to effect, but it is used to make a prediction. Since a prediction concerns a state of affairs that is to be realized, and not a factual state of affairs, (5) cannot
Argumentation, explanation and causality 143
be an explanation. 12 Assuming that there are no contextual indications that example (6) is used to report an obligation, but instead functions as an indirect directive, the first clause cannot function as an explanandum, because the proposition is not a descriptive proposition in which a factual state of affairs is described, but an inciting proposition. 13 In example (7) the reasoning is not based on a causal relation, but on a symptomatic relation. Moreover, the 'because'-clause contains a particular proposition, whereas the first clause contains a proposition whose scope is universal. In an explanation, particular facts can be explained by referring to a general rule or principle, but a general rule cannot be explained by mentioning particular facts. Example (8), finally, is not based on a causal relation, but on an analogy. By linking relations at the propositional level systematically with relations at the illocutionary level, it is thus possible to obtain information that is crucial to the analysis of argumentative discourse. 14 A piece of reasoned discourse can only be an explanation if the reasoning is at the propositional level based on a causal relation, not on a symptomatic relation or an analogy. Moreover, this causal relation should be construed in such a way that the effect is mentioned in the explained statement and the cause in the explaining statement, instead of the other way around. The explained statement should contain a descriptive proposition, not an evaluative or inciting one. This proposition should refer to a factual state of affairs, not to a state of affairs that is still to be realized. Since an explanation must be based on a causal relation, identifying the type of relation that the reasoning is based on at the propositional level is a crucial step in the analysis. Indicators of propositional relations are therefore an important source of clues for distinguishing arguments from explanations. 15
s.
Conclusion
As I hope to have shown, the functionalizing speech act perspective inherent in the pragma-dialectical approach creates a fruitful starting point for making a systematic inventory of linguistic clues at the different hierarchical levels of argumentative discourse. By systematically distinguishing between relations at the propositional and relations at the illocutionary level, and by combining pragmatic analyses of the contextual preconditions for performing the speech acts of arguing and explaining with the use of pragma-dialectical analytical instruments, a basis can be created for using linguistic insights in a wellfounded and systematic way.
244 A. Francisca Snoeck Henckemans
Notes 1. For a detailed discussion of the ideal model for critical discussion and the main startingpoints of pragma-dialectics, see van Eemeren and Grootendorst ( 1984) and van Eemeren et al. ( 1996).
2. I am using the term 'explanation' in the narrow sense of making it understood how something came into being. The term is sometimes also used in a broader sense, as a synonym for elucidating, amplifying or describing, and then, of course, other types of propositional relations may be used then just causal relations. 3. Van Eemeren and Grootendorst only apply these relations in the context of argumentation: each of these relations underlies a different type of argumentation. In the linguistic Uterature (HalUday & Hasan 1976; Martin 1992) more or less similar distinctions between relations are made (comparative, consequential, temporal and additive relations), that can be situated both at the propositional level (external/semantic relations) and at the illocutionary level (internal/pragmatic relations). 4. Top-down approaches start from a theoretical stance taken toward the phenomena, and thus can be seen as 'a priori' approaches instead of inductive 'a posteriori' approaches. The pragma-dialectical approach is a priori in the sense that it begins with a (normative) model of critical discussion (Van Eemeren, Grootendorst, Jacobs & Jackson 1993, p. 52-54), which defines the set of relevant speech acts and relations between speech acts. 5. Apart from relations consisting of a nucleus and a satelite, Mann and Thompson also distinguish 'multinuclear' relations (1988, p. 247). The large majority of relations, however, holds between a nucleus and a satelite. 6. In her discussion of the Rhetorical Structure Theory, Kroon ( 1995) notes that there are many cross-classifications possible between subject-matter relations and presentational relations. According to her, this problem arises 'from the fact that RST fails to make sufficiently clear in what respects subject-matter relationships differ from presentational relations, and whether and how they interrelate' (1995, p. 22). 7. Even if Sweetser's epistemic causal relation were to be reinterpreted as a relation between speech acts, it would still not fully satisfy the concept of argumentation. Sweetser's epistemic causal-conjunctions always have factual conclusions: certain knowledge 'causes' the speaker to conclude that something is true. There is no place for argumentation in support of evaluative or inciting conclusions. 8. A similar type of criticism is put forward by Knott and Mellish ( 1996, p. 153 ), who point out that "an account is missing of how an argumentative text [ ... ] achieves a rhetorical effect on the reader - how it persuades the reader" of the conclusion that is presented by the writer. They concede that there may be contexts where Sweetser's analysis of epistemic relations is preferable, for instance when dealing with writers who are simply expressing their own chain of reasoning out loud for scrutiny by a reader whose authority they accept, but they do not consider this a prototypical use of argumentative rdations (Knott & Mellish 1996, p. 154).
Argumentation, explanation and causality 245
9. Van Eemeren and Grootendorst (1984, p. 11~117; 1992, p. 29) observe that it is important to distinguish between argumentation and explanation (or amplification) since explanations are designed to increase the listener's comprehension, and will have to be judged accordingly. Starting from this observation, Houtlosser gives a specification of the felicity conditions of explanations and compares them with the felicity conditions of argumentation as formulated by van Eemeren and Grootendorst ( 1984, p. 45-46; 1992, p. 30-33). His more detailed analysis is consistent with Johnson and Blair's (1983) and Govier's ( 1987) analyses of the differences between arguments and explanations.
10. This difference in the speaker's expectations of the listener's beliefs is seen by several authors as the most important difference between argument and explanation (Govier,1987; van Eemeren & Grootendorst 1984, 1992; Johnson & Blair 1983 ). 11. Some of the linguistic tests that are used by Kroon (1995, p. 104) and others to distinguish semantic from pragmatic relations can also be explained contextually. Pragmatic relations can, for instance, not be the focus of a cleft-construction(? Is it because there are puddles on the pavement that is has been raining?), whereas semantic relations can. In this cleft-construction, the fact that it has been raining is taken for granted, or presupposed. This means that one of the contextual prerequisites for argumentation has not been met, whereas the contextual prerequisite for explanation (that there be no dispute about the state of affairs to be explained) is met. 12. Lyons (1977, p. 815) points out that predictive statements can also be objectively modalized: "The speaker can treat the future as known [ ... ] whether he is epistemologically justified in doing so or not". In that case, the utterance functions as an 'act of telling' that must be distinguished from a prediction. 13. Van Eemeren and Grootendorst (1992, p. 159) make a distinction between three main types of proposition: descriptive, evaluative and inciting propositions. Descriptive propositions describe facts or events, evaluative propositions express an assessment of facts or events, and inciting propositions call on to prevent a particular event or course of action. 14. Klein (1987) arrives at a similar conclusion; he specifies which types of propositional relations can go together with particular illocutionary relations. 15. Van Eemeren and Grootendorst (1992, p. 98-99) and Van Eemeren, Grootendorst and Snoeck Henkemans (2001, ch. 6) list a number of expressions that can be used to indicate a particular argumentation scheme.
References Eemeren, F. H. van, & Grootendorst, R. ( 1984). Speech acts in argumentative discussions.
A theoretical model for the analysis ofdiscussions directed towards solving conflicts of opinion. Dordrecht/Cinnaminson: Foris Publications, PDA 1. Eemeren, F. H. van, & Grootendorst, R. (1992). Argumentation, communication and fallacies. A pragma-dialectical perspective. Hillsdale, NJ: Lawrence Erlbaum Associates.
~6
A. Francisca Snoeck Henckemans
Eemeren, F. H. van, Grootendorst, R., Jackson, S. & Jacobs, S. (1993). Reconstructing argumentative discourse. Tuscaloosa: University of Alabama Press. Eemeren, F.H. van, Grootendorst, R., Snoeck Henkemans, A.F. (2001). Argumentation. Analysis, evaluation, presentation. Hillsdale, N.J.: Lawrence Erlbaurn. Eemeren, F. H. van, Grootendorst, R, Snoeck Henkemans, F., Blair, J. A., Johnson, R. H., Krabbe, E. C. W., Plantin, C., Walton, D. N., Willard, C. A., Woods, J., & Zarefsky, D. ( 1996). Fundamentals ofargumentation theory. A handbook ofhistorical backgrounds and contemporary developments. Mahwah, NJ: Lawrence Erlbaum. Govier, T. ( 1987). Problems in argument analysis and evaluation. Dordrecht: Foris Publications, PDA 5. Halliday, M.A. K., & Hasan, R. (1976). Cohesion in English. London: Longman. Houtlosser, P. (1995). Standpunten in een kritische discussie. Een pragma-dialectisch perspectief op de identificatie en reconstructie van standpunten. lfott: Amsterdam [Standpoints in a critical discussion. A pragma-dialectical perspective on the identification and reconstruction of standpoints (with a summary in English)). Johnson, R., & Blair, J. A. (1983). Logical self-defense. 2nd edition. Toronto: McGraw Hill Ryerson. Klein, J. (1987). Die konklusiven Sprechhandlungen: Studien zur Pragmatik, Semantik, Syntax und Lexik von 'BegrUnden', 'Erklaren warum', 'Folgern' und 'Rechtfertigen'. TObingen: Niemeyer. Knott, A., & Mellish, C. (1996). A feature-based account of the relations signalled by sentence and clause connectives. Language and Speech, 39, 143-183. Kroon, C. (1995). Discourse particles in Latin. A Study of'nam', 'enim', 'autem', 'vero' and 'at'. Amsterdam: J. C. Gieben. Lyons, J. (1977). Semantics. Vol. II. Cambridge: Cambridge University Press. Mann, W. C., & Thompson, S. A. (1988). Rhetorical structure theory: toward a functional theory of text organization. Text, 8, 243-281. Sanders, T. J. M. (1992 ). Discourse structure and coherence. Aspects ofa cognitive theory of discourse representation. Diss. Katholieke Universiteit Brabant. Sanders, T. J. M. ( 1997). Semantic and pragmatic sources of coherence: On the categorization of coherence relations in context. Discourse Processes, 24, 119-147. Schiffrin, D. (1987). Discourse markers. Cambridge: Cambridge University Press. Sweetser, E. E. ( 1990). From etymology to pragmatics. Metaphorical and cultural aspects ofsemantic structure. Cambridge: Cambridge University Press. Vandepitte, S. (1993). A pragmatic study of the expression and the interpretation of causality conjuncts and conjunctions in modern spoken British English. Brussels: Koninklijke Academie voor Wetenschappen, Letteren en Schone Kunsten van Belgi!.
SECTION
3
From text representation to knowledge representation
Relational coherence, i.e. coherence relations and the way in which they are expressed in text, is an important issue in studies of text representation. In the previous section, they were primarily discussed from a linguistic, an analytical point-of-view. However, given the starting point that coherence relations are of a conceptual nature, a question that rises is: Are these coherence relations that were identified in many studies of discourse structure actually different from the relations that exist in our knowledge base? And what is the relationship of coherence relations with inferences comprehenders make while processing a discourse? These are the questions investigated by Graesser, Wiemer-Hastings & Wiemer-Hastings in Chapter 10. The authors stress the relevance of an analysis of the domain knowledge language users have with respect to the topic of the text under study and discuss the validity of their taxonomy of inferences, especially in relation to classifications like those presented in Section 2. In Chapter 11 we move further away form the linguistic analyses of text structure into the direction of knowledge representation. Britton, Schaefer, Bryan, Silverman & Sorrells develop a quantitative model for the way in which people think about 'bodies' of technical and scientific information. Thinking was induced by asking participants to think about these bodies of knowledge. The model was used to make quantitative predictions of what specific thoughts participants would have after thinking about this specific information. The chapter's basic claim is that the participants' thinking process consists of spreading activation and that the final configuration is the end product of the spreading activation process. It is this end product that the model predicts. Computational modeling and experiments are used to show the validity of their account, which is dearly related to Chapter 10 in the sense that it is argued that text representation depends on pre-existing knowledge. The explicit account of a text representation as of a dynamic nature, is similar to the approach Gaddy et al. take in Chapter 3.
CHAPTER
10
Constructing inferences and relations during text comprehension Arthur C. Graesser,* Peter Wiemer-Hastings and Katja Wiemer-Hastings University of Memphis/UDiversity of Edinburgh/ Northern illinois University
Comprehenders make inferences when they read text, watch film, and observe the real world. This intuition is shared by virtually everyone in linguistics (including psycholinguistics, computational linguistics, sociolinguistics, and text linguistics) and in cognitive science (discourse psychology, artificial intelligence, philosophy of mind, social cognition, cognitive anthropology). However, there has been considerable controversy over matters of inference generation. What inferences are generated? When are they generated? What sources of information need to be intact when inferences are generated? What cognitive processes and representations produce inferences during comprehension? And what, precisely, is an inference anyway? Twenty years ago there was very little scientific knowledge about inferences in text comprehension. Most research efforts concentrated on the representation ofexplicit text and the process oflinking anaphoric expressions (e.g., nounphrases, pronouns) to previous explicit text constituents. Times have changed in the world of discourse psychology. There have been serious efforts by discourse psychologists to dig deeper and understand how readers construct 'situation models', i.e., mental models of what the text is about. For example, the situation model for a story would consist of a microworld with characters who perform actions in pursuit of goals, events that present obstacles to goals, conflicts between characters, emotional reactions, the spatial setting, the style and procedure of actions, objects, properties of objects, traits of characters, and mental states of characters. Inference generation is inextricably bound to the process of constructing a situation model. The research efforts in recent years have produced a wealth oftheoretical positions in discourse psychology, each of
250
Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
which makes distinctive claims about situation model construction and inference generation: The constructionist theory (Graesser, Singer, & Trabasso 1994), the construction-integration model (Kintsch 1998), the structure building framework (Gernsbacher 1997), the event indexing model (Zwaan, Langston & Graesser 1995; Zwaan & Radvansky 1998), the resonance model (Myers & O'Brien 1998; O'Brien, Raney, Albrecht & Rayner 1997), the landscape model (van den Broek, Young, Tzeng & Linderholm 1999), the schema copy plus tag model (Graesser, Kassler, Kreuz & McLain-Allen 1998), the 3CAPS model (Goldman, Varma & Cote 1996), and the minimalist hypothesis (McKoon & Ratcliff 1992). Our scientific knowledge of inference generation and situation model construction has evolved from barren to overwhelming in just two decades. It is beyond the scope of this chapter to introduce and clarify all of the controversies associated with inference research. Our objective is more modest. We want to show how world knowledge plays a central role in the mechanisms that construct inferences and the relations that bind text constituents. Any model will be a dismal failure ifit confines its analysis to linguistics and the surface cues in the explicit text. Our chapter is divided into three sections. First, we describe Graesser's constructionist theory of inference generation (Graesser et al. 1994 ). This theory offers discriminating predictions about what knowledge-based inferences are generated when readers construct a situational model for a text. Second, we briefly describe a three-pronged method for investigating inference generation. This method is needed to provide a rigorous scientific investigation of inference generation. Third, we present a catalogue of relations that are used to connect text constituents and the conceptual entities in world knowledge structures. The catalogue of relations is based on analyses of world knowledge, but the constraints of world knowledge are to some extent manifested in language and rhetoric. The underlying assumption is that world knowledge is sufficiently constrained and systematic that it can be incisively integrated with theories of language processing that have traditionally been confined to the lexicon, syntax, and semantics (see Tomasello 1998).
1.
A constructionist theory of inference generation
Many inferences are the same when comprehending a sequence of events via text versus film versus observation of the real world. For example, consider the following cryptic sequence of descriptions.
Inferences and relations
( 1) A diner. A couple sits down at a table. The young woman has a distressed
look on her face. She takes a letter out of her purse. She slides it to the young man. The young man lifts up the letter. He reads it Soon he is stunned. He sits motionless. The tears start to fall. The young woman gets up from the table. She stares at the floor. She leaves. The 13 verbal descriptions could easily be captured by 13 corresponding film dips without appreciably changing the meaning. There would be appropriate camera angles and distances for capturing the sequence of scenes, actions, and emotional expressions. Similarly, a customer at a nearby table could witness the same sequence of 13 'observations' without appreciably changing the meaning. When comprehending "A couple sits down at the table,, comprehenders infer the 'superordinate goals' (motives) of getting food and having a conversation. When comprehending "She slides the letter to the young man", comprehenders infer that the woman has the superordinate goal of getting the man to read the letter. When comprehending "Soon he is stunned", comprehenders infer the 'causal antecedent' state that the letter has disappointing news. This inference is verified further and strengthened when comprehending the subsequent two sentences ("He sits motionless," "Tears start to fall"). Most comprehenders infer the plausible causal antecedent event that the couple breaks up, at or before "She leaves., It would appear, therefore, that the generation of these knowledge-based inferences is governed by mechanisms that transcend the medium; the same inferences occur while reading text, watching film, and observing the real world. In a nutshell, the medium is not the message. We could speculate on the minimum set of statements/clips/observations that would be needed to convey the break-up scenario. The following four might work: The young woman has a distressed look on her face, the young man reads the letter, tears (of the young man) start to fall, and the young woman leaves. The following four would definitely not work: A diner, a couple sits down at the table, the young man lifts up the letter, and the young woman gets up from the table. The minimal set would be the same for text, film, and the world. If we were to rate the statements on importance to the break-up scenario, the ratings would correlate very highly among the text, the film, and the world. So once again, the medium is not the message. Nevertheless, there are some nontrivial differences between text, film, and the real world. These differences have repercussions on the sort of inferences that would be anticipated in these three modes. A large amount of information
251
l.Sl. Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
is perceptually available to the viewer in the case of films and the real world. The viewer directly observes the spatial layout of the environment, visual features of objects and people, the actions that are performed by people, the manner in which actions are performed, and events that are visually prominent. The viewer does not have to generate inferences about this information that has strong visual support. However, the viewer does need to generate inferences that explain the visible actions, events, and states; these inferences include the superordinate goals, causal antecedents, emotions, and perhaps traits of people. These inferences are normally invisible, but they do explain what is seen. The theory of inference generation that we will be advocating, called the 'constructionist theory' (Graesser et al. 1994; Graesser & Zwaan 1995; Graesser & Wiemer-Hastings 1999), assumes that these 'explanationbased' inferences are routinely generated while comprehending information in all modes- text, film, and the world. An adult comprehender has had years of experience ( 16 hours per day) generating explanation-based inferences, so these strategies of generating inferences are well weathered and automatized in the cognitive system. It is very adaptive for a person to construct these explanation-based inferences because they are associated with the achievement of goals, the causes and consequences of obstacles, and survival in the social and physical world (Mooney 1990; Noordman & Vonk 1998; Schank 1986). The constructionist theory offers discriminating predictions about the classes of inferences that adults make when they comprehend text. Some classes of inferences are made routinely and quickly, whereas others are made sporadically or are entirely missed unless the reader has an idiosyncratic goal to focus on a particular dimension of knowledge. When adults comprehend text, for example, they routinely make the explanation-based inferences, but they do not consistently generate many other categories of inferences, such as the spatial layout, visual features of objects and people, and the style or method of performing actions. The latter information receives strong visual support when films are viewed and the real world is observed, so the inference strategies during text comprehension are minimally developed in these arenas. According to the constructionist theory, adult readers generate the explanation-based inferences consistently and quickly, whereas the construction of the other classes ofinferences mentioned above (called 'elaborative' inferences) is much more variable and time-consuming. For example, it would take several minutes and considerable cognitive resources to construct a mental map of a region with people and objects distributed in a particular layout. Therefore, it
Inferences and relations 253
is unlikely that a detailed mental map is constructed while reading a text at the normal rate of 250-400 words per minute. Yet the mind can perceive a spatial layout in less than a second when viewing a film or the real world. It is important to acknowledge that the predictions of the constructionist theory are not equivalent to alternative theories of inference generation in discourse psychology. For example, McKoon & Ratcliff's (1992) minimalist hypothesis does not predict that explanation-based inferences are routinely made. The minimalist hypothesis instead predicts that the only inferences that readers routinely and quickly generate are those that are readily available in working memory and that are needed to establish local text coherence. Early theories of mental models (Glenberg, Meyer & Lindem 1987; Johnson-Laird 1983) predicted that readers construct spatial inferences, whereas the constructionist theory predicts that it is too time-consuming to construct a rich spatial situation model during normal reading. Graesser et al. (1994) reviews the evidence for the predictions of the constructionist theory and how its predictions are different from alternative theories, models, and hypotheses in discourse psychology. However, it is beyond the scope of this chapter to compare and contrast the various theoretical predictions. Text, film, and the world differ on another important dimension. Texts and films are intentionally created by humans in an effort to convey messages to the comprehender. The information must be delivered in a coherent fashion that conveys the message (e.g., point, moral, macrostructure). In the case of text, the pragmatic communicators ofthe message are the writer and the narrator. In the case of film, there is the script writer, the director, and the camera-operator. However, this pragmatic 'message level' is absent in the real world. People in the real world do not enact scripts in the service of messages to viewers. They just live their lives, and much oflife is rather uneventful. Thus, the content oftexts and films is intentionally and coherently constructed in the service of a message, whereas these constraints are absent in the real world. According to the constructionist theory of inference generation, comprehenders of text attempt to a construct a meaning representation and supporting inferences in a fashion that achieves 'coherence' at local and global levels. Readers generate inferences that fill gaps in the main messages. They generate inferences that explain why the writer bothers to mention something that otherwise would be insignificant (e.g., an 'out of the blue' due in a mystery novel). This attempt to construct local and global coherence is appropriate when comprehenders process text, film, and other communication artifacts. Such
254 Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
mechanisms are strategic in the sense that the reader invokes cognitive procedures deliberately, systematically, and continuously. In contrast, a search for coherence is not an appropriate strategy when viewing life's activities because much oflife does not unfold in coherent packages. Consequently, adults have much less experience generating coherence-based inferences; such strategies are acquired only when they read text and view films. Coherence-based inferences may not be routinely made during comprehension by readers who rarely read texts and view films because they lack the experience needed to overlearn such comprehension strategies. Adults who read infrequently may have trouble constructing inferences that address the global message of a text and the pragmatic level (e.g., the motives and attitudes of the writer). The constructionist theory stipulates that readers generate two other classes of inferences. These inferences are difficult to predict ahead of time because they are contingent on the detailed composition of the text and the mental state of the reader. 'Passive-activation' inferences are encoded because they are activated and reactivated by multiple sources of information (e.g., words, propositions, contents of working memory, scripts, global macrostructures), or are strongly activated by one important information source. There are several psychological models that make specific predictions about what inferences are activated and encoded through passive activation mechanisms, such as Kintsch's construction-integration model ( 1998), the resonance model (Myers & O'Brien 1998; Myers, O'Brien, Albrecht & Mason 1994), and the minimalist hypothesis (McKoon & Ratcliff 1992). 'Reader-goal' inferences are motivated by the idiosyncratic goals that the reader has while reading the text. For example, if the reader has the goal of tracking the personality of one of the characters, then there will be many inferences about that character's personality. If the reader wants to trace the spatial layout of a room in a short story, then relevant spatial inferences are generated. There is ample experimental evidence that readers generate spatial inferences if the instructions or the experimental task encourage them to construct a spatial mental model ( Glenberg, et al. 1987; Morrow, Greenspan & Bower 1987; Rinck, Williams, Bower & Becker 1996), whereas these spatial inferences are rarely constructed when naturalistic stories are comprehended without goals that monitor spatial processing (Zwaan & Van Oostendorp 1993). The constructionist theory stipulates that many other classes of inferences are not routinely constructed during comprehension. For example, readers do not normally generate the 'logic-based' inferences that are derived from most
Inferences and relations
of the well-formed rules in a syllogistic reasoning task, such as modus tollens (i.e., (A implies B) and (not B), therefore (not A)] and DeMorgan's rule. It takes time (measured in minutes) and an external memory (such as truth tables or Venn diagrams) to construct such analytical inferences, so they will not be made at a reading rate of 250-400 words per minute. Humans apparently can handle modus ponens [i.e., (A implies B) and A, therefore B) because this rule is easy on the cognitive system; the expressions do not involve negation and the inferences can be produced by pattern-triggered associative operations. Readers do not generate most 'statistics-based' inferences during comprehension because such inferences would take minutes or hours to construct, even by those with high expertise in statistics. Some readers can accommodate simple statistical inferences, such as averages or ranges on a small set of numbers. Readers are not very good at generating 'causal consequence' inferences (or expectations) that forecast what should happen many steps into the future. Readers can sometimes predict what will immediately occur after a text event, particularly if the context is so constrained that there are only one or two likely consequences. However, most expectations end up getting disconfirmed by future occurrences, particularly in a dynamic world. The picture that we have sketched so far suggests that adult readers routinely generate only a small set of inferences, rather than promiscuously generating many classes of inferences. Readers routinely generate inferences that involve explanations, passive-activations, readers' goals, and coherence (the latter perhaps being confined to readers who have high reading fluency). Readers do not tend to generate inferences that involve detailed elaborations (e.g., spatial layout, visual features of objects and people, the manner in which actions and events occur), distant causal consequences, logical syllogisms, and statistics. Instead of assuming that "anything goes", there are constraints from cognition, the world environment, and perhaps biology that limit the classes of inferences that are generated. The constructionist theory offers one position that specifies what inferences are most likely to be generated and what aren't.
2..
A three-pronged method of investigating inference generation
Inferences are not directly manifested in the text, so there needs to be a method of exposing the inferences and testing whether they are generated during normal reading. A three-pronged method has been advocated for studies of
255
256 Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
inference generation (Graesser et al. 1994; Magliano & Graesser 1991; Sub & Trabasso 1993; Trabasso & Magliano 1996). The three prongs are ( 1) theoretical predictions, (2) collection of verbal protocols, and (3) collection of on-line behavioral measures. 2..1
Theoretical predictions
This prong simply asserts that it is important to identify what classes of inferences are predicted to be generated by the various theoretical models. For example, the constructionist theory would predict that the explanation-based inferences are routinely generated, but not the detailed elaborative inferences. In contrast, McKoon and Ratcliff's ( 1992) minimalist hypothesis would predict that neither of these classes of inferences are routinely generated. As mentioned above, there is no lack of theories and models in discourse psychology to test. 2..2.
Verbal protocols
Verbal protocols are collected from a normative group of readers as they comprehend the text, sentence by sentence. These verbal protocols expose the inferences that surface to consciousness when the reader has the luxury of thoughtful reflection on the text. If an inference appears in the verbal protocols, then the experimenter is on safe grounds in arguing that the readers have a sufficient amount of background knowledge to produce the inference. There is a methodological danger in experimenters generating their own inference test items and presuming that readers have the knowledge to generate such inferences. The most common form of verbal protocol is the 'think aloud' protocol. Readers think aloud and express what ever comes to mind as they comprehend the text, sentence by sentence (Cote, Goldman & Saul 1998; Trabasso & Magliano 1996; Zwaan & Brown 1996). In question answering tasks, the readers answer particular questions about each sentence, such as why, how, and what-happens-next (Graesser & Clark 1985). In question asking tasks, the readers ask questions that come to mind about each sentence (Olson, Duffy & Mack 1985). All of these verbal protocol techniques expose the inferences that readers make with minimal time constraints. Analyses of think aloud protocols have confirmed some of the predictions of the constructionist theory. Most of the content of the think aloud protocols consists of explanations, whereas causal consequences and elaborations are considerably less frequent. This pattern of results has occurred for simple
Inferences and relations 257
children's stories (Trabasso & Magliano 1996), literary stories for adults (Zwaan & Brown 1996), and expository text (Cote et al. 1998). As predicted by the constructionist theory, the inferences that rise to consciousness should include superordinate goals of characters, causal antecedents, and other forms of explanations, whereas there should be a lower incidence of elaborative associations and distant causal consequences (i.e., expectations, forecasts, predictions). Verbal protocol analyses are also used to expose the content of the information sources that supply the inferences. In essence, verbal protocols are used to perform knowledge extraction (or knowledge engineering) on the packages of generic world knowledge that supply the inferences in the situation model. Many of the knowledge-based inferences are inherited from the word concepts in the explicit text, particularly the nouns, verbs, and adjectives. According to the estimate of Graesser and Clark ( 1985), approximately 63% of the knowledge-based inferences in stories are inherited from the word concepts. An additional 9% of the inferences are inherited from global concepts that are related to the story, but are not explicitly stated (such as the concepts of fairytale and conflict). The remaining 28% of the story inferences are 'novelsituational' inferences; the content of these inferences did not match any information units in the word concepts or the global concepts. Graesser and Clark's analysis of the verbal protocols was analogous to research on the phenomenon of conceptual combination. The basic challenge in this research area is to explain the extent to which the representation of a combined concept (e.g., "lamp oil") can be derived from the constituent word concepts ("lamp", and "oil")(Hampton 1987; Rips 1995; Wisniewski 1997). One straightforward approach to investigating conceptual combination is to have participants list attributes of concept A, concept B, and the combined concept AB. To what extent do the attributes listed in the combined concept AB match the attributes listed in A and B? What sort of attributes in a generic concept (such as "oil") end up appearing in the attributes of diverse concept combinations (such as "com oil" versus "lamp oil" versus "baby oil")? What are the emergent attributes in combination AB that cannot be inherited from either concept A or concept B? It is beyond the scope of this chapter to discuss the principles and theories of conceptual combination. The point we wish to emphasize is that an analysis of these verbal protocols provides a rich data base for inducing the mechanisms of conceptual combination and for testing models that make distinctive predictions. Moreover, Graesser and Clark (1985) adopted a similar approach by using verbal protocols to explore the inferences
258 Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
that are inherited from generic concepts versus those that are novd (emergent) in the situation model. 2..3 On-line behavioral measures
The comprehender has the luxuries of time and thoughtful reflection when verbal protocols are collected. Just because an inference appears in a think aloud protocol does not necessarily mean the inference is encoded during normal comprehension. Readers do not have the benefits of reflection and time while reading at a rate of 250-400 words per minute. Therefore, some of the inferences exposed by the verbal protocols may not be generated on-line during normal reading. Conversely, some of the inferences encoded during normal reading may not appear in the verbal protocols; such inferences may be difficult to express in words or may not be amenable to conscious introspection. Verbal protocols do not provide a perfect window to the inferences that get constructed during normal reading, so it is important to collect on-line behavioral measures in rigorous tests of an inference theory. Discourse psychologists have explored a large number of measures and tasks that tap on-line comprehension processes and inference generation (Graesser, Millis & Zwaan 1997; Haberlandt 1994). For example, the typical dependent measures include sdf-paced reading times for text segments (e.g., words, clauses, sentence), fixation times on words during eye-tracking, lexical decision latencies on test words (i.e., whether a test string is a word or a nonword), naming latencies on test words, latencies to verify whether a test sentence is true or false, latencies to decide whether a test segment has been presented earlier (i.e., recognition memory), and speeded recognition judgments under a deadline procedure. All of these methods have been used in our investigations of inference generation, but most of our studies have investigated sentence reading times (Graesser & Bertus 1998), lexical decision latencies (Long, Golding & Graesser 1992; Magliano, Baggett, Johnson & Graesser 1993; Millis & Graesser 1994) and word naming latencies (Long et al. 1992). The results of these studies have been compatible with the results of the think aloud studies and the constructionist theory: the explanation-based inferences are encoded more quickly and strongly during comprehension than are distant causal consequences and frivolous elaborative inferences. One line of research will be described in order to illustrate the careful experimental control that was imposed on some tests of the constructionist theory. After each sentence in a text was read, a test string appeared (cued with
Inferences and relations 259
asterisks or in some other fashion). Some of the test strings matched a distinctive noun, verb, or adjective in the inference under consideration. For example, "eat, would be the inference test word from the inference "the couple wanted to eat., This inference would be theoretically predicted while reading the second sentence of the example break-up scenario ("A couple sits down at a table."). Test words were prepared for inferences in different theoretical categories, such as superordinate goals, subordinate actions, causal antecedents, causal consequences, and elaborative states. The samples of these test words were equilibrated on measures derived from verbal protocols that were collected from other groups of participants. For example, if 25o/o of the participants had expressed the superordinate goal inferences in the verbal protocols, then 25% would also have expressed the subordinate actions. Similarly, the samples of test words were equilibrated on the likelihood that they matched information units from the word concepts that appeared in the explicit sentences. For example, consider the word concepts that appeared in the text up through the second sentence in the break-up scenario: diner, couple, sit, and table. Knowledge extraction methods had been used (from previous groups of participants) to expose the content of each of these generic word concepts. The test inferences in the different classes were equilibrated on the likelihood that they matched content in the four word concepts. There were additional experimental controls in these studies. We imposed careful control over comprehension time by using a rapid serial visual presentation task. The sequence of words in the sentence are presented for a short duration (250-500 milliseconds per word): A, couple, sits, down, at, a, table. The time-course of inference activation is traced by presenting the inference test word ("eat") at specific durations after the end of the sentence (e.g., 150, 500, 1000, versus 2000 milliseconds). An inference is assumed to be encoded if its test word has a comparatively short latency and this facilitation is sustained over time (rather than quickly decaying). An inference is assumed to be activated if its latency in the inference context is shorter than the same word in the context of another text. Thus, an inference activation score is measured as [latency (unrelated context) -latency (inference context)]. These well controlled experiments supported the predictions of the construction theory. That is, the explanation-based inferences had higher inference activation scores than did the distant causal consequences and the elaborative inferences (Long et al. 1992; Magliano et al. 1993; Millis & Graesser 1994). The results of the experiments that collected on-line behavioral measures were therefore compatible with the verbal protocol studies.
26o Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
3·
A catalogue of relations based on world knowledge
According to the constructionist theory, readers attempt to achieve local and global coherence when they comprehend text. The coherence is sometimes driven by explicit features of the text, such as anaphoric references, connectives, transitional phrases, rhetorical predicates, and signaling devices. However, sometimes the coherence relations are constructed inferentially. Good readers are able to infer the appropriate coherence relations that bind text constituents. It is important to acknowledge that there is no guarantee that coherent text representations are constructed because the process is contingent on the reader's judgment that the author intended to construct a coherent message. All bets are off if the text is so disjoint and inconsiderate that the reader gives up trying to construct a coherent message. However, most naturalistic texts have some semblance of coherence and most readers do make the effort to achieve a coherent construction. Coherence relations have been extensively analyzed in the fields of text linguistics and discourse processes (for example, consider the chapters in the present volume by Sanders & Spooren, by Knott, by Pander Maat, by Noordman, and by Snoeck Henkemans). Researchers in these fields have proposed taxonomies of coherence relations that allegedly are needed to explain the structure and processing of oral and written discourse (Givon 1993; Halliday & Hasan 1976; Mann & Thompson 1986; Sanders 1997; Sanders, Spooren & Noordman 1992). For example, a relatively small set of coherence relations appears to underlie the connectives that explicitly occur in texts. There are connectives that signify 'temporality' (e.g., before, after, during, and, then), 'causality' (because, so), 'intentionality' (in order to, for the purpose of), 'opposition' (but, however, on the other hand), 'logical implication' (therefore, thus), and so on. A text is regarded as coherent to the extent that its explicit statements can be connected to each other conceptually. Local coherence is achieved if the reader can connect the incoming sentence to information in the previous sentence or to the content in working memory. Global coherence is achieved if the incoming sentence can be connected to the text macrostructure (i.e., major message or point) or to information much earlier in the text that no longer resides in working memory. Available research in discourse psychology suggests that readers attempt to achieve coherence at both local and global levels (Albrecht & O'Brien 1993; Myers, et al. 1994; Singer, Graesser & Trabasso 1994), although there is some debate about the consistency in which global coherence is achieved.
Inferences and relations
The establishment of text coherence is sometimes facilitated when there are explicit connectives that signal how text constituents should be connected (Britton & Gulgoz 1991; Deaton & Gernsbacher, in press; Millis & Just 1994). However, explicit textual cues that signal coherence relations are not always necessary for the establishment of conceptual coherence because these links can sometimes be filled in inferentially during the construction of the situation model. The reader requires fewer explicit cues to the extent that there is ample world knowledge about the content of the sentences. Zwaan has proposed an 'event indexing' model that accounts for the reader's construction of a multithreaded situation model while reading stories (Zwaan, Magliano & Graesser 1995; Zwaan et al. 1995; Zwaan & Radvansky 1998). This model assumes that the reader monitors five conceptual dimensions: protagonists, temporality, spatiality, causality, and intentionality. A break in continuity may exist on any of these dimensions when an incoming sentence is read. Protagonist discontinuity exists when an incoming event E has a character that is different from the characters in the previous sentence P. A temporal discontinuity exists when event E occurs much later in time than P, or is part of a flashback. A spatial discontinuity exists when event E occurs in a spatial setting that is different from P. A causal discontinuity exists when event E is not causally related toP. An intentional discontinuity exists when event E is part of a character's plan that is different from any plans associated with P. Research by Zwaan supports the claim that reading times for event E increase as a function of the number of dimensional discontinuities that exist. Also, a discontinuity on any one of these dimensions significantly increases reading time. World knowledge plays a central role during the process of establishing continuities on these dimensions, as well as other types of coherence relations. Consequently, we believe that it is worthwhile to take stock of the relations that have been identified by researchers who investigate the representation of world knowledge. In addition to the coherence relations that have been identified in text linguistics and discourse processes, the fields of artificial intelligence and cognitive science have devoted 30 years of research investigating the conceptual structures that represent world knowledge (Graesser & Clark 1985; Lehmann 1992; Schank & Reisbeck 1981). These conceptual structures consist of a set of nodes (concepts, states, events, actions, goals) that are connected by relational arcs (IS-A, HAS-AS-PARTS, CAUSE, REASON, etc.). The relations that exist in world knowledge structures presumably are very relevant to the coherence relations that connect text constituents.
261
l.62. Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
We recently developed a catalogue of relations in a project funded by the Office of Naval Research. This relation catalogue is presented in Table 1 in the Appendix. The objective of the research project was to build a computer tool that guides the elicitation of knowledge from expert tactical planners in the military (Williams, Hultman & Graesser 1998). This knowledge is extracted during the process of building expert systems. There is a broad landscape of knowledge in tactical planning: planning networks, causal networks, conflict scenarios, organizational hierarchies, friend-foe networks, taxonomic hierarchies, spatial structures, visual descriptions, and logical structures. Although the project focused on tactical planning, the catalogue of relations was expected to adequately cover a large spectrum of knowledge domains. If a researcher wanted to be extremely precise, there would be literally thousands of relation categories that capture very subtle semantic distinctions. However, a precise analysis runs the risk of being so subtle that a knowledge engineer or discourse analyst would have trouble remembering and applying it; there would be low agreement between a pair of researchers as to whether a particular relation is relevant. It is widely acknowledged that interjudge agreement decreases when there are a large number of categories and the theoretical distinctions are extremely fine-grained (D'Andrade & Wish 1985 ). At the other end of the continuum, there are liabilities to having a very small number of relation categories; the set of categories would be so crude that the judges end up glossing over critical theoretical distinctions. The relation categories in Table 1 approach an optimal point in this tradeoff. It is possible to obtain satisfactory interjudge agreement by researchers trained to use these categories. At the same time, the categories are functionally useful in computational models of question answering, summarization, recall, and planning (Graesser & Clark 1985; Graesser, Gordon & Brainerd 1992). The relation catalogue in Table 1 contains 22 basic relations altogether. For each relation, there is a definition, a composition rule, and an example. For some of these basic relations, there are additional relations that are synonyms, inverses, negations, and subtypes. Therefore, this scheme could be expanded to over 100 relations after considering the synonyms, inverses, subtypes, and negations of the 22 basic relations. The catalogue accommodates all of the relations that are included in Graesser, Gordon, and Brainerd's (1992) analysis of world knowledge and most of relations that were reported by the 35 contributors to Lehmann's edited volume Semantic Networks in Artificial Intelligence ( 1992). We did not include relations that are involved in sentence syntax
Inferences and relations 2.63
and in case structure thematic roles (e.g., agent, object, recipient, etc.). A large subset of the relations in Table 1 are also relations that exist in Wordnet (Miller 1990). Wordnet is a large lexicon of nouns, verbs, and adjectives, which contains syntactic and semantic features that reflect the use of words in language. Each of the relations in Table 1 connects two nodes. Each node is assigned to one of five categories: Concept (C) A noun-like concept, such as "captain," "ship," and "battle." State (S) An ongoing characteristic that remains unchanged within a relevant time frame, such as "the ship has missiles" and "the water is salty." Event(E) A state change that occurs within a relevant time frame, such as "the ship sank" and "the fleet threatened the enemy." Goal (G) A state or event that an agent wants to achieve, such as "the captain wants the ship to reach the port" and "the captain wants to communicate with the enemy." Style (Sy) The qualitative manner or intensity in which an event unfolds, such as the ship moved "in a zigzag path" and the missile moved "slowly." It should be noted that the 'intentional actions' of agents are not primitive node categories in this analysis. Instead, they are amalgamations of goal nodes linked to states, events, or style specifications, which signify a positive outcome. Therefore, the action "the captain sank the ship" is an amalgamation of two nodes: (2) (G: captain wanted to sink the ship)-OUTCOME ~ (E: the ship sank). There is another constraint that the captain executed some plan that led to the sinking ship. Our analysis segregates goals from the events in the world because it is very important in planning to differentiate plans (which may or may not be implemented) from events that actually occur in the world. Each relation has a 'composition rule', which specifies the node categories that can be linked by a particular type of relation. For example, the IS-A relation can link only concept nodes, whereas the RE.ASON relation can link only goal nodes. "Any" signifies that any of the five node categories may apply. Braces {} signify that a set of node categories may occur. For example, {E I S I Sy} signifies that a node may be either an event, a state, or a style specification. The position that we are advocating is that the relations in this catalogue
u;4 Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
play an important role in connecting explicit text constituents and in building situation models during comprehension. The relations in Table I are in the structures when coherent meaning representations are built. Regarding inferences, some relations play a more important role than others. According to the constructionist model, readers attempt to explain why events, actions, and states occur, so the important conceptual relations to monitor during inference generation are CAUSE, INITIATES, OUTCOME, REASON, and TEMPORAL. In contrast, relations such as HAS-AS-PART, MANNER, PROPERTY, QUANTITY, SPATIAL, and SUBPROCESS are not as prevalent in the inference mechanisms. An analysis of world knowledge is needed in order to build a complete psychological theory of how humans build coherent messages. Instead of starting with text and language, and asking what text connections are explicitly articulated, we start with world knowledge and ask what relations are prevalent when we make sense of the world. Instead of specifying what coherence relations are needed to bind text constituents, one can inquire what coherence relations are needed in order to build a computational model that performs a variety of processing tasks (such as question answering, recall, summarization, and planning). Of course, what ultimately is desired is a theory of comprehension that specifies how the meaning representations are constructed on the basis of both world knowledge and the surface linguistic cues. A productive research direction for future research would be to specify detailed mappings among (a) surface linguistic cues, (b) world knowledge structures, and (c) cognitive processes. The contributors to this volume have explored the cross fertilization of research in linguistics and psychology. This interdisciplinary effort is difficult. There is a strong complacent inclination for the fields to remain isolated and insulated from each other. Nevertheless, the efforts will hopefuUy yield fresh insights. If our claims in this chapter are on the mark, it will also be worthwhile to add computational linguistics and artificial intelligence to the fold.
Note " Correspondence concerning this chapter should be addressed to Arthur C. Graesser, Department of Psychology, Campus Box 526400, The University of Memphis, Memphis, TN 38152-6400,
[email protected]. This research was partially funded by grants to the first author by the Office of Naval Research (N00014-95-llll3 and NOOOI4-98-l0331) & the National Science Foundation (SBR 9720314).
Inferences and relations 265
References Albrecht, J, E., & O'Brien, E. J. (I 993 ). Updating a mental model: Maintaining both local and global coherence. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 1061-1070. Britton, B. K., & Gulgoz, S. (1991). Using Kintsch's computational model to improve instructional text: Effects of repairing inference calls on recall and cognitive structures./oumal ofEducational Psychology, 83, 329-404. Cote, N., Goldman, S. R., & Saul, E. U. ( 1998). Students making sense of informational text Relations between processing and representations. Discourse Processes, 25, 154.
D'Andrade, R. G., & Wish, M. (1985). Speech act theory in quantitative research on interpersonal behavior. Discourse Processes. 8, 229-259. Deaton, J, A., & Gernsbacher, M. A. (in press). Causal conjunctions and implicit causality cue mapping in sentence comprehension. Journal ofMemory and Language. Gemsbacher, M.A. (1997). Two decades of structure building. Discourse Processes. 23, 265-304.
Givon, T. (1993). Coherence in text, coherence in mind. Pragmatics 6- Cognition, 1, 171-227.
Glenberg, A.M., Meyer, M., & Lindem, K. (1987). Mental models contribute to foregrounding during text comprehension. ]ourMl of Memory and Language, 26, 69-83.
Goldman, S. R., Varma, S., & Cote, N. (1996). Extending capacity-constrained construction-integration: Toward "smarter" and flexible models of text comprehension. In B. K. Britton, & A. C. Graesser (Eds.), Models of understanding text (pp. 73-114). Mahwah, NJ: Erlbaum. Graesser, A. C., & Bertus, E. ( 1998). The construction of causal inferences while reading expository texts on science and technology. Journal of the Scientific Studies of Reading, 2, 247-269. Graesser, A. C., Gordon, S., & Brainerd, L. E. (1992). QUEST: A model of question answering. Computers 6- Mathematics with Applications, 23, 733-1992. Graesser, A. C., & Clark, L. C. ( 1985). Structures and procedures of implicit knowledge. Norwood, NJ: Ablex. Graesscr, A. C., Kassler, M.A., Kreuz, R. J., & McLain-Allen, B. (1998). Verification of statements about story worlds that deviate from normal conceptions of time: What is true about Einstein's Dreams? Cognitive Psychology, 35,246-301. Graesser, A. C., Millis, K. K., & Zwaan. R. A. ( 1997). Discourse comprehension. Annual ReviewofPsychology, 48, 163-89. Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrative text comprehension. Psychological Review, 101,371-395. Graesser, A. C., & Wiemer-Hastings, K. ( 1999). Situation models and concepts in story comprehension. InS. R. Goldman, A. C. Graesser, & P. van den Broek, (Eds.), Narrative comprehension caustJlity, and coherence (pp. 77-92). Mahwah, NJ: Erlbaum.
D6 Arthur C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
Graesser, A. C., & Zwaan, R. A. (1995). Inference generation and the construction of situation models. In C. A. Weaver, S. Mannes, & C. R. Fletcher (Eds.), Discourse comprehension: Strategies and processing revisited (pp. 117-139). Hillsdale, NJ: Erlbaum. Haberlandt, K. (1994). Methods in reading research. In M.A. Gernsbacher (Ed.), Handbook ofpsycholinguistics (pp. 1-31 ). New York: Academic Press. Halliday, M.A. I<., & Hasan, R. (1976). Cohesion in English. London: Longmans. Hampton, J. A. (1987). Inheritance of attributes in natural concept conjunctions. Memory & Cognition, 15,55-71. Johnson-Laird, P. N. (1983). Mental models. Cambridge, MA: Harvard University Press. Kintscb, W. (1998). Comprehension: A paradigm for cognition. Cambridge, MA: Cambridge University Press. Lehmann, F. (Ed.). ( 1992). Semantic networks in artificial intelligence. Oxford, England: Pergamon Press. Long, D. L., Golding, J. M., & Graesser, A. C. ( 1992). Test of the on-line status of goalrelated inferences. journal ofMemory and lAnguage, 31, 634-647. Magliano, J.P., & Graesser, A. C. (1991). A three-pronged method for studying inference generation in literary text Poetics, 20, 193-232. Magliano, J.P., Baggett, W. B., Johnson, B. K., & Graesser, A. C. (1993). The time course of generating causal antecedent and causal consequence inferences. Discourse Processes, 16,35-53. Mann, W. C., & Thompson, S. A.(l986). Relational propositions in discourse. Discourse Processes, 9, 57-90. McKoon, G., & Ratcliff, R. (1992). Inference during reading. Psychological Review, 99, 440-466.
Miller, G. A. (1990). WordNet: An on-line lexical database. International journal of Lexicography, 3. Millis, I<., & Graesser, A. C. ( 1994 ). The time-course of constructing knowledge-based inferences for scientific texts. journal ofMemory and LanguJJge, 33, 583-599. Millis, K. K., & Just, M.A. ( 1994). The influence of connectives on sentence comprehension. Journal ofMemory and lAnguage, 33, 128-147. Mooney, R. J. ( 1990). A general explanation-based learning mechanism and its application to narrative understanding. San Mateo, CA: Morgan Kaufman. Morrow, D. G., Greenspan, S. L., & Bower, G. H. (1987). Accessibility and situation models in narrative comprehension. Journal ofMemory and lAnguage, 26, 165-87. Myers, J. L., & O'Brien, E. J. (1998). Accessing the discourse representations during reading. Discourse Processes, 26, 131-157. Myers,). L, O'Brien, E. J., Albrecht,). E., & Mason, R. A. (1994). Maintaining global coherence during reading. journal of Experimental Psychology: Human Learning, Memory, and Cognition. 20, 876-86. Noordman, L. G. M., & Vonk. W. ( 1998). Memory-based processing in understanding causal information. Discourse Processes, 26, 191-212. O'Brien, E. J., Raney, G. E., Albrecht, J. E., & Rayner, I<. (1997). Processes involved in
Inferences and relations 2fJ7
the resolution of explicit anaphors. Discourse Processes, 23, 1-24. Olson, G. M., Duffy, S. A, & Mack, R. L. (1985). Question asking as a component of text comprehension. In A C. Graesser, & f. B. Black (Eds.), The psychology of questions (pp. 219-226). Hillsdale, NJ: Erlbaum. Rinck, M., Williams, P., Bower, G. H., & Becker, E. S. (1996). Spatial situation models and narrative understanding: Some generalizations and extensions. Discourse Pro-
cesses. 21,23-56. Rips, L. J. (1995). The current status of research on conceptual combination. Mind and Language, 10, 72-104. Sanders, T. f. M., Spooren, W. P.M., & Noordman, L. G. M. (1992). Toward a taxonomy of coherence relations. Discourse Processes, 15, 1-36. Sanders, T. J. M. ( 1997). Semantic and pragmatic sources of coherence: On the categorization of coherence relations in context. Discourse Processes, 24, 119-148. Schank, R. C. ( 1986). Explanation patterns: Understanding mechanically and c:rMtively. Hillsdale, NJ: Erlbaum. Schank, R. C., & Reisbeck, C. K. (1981)./nside computer understanding. Hillsdale, NJ: Erlbaum. Singer, M., Graesser, A. C., & Trabasso, T. (1994). Minimal or global inference during reading. Journal ofMemory and Language, 33,421-441. Sub, S. Y., & Trabasso, T. ( 1993 ).Inferences during reading: Converging evidence from discourse analysis. talk-aloud protocols, and recognition priming. Journal of Memory and Language, 32, 279-300. Tomasello, M. (Ed.). ( 1998). The new psychology of language: Cognitive and functional approaches to language structure. Mahwah, NJ: Erlbaum. Trabasso, T., & Magliano, J.P. (1996). Conscious understanding during comprehension. Discourse Processes, 21,255-287. Van den Broek, P., Young. M., Tzeng, Y., & Linderholm, T. (1999). The landscape model of reading: Inferences and the on-line construction of memory representations. In H. van Oostendorp, & S. R. Goldman (Eds.), The construction of mental representations during retlding. Mahwah, NJ: Erlbaum. Williams, K. E., Hultman, E., & Graesser, A. C. {1998). CAT: A tool for eliciting knowledge on how to perform procedures. Behavior Research Methods, Instruments 6- Computers, 30, 565-572. Wisniewski, E. J. ( 1997). When concepts combine. Psychonomic Bulletin and Review, 4, 167-183. Zwaan, R. A, & Brown, C. M. (1996). The influence of language proficiency and comprehension skill on situation model construction. Discourse Processes, 21,289327. Zwaan, R. A, Langston,M. C., &Graesser, A C. (1995). The construction of situation models in narrative comprehension: An event-indexing model. Psychological Science, 6, 292-297. Zwaan, R. A., Magliano, J.P., &Graesser, A. C. (1995). Dimensions of situation model construction in narrative comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 386-397.
:t68 Arthw C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
Zwaan, R. A., & Radvansky, G. A. ( 1998). Situation models in language comprehension and memory. Psychologiazl BuUetin, 123, 162-185. Zwaan, R. A., & Van Oostendorp, H. ( 1993). Do readers construct spatial representations in naturalistic story comprehension? Discourse Processes, 16, 12~143. Table 1.
Catalogue of Relations
(1)
AND Definition: Both A and B exist or occur. Composition rule: (Any} ~AND-+ (Any} [Note: The arc is bi-directional.] Example: (C: ships)~ AND-+ (C: planes)
(2)
BECOMES Definition: One concept changes into another concept. Composition rule: (C) -BECOMES-+ (C) Example: (C: civilian)- BECOMES-+ (C: enemy)
(3)
CAUSE Synonyms: CONSEQUENCE Subtypes: DIRECf-CAUSE, ENABLES Inverse: PRECONDITION, PREREQUISITE Definition: A directly or indirectly causes or enables B. The beginning of A precedes the beginning of B. Composition rule: ( E I S I Sy }-CAUSE -+ ( E I S I Sy I Example: (E: The plane flew too low) -CAUSE-+ (E: Radar flagged the plane)
(4)
CONTROLS Synonym:StTPERVISES Definition: A has direct control over the activities of B. Composition rule: (C)- CONTROLS-+ (C) Example: (C: captain)- CONTROLS-+ (C: lieutenant)
(5)
DISABLES Synonyms: BLOCKS Definition: A stops or prevents B. Composition rule: (G I E I S I Sy}- DISABLES-+ (G I E IS I Sy} Example: (S: ship has no fuel) - DISABLES -+ (E: ship moves to island)
(6)
FRIEND (and FOE) Synonyms: LUKES, ALLY Negation: FOE, ENEMY, DISLIKES Definition: One animate entity is in an alliance with another animate entity. Composition rule: (C)~ FRIEND-+ (C) [Note: The arc is bi-directional.] Example: (C: Iran)~ FOE-+ (C: Iraq)
(7)
HAS-AS-PART Synonyms: HAS-COMPONENT Inverses: IS-A-PART-OF Definition: A has a part or component B. Composition rule: (C)- HAS-AS-PART-+ (C) Example: (C: The U.S. S. Nimitz)- HAS-AS-PART-+ (C: flight deck)
Inferences and relations 269
(8)
IMPLIES Synonyms: IF-THEN Definition: If A exists or occurs, then B exists or occurs. A and B overlap in time. Composition rule: {Any} - IMPUES--+ {Any} Example: (S: The fleet has many ships)- IMPLIES--+ (S: The fleet is powerful)
(9)
INITIATES Synonyms: ELICITS Inverses: CONDITION, CIRCUMSTANCE, SITUATION Negation: DISABLES Definition: A initiates or elicits a goal. Composition rule: { E I S I Sy }-INITIATES --+ (G) Example: (S: The ship was low on fuel)- INITIATES--+ (G: captain get fuel for ship)
(10)
IS-A Subtypes: IS-A-KIND-OF, IS-AN-INSTANCE-OF, IS-A-MEMBER-OF, IS-A-1YPEOF Synonyms: SUPERCONCEPT Inverses: KIND, INSTANCE, MEMBER, SUB1YPE, SUBCONCEPT Negation: IS-NOT-A, OPPOSITE Definition: A is a subcategory or instance of B. Composition rule: (C)- IS-A--+ (C) Example: (C: The U.S. S. Nimitz)- IS-A--+ (C: ship)
(II) MANNER
Definition: A specifies the manner in which a state change B occurs or a goal is achieved. A and B overlap in time. Composition rule: { E I Sy} -MANNER--+ {E I Sy} (G)- MANNER--+ (G) Example: (E: the troops moved)- MANNER--+ (Sy: quickly) (12)
NAME Definition: A concept or other node is named with a particular label. Composition rule: {Any} -NAME--+ (name label) Example: (E: USA attacks Iraq)- NAME--+ ("Desert Storm")
(13) OR
Definition: Either A orB exist or occur, but not both Composition rule: {Any} E- OR--+ {Any} [Note: The arc is bi-directional.] Example: (G: fleet threaten enemy ship) E- OR--+ (G: fleet destroy enemy ship) (14) OUTCOME
Synonyms: RESULT Definition: A specifies whether or not the goal B is achieved Composition rule: (G)- OUTCOME--+ { E I S I Sy} Example: (G: commander destroy ship)- OUTCOME--+ (E: ship was destroyed) (15)
PROPERTY Synonyms: ATTRIBUTE, CHARACTERISTIC. FEATURE Definition: A concept has a particular characteristic. Composition rule: (C)- PROPERTY--+ I E I S I Sy I Example: (C: ship)- PROPERTY--+ (S: the ship is buoyant)
270
Arthw C. Graesser, Peter Wiemer-Hastings, and Katja Wiemer-Hastings
(16) QUANTfiY
Synonyms: NUMBER, FREQUENCY Definition: The number of instances of a concept or event. Composition rule: (C I El- QUANTITY--+ (number) Example: (E: The gun fired)- QUANTITY--+ (12 times) (17) REASON Synonyms: PURPOSE, MOTIVE Inverses: METHOD, PLAN, STEP, PRE-ACI10N Definition: Goal A is the reason or motive for implementing a method, plan, or action B. The outcome of B is achieved before the outcome of A A is a superordinate goal of B. Composition rule: (G)- REASON--+ (G) Example: (G: commander fire missile)- REASON--+ (G: commander destroy ship) (18) REFERENTIAL POINTER (RP)
Synonyms: REFERS-TO Definition: One node refers to another set of nodes. Composition rule: (Any I - RP --+ ( AnyI Example: (C: battle)- RP--+ [(E: ship 1 fires at ship 2), (E: ship 2 fires at ship I)) (19)
SIMILAR-TO Subtypes: EQUIVALENT Negation: DISSIMILAR, CONTRASTS,CONTRADICTS Definition: Two nodes are very similar in content or features. Composition rule: (AnyI -SIMILAR-TO-- > (Any I Example: (C: ship 1)- SIMILAR-TO--+ (C: ship 2)
(20) SPATIAL RELATION
Definition: One spatial region, object, or part is spatially related to another. Composition rule: (C)- SR--+ (C) Example: ( C: mast) -ABOVE --+ ( C: deck) The following subtypes of spatial relationships are self-explanatory: ABOVE BELOW (inverse) BY (synonym) BESIDE BETWEEN IS-IN (inverse) CONTAINS CONNECTED-TO WEST-OF (inverse) EAST-OF OUTSIDE-OF (inverse), WITHIN (synonym) INSIDE-OF RIGHT-OF (inverse) LEFT-OF
NEAR NORTH-OF SOUTH -OF (inverse) ON-TOP-OF UNDERNEATH (inverse), SUPPORTS (inverse) SURROUNDS ENCAPSULATES (synonym) TOUCHES ABUTS (synonym) The following relations specify a quantity in 3-dimensional space: POSffiON C-rute: (C)- POSffiON--+ (x, y, z) ORIENTATION C-rute: (C) -ORIENTATION--+ (rotated45-degreesonx, y)
Inferences and relations
(21) SUBPROCESS Definition: Event A is a subprocess of event B. Composition rule: (E) -SUBPROCESS -+ (E) Example: (E: ship fires missile)- SUBPROCESS-+ (E: chamber releases missile) (22) TEMPORAL RELATION Definition: Point or duration A is related to point or duration B in time. Composition rule: {G I El - TR-+ {G I El Example: (E: missile bits ship)- BEFORE-+ (E: ship sinks) BEFORE AFTER (inverse)
DURING
171
CHAPTER
11
T~ngaboutbodiesofknowledge
Tests of a model for predicting thoughts Bruce K. Britton, Peter Schaefer, Michael Bryan, Stacy Silverman, and Robert Sorrells University of Georgia
1.
Introduction
This chapter presents a model for thinking about bodies of knowledge, along with two experimental tests of that model. We induced bodies ofknowledge in the participants by having them read expository texts that described bodies of technical and scientific information. Then we induced thinking by asking the participants to think about the bodies of knowledge. The model simulated the thinking process, yielding quantitative predictions of what specific thoughts the participants would have after thinking about the bodies of knowledge. We tested the predictions by looking for the predicted thoughts in two groups of participants. One group had been asked to think, and we compared them to otherwise equivalent participants who had not been asked to think. Our chapter's basic claim is that the participants' thinking process about a body of knowledge consists of'spreading activation and relaxation'. Spreading activation and relaxation is a process in which activated concepts from a body of knowledge spread their activation to related concepts, which causes the related concepts to become activated, and in turn to activate the concepts they are related to, and so on. The spread of activation continues as long as thinking is progressing, i.e., as long as the activations of concepts keep changing. When thinking stops progressing, the activations of the concepts stop changing and the participant is left with a final configuration of activations of those concepts. That final configuration of concept activations is the end product of the participant's thinking. That end product is what the model predicts. To make the predictions, the model simulates the thinking process by
274 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
simulating the process of spreading activation and relaxation. The simulation produces an end product, which is a final configuration ofthe activations of the concepts. The end product produced by the model's simulation is then compared to the end products elicited from participants who had been asked to think versus those who had not. The expected result is that the participants who thought will have an end product more similar to that produced by the model's simulation of thinking. This was done in two tests of our simulation model that are reported as experiments 1 and 2. In related work, the same spreading activation and relaxation process has been used to model thinking in the text comprehension process (Britton & Sorrells 1998) and to model psychological processes other than thinking (e.g., Anderson, Silverstein, Ritz & Jones 1977; Anderson 1995; Britton 1995; Britton & Eisenhart 1993; Stahl, Hynd, Britton, McNish & Bosquet 1996). Different spreading activation processes have been used for these purposes by e.g., Kintsch (1988, 1998); 1 McClelland and Rumelhart (1988); Thagard (1989); and Thagard and Verbreught ( 1998).2 The next three sections of this Introduction describe the knowledge representation and processing assumptions of the simulation model, and explain how it was tested.
2.
Representation: Eliciting the meanings of concepts-in-context from subject matter experts
To illustrate the knowledge representations used by the model, Figure 1 shows an example text about burning wood and Figure 2 shows a network of concepts characterizing the corresponding knowledge structure.
Wood does not begin burning instantly when a log is tossed onto a fire. In fact, heat is robbed from the existing fire to get the new log burning and producing heat. Heat from the existing fire turns excess water in the new log to steam. Still absorbing heat, the wood begins to break down into flammable gases and charcoal. In the third stage these gases begin to bum and produce some heat. In the fourth and final stage. the charcoal residue burns and gives off significant heat, the point at which you begin to benefit from a wood fire.
Figure 1. Example text about wood burning (from the Verbal Comprehension subtest of the Armed Forces Qualification Test).
Thinking about bodies of knowledge 275
--- --Figure 2. Network of concepts characterizing the knowledge structure corresponding to the wood burning text. The concepts of the body of knowledge are printed in ovals, and the connections between the concepts are shown as the lines connecting the ovals. Positive connections between concepts are shown as the solid lines and negative ones as the dashed lines. The strength of each positive and negative connection is shown by the darkness of the lines. 2.1
How the knowledge structure was elicited
Considering first the individual concepts within the circles in Figure 2, our method for selecting the concepts was informed by the fact that Britton and Tidwell (1995) had found that there were high levels of agreement between qualified judges on which concepts in a text are most important. So although the concepts were selected by the first author as the most important in the texts (for an unrelated project in 1993), we expect that other judges would have selected the same or similar concepts. Moving on to the network of relations, they are represented in Figure 2 by the lines in the network. The relations represent the strength and valence of the connections between the concepts in the ovals: all the connections are bidirectional; the solid lines represent positive, excitatory connections; the dashed lines represent negative, inhibitory connections; thicker lines represent stronger connections. (No information is represented by the locations of the nodes
2.76 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R Sorrells
or the distances between them.) We elicited the network from experts in the subject matter of the text. Because experts have privileged access to the correct knowledge structure for their special subject (at least for relatively complex scientific and technical bodies of knowledge like those used in these experiments) eliciting knowledge from them is presumably the best way, perhaps the only effective way, to approximate correctly the relevant bodies of knowledge (Britton & Tidwell1995; Jonassen, Beissner & Yacci 1993). We elicited the network by first asking the experts to read the text. Then the experts (for the wood text they were two members of the Departments of Forestry and of Agricultural Engineering at the University of Georgia) were given all pairs of the most important concepts in the text. For each pair of concepts they were asked to choose the correct connection between the members of each pair, where 'correct connection' means "the magnitude and valence of the connection a reader would properly have if the text was correctly understood... These connections were elicited from the experts by asking them to respond to a series of scales like that in Figure 3, in which the pair of concepts is shown at the top and the scale runs from the maximum positive relation- very closely related- to the maximum negative (i.e., opposed) relation - very distantly related. The participants were instructed that the "very distantly related" end of the scale referred to the relation of opposition. (The extreme values of+ 1 and -1 did not appear on the scale because they were reserved for the identity relation of a term with itself and with its abstract negative respectively, and since the respondents were not presented with each term paired with itself, nor with each term paired with its abstract negative, those values could not reasonably be selected by the respondent; only the discrete values shown from +.9 to -.9 could be selected.) The matrix of Figure 4 shows the average rating that the experts chose for each pair of concepts. The entries above the diagonal are the same as those below, representing the hi-directionality of the connections. (Figure 4 provides exactly the same information as Figure 2, except in numerical form.) pairs of concepts +I
1· ..1
-I
I
I
I
+9
+.6
+.3
-.3
Very Closely Related
Moderately Closely Related
Somewhat Closely Related
Somewhat Distantly Related
I
-.6 Moderately Distantly Related
H
-.9
Very
Distantly Related
Figure 3. Example scale used to elicit the magnitude and valence of the connections between pairs of concepts. See text for explanation.
Thinking about bodies of knowledge 2.77
2
I.log-when -first-thrown-on fire 2. produces-significant-heat
3. absorbs-heat
3
4
5
6
. 9 -.54 -.9 -.54 -.9
.9
.54
-.9
0
4. burns-at-once
-.54
5. charcoal
-.36
6. flammable-gases
Figure 4. Matrix of average ratings of relatedness of two subject matter experts for each pair of terms.
2.2
Meaning-in-context versus dictionary meaning
This method specifies the meaning of each concept in terms of its ratings of relatedness with the other concepts. Our rationale for specifying the meaning of concepts in this way has two complementary aspects. First, every concept takes its meaning from its relation with other concepts: no concept exists in isolation. This can easily be demonstrated by trying to consider any concept in isolation; i.e., by trying to give meaning to any concept without relating it to other concepts. Take for an example the concept of "dog." To consider it in isolation, its relation to other concepts must be removed, so any relation to such concepts as animate, four-legged, furry, etc. must be removed. In fact any concept that has any relation to dog must be disregarded if dog is to be considered in isolation. Obviously, once this process has been completed, no meaning is left behind. The second aspect of the rationale for specifying the meanings of concepts in this contextualized way is based on the notion that concepts have two kinds of meaning. One is the non-contextualized meaning, which is the meaning found in the dictionary. The other is the meaning-in-context, which is the meaning used when the concept is encountered in a context (in the example the context is the wood burning body of knowledge). The meaning-in-context is specified by the concept's relation to the other concepts in its context. The meaning-in-context normally does not appear in the dictionary, because it is
278
B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
only appropriate for a particular context within a particular body of knowledge. The meanings normally used in ordinary language are the meanings-incontext, because words and concepts are virtually never used out of context, (except in the discourses of lexicographers, linguists, discourse researchers, etc., which are metalinguistic rather than ordinary language). So it is the meaning-in-context that must be used in studies of language-in-use. Together, the notions that the meaning of a concept is specified by how it relates to other concepts, and that concepts-in-use have meanings-in-context, combine to form the composite idea that the meaning-in-context of a concept can be specified by its relations to other concepts in its context. This view of the meaning of concepts is operationalized in the method used here for eliciting the experts' knowledge about the text. For example, the meaning of each concept in the wood burning body of knowledge, in terms of its relation to the other concepts, can be read along the rows of Figure 4. Each row of the matrix can be read somewhat like a recipe: to get the meaning-in-context of the concept of "log when first thrown on fire" take a large part (.9 units) of"absorbs heat." Then in opposition (the opposition reflects the minus sign) put a large part (-.9 units) of"produces significant heat" and similarly of "charcoal," along with somewhat less of opposition to "burns at once" and "flammable gases." So for the meaning of "log when first thrown on fire" the result can be shown graphically as in Figure 5, with the concepts nearest the topmost concept being closely related to it and the concepts farthest from the topmost concept (i.e., those near the bottom) in opposition to it. (Note that the distances between "log when first thrown on fire" and the other concepts in Figure 5 represent only the meaning of "log when first thrown on fire"; the distances between the other concepts in Figure 5 do not represent the distances between those concepts in the body of knowledge. The meaning-in-context of each concept would have to be specified in a separate figure.) The meaning shown in Figure 5 cannot be found in the dictionary or probably anywhere else outside this context. Nevertheless, the meaning is dearly one reasonable approximation to the meaning that a knowledgeable and skilled reader would have for that concept after correctly understanding the text. The main representation assumption used here is that the matrix of concept relatedness ratings approximates the experts' knowledge structure for the body of knowledge expressed in the text; that is, that matrix represents the network of ideas that constitutes the body of knowledge corresponding to the text as properly understood. The validity of the representation assumptions
Thinking about bodies of knowledge 279
log-when-first -thrown-on-fire
absorbs-heat
burns-at-once flammable-gases charcoal produces-significant-heat Figure 5. Graphical depiction of the meaning of"log when first thrown on fire" in terms of the other concepts. See text for explanation.
used here, like any representation assumptions, can be evaluated partly on the basis of the results of empirical studies that use those representation assumptions, and partly on the basis of general criteria like parsimony.
3·
Processing assumptions for participants and the model: Thought as spreading activation
According to our hypothesis, the thinking process consists of spreading activation around the network of concepts that constitutes the body of knowledge. Spread of activation refers to the transmission of activation from one concept to another. The notion of 'activation' roughly corresponds to the everyday language folk-psychology notion of something 'coming to mind.'
28o B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
3.1
The first steps of spreading activation: From time 1 to time 2
When concept A is in an activated state, that causes the activation of the concepts connected to A. In everyday language we could say that A 'brings those other concepts to mind.' In the wood burning body of knowledge, suppose that the concept of "log when first thrown on fire" is the first one to become active (i.e., it comes to the participant's mind at a particular point in time; call it time 1). H spreading activation then occurs, then at the next point in time (time 2) the participant should bring "absorbs heat" to mind because they are strongly connected, as shown by the thick solid line in Figure 2, corresponding to the .9 entry in Figure 4. Also at time 2, activation simultaneously spreads to all the other concepts to which "log when first thrown on fire" is connected. So "log when first thrown on fire" also brings to mind its opposition to "bums at once" (i.e., when a log is first thrown on the fire it does not bum at once, as shown in Figure 2 by the negative connection coded by the dashed lines). The meaning of the negativity of the activation of "burns at once" may be clarified by considering it as entirely equivalent to increasing the positive activation of the concept "does not bum at once." Similar considerations apply to the effect of the activation of "log when first thrown on fire" on the time 2 activations of "produce significant heat" "charcoal" and "flammable gases." So at time 2, all the concepts connected to "log when first thrown on fire" have some amount of activation. How much activation does each concept receive? According to our model, it depends on: (a) how activated is the concept that sent the activation, and (b) the strength of association between the sending concept and the other concepts. Roughly, the higher the activation of the sending concept and the stronger the connection between it and the receiving concept, the more activation is spread from the sender to the receiver, and so the more activation the receiving concept gets. The best way to imagine such a spread of activation is to use Figure 2 to visualize some quantity of activation from "log when first thrown on fire" being transmitted as a larger or smaller blob along its connection to "absorbsheat., On the way to "absorbs-heat" the activation blob is weighted (in quantitative terms, multiplied) by the strength of the connection. The weighting reflects the notion that a stronger connection leads to more activation being transmitted. (Notice that "log when first thrown on fire", like all the concepts, also transmits its activation back to itself along the connection shown from
Thinking about bodies of knowledge
itself back to itself, sending activation to "absorbs heat" does not cause the original activation of "log when first thrown on fire" to dissipate.) Once the activations have been spread, then for each concept the activations coming into it are added up, yielding a total activation value for that concept. That completes one step of spreading activation. 3.2.
The remaining steps of spreading activation
What happens next? From subjective, introspective experience we know that the thinking process often continues beyond its first step. In the model, what determines whether thinking continues? According to the hypothesis tested here, thinking continues if the whole set of concepts has a different pattern of activations at the end of a step of thinking than it had at the beginning of that step of thinking. In the example, the pattern of activations is obviously different at the end of time 2 than at its beginning: at the beginning only "log when first thrown on fire" was active, while at the end all six concepts have some activation. So thinking will continue for another step. Each such step of spreading activation starts with the pattern of activations found at the end of the previous step, and takes the next step of thinking by spreading that activation, just as before. But if the pattern of activations at the end of a step of thinking is the same as (or very, very similar to) that at the beginning of that step of thinking, then changes due to spreading activation have stopped. Using everyday language folk-psychology terms, what has been described can be put as follows: If a participant has one or more concepts in mind, and then starts thinking, what happens is that the concept(s) bring to mind the other concepts they are connected with. How much each other concept comes to mind depends on: (a) how strongly in mind was the concept which brought it to mind, and (b) how strongly connected the concepts were. Then the concepts that have been brought to mind bring other concepts to mind, again proportional to their strength. This continues as long as the participant's state of mind keeps changing, i.e., as long as the concepts keep changing their pattern of activation. When the pattern of activations stops activations stop changing, the participant's state of mind stops changing and thinking has come to a conclusion.
~~
:1.82 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
3·3
The arithmetic of spreading activation
This everyday language account can be translated quite easily into the arithmetic process that was used by the model to simulate the spreading activation process. Only a little arithmetic needs to be specified. First, we need to specify some particular numbers: one set for the connections between concepts, one set for the activations of the concepts before any spreading activation has occurred, and one set for the activations as spreading activation proceeds. The numbers for the connections between concepts have already been elicited from the experts in the form of ratings between + 1 and -1, and these we can use as is. For the numbers to represent the activations, the choice is also very easy to make because we can also restrict them to numbers between + 1 and -1, and for the numerical activations before any spreading activation has occurred, it turns out they can be selected virtually at random (as long as they are very small and nonzero) without affecting the pattern of activations obtained at the end of spreading activation. The second thing needed to simulate spreading activation is to specify the sequence of arithmetic operations to be performed on those numbers. The simple arithmetic operations of addition, subtraction, multiplication and division are all that are needed. At the beginning of each step, the activations are sent from each concept to itself and to each other concept, multiplied on the way by the numbers for the connections along which they are sent (i.e., the expert ratings). At the end of each step, the activations arriving in each concept are summed algebraically, i.e., by adding the plus values and subtracting the minus ones. Then the pattern of activations is revealed by dividing each concept's total activation by the total activation of the concept with the largest activation. This makes the pattern of activations easy to see because it always results in the concept with the largest activation having an activation of 1 (since it has been divided by itself) and the other concepts having activations that are the corresponding proportion of the largest. Then that pattern is compared to the pattern at the start of the step and if the pattern is similar enough, the spreading activation process stops there. If there is a change, spreading activation continues for another step, just as before, until the pattern stops changing, when it stops. The end product of this simulation process is a pattern of final activation values, one value for each concept.
Thinking about bodies of knowledge 283
One remaining arithmetical question is: Does the process of spreading activation always stop? The answer is that for practical purposes it does always stop. The qualifier is needed because special circumstances can be arranged to cause it not to stop. However, in order for those circumstances to occur in any particular network, it would be necessary to specify carefully selected and precise values of certain connections and starting activations. This requirement of precision means that if there is any randomness or noise present anywhere in the system, those circumstances are very unlikely to arise. Since real systems always have some sources of randomness, the probability of such special circumstances arising in real systems approaches zero, barring malevolent perversity on the part of nature. This is why it is safe to assume that those circumstances will virtually never arise in practice, and so for practical purposes spreading activation will always stop. 3·4
The end product of spreading activation
The end product of the simulated thinking process is the pattern of activation of the concepts when the spread of activation stops: this pattern is called the "final activation pattern." That final pattern can be represented mentally and in the model in either of two forms. The two representations are explained using Figures 6 and 7. Figure 6 shows the pattern of the final activations for the concepts in the wood burning body of knowledge, listed as the numbers along the left side, and the same numbers are repeated along the top of the figure. 3 Figure 6 also shows the other form of the pattern, which is the set of entries shown in the body of the matrix. Each entry in the body of the matrix is the multiplicative product of the activations of its corresponding row and column . .9397 -1.03334 .8715 .2395 -.9588 -.3056 1. log- when -fU'St-thrown-on fire 2. produces-significant-heat 3. absorbs-heat 4. burns-at-once 5. charcoal 6. flammable-gases
.9397 -1.0334 .8715 .2395 -.9588 -.3056
-.9712
.8819 .2252 -.9009 -.9006 -.2476 .9908 .2088 -.8355 -.2297
-.2872
.3158 -.2663 -.0732 .2993
Figure 6. Final activation values and matrix of connections corresponding to final activations for the concepts in the wood burning body of knowledge. See text for explanation.
284 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
The numbers along the side and top of Figure 6 are obviously different from the numbers in the body of the matrix; the difference is important in its psychological interpretation, but it is only superficial in two mathematical ways. One mathematical way the difference is superficial is that the entries in the body of the matrix don•t bring any additional information into the situation: They are based entirely on the activations along the side and top, of which they are the products. This means that the matrix entries are entirely redundant if one has the activations. The second mathematical way in which the difference is superficial is not as obvious. It arises from a special property of any matrix entries when they are arrived at by multiplying the marginal numbers as in Figure 6. The property is that from the matrix entries alone we can always use spreading activation to recover perfectly the pattern of the marginal numbers that were multiplied together to get them. This means that the activations are redundant if one has the matrix entries. This is not obvious, because it is certainly not generally true in everyday arithmetic that one can always recover, from the product of two numbers, the pattern of numbers that were multiplied to get that product. The truth of this property, for any matrix that was constructed as Figure 6 was, is easy to demonstrate in terms of spreading activation and can also be proven mathematically using elementary linear algebra (see Britton & Sorrells 1998).
Figure 7. Network of connections corresponding the final activations for concepts in the wood burning body of knowledge.
Thinking about bodies of knowledge z&s
The upshot of all this is that the numbers along the side and top of Figure 6 and the entries in the corresponding matrix are equivalent in the sense that if you have either one you can always get exactly the other one. To make this property dear graphically, it is helpful to consider Figure 7, which is simply the network corresponding to the final activations and matrix of Figure 6 (i.e., the activation printed in each node of Figure 7 is the final activation value of that node, as shown along the side and top of Figure 6, and the magnitude and valence of the line representing each connection in Figure 7 corresponds exactly to the appropriate matrix entry in Figure 6). As noted above, Figure 7's representation of the network, like Figure 6's, is redundant in that it provides the same information twice. That is, if the connections were removed, leaving only the activations, the value of each connection could be reconstituted simply by multiplying the activations of the nodes it connects. It is also true, though less obvious, that if the activations were removed, leaving only the connections, the activation pattern could be reconstituted simply by starting the network with random small activations, and then spreading activation: The network would settle in one step, and the activation pattern would be reproduced perfectly. The important psychological difference between the final activations and the connections that are calculated from them has to do with their differing roles in working memory and long term memory. An implication of the hypothesis tested here is that the activations of Figures 6 and 7 are specialized for maintaining thoughts in working memory, while it is the connections of Figures 6 and 7 that are specialized for storing thoughts in long term memory. Further discussion of this implication is deferred to the General Discussion, after the psychological reality of the thought products has been tested.
4·
Testing the model
This brings us to the experiments that were done to test the model In the experiments, participants who had (or had not) been asked to think about a set of concepts were asked to rate the relatedness between the concepts. The critical comparisons are between: (a) the predicted pattern of connections, which is the end product of the simulated thinking process; and (b) the actual pattern of connections which is produced by the participants who have been instructed to think (i.e., their pattern of relatedness ratings). According to the hypothesis, the expected result of that comparison is that the end product of the simulated
286 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
thinking process will be found to be present in the pattern of relatedness ratings produced by the participants who were instructed to think, more so than in the participants who were not instructed to think. So the main goal of the experiments is to establish empirically whether the predicted thought product is found more in the participants when they have thought about the body of knowledge. 4.1
General procedure
In the experiments, first the experts read the text and rated the connections between the concepts. That network of connections (i.e., the experts' knowledge structure) was presumed to be the best available estimator of the correct knowledge structure for the text. Then, to make the predictions of the participants' thoughts about the text, the thought process about the text was simulated by spreading activation around the experts' network of connections until the network relaxed to its final activation pattern. The resulting pattern was the predicted product of thinking about an accurate representation of the text. That predicted thought product was obtained before the participants came to the laboratory. When the participants came to the laboratory, they read the texts and took a test on each one. Then some of the participants were asked to think about the body ofknowledge described in the text. Finally, each participant took a second test on each body of knowledge, to elicit the connections that characterized their body of knowledge after thinking (or not thinking). Both tests were the same as the experts had taken, i.e., with rating scales like those in Figure 3. The two experiments differed in that experiment 1 was a between subjects design in which the thinking took place in one of the groups during a one week interval between two experimental sessions, while experiment 2 was a within subjects design that was completed within a single experimental session, in which the thinking took place for some texts but not for others.
4-2
Analyses
For both experiments, the critical datum is the similarity between (a) the participant's pattern of connections elicited after thinking; and (b) the pattern of connections of the predicted thought product. Also important as controls are the similarities of the Think condition participants' connections (a) to the pattern of connections of the No Think participants; and (b) to the pattern of connections characterizing the original knowledge structure.
Thinking about bodies of knowledge 287
Experts' Structure
Final Activation Pattern
12
12
13
13 14 15
14 15 16 23 24 25 26 34 35 36 45 46 56
16 23 24 25 26 34 35 36 45 46 56
-1.5
0
+1.5
~------i
1----1
J------~
.....__..._ _~
J-----'11---1
-1.5
0
Superimposed
Discrepancies
12
12 13
13
14 15 16 23 24 25 26 34 35 36 45
-
46 56
-1.5
+1.5
0
14 IS 16 23 24 25 26 34
Experts' Structure
Final Activation Pattern
35 36 45 46 56
+1.5
Figure 8. Bar graphs representing the connections between pairs of concepts (the pairs of numbers at left correspond to the pairs of concepts, numbered as in Figures 4 and 6. The top two graphs show the connections for the knowledge structure (corresponding to Figures 2 and 4) and the predicted thought structure (corresponding to Figures 6 and 7). The bottom left bar graph superimposes the two top bar graphs. The bottom right bar graph plots the magnitude of the discrepancies between the top two bar graphs.
288 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
It is important to note that, in general, the pattern of connections that characterizes thought products like the one shown in Figures 6 and 7 are different from the pattern of connections that characterize the knowledge structure for the body of knowledge. The difference can be seen by comparing the networks or matrices for the knowledge structure with those for the thought product. The difference is easiest to see in graphical displays like Figure 8, in which the connections are shown as bar graphs. The top left bar graph represents the experts' connections for the wood burning body ofknowledge, and the top right bar graph represents the connections for the predicted thought product. The lower left bar graph shows the two top bar graphs superimposed to show their difference, and the lower right bar graph shows directly the difference between the top two bar graphs, i.e., between the connections in the body of knowledge and the connections in the thought product that is calculated from it by simulated spreading activation. The difference between the connections means that when the participant's connections are elicited in the experiments, it is possible to tell how well they match the connections for the predicted thought product, and compare it to how well they match the connections for the knowledge structure. For both experiments, the hypothesis is that during the thinking period, the participant spreads activation around the network for the body of knowledge and the result is a pattern of final activation values, which pattern is the same one as that predicted by the simulation model.
S·
Experiment 1
The purpose of this experiment was to test whether thinking caused an increase of the predicted thought structure. Both the experimental (Think) and the control (No Think) groups read several texts during the first experimental session and took immediate tests on them. At the end of the session, thinking was induced in the Think group participants by asking them to think about the texts during the one week interval between the first and second experimental sessions. The No Think group participants were not given any special think instructions, and they also returned for the second experimental session one week later. Both groups then received a second test on the texts. By the time of the second experimental session, two causal mechanisms potentially had an opportunity to influence the participants' responses. One is forgetting. Forgetting is a causal mechanism to which both the No Think and
Thinking about bodies of knowledge 289
the Think groups were subject. Evidence for forgetting would appear as a decline in memory for the body of knowledge and for its thought product, from the test at the first session to the test at the second session. The other causal mechanism was the thinking process, to which the Think group was presumably subject more than the No Think group. Evidence for thinking would appear as an increase in memory for the thought product of the body of knowledge, according to the hypothesis. Note that forgetting and thinking have opposite effects on the predicted thought structure: forgetting tends to cause a decline in memory, while thinking tends to cause an increase in memory. Because the Think group is subject to both causal influences, the results for the Think group can be expected to reflect both forgetting and thinking. Therefore, the expected result for the Think group was either an increase or a smaller decrease in performance for the predicted thought structure, because the negative effects of forgetting would be counteracted by the positive effects of thinking. That is, evidence for thinking would appear as a lesser decline in memory in the Think group for the predicted product of the thinking process, according to the hypothesis. On the other hand, for the No Think group only forgetting was a causal mechanism, so the expected result was a comparatively larger decline in performance, because the negative effects of forgetting would not be counteracted by the positive effects of thinking. Therefore one critical comparison between the Think and the No Think groups was based on how much of a decline in performance was found for the predicted thought structure from the first to the second experimental session. It is important to note that to support the hypothesis, the smaller decline in performance for the Think group could not be general over all aspects of the text, but would have to be limited to the predicted thought product. In particular, the hypothesis would not be supported if the thinking manipulation caused the same magnitude of decline for the knowledge structure representing the original body of knowledge. Such a result would indicate that the smaller decline in performance of the experimental group was likely an artifact of the similarity between the predicted thought structure and the knowledge structure (recalling that the former was derived from the latter). In summary, the result expected on the basis of the hypothesis is that the Think group would show a smaller decline in performance for the matrix representing the predicted thought structure, as compared to the No Think group; but there should be no such differential between the groups for the knowledge structure representing the original body of knowledge. This was
290
B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
tested as a three way interaction between the Group factor (Think vs. No Think), the Structure factor (the knowledge structure vs. the predicted thought structure), and the Test time factor (first session test vs. second session test). 5.1
Method
Participants Undergraduates (N=60) who were volunteers from psychology courses at the University of Georgia participated for extra credit. They were randomly assigned to the Think or No Think groups. They signed up for two 50 minutes sessions one week apart and were tested in groups of 30. Four participants failed to complete one or more items on the tests and so were excluded from the analysis.
Conditions There were two levels of the treatment variable (Think versus No Think). In both conditions the participants read the experimental texts and took a test on each one during the initial session. The treatment was introduced at the end of that session. Random assignment of participants to treatments was achieved by telling the participants in one randomly selected half of the room that they could leave, reminding them of their commitment to return one week later for the second session. This constituted their assignment to the No Think condition, because they had not been told to think about the texts during the succeeding week. The participants in the other half of the room, who had remained, were in the Think condition. They were asked to think about the texts for 2 minutes "several times during the next week". To help them remember to do this, and to remind them of the topics of the passages, each was given an index card to take with them; on the index card was printed for each passage a single word indicating its topic. They were told they could record information on the card about when they did their thinking, but that this was optional, and they were asked to bring the card back with them to the next session. Then they were dismissed in the same way as the No Think group.
Materials Each participant received a booklet which included four expository texts from the SAT test, each about 250 words long, each typed single spaced on a single page. The texts were on the topics of Bacteria, Climate, Glaciers, and Prostaglandins. Each text was followed by a test in which the five or six most important
Thinking about bodies of knowledge
terms from the text were paired and each pair was presented with a scale like that in Figure 3. The booklets started with a page asking participants not to open the booklet until they were told to do so, followed by a time sheet asking them to write on it the number appearing on a digital clock, followed by the first passage, followed by a time sheet, followed by the test on that passage, followed by a time sheet, followed by the second passage, and so on. The passage-test combinations were arranged in four orders from a latin square. At the second session the participants received a booklet like the one they had received at the first session except that the passages were not included. The booklets were the same in the Think as in the No Think condition.
Instructions and procedure At the beginning of the first session, the participants were told they would be reading texts and taking tests on them. They were told they should read each passage until they understood it, and that they would then receive a test on each passage, during which they would not be allowed to look back at the passage. A dummy example of a test item was given on the blackboard along with the scale. Participants were asked to answer each item based on the information in the passage, and were told that the "Very distantly related" end of the scale should be used for concepts that were opposed in some way. They were also told how to use the time sheets. Then they were asked if they had any questions, which were answered, and finally that they could ask questions at any time during the session by raising their hand. Then they proceeded through the booklets at their own pace. At the beginning of the second session the participants were instructed that they would get a second test on each passage.
Analyses For each body of knowledge two knowledge structures can be specified, one the knowledge structure for the body of knowledge, approximated by the ratings elicited from the expert (for the wood burning example, this corresponds to the structure shown in Figures 2 and 4 ) and the other corresponding to the predicted thought structure (for the example, Figures 6 and 7). Each structure was compared to the participant's knowledge structure. The prediction is that the ratings of the participants in the Think condition will have more evidence of the predicted thought structure than the ratings of the participants in the No Think condition. A5 a control, each participant's knowledge structure was also compared to the knowledge structure for the body of knowledge, for which the prediction was that there would be no more evidence for that structure in one thinking condition than the other.
291
292
B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
The evidence for each knowledge structure was assessed by calculating an index of similarity {a) between each participant's ratings and the predicted thought product, and {b) between each participant's ratings and the knowledge structure for the body of knowledge. The index of similarity was the Pearson product-moment correlation coefficient, calculated separately for each subject for each of the two structures. Fisher's z was applied to each resulting correlation coefficient to render the correlations normally distributed. The resulting Fisher z transformed correlations for each participant were the data points entered into a 2 X 2 X 2 analysis of covariance (ANCOVA) with the thinking condition as the between subjects factor {Think vs. no Think) and the within subjects factors of structure (knowledge structure for the body of knowledge vs. the predicted thought structure) and test time {the first test vs the second test). Covariates were participants' SAT Verbal and Math scores. The predicted interaction between the three factors was that the Think group would show a smaller decline in performance for the matrix representing the predicted thought structure (i.e., relatively more remembering of it), as compared to the No Think group; but there should be no such differential between the groups for the knowledge structure for the body of knowledge. 5.2
Results and discussion
Usage of the index cards At the second session for the first returning group of 30 participants, after the participants had taken the test, the index cards were collected for the 15 Think condition participants. {Because of an error by the experimenter, the cards were not collected from those in the second returning group.) Thirteen of the 15 participants brought back their index cards, and another reported that she had it at home but had forgotten to bring it. Seven of the cards had notations of dates and/or times next to the topics. All the dates were within the interval between sessions. These results provide some evidence that some participants in the Think condition used the cards as instructed, and are consistent with the conclusion that thinking about the texts took place during the interval
Test results The interaction is shown in Figure 9, plotted as the interaction between thinking condition and structure, with the test time factor shown on the ordinate as the percent retained from the first test to the second test. The interaction
Thinking about bodies of knowledge 193
.110
Prrcmtagc Rrtainrd on Delayrd .7D Ttst
No Think Condition N=27
Think Condition N=29
-----1
.60
.40
.30
.10 0.00 ...._......._ ____...I_ __.__ _..;....._ _.___ _ __ J
Expms' Structure (Control)
Predictrd Thought Structure
Expms' Structur~
(Control)
Prtdictrd Thought Structure
Figure 9. Interaction between thinking condition (thinking versus no thinking) and structure (knowledge structure versus predicted thought structure).
between thinking condition, structure, and test time is significant, F ( 1,52) = 4.25, p<.OS. None of the other main effects or interactions were reliable beyond the .15 level. As a covariate the SAT Verbal score was significant, F(1,52) = 5.39, p< .05, but the Math score was not (F
294 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
6.
Experiment 2
This experiment was conducted to extend the conditions under which the hypothesis was tested. It differed from experiment 1 in that the manipulation of thinking was within subjects, with each participant instructed to think about some of the bodies of knowledge but not instructed to think about others. Also, all the reading, testing, and thinking occurred during a single experimental session. Participants were tested in groups; one group (N=36) was asked at the end of the session to write a retrospective report on what they had been thinking about during the thinking period. 6.1
Method
Materials The texts were the same as those used in experiment 1. They were arranged in booklets as in experiment 1, except that each text and first test assigned to the Think condition was followed by a sheet instructing the participant to think about the body of knowledge for between one and four minutes, self-timed with the digital clock, while for each text and first test in the No Think condition those sheets were not included. Then in both conditions the second test was presented. The passages appeared equally often in each condition, and equally often in the different possible orders. Instructions and procedure The instructions were the same as for experiment 1, except that all the participants were informed they would be thinking about some of the passages and instructed how to self-time the thinking period After all the participants had finished their booklets, they were asked to write down a description of what they had been thinking about during the thinking period.
Participants Undergraduates (N=108) were tested from the same population as those in experiment 1.
Analyses Given the theoretical basis of the prediction and the positive results of experiment 1, the hypothesis that the Thinking condition would have more of the predicted thought product than the No Thinking condition was tested as a planned comparison.
Thinking about bodies of knowledge
6.2.
Results and discussion
Retrospective reports Of the 36 participants asked for retrospective reports, 34 wrote one or more sentences. The 34 protocols were rank ordered for the amount of thinking about the passages they indicated, independendy by two of the investigators (BKB and PS). The judges were significandy correlated (Spearman rho =.89; p < .01 ). Excerpts from the protocols suggest the bases for the judgments. The protocol ranked highest by Britton read: "I tried to think about certain terms used in the passages and how they were related." The protocol ranked highest by Schaefer read: "During the time we had to wait I tried to go back through what I read. I tried to picture the essay & write down key thoughts one by one as they were mentioned in the essay ... " The protocols rated just above the median read: "Mind mosdy unfocused. I thought, scattered, about the SAT, about what it would be like to see a glacier, about Ceremonial Time, a nonfiction book about Northern USA glacier-country. And I thought about the new terms in each reading- subject-specific words;" and "I thought about the text for a minute, but then I began to think about what I needed to do today and about what I wanted to eat for breakfast." The protocols rated lowest read: "Man this ... is taking forever;" and "I wasn't thinking about the text. I was thinking about my test later today."
Test results The mean correlation of the participants in the Think condition with the predicted thought structure was .78, while for the No Think condition it was .63. These differed reliably on a one-tailed t-test, t(107) = 1.86, p < .05, consistent with the hypothesis. In contrast, for the knowledge structure of the body of knowledge, the mean correlation of the participants in the Think condition was .66, while in the No Think condition it was .57; these did not differ reliably, on either a two- or one-tailed t-test, t(I07) = 1.51, p > .10. This shows that the significant increase of the predicted thought structure in the Think condition was limited to that structure and was not due to a generalized increase in performance. It should be noted that the Think and No Think conditions in this experiment differed also in the interval between (a) the reading with its accompanying first test on a text, and (b) the second test on that text, with the interval for the Think condition being the same length as the interval devoted to thinking, and the interval for the No Think condition being essentially zero. This raises
~9S
296 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
the question of whether the significant difference between the conditions in evidence for the predicted thought structure is due to the length of the interval per se or to the events occurring during that interval, i.e., thinking. One way to address this question is to consider the expected results if they had been due only to the passage of time during the interval This could be tested by including a group which was not thinking but who had an interval as long as that in the Think condition. The expected result for that group would be less remembering for both structures compared to the present No Think group, because the No-Think-in-the-interval group would be subject to more forgetting due to the passage of more time. But the results show that the Think group remembered more than the No Think group, with this significant only for the predicted thought structure. That result is not consistent with the hypothesis that the difference between the groups was solely due to the passage of more time in one group than in the other group.
7·
General discussion
The main finding of these experiments is that, when participants were instructed to think, they had more of the predicted product of thinking than when they were not instructed to think. This result is consistent with the predictions of the model and so provides support for it. Further support is provided by the five experiments reported in Britton and Sorrells (1998). Empirical evidence for the psychological reality of these thought products having been presented, the discussion will focus on three issues: the meaning of the products of thought, their implications for memory storage and retrieval, and the implications of the results for the validity of the representation and processing assumptions of the model. 7.1
Meaning of the products of thought
One way to look at the meaning of the products of thought is in terms of their semantic meaning. One way to specify the semantic meaning is by interpreting the arithmetical form of the thought product. One of the arithmetical forms is the pattern of the concepts' final activation values. The goal of the interpretation process is to produce a statement in ordinary language that expresses the semantic meaning of the set of concepts. For example, for the wood burning body of knowledge, the final activation pattern
Thinking about bodies of knowledge 297
shown in Figure 6 (and its equivalent network form shown in Figure 7) is the raw material for the semantic interpretation. Two general principles are used to interpret such patterns of activation values. The first principle is that the semantic interpretation of the final activation pattern depends on the relative magnitudes of the activation values. Larger activations indicate more prominence of a concept in the thought product. One way to reflect this in the semantic interpretation is to simplify the situation by considering only the concepts with the largest activations, disregarding the concepts with small activations. The second principle has to do with how the signs are interpreted. The positive and negative signs are interpreted to indicate only that the sets of concepts with opposite signs are contrasted (i.e., opposed) to each other. This interpretive principle can be kept in mind by recognizing that if all the signs are reversed, the interpretation is not changed at all. Applying the first principle to the marginal activation values in Figure 6, the interpretation can be simplified by disregarding "burns-at-once" and "flammable gases" because of their relatively small activation values. Applying the second principle to the remaining concepts, what is found is an opposition between, on the one hand "log-when first thrown on fire" and "absorbs-heat," and on the other hand "produces significant heat" with "charcoal." This composite concept can be interpreted as specifying the contrast between the states of affairs early in the burning process versus late in the burning process. As shown in this example, the end product of interpreting a set of activation values is a single composite concept. The composite concept is constructed by combining the individual concepts with the largest activations with each other in the specific way prescribed by the signs of the final activation values. In general, we have found that such a composite concept behaves like other concepts, in that we can use it in our thinking as a single unified idea, as in the example. Another important aspect of the thought product's meaning can be brought out by comparing it to the ideas in the structure for the whole body of knowledge. The comparison shows that the composite idea in the thought product is a simplified, sharpened version of one of the composite ideas in the whole body of knowledge. For this, we need to use the other arithmetical form of the thought product, which is the matrix entries, like those in the top two panels of Figure 8. This can be seen for the wood burning body of knowledge by comparing the idea in the thought product, shown in the pattern of connections in Figure 7, and in the top left panel of Figure 8, to that idea as it appears in the structure for the whole body of knowledge, shown in the pattern of
1.98 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
connections in Figure 2 and the matrix of Figure 4. The difference is that in the structure for the whole body of knowledge, that idea is intermixed with others, while in the thought product it is the only one of importance. This difference can be visualized by comparing the network of Figure 2 with Figure 7. The two networks are virtually identical visually in the bottom part of the figures, which is the location of the concepts included in the composite idea corresponding to the thought product; but they differ markedly in the top part, where the connections in the thought product are depicted by much lighter lines, reflecting relatively unimportant connections. Compared to the knowledge structure for the whole body of knowledge, the thought product is simplified in that there is only a single main idea in the thought product, and sharpened in that the idea is relatively more prominent in the thought product than in the original structure. This simplifying, sharpening property of the thought product is virtually always evident in thought products calculated by the model. This, as well as the composite nature of thought product concepts, can be seen in other examples. Figure 10 shows a text about inflation, Figure 11 shows the corresponding network for that body of knowledge, and Figure 12 shows the network for the thought product .As can be seen by comparing the connections for "government" in Figures 11 and 12, in the structure for the whole body of knowledge
Substantial inOation is a monetary phenomenoQ almost always rising from a more rapid increase in the quantity of money than the output of goods and services. Of course, the reasons for the increase in money may be various. It takes time- measured in years- for inflation to develop; it takes time for inOation to be cured, and there is only one fundamental cure. The rate of increase in the quantity of money must
be curtailed. In today's world , governments determin~-or can determin~the quantity of money.
Figure 10. Example text about inflation (From Verbal Comprehension subtest of the Armed Forces Qualification Test).
Thinking about bodies of knowledge 299
Figure 11. Network of concepts characterizing the knowledge structure corresponding to the inflation text.
-' ~
'~
~'
~
~
~ --:~-----~
/
/
'
/
/
/
~I I
I
/
-/ I I I
I
/ I
Figure 12. Network of connections corresponding to the final activations for concepts in the inflation body of knowledge.
300
B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. SorreUs
the idea of government emphasized its role in printing too much money and so causing inflation; but in the thought product the connections to "government" are much smaller, indicating that the concept's prominence is reduced. Instead, in the thought product, the emphasis is on the contrast between "cause of inflation" and "print money faster" on the one hand, and "solution of inflation" and "print money slower" on the other hand. Here again the simplification and sharpening was toward emphasizing one composite concept by reducing the prominence of other concepts. Again the composite concept can be held in mind as a unit.
To the theorists of the Tactical School, strategic bombardment was farst visualized only as a means of destroying enemy civilian morale. The feasibility of direct attack on enemy centers stopped short of population bombing. It was pointed out that Japanese strikes against Chinese cities only strengthened morale. Later, attacks on industrial targets during daylight were favored. The economy of an industrialized nation might be disrupted by disabling just a small number of factories.
Figure 13. Example text about bombardment (From Verbal Comprehension subtest of the Armed Forces Qualification Test).
early-in-strategic-bombardment
0.5200
destroy-enemy-civilian-morale
0.4692
strengthen-enemy-civilian -morale
0.0693
later-in-strategic-bombardment attack-factories
-0.4315 -0.5644
Percent Variance Accounted For = 60%
Figure 14. Final activation values for the concepts in the bombardment body of knowledge.
Thinking about bodies of knowledge
A third example is shown in Figures 13 and 14, in which the composite concept that dominates the thought product is the intended consequences ofbombardment, with the contrast being between the intention "early in strategic bombardment'' of "destroying enemy civilian morale,, versus the intention "later in strategic bombardment" of "attacking factories., This composite concept also can be held in mind as a unit. Virtually excluded from the thought product is the ideas in the text describing the unintended consequence: that "early in strategic bombardment" the consequence was to "strengthen civilian morale." The extent of this sharpening and simplification can be quantitatively specified as follows. The thought product's list of final activation values is associated with a value that indicates exactly how prominent in the original structure is the composite concept that is its thought product. For the wood burning body of knowledge, the idea that corresponds to the thought product accounts for 63o/o of the variance in the original structure ( 100% of the variance is the maximum that could be accounted for); for the inflation body of knowledge, the figure is 74%; and for the bombardment body of knowledge, 60%. It is the complement of these values (37%, 26%, and 40%, respectively) that expresses the degree of simplification, i.e., how much is left out from the original structure in the thought product.4 Quantitative claims about the amount of meaning contained in a particular thought about a body of knowledge are not made elsewhere, as far as we know, and are potentially strongly testable predictions of the model. For example, being asked to think about a body of knowledge whose thought product accounts for more of the variance in the original body of knowledge (e.g., the inflation passage) than one whose thought product accounts for less (e.g., the bombardment passage) should allow for more accurate reproduction of the body of knowledge. That is, the process of thinking changes the body of knowledge in the inflation passage less than it does that of the bombardment passage. These results can be extended from the current tests to more common tasks (e.g., recall, multiple choice, essay). These tasks would show better access to information from the inflation passage, where "better" is defined as more similar to the original passage. 7.2.
Implications for memory storage and retrieval
What are the implications of these results for memory structure and retrieval? Working memory is considered first. The percent of variance in the original knowledge structure that is accounted for by the thought product has implica-
301
302.
B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
tions for maintaining the original knowledge structure in working memory. One way to maintain the original knowledge structure in working memory is to hold in mind the entire set of connections. Another way is to maintain only the final activations that represent the thought product. We normally think of working memory as maintaining a set of concepts which are activated to various degrees , not a set of connections; nevertheless it is useful to compare the consequences of holding the entire set of connections versus only the final activations. One consequence of maintaining only the activations is that not all of the body ofknowledge will be maintained. In fact, it is possible to specify with some precision how efficacious the thought product would be: how much ofthe original body of knowledge could be reconstituted in working memory by the thought product alone? The value is just the same as the proportion of variance of the original body of knowledge that is accounted for by the thought product, e.g., 63% for the wood burning body of knowledge, etc. The reconstitution of the original body of knowledge (i.e., of the specified proportion of it) is accomplished in the same way as the thought product's matrix was constructed in Figure 6, i.e., by multiplying the list of final activation values by itself as done there. In other words, an important property of thought product matrix entries like those in Figure 6 is that they approximate the original matrix. Moreover they approximate it to a precisely calculable extent, with that extent given by the proportion of variance that the thought product accounts for. If what is maintained in working memory is the thought product, specified as the pattern of final activations, there is another interesting consequence. First, maintaining it would appear to require much less working memory capacity than maintaining the entire set of connections, for straightforward quantitative reasons. If n is the number of concepts, the entire set of connections contains n(n-1 )/2 elements in its matrix, which is somewhat less than half of n squared; but the list of final activation values contains only n elements. For a limited capacity working memory system, this can be a substantial savings, because as the body of knowledge gets larger the relative proportion of savings increases rapidly. What are the effects of the thought product on longer term memory traces? The usual way we think ofstoring mental contents in long term memory is in the form ofassociations (i.e., connections). Such associations have the advantage of being relatively stable without the need for continuous attention. As indicated using Figures 6 and 7, the mathematical relation between activations and the associations calculated from them means that, once a set of activations is arrived at, it is possible to immediately calculate a set ofassociations that will reproduce
Thinking about bodies of knowledge 303
those activations perfectly. So if any configuration of concept activations is in mind, e.g., any end product of thought, it is immediately possible to calculate the associations that will reproduce those activations. So it would be possible, by storing in long term memory only the associations calculated from the thought product activations, to reproduce the thought product activations perfectly. But the results of the experiments do not provide evidence for separate storage of the connections that would reproduce the thought product. Instead, our results indicate that by the time of the retention test, the connections for the thought product have been combined with the connections representing the original knowledge structure as expressed in the text. This is indicated by the finding that the participant's structures that were found in all the conditions in both experiments provided evidence for both the knowledge structure for the text and the predicted thought structure. It appears that the two structures are averaged together. This raises the question: when the simplified and sharpened thought structure ideas are averaged in memory with the original knowledge structure for the body of knowledge, what will the net knowledge structure look like? In general, it appears that the averaging will produce a net knowledge structure in which the ideas contained in the thought product are somewhat more prominent than they are in the original structure, but somewhat less prominent than they are in the thought product alone. The process can be visualized as pulling the knowledge structure for the body of knowledge toward the thought product, with the size of the pull proportional to the relative weighting of the thought product. Another implication for long term memory has to do with mechanisms of forgetting. It is well known that as time passes since learning of a body of knowledge, the memory for it is altered: this is normally attributed to forgetting. Usually, the alteration is not toward a memory trace with more ideas in it, but toward one with fewer ideas in it; not toward a more complex memory trace, but toward a simpler one. What are the mechanisms of such simplification and reduction? The mechanisms that come into play depend partly on what occurs during the retention interval. Thinking is one thing that can occur. The results of this paper indicate that thinking during the retention interval causes a simplification. The model tested here contributes to research on forgetting by specifying a candidate principle by which the simplification is implemented, and calculating with precision the predicted end product of that simplification process. In the experiments, the psychological reality of the candidate simplification principle was tested successfully by seeking that end product in the participants' responses, and finding it there.
304 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
7·3 Validity of the representation and processing assumptions of the model What are the implications of these results for the validity of the representation and processing assumptions that underlie the model? Any set of representational or processing assumptions in any field depends for its validation partly on their usefulness for understanding and predicting the phenomena of interest. By this criterion, the assumptions used here have been validated successfully in the reported experiments. Another common criterion used to evaluate theoretical machinery of this kind is parsimony. This can be evaluated only in comparison to alternative assumptions. It is easy to imagine alternative representation and processing assumptions; it is often much more difficult to implement and test them. Among those frequently proposed are representation assumptions that label the links between concepts with labels like "instance-of', "causes", etc., and processing assumptions that use unidirectional connections. Models in the information processing tradition have had some success with such representation and processing assumptions, but as far as we know, models using those assumptions have not yet been used to simulate the thinking process to produce predicted thought products for the kinds ofbodies of knowledge used here. In our experience, suggestions that the representation scheme used here is inadequate have not been followed by proposals for alternate schemes that the proposer evinces any intention of testing, or even of implementing so as to produce a testable consequence, much less by reports of tested schemes that have even equaled the present one in predictive efficacy. We eagerly await the proposal of testable alternatives. The guiding hypothesis of this paper is that knowledge structures change as a result of being thought about. H they did not change, then there would be no consequences of thinking, and it is generally accepted nowadays that thinking has consequences. This paper contributes to psychological science by identifying a mechanism of thought, using that mechanism to explicitly predict specific thoughts, and empirically verifying that those thoughts are present after thinking. Since the model is supported, it may be usable as a foundation for further research.
Thinking about bodies of knowledge 305
Notes Activations are calculated in a different fashion in Kintsch's construction-integration model. Originally intended to model the processes of comprehension and the selection of propositions to be held in working memory, Kintsch's algorithm does not allow for the computation of negative levels of activation. See text for interpretation of negative levels of activation, and Britton and Sorrells ( 1998) for results showing the superior fit of the current algorithm as applied to the prediction of thought structures.
1.
2. Thagard's and McClelland and Rumelhart's algorithms for calculating activations produce a similar result to the algorithm used here, and have been shown to simulate psychological processes other than thinking.
3· Each final activation value has been multiplied by the square root of a scaling factor called the eigenvalue, for proper scaling of the multiplications shown in the body of the matrix. This does not affect the pattern of activations. Discussion of the 'percent variance accounted for' is in the General Discussion.
4· The mathematical background for the claims made in this section is best descnbed in terms of elementary linear algebra, in which these values can be calculated because the thought product (i.e., the set of final activation values) is the same as the eigenvector with the largest eigenvalue of the matrix of connections, and that eigenvalue can be directly expressed as the proportion of variance in the matrix that is accounted for by its eigenvector. Britton and Sorrells (1998) provide more mathematical detail.
References Anderson, J. A. ( 1995 ). An introduction to neural networks. Cambridge, MA: MIT Press. Anderson, f. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S. (1977). Distinctive features, categorical perception, and probability learning: Some applications of a neural model. Psychological Review, 84, 413-451. Britton, B. K., & Eisenhart, F. J. ( 1993 ). Expertise, text coherence, and constraint satisfaction: Effects on harmony and settling rate. Proceedings of the Cognitive Sdence
Society,15,266-211. Britton, B. K., & Sorrells, R ( 1998). Thinking about knowledge learned from instruction and experience: Two tests of a connectionist model Discourse Processes, 25,
131-177. Britton, B. K., & Tidwell, P. (1995). The cognitive structure testing system. In P. Nichols, S. Chipman, & R Brennan (Eds.), Cognitively diagnostic assessment (pp. 251-278). Hillsdale, NJ: Lawrence Erlbaum Associates. Jonassen, D., Beissner, K., & Yacci, M. (1993). Structural Knowledge: Techniques for representing. conveying. and acquiring structural knowledge. Hillsdale, N.J.: Lawrence Erlbaum Associates.
306 B.K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells
I
SECTION
4
Segmentation in text and text representation
So far, the cognitive processes of text production have hardly been in the focus of attention in this volume. However, in Chapter 12, Schilperoord reports the use of a promising research paradigm by analyzing pauses in dictations as traces of planning processes. He tries to find evidence for the claim that conceptual and linguistic planning processes are independent of each other: planning at the level of paragraphs is not related to planning at the word level, for instance. Interestingly, he finds that the clause level is intermediate, in that pauses at clause level tum out to be affected by pauses at the higher levels and at the same time affects lower levels. The discussion of the relative autonomy of linguistic processes in production is clearly related to Giora & Balaban's ideas on metaphor interpretation (Chapter 4). And although this chapter explicitly focuses on text production, the conception of text representation as of a dynamic and incremental nature is very similar to that used in research on text understanding, as exemplified in Chapters 2 and 3, and in Chapter 11. The final chapter of this book also deals with a discussion of levels within and above the clause. In the very first chapter of this book volume, we gave a simple historical overview in which the sentence was replaced by the discourse as the primary object of study. Ever since we did that, it has been unclear what the units of analysis were going to be: when coherence between text segments is discussed, what segments are we talking about? Where other chapters have often taken the unit of analysis for granted, Verhagen takes up the issue ofhow to define these units. In Chapter 13 he notes that subordinated structures form a problem for discourse analysis in that their discourse function often is not reflected in their grammatical role. This concerns very frequent types of constructions like to this should be added that his wife is seriously disabled, in which the central part of the information, and thus the proper unit of analysis, is syntactically embedded. He uses the notion 'conceptual independence' to account for this form-function-discrepancy. Furthermore he suggests that units should be analyzed as belonging on one of message dimensions: coordi-
308 Section 4: Segmentation in text and text representation
nation and content, notions which are closely related, be it not identical to, notions like semantic/ideational and pragmatic/ epistemic, which are prominently present in Section 2 of this volume.
CHAPTER
12
Conceptual and linguistic processes in text production Interactive or autonomous? Joost Schilperoord* Katholieke Universiteit Brabant
1.
Introduction
Despite their rich diversity, most of todays models of speaking or writing assume that conceptualizing and formulating are the essential processes underlying the production of text (see for example: Bereiter & Scardamalia, 1987; Garret 1980; Hayes & Flower 1980; Levelt 1983, 1989; Schilperoord 1996; Van der Pool1995; Van Wijk 1987). Together, these processes result in a series of motor actions involved in speaking, typing or hand-writing. Language or text production starts as soon as the speaker/writer conceive of an intention to communicate particular information. He intends to assert X, or to ask Y, or to warn somebody of Z. Conceptualizing furthermore involves retrieving relevant information and ordering or structuring that information for expression. It also involves book keeping and monitoring. The speaker/writer will have to keep track of what s/he has said or written, or, in case of dialogues, what others have said, and s/he will monitor her/his own production. The main goal of the formulating process is to convert conceptual structures into linguistic structures. This process involves grammatical encoding, that is, finding the appropriate words to express the message, and building syntactic constituents such as noun phrases or prepositional phrases. In addition, formulating involves phonological or, in the case of writing, graphemic encoding (c£ Van der Pool1995) which results in a plan to articulate or to write down the utterance. A major issue of language production theories concerns the question whether the processes of conceptualizing and formulating interact in the course of producing language. Different points of view on this matter have been
310
Joost Schilperoord
expressed. In Section 2, we will be more precise about what it means exactly to state that two processes interact, and how such interaction can be demonstrated empirically, so here it suffices just to give a rough impression of these points of view. It seems that, in general, the implicit assumption underlying interaction models of text production, such as the Flower and Hayes-model, is that the various components in fact interact. Bluntly speaking: the assumption seems to be that "ifyou get your ideas dear", then formulating them will take care ofitself. Local operations, such as shaping a sentence form or choosing appropriate lexical forms, can thus be performed more fluently once the writer has formed a dear idea ofwhat it is that he wants to say in a text. Such an assumption presents a dear case of interaction between processes. On the other hand, according to modular models, such as Levell's Blueprint model of speaking, interactions between different kinds of processes are ruled out. In fact, the very reason for distinguishing different kinds of processes is the fact that they perform distinct, non-interactive operations. For example, in Levell's model, the component that handles linguistic processes is a specialist in converting conceptual structures into linguistic structures because it is the only component able to do that. Such criteria for distinguishing specialists express particular points of view as to what kind of theory of language production one likes to develop. If the aim is to develop a modular theory rather than an interactional one, then the subprocess of grammatical encoding can be made computationally transparent to the extent that it can be considered an autonomous component. After all, if other types of information, such as conceptual information, are allowed to interfere with the syntactic tree-building process, there would be no need any longer to distinguish a distinct syntactic specialist. Levelt's model thus assumes no interactions between the linguistic and the conceptual specialists. A major concern of the present paper is to examine to what extent the implications of both kinds of models can be turned into an empirical affair. More in particular, we shall be concerned with production pauses as a possible source of evidence for either one of these two models. Later sections will provide a dose look at the pause patterns that emerge in the course of producing language (where the scope of this paper will be restricted to the process of written text production). Since we cannot take an immediate look at what is happening in people's minds when they produce language, thus testing the implications of the models directly, we are in need of some behavioural correlate of the cognitive processes that underlie it. As I intent to demonstrate, pauses present us with such a behavioural correlate.
Conceptual and linguistic processes in text production
The research strategy adopted here is to select a set of'external' variables of which it can be plausibly argued that they will be associated with variation in processes, and then see to what extent conceptual and linguistic processes respond differently to these variables (as testified by statistical analyses of pause patterns). For example, one such external variable is differences between texts. Obviously, text will differ from each other on dimensions such as length, conceptual import, structural complexity and so on. Such differences will have a certain impact on the conceptual and linguistic processes underlying the text's coming about. Such impacts, in tum, can assumed to correlate with particular variations in pausing patterns. The analytical challenge, then, is to see whether these pattern variations covary or not. If they do, this would imply interaction between processes, if they don't, this would imply no interaction. Clearly, this research methodology can be employed successfully to the extent to which one is able to distinguish conceptual processing from linguistic processing on the basis of pause patterns. Inferences about the nature of the processes going on during a pause crucially refer to the location and the length of the pause (Schilperoord 1996). Although the relation between such variables on the one hand, and the type of processing signified on the other is sloppy, to say the least, some evidence can be found in the literature suggesting that conceptual processes mainly concur with relatively long pauses, located at important structural transitions in texts, such as transitions between paragraphs, whereas linguistic processes mainly concur with relatively short pauses at peripheral locations in texts, such as within-clause locations (e.g. Henderson, Skarbek & Goldman-Eisler 1966; Gee 1986; Schilperoord & Sanders 1997). Hence, if it can be demonstrated that text-related differences (being one of the external explanatory variables) only affect pause times for between-paragraph pauses, but not pause times at peripheral locations (the dependent variables), then this would suggest no interactions between processes. Three such 'external' explanatory variables will be tested. We start by analysing the effect of text-related differences on conceptual and linguistic processes. Then we will be concerned with the question how within-text fluctuations in conceptual processes influence processing at lower levels. And finally we shall explore how the variable time during the text production process correlates with both conceptual and lower-level processes. The organization of the paper is as follows. Section 2 first addresses the interaction issue in a more principled vein, and then examines how differences in pause length and pause locations may be responsive to that issue. By way of introducing the empirical part of the paper, Section 3 contains general infor-
311
312
Joost Schilperoord
mation on the distribution of pause time in (written) text production. In addition, the research materials will be introduced. Section 4 presents the results of various empirical analyses on pause time variances together with pause locations. Section 5 concludes the paper. One final introductory remark: the data to be presented here are all restricted to the temporal organization of expert-writing, and to a type of text production that is often referred to as the Knowledge Telling mode of composition (Bereiter & Scardamalia 1987). In writing-process research it is often, but wrongly assumed that expert writing is, first and foremost, problem-solving writing. However, true expert writing is characterized by moving away from problem-solving towards knowledge telling. A true expert is an expert because s/he is able to produce complex texts just by telling what s/he already knows. That is, the expertise of expert writers is restricted to a set of textual genres for which they have at their disposal a fixed discourse model and a sufficiently structured mental knowledge base that allows them to retrieve information smoothly and with great ease, and to order that information according to a more or less fixed arrangement of content categories. As we are mainly interested in writing processes that go on easily and well, it seems wise to confine the research to be reported here to knowledge telling writing.
2.
Interaction between processes
The question whether or not conceptual and linguistic processes interact in the course of text production can only be answered satisfactory if we are able to distinguish conceptual and linguistic processing sources for pausing activity. As this issue has an immediate import on the empirical status of pausing data and on the range of possible theoretical interpretations thereof, it ought to be handled with great care. A suitable way of doing so, I guess, is to try to be more precise about what it means to say that conceptual and linguistic processes interact. Since modular models of language production obviously express the strongest hypothesis as to the interaction issue, we shall address the matter from this point of view. The basic architecture of a processing module can be outlined as in Figure 1. A module is a set of production rules that operate 'blindly' on a particular kind
Conceptual and linguistic processes in text production
input
Figure I.
output
A processing module
of input, and deliver a characteristic kind of output. To perform its action, the module has access to a knowledge base. So if there is conceptual input [DOG], specified as 'count noun', 'singular' and 'already introduced in the discourse', then Levell's lexico-syntactic module produces the phrase ((the) dog) without asking any questions, only consulting the mental lexicon, being the knowledge base it has access to. Now, suppose we can distinguish between two processes A and B. Then a modular model would postulate two distinct modules A' and B' dedicated to perform these two kinds of processes. This implies that interaction between two processes A and B boils down to an interaction between the operations of the two modules A' and B'. On the other hand, modules can be said not to interact if, for any pair of two modules, different answers have to be provided to the following question (cf. Levelt 1989, p. 16): what types of representations does a module accept as input and delivers as output? If both modules respond to the same kind of input, or produce similar output, then there is no need any longer to consider as distinct modules (they 'merge', so to speak, into one module A'+B'). 1 Conceptually speaking, if, for example, it can be proved that choice of words affects the message one wants to express, then the formulator affects the inner workings of the conceptualizer. and hencethey interact. So because in Levett's model the formulator component is considered to be modular, it follows that any form of interaction between conceptual and linguistic processes is ruled out. Let us refer to this state of affairs as the
autonomy-thesis. How can this claim be validated empirically? Note that the module-criterion discussed does not allow the autonomy-thesis to be phrased directly in terms of pause patterning. In fact, Levelt argues that any pause, no matter its length or location, reflects processing that is under the speaker's executive control, allegedly requiring his conscious attention. (cf. Levelt 1989, p. 126). Because Levelt's formulator module performs its job automatically, that is, outside the domain of executive control, not demanding any attentional resources, pauses can only be overt reflections of the inner operations of the
313
314 Joost Schilperoord
conceptualizer, and not of the formulator. Therefore any direct test of the autonomy-thesis based on pausological evidence is out of the question. But what about the alternative type of model, that is, interactional models? The crucial feature of these models is that interaction between different kinds of processes is considered an essential property ofthe human language production system. Flower and Hayes' so-called multi-representation thesis explicitly claims that writers gradually convert abstract conceptual structures into increasingly more linguistic structures, the former correlating with larger plans for producing text, the latter (eventually) with sentences and parts of sentences (Flower & Hayes 1984). Such conversion operations are labelled 'planning' and 'translating' in their model, but no principled distinction in terms of input characteristics or knowledge bases is implied here. In fact, their model assumes that there is not a precise distinction to be made. 2 Let us call this the interac-
tion-thesis. The features of interactional models (the Flower and Hayes model being a paragon of such models) as discussed so far allow for a gradual distinction between what may be termed 'high and low' level processing, 'central' and 'peripheral' processes, or 'mainly conceptual' and 'mainly linguistic'. Related dichotomies encountered in the literature are 'deep' versus 'shallow' processing (Just & Carpenter 1987), 'distal' versus 'local' planning (Kowal & O'Connell 1987), and 'molar' versus 'atomic' planning (Henderson 1974). In fact, interactional models hypothesize, for example, that thorough conceptual planning facilitates ease and fluency at the lower levels, i.e. at the level of translating concepts into language. Clearly, this state of affairs allows us to rephrase the theoretical implications of interactional models in terms of empirical hypotheses with regard to pause patterning. Various authors have pointed out that the aforementioned dichotomies indeed correlate with the distinction between, for example, long and short pauses, and with different pause locations. Gee (1986) for example, writes: In fact, there is now some evidence that the longest pauses of a text correlate wen with important discourse breaks in the text. ( ... ) For example, major transitions or breaks in the plot of a story tend to have longer pauses than more minor transitions or breaks. [(See also Gee 1986, p. 393; see also Butterworth 1975; Henderson, Goldman-Eisler & Skarber 1966; Sanders, Janssen, Van der Pool, Schilperoord & Van Wijk 1996; Schilperoord & Sanders 1997; Schilperoord & Van der Pool1997) ].
These studies thus suggest that temporal parameters can be taken as evidence
Conceptual and linguistic processes in text production
for the somewhat loosely implied levels of processing reflected by variances in length and/or location of pauses. The distinction between central and peripheral processes, for example, seems to go along with the distinction between conceptual and linguistic processes. Moreover, this processing distinction can be tracked down at predictable 'places' in texts. In sum, the available evidence suggests that variances in pause times are responsive to the distinction between central processes and peripheral processes, and that they correlate with textual locations: central processes occur at major breaks or major structural transitions in texts, whereas peripheral processes occur within sentences or clauses. In what follows, the interaction-thesis will be put to the test. The external variables distinguished in Section I will be examined as to their impact on both conceptual and linguistic processes in terms ofpause pattern variances. For each case, the hypothesis can be statistically rephrased as follows: if conceptual and linguistic processes indeed interact, then the pause pattern variances will covary to a certain extent. If such covariances cannot be produced in any significant way, the interaction-hypothesis must be considered violated. Whether or not, in that case, Levett's autonomy thesis stands out as a natural alternative for interactional models is an issue that we will be concerned with in the discussion section.
3·
The temporal characteristics of written discourse production
This paper presents the results of a series of analyses of the distribution of pause time in written text production. In order to put the results of the analyses into perspective, it seems appropriate first to provide an impression of how pause time is distributed with respect to various textual locations, and how pause times vary with respect to such locations. We begin by briefly introducing the location types that were analysed (3.1 ). Then we introduce and describe the research materials that were used (3.2) and the data-analytical model that was constructed {3.3). The results of the analysis are presented and discussed in Section 3.4. 3.1
Locating pauses
Analysing the relation between the distribution of pause time variances and their locations in texts requires us to select a set of cognitively meaningful categories in order to tag pause locations. Psycholinguists have intensively
315
316
Joost Schilperoord
studied length differences in terms oflexical and syntactic locations of pauses. So, pause lengths have been analysed with respect to transition probabilities between individual words (Lounsbury 1954; Goldman-Eisler 1958), with respect to grammatical constituents (Maclay & Osgood 1959), clauses (Boomer 1965; Ford & Holmes 1978; Holmes 1988; Schilperoord 1997a, b) and sentences (Goldman-Eisler 1968).3 This tradition has been honoured in the present study: an analysis has been conducted of the distribution of pause time relative to words, phrases, clauses and sentences. Other studies, however, went 'beyond' the sentence (Henderson et al. 1966; Butterworth 1975) and have studied hesitations in larger stretches of spontaneous monologues. These studies suggest the involvement of supra-clausaVsentential units in speech production, that is, units involving several sentences, to which the distribution of pause time is responsive. The question now is what textual categories correlate with such suprasentential units? In the present study, we have chosen to analyse the distribution of pause time relative to paragraph transitions - that is, apart from those already mentioned: words, constituents, clauses and sentences. There is ample evidence suggesting the semantic and discourse structural integrity of paragraphs (for example: Ariel1988; Chafe 1987, 1994; Gee 1986; Longacre 1979). This may be taken as a measure for its psychological significance in text planning. We shall briefly recapitulate these studies. Gee, in his case study of the production of narrative discourse, finds that important discourse breaks coincided predominantly with longer pauses (Gee 1986, p. 393 ).In another case study of narrative production, Chafe reports that his narrator paused longer at places at which'( ... ) significant changes [took place] in scene, time, character configuration, event structure and the like' (Chafe 1987, p. 42). According to Chafe, such locations are naturally 'associated with a division into "paragraphs".' (ibid, p. 42). Additional evidence for the integrity of paragraphs is provided by Longacre (1979) who argues that '( ... ) there is good evidence in many languages of the world for paragraph closure (e.g., features of beginning and end) and paragraph unity( ... ).' (ibid, p. 116). Apart from characteristic ways of closing paragraphs, an important structural (or 'grammatical', in Longacre's own words) element that he mentions is the thematic unity of paragraphs. Finally, Ariel's study on the distribution of referring expressions, such as indefinite descriptions and pronouns, shows the use of these expressions to correlate with the current accessibility status of discourse referents. Full nouns, for example, signal low accessibility of a referent, whereas pronouns or zero-anaphora signal high accessibility. The
Conceptual and linguistic processes in text production 317
relevant finding of Ariel concerns the distribution of such referential codes relative to paragraph boundaries: '( ... ) we find that initial paragraph position oftexts, even ones whose discourse topics have long been established, tends to reintroduce the topic in a full NP form anyway, rather than in pronominal form.' (Ariell988, p. 72)
Apparently, memory scope is crucially related to paragraph units. Closing signals for paragraphs result in the release of information from Working Memory, thus necessitating the reintroduction of already established referents. Taken together, these results suggest the psychological reality of paragraphs. Hence, choosing the paragraph as the written counterpart of suprasentential units of planning' appears to be justified on plausible grounds. Therefore, pauses in our corpus were tagged for their syntactic position (within phrases, between phrases, between clauses, and between sentences), and also for their position with respect to paragraphs. 3.2.
Materials
A corpus of 120 text production processes was sampled. Recordings were made of six professional Dutch lawyers using dictation equipment. Twenty routine texts were selected from all subjects (mostly letters). All texts were 'real life' and have actually been used to serve communicative purposes: to inform clients and colleagues about recent developments in ordinary legal cases. In order to guarantee the real life nature of research materials, dictation processes were recorded in the lawyers' offices at the time subjects would also normally have produced their letters. Recorded materials were transcribed verbatim, including errors, repairs and the like. The software program GIPOS for speech analysis was used for the detection and measurement of pauses. A pause was defined as a silence in the speech stream lasting minimally 250 ms. In total, 7820 pauses were sampled, distributed across six subjects, 120 texts and about 23000 words. A data base was constructed according to the following procedure. We started by tagging all transitions between words in terms of the location types distinguished. So, each location was labelled either 'between words', 'between phrases', 'between clauses', 'between sentences' or 'between paragraphs', the categories being top-down inclusive. Furthermore, it was determined for each transition whether or not a pause coincided with it. If it did not, this transition was labelled as a 'zero-pause', if it did, the transition was labelled in terms of the duration of the accompanying pause. This procedure
318
Joost Schilperoord
guarantees that, in estimating the average duration of pauses at a certain location, the effect of zero-pauses can be calculated. 4 The procedure also allows us to estimate the frequencies of pauses at the distinct location types (the results of these estimations are left outside the present paper as they bear no relevance to the issue at stake here, but see Schilperoord 1996). 3·3
Data model
The analyses aimed at estimating the average pause length for each location type, and pause time variances with respect to the distinct locations. In fact, the present study makes intensive use of these fluctuations in pause length relative to the mean length. Separate estimates were made for within-text and betweentext variances. To explain why, consider pauses at paragraph transitions. Obviously, we can estimate the overall mean Xpa, for the entire corpus. However, pauses between paragraphs within the same text will vary relative to ~r· This is called within-text variance. In addition, the mean length of these subsets of pauses varies from text to text. This is called between-text variance. As we will see later on, analysing these different sources of variance is an important research tool to learn more about how conceptual and linguistic processes interact in text production. By using a Multi-level program for the analysis of variance, we were able to estimate overall means, within-text variances and between-text variances simultaneously. The data model that was used is given in (1 ). ( 1} Y1Jk
= 13 1>tpAR + P2>tSEN + 133*CLA + P4*CON+ 135*WOR
( + tulk>tPA~ + t2iJk>tSEN,Jk + t 31Jk·c~Jk + t 411k>tCON~lt + tsijt >two~ik + J.l 1o;t >tPA~lk + ~o;k >tSENiik + J.t30Jk*CLAiik + J.l40ik*CONiik + J.lsOjk*WORiik)
The variable Yiik represents a pause at location type ~ in the j-th text of the k-th subject. So, model ( 1) allows us, for each individual pause in the corpus, to estimate the overall mean for pauses at each location type (the so-called fixed parameters J3 1_5), and both within-text and between-text variances for pauses at each location type (the so-called random parameters, given between parentheses). The random parameters denote location-specific residuals (tiik) and text-specific residuals (J!Ojk). The first random parameter allows us to estimate variances for all location types (the so-called level 1 variances). In classical test theory this term indicates measurement error and it is assumed that tiik =N(O, crf). The second random parameter allows us to estimate text-
Conceptual and linguistic processes in text production 319
specific variances for each location type (the so-called level2 variances). 3·4
Results and discussion
Table 1 shows the results of the estimates of the mean length of pauses at the distinct location types and the within and the between-text variances. Table 1.
Estimates of mean times of pauses at five location types, estimates of within text and between text variances (ms). estimates locations fixed effects•
paragraphs sentences clauses constituents words
8472
2676 1336 1185 1046
within texts
between texts
19240 4673 1359 676
12050
468
406
162 019 003
• chi-squares for simultaneous contrasts= 212.5, df = 5, p = < .001
Table 1 shows that, on average, longest pauses occur between paragraphs, than between sentences, than between clauses, and so on. These results thus suggest that the selected location types are sensitive to pause time differences, all differences reaching the level of statistical significance. Furthermore, pause time variances, represented in the right-hand columns in Table 1, show that there are indeed considerable deviations from the estimated averages, both within and between texts, although between-text differences gradually loose their impact on low-level locations. In the remainder of this paper, these residual estimates will be of special concern. For now it suffices to note that the five selected location types account for over 50% of the variances observed at the various location types. The dean-cut correlation found between a pause's location and its duration warrants the assumption that levels of processing can indeed be inferred from the data. Granting the assumptions discussed in Section 2, conceptual processes appear to take place predominantly at the level of paragraphs, and perhaps at the level of sentences, indicating that text producers tend to engage in conceptual processing especially before a new paragraph and to a somewhat lesser extent before a new sentence. Linguistic processes such as grammatical shaping and lexical choice, appear to occur mainly at the 'lower' levels of
320
Joost Schilperoord
clauses, constituents and words, indicating that decisions as to syntactic shaping and wording are being made especially before clauses, constituents and words. This is not to imply that we can, for each distinct location, infer a uniquely determined kind of process. It may be so that planning at the sentence level differs from planning at the paragraph level in terms of the range, or the level of conceptual abstraction. For example, paragraphs are planned as to their 'gist' whereas sentences are planned in more conceptual detail. But this is off course mere speculation, as such differences cannot be discerned directly from differences in pause length. So, in conclusion, only a gross distinction between conceptual and linguistic processes appears to be observable from the data.
4- Interactions between conceptual and linguistic processes in text production 4-1 Introduction We now turn attention to the main question in this paper, concerning the interaction between conceptual and linguistic processes in text production. Section 2 addressed the interaction-issue from a theoretical point of view. Here, we start by making explicit what the presence of interaction would stand for empirically, notably, in terms of pause patterns. Consider the data model presented in Figure 2.
Figure 2.
Data model
The box in the middle of Figure 2 represents a simplified version ofa production model in which only conceptual processes (CP) and linguistic processes (LP) are distinguished. The left hand side depicts a 'production-external' explanatory variable X. For X it can be argued that, someway or another, it will have some effect on the performance of both CP and LP, so that (2) can be assumed.
Conceptual and linguistic processes in text production 3D
(2)
X~ X~
CP LP
In addition, these effects of X on CP and LP will be reflected by particular pause time parameters, for example, means and variances. Let us refer to these as PPcp and PPlp then the entire system of relationships can be given as (3 ). (3)
X ~ CP ~ PPcp X~
LP
~
PP1p
Since CP and LP are covert processes, the analyses to be reported will focus on exploring relationships (4). (4)
X~
( ... ) ~ PPcp pplp
X~( ... ) ~
As a first approximation of 'interactions between CP and LP', we may say that there are no interactions if there is no way of predicting PPcp on the basis ofPP1P or vice versa. In other words, interactions are assumed to be present if (4) can be rewritten as the covariation model (5). (5)
X
~
( ... )
~
ppcp H pplp
The assumption to be tested is that conceptual and linguistic processes run more or less interactively, so therefore (5) represents the hypothesis to be tested. The general structure of the next three subsections will be, first, to introduce X as a relevant external variable that has alleged effects on CP and on LP, then to specify how model (4) can be cast in terms of X and in terms of hypothesized pause patterns, and, finally, to test the hypothesis.
4-2
Differences between texts
The first external variable tested is differences between texts. Such differences may be of various sorts -conceptual complexity, or text length, but for now the precise nature of these differences need not bother us, as long as it is reasonable to assume that they exist, and may indeed have a certain impact on cognitive processes, especially at the conceptual level. Therefore, we hypothesize that, if indeed conceptual processes and linguistic processes interact, then conceptual difficulties are solved at the level of conceptual processing, but will have marked effects on 'lower' levels of processing as well. In terms of pause
3:1.2
Joost Schilperoord
patterns, we expect text-rdated differences to influence pause time variances at the level of paragraphs, but also at lower levels. To test such a hypothesis, we must gain insight as to the amount of pause time variance for all distinct location types that can be attributed to between-text factors. To put this in other words, the question is to what extent differences between texts carry over to pause time variances for the various location types. In terms of the level 1 and level 2 sources of variance discussed in Section 3, the analytical question thus boils down to estimating the relative contribution oflevel2 variance to the overall variance in pause time (level 1 + level 2) at all distinct locations. One way of obtaining such information is to estimate the so-called intra-class correlation coefficient, called p (cf. Cochran 1977). The factor p can be estimated by calculating the ratio of between-text residues and within-text residues for each location type (see (6)). (6)
Px = ~k ·~jk I
Eiik ·~ik
+ ~Dik•xiik
Equation (6) thus depicts the level2/level1 + level2 ratio for each pause xijk' Both the level 1 and levd 2 estimates were already given in Table 1. The interaction-hypothesis predicts the intra-class coefficients for paragraphs, sentences, clauses, constituents and words to be more or less similar. Applying (6) to these estimates results in the estimates shown in Table 2. Table 2.
p - estimates for location types
Location
Estimate
paragraphs sentences clauses constituents words
.385 .079 .106 .028 .005
Table 2 shows the influence of text-related differences to be dearly present at the paragraph level (.385). Hence, differences between texts have marked effects on the pause time variances at paragraph transitions. Lower, but still meaningful p-coefficients were obtained for transitions between sentences and clauses, indicating that these levds are also sensitive to differences between texts. On the other hand, there is almost no effect on pause time variance for the 'lower• textual categories 'constituents• and 'words·. In other words, PP1p cannot be predicted on the basis of PP cp' Therefore it must be concluded that the intra-class data do not provide any support for the covariance-model (5).
Conceptual and linguistic processes in text production 3:13
Granting location types to mark the difference between 'conceptual' and 'linguistic' processing, these results thus suggest that from a temporal point of view linguistic processing goes on independently from text-specific factors, such as text length or complexity, which contradicts the interaction-hypothesis. The clear break between the level of clauses on the one hand and constituents on the other, indicates that text-related problems are solved primarily at the paragraph-level, and to a somewhat lesser extend at the sentence- and clause-level, but they leave processing at the constituent and word level unaffected. At this stage, however, there is not much of a base to be anymore specific about what goes on at the intermediate levels of sentences and clauses. This issue will be addressed in more detail in the next section. 4·3 Interactions between conceptual and linguistic processes within texts
The former section informed about how pause times at different levels vary along with between-text differences. This section considers a second kind of external factor: how do pause times at different levels covary or correlate with each other within texts? Let us refer to these possible covariations as inter-level relations between various levels of processing. There are three logical possibilities as to how pauses at different locations may covary/correlate: They may correlate positively, negatively, or there may be no significant correlation at all. Let us try to cast these possibilities in processing terms. The first possibility entails that if, relative to the average length for some superordinate level Lx' pause time at X increases, so will pause times at a subordinate level Lx-I.l, ... n· If a text producer uses more time than average to plan a paragraph, he will also use more time to plan the sentences within that paragraph, the clauses within that sentence, and so on. The second possibility represents the reversed situation: if, relative to the average length, pause times increase at level L"' pause times will decrease at a subordinate level Lx-I.l•... ,· If a paragraph is thoroughly planned there will be less need to plan the constituing sentences and so on. So these first two possibilities imply interactions between conceptual and linguistic planning, for in both cases some aspect ofPP1P (its variance) is predictable on the basis of the same aspect of PPcp' The third logical possibility entails that there is a zero or non-significant correlation between variances in pause times at levels L" and L" _ I.l•... n·= deviations from the fixed parameters for different levels are randomly distributed. Cast in processing terms: planning load at the paragraph level may vary, but such variances will have no predictable effect on planning load at lower levels. This statistical possibility is assumed to be the
324
Joost Schilperoord
counterpart of the interaction-hypothesis which predicts either positive or negative correlations. To test the hypothesis, we first need to estimate the inter-level covariances/ correlations. Covariances between each pair of levels X and Y can be estimated by using the equation (7). (7) Covxy = I(Xojk- Xoo.c) {yOjk- Yook}/n
Equation (7) states that covariances can be estimated by subtracting the overall means for pause times at levels X and Y from the text-specific means for the same levels for all120 texts, then to multiply the differences, then to summerize the multiplied differences, and finally to divide them by the number of observations (i.e. 120). Correlations between X andY can then be estimated by using equation (8). (8) Corxy
=Covxy I lV11X ,.lP...,.
Table 3 summarizes the results of the covariance/correlation analysis. Table 3. Correlations (in italics) and covariances between levels of processing. In order to be significant covariances must be 1.96 as large as the associated standard error estimates (given between brackets) Estimates par
sen 1.77 (.43) .801
sen
cia
con
.978 (.25) .701 .241 (.053)
.201 (.074) .418 .071 (.016)
.942
cia
.808
.162 (.039) .825
con.
wor .025 (.035) .145 .014 (.007) .467 .010 (.004) .497 .007 (.002) 1.00
Note first that only positive correlations are found for each pair oflevels X and Y, no matter how 'far' apart these levels are. For example, variances at the level of paragraphs correlate positively, although only slightly (.145), with fluctuations at the word-level. Generally speaking, if pause times at some superordinate level increases with respect to the overall mean, so will pause times at all subordinate levels. Correlations increase if the levels under consideration become closer to each other. Second, Table 3 shows correlations to be highest at adjacent levels of processing - the matrix therefore represents a so called a
Conceptual and linguistic processes in text production 315
simplex structure (Schilperoord 1996). However, although positive correlations between all levels have been obtained, it cannot be immediately concluded that they support the interaction hypothesis. In order to test this hypothesis, we have to explore more precisely the amount of variance at lower levels that can be explained on the basis of variances at higher levels. To see why, consider a simplified model of processing with only three levels, A, B and C. Let it further be assumed that conceptual processes take place at level A, and linguistic processes at level C, and that level B represents some intermediate level of processing. If conceptual processes and linguistic processes would indeed interact, then the interaction hypothesis predicts a 'jwnp' between levels in that processing at level A affects processing at level C, regardless ofwhat happens at level B. This situation would be the case if it can be shown that a considerable amount of level-C variance can be explained by level-A variance. If, on the other hand, conceptual processes have no impact on linguistic processes, the relative contribution of A-variances in explaining C-variances will be insignificant. This would indicate that the impact of A-level processes on lower levels is subdued, so to say, at the intermediate level. Hence, the following two models were tested: Model (1): A+ B-+ C Model (2): B-+ C Model ( 1) represents the statistical counterpart of the interaction hypothesis, whereas model (2) represents the autonomy hypothesis. Analysing these models can be done by constructing a path-model which specifies the variables of interest (that is, pause time variances at five location types) and all possible relations that may hold between them. However, both the simplex structure shown in Table 3, and the theoretical considerations given so far, indicate that only the maximally restrictive model should be analysed, that is, the model that allows significant relations to exist only between adjacent levels of processing.5 The statistical test itself proceeds by estimating the so-called maximwn likelihood parameters, i.e. the set of regression weights holding between the variables. The test aims at fitting the hypothesized model with the data (i.e. the covariance-matrix shown in Table 3). If differences between the model and the observed data are small, the model is said to fit the data. The fit of the model is represented as a goodness-of-fit index (gfi), expressing the amount of explained covariance. Obviously, since the minimally restrictive model will fit the data completely (because all possible relations between levels are allowed), the
326
Joost Schilperoord
analytical question boils down to an estimation of the maximally restricted set of parameters which nevertheless fit the observed scores reasonably. Therefore, we started analysing a model with only 4 regression parameters. The results of the data-fitting test are presented in Table 4. Table 4. Estimates of model parameters for adjacent levels Xn' Xn- 1, and residual covariances Estimate par
sen
cla
con
wor
1.0
.708 .498
.891 .205
.683 .533
.732 .478
p1,1-l 'l'x
gfi = .985
The gfi estimate in Table 4 indicates that the maximally restricted model fits the data extremely well as it accounts for .985 of the covariance between location types. Enlarging the model with six additional parameters only adds an extra .015 explanatory power to the gfi, which is next to negligible. Let us therefore concentrate on the regression weights ~x.x-l" As the processing direction can be assumed to be top-down6 , the variance at the paragraph level is entirely determined by the residual term 'l'x· The regression weights ~x.x-l (with x = 2, .., 5) express the strength of the connections between estimates for all pairs of adjacent levels. Residual variances for all levels are indicated by 'l'x (with x = 2, .., 5). These parameters express the amount of unexplained variance at each level granted the values for regression weights. As Table 4 shows, the variance at the clause level is particularly well-explained, as there is only .205 residual variance (hence, only 21 o/o remains unaccounted for). Taken together, the regression weights and the residual parameters allow us to estimate the strength of remote connections by applying equation (9). (9) ~x.x-n = ~n-l,n''l'n ·~x.n- t•'l'n- I
Equation (9) allows us to estimate the amount of explained variance at, say, the level of constituents by considering the joint variances at the level of paragraphs and sentences, and oppose this estimate with the one concerning the amount of explained variance at the constituent level by considering the intermediate level of clauses. It then turns out that paragraphs- and sentence-variances only account for 10% variance at the constituent level, whereas clause-variances account for over 36% for constituent-variances. 7 Phrased in terms of processing, what happens at the conceptual level (paragraphs, sentences) does not tell
Conceptual and linguistic processes in text production 327
us much about what happens at the lower, linguistic levels. In other words, processing at these levels seems to go on relatively independently, which can be taken to support the no-interaction hypothesis. However, it would be wrong to assume a sudden switch between levels of processing. As the data indicate, the level of clauses seems to occupy an 'intermediate' level of processing: The clause may constitute such an intermediate level of processing because its structure interacts with the process of assembling conceptual fragments, whereas at the same time it provides increasing constraints on linguistic processing as the process proceeds. Therefore, it represents a level of processing 'below' which influences of conceptual processes are subdued, so to speak, that is, no longer present at the levels of linguistic processing. Hence, the correlation and regression data tested here strongly contrast the validity of the interaction hypothesis. As it seems, processing at the conceptual level goes on more or less independently from processing at the 'lower' linguistic level. The data imply an 'adjacency model' of processing: no 'jumps' across levels occur. Moreover, it seems that the level of clauses acts as an intermediate level 'between' conceptual and linguistic processes. Processing at this level is affected by higher level processes, and affects lower level processes (see also Schilperoord 1997a). 4·4
The time-dependent nature of conceptual and linguistic processes
This section considers yet another factor accounting for pause time variances: the variable 'processing time'. In research on writing and discourse production, the variable 'Time' has long been neglected as a source of variance in processing. Only recently its importance has been acknowledged (Van den Bergh & Rijlaarsdam 1999; Schilperoord 1995, 1996). Van den Bergh and Rijlaarsdam mention two ways in which the variable 'time' might aVect our current theoretical view on text production. First, they stress that cognitive activities might serve widely dNerent purposes if occurring at diVerent points in (real) writing time. Furthermore, they have shown that the relationship between cognitive activities and text quality is not Wxed, but in stead timedependent. A process such as 'reading a writing assignment' correlates positively with text quality if it occurs during the early stages in writing, but negatively if it occurs during later stages Although such a Wnding runs the risk of belabouring the obvious, it does quite nicely illustrate the relevance of taking the time dimension seriously. Moreover, this approach has revealed timedependent patterns in cognitive processing that are less obvious.
3:z.8
Joost Schilperoord
The real time approach taken here deviates slightly from that of Van den Bergh and Rijlaarsdam in that we only take into account pause patterns at various levels of processing, whereas in the study referred to above, the data base consisted of tagged thinking-aloud protocols. The strategy will be to find out how the variable 'time' might affect processes at different levels, and how such affects might differ from each other. If 'time' is the variable X of model (4) (see Section 4.1), what effects on conceptual and linguistic processes can then be expected, and how can these effects be tracked? The crucial assumption behind the time-approach to the study of text production is that in the course of production the cognitive 'circumstances' change gradually in time as more text is already produced. This might lead to different purposes served by otherwise identical processes, but also to varying interactions between different processes. How does this flesh out in the case of routine letter production? Arguably, routine letters are called 'routine' because different exemplars share much of their contents with each other. This will be the case especially at the beginnings and endings of such letters. These parts often contain formulaic language, as they contain standardized opening and closing statements, and fixed pragmatically oriented formulae. What makes a letter unique, on the other hand, is its central part, the part that contains the actual message to the addressee, and the part that constitutes the ultimate reason for the writer to produce the letter. Modelled as a function of production time, we then see a alternation between 'fixed' and 'new' parts within these letters. If the above assumptions cut some ice, this might have an impact at the level of conceptual processing, but not necessarily at the level of linguistic processing. For example, we might expect pause times at paragraph junctures to increase towards the central letter parts, and then to decrease towards the end of the letter. Such an assumption is based on the idea that producing the 'new' parts ofletters will be cognitively more effortful than producing the more or less fixed parts. However, former analyses have pointed out that such effects do not carry over to the 'lower', linguistic levels of processing. Content-related problems are mainly solved at the 'upper' levels and they do not significantly affect lower-level processing (see Section 4.3). So, if some way can be found to model the influence of the 'time' variable, we expect time-bounded influences to be present at the conceptual level, but not at the linguistic level. That is, time-bounded influences at this level cannot be predicted on the basis of timebounded influences at the conceptual level. On the other hand, the interaction hypothesis would, somehow, predict
Conceptual and linguistic processes in text production
time-bounded effects at all levels to correlate. Although the aforementioned considerations, together with the results of previous analyses seem to provide good reasons not to assume such correlations, they may in fact appear after all, that is: it may be the case that time-bounded effects at the level of conceptual processes allow us to predict the nature of such effects at the lower levels. The hypothesis tested here therefore specifically entails predictable effects of the variable time to be present at level X, given the observed effects at some superordinate level Y. In order to test this time-version of the interaction hypothesis, for each pause the number of words was counted that was produced in one spurt immediately following that pause. These counts were subsequently combined with the location ofthe pause, the latter indicating the level of processing. These so-called p(ause)/a(rticulation) rates will undoubtedly display considerable variation, and the analytical enterprise now is:
1. to see to what extent such variances can be explained away by the variable 'time'; 2. to see whether these variances covary (if indeed they do, this would support the interaction hypothesis). We started estimating the p/a variances that occurred between paragraphs, between sentences, between clauses, between constituents and between words respectively. Figure 3 shows the results.
Figure 3. Mean and residual pause/articulation ratio's for five type oflocations (1 =par, 2 =sen, 3 =cia, 4 =con, 5 = wor)
329
330 Joost Schilperoord
As could be expected from earlier endeavours, the p/a rates for all location types indeed show considerable variation. In terms of cognitive processing, the data moreover indicate that the 'ease of processing' varies per level. Variances at the level of paragraphs by far exceeds variances at the levels of sentences, clauses, constituents and words respectively. Now, what happens if these variances are modelled as a function of (real) production time? This can be estimated by aligning the various paragraphs, sentences, clauses, constituents and words according to their linear order of production (and according to the order in which they appeared in the final text). The results of this analysis are represented graphically in Figure 4.
'·' 3.1
•a
'
"a a a
a
"
Figure 4. Residual pause/articulation ratio's for five type oflocations, modelled as a function of real production time
Real production time is projected on the horizontal axis of Figure 4, whereas the estimated variances in p/a ratios for each location type are projected on the vertical axis. The (somewhat stylized) curves are formed by projecting each individual pause at a particular location type on the time track. As extremely long texts (over 600 seconds real production time) are somewhat exceptional, the curves for paragraphs and sentences gradually become less dense because fewer observations could be made here. The upper regions of Figure 4 shows the p/a rates for paragraph pausing as a function of (real) production time. The curve represents almost precisely what we had expected to find: processing efforts increase towards the middle sections of the process/text, in order then to decrease towards the end of the process/text. It can therefore be concluded that the variable 'time' does have an
Conceptual and linguistic processes in text production
impact on processing at the level of paragraphs. Assuming this level to concern mainly conceptual processes, it can be concluded that at this level processing is indeed influenced by the time variable. The same tendency, although to a much slighter extent, can be observed for p/a rates at the sentence level, displaying a slight, but still significant curve. However, as can be easily gleaned from Figure 4, time-bounded effects are markedly absent at the clause-level, the constituent-level and the word-level. These curves are hardly affected by the variable 'time' and only show a nonsignificant curve: they therefore ought to be taken as representing straight lines. This means that the p/a ratios are stable regardless of their position on the time-track. These results imply that no correlations exist as to the effect of time on conceptual processes on the one hand, and linguistic processes on the other (although, admittedly, this conclusion is based on observations done 'by the eye'). Time has an effect on conceptual processing, but there is no way of predicting, on that basis, what the effects will be at the linguistic levels of processing. All in all, the results do not provide any kind of support for the interaction hypothesis, and are therefore consistent with the results of the previous analyses.
s.
Discussion
This paper has sought to find out whether empirical evidence can be produced, based on an analysis of pausing patterns in text production, supporting the claim that in text production conceptual and linguistic processes interact, i.e. the interaction-hypothesis. The research strategy that we employed was to select three production-'external' circumstances for which it could be argued that they would cause variation in conceptual and linguistic processes. We first argued that temporal parameters such as 'pause length' and 'pause location' are responsive to the distinction between the two kind of processes. The three external variables explored were: differences between texts, variation in pause lengths for different location types in texts, and the variation due to the variable 'real production time'. For all three variables it could be demonstrated that they produce distinct pause patterns for both kinds of processes. Distinct in that one particular pattern could not be predicted on the basis of the other caused by the same external variable. These results thus do not provide empirical evidence favouring the interaction hypothesis.
331
332
Joost Schilperoord
Does this mean the no-interaction hypothesis to be the correct one? Obviously, given theoperationalization of the notion of'interaction' adopted in this study, it is altogether plausible to conclude that this hypothesis best accounts for the data obtained. For example, given the assumption that cognitive load will increase towards the middle parts of the texts produced, and decrease towards the concluding parts, it could be clearly shown that this only affects conceptual processes, but leaves linguistic processes more or less unaffected. Hence, the latter kind of process, arguably concerned with lexical search and syntactic alignment, does not respond in any predictable way to increased cognitive load. This is not to imply that no processing trouble pops up at this level, but these troubles cannot be predicted on the basis of higher-level conceptual processing features. Granting this to be the main lesson that can be learned from the analyses reported, we should, however, bear in mind some reservation. As is always the case with empirical data- they hardly ever allow one to draw ali-or-nothing conclusions. It should be stressed, first of all, that the conclusions drawn here are necessarily restricted to the kind of writing processes that was studied: highly routinized writing by expert adults. Flower and Hayes' multi-representation thesis might very well typify writing processes requiring much more problem-solving and planning activity. Here it seems plausible to assume that the ease oflocal processing is facilitated to the extent that content and structure of the to-be-written text is pre-planned. This line of thought is also suggested by Welschen's study on the 'cognitive balance' between higher and lower order processes in children's writing (Welschen 1982). Second, we have seen that the no-interaction hypothesis particularly holds for the highest and lowest level of processing: paragraphs and words. At the intermediate levels, however, we do find some interaction between levels. Especially the analysis reported in Section 4.3 points out that the level of clauses appears to take an immediate position in between 'higher' and 'lower' levels of processing. The general impression one gets is that conceptual and linguistic processes 'merge' so to speak at the clause level. Clauses represent a level of processing that is affected by higher level conceptual processes, and that affects lower level linguistic processes. Clause structures interact with the process of retrieving content from memory and assembling conceptual fragments as its structural characteristics provide increasing constraints as the process proceeds, whereas, on the other hand, clear interactions between the clause level and lower levels of processing were obtained. Both text-related differences, and differences between paragraphs may cause a text producer to put extra effort
Conceptual and linguistic processes in text production 333
into planning the contents and structure of texts. But, these conceptual problems are largely solved at the levels of paragraphs and sentences: They do not percolate to the 'lower' processing levels of constituents and words. This does not mean that finding lexical items to express ones thoughts, or aligning them within syntactic frames goes on effortlessly- it only means that problems that emerge at these levels of processing are solved independently from problems encountered at higher levels.The intermediate position occupied by the level of clauses is further substantiated by an analysis of the development of pause times within clauses, reported in Schilperoord (1997a, b). Here it is demonstrated that both the average pause length and pause time variances gradually decrease within clauses, that is, by each successive pause within a clause structure. This phenomenon does not show up within, say, sentences. We can account for these findings by assuming that, at the conceptual level, producing texts basically proceeds in a clause-by-clause manner (see also Ford 1982; Ford & Holmes 1978 ). Decreasing pause times within clauses then indicate conceptual processing to get increasingly constrained by grammatical factors, such as the obliged verb-final position in dependent clauses in Dutch. Admittedly, these are rather speculative statements. Indeed, what happens at the level of paragraphs has little to do with what happens the level of words, rightly so predicted by the no-interaction hypothesis. But much remains to be learned about the intermediate levels, and it seems wise not to assume a strict division between conceptual and linguistic processes.
Notes • I would like to thank Ted Sanders and Wilbert Spooren and three anonymous reviewers for their comments on an earlier version of this paper. I especially thank one of the reviewers for her or his highly valuable advice with regard to the presentation of the materials in this paper, and for preventing me from a major theoretical pitfall. I can only hope my revisions are as good as her or his advice. If I have understood his book correctly, Fodor ( 1983) would say that these two modules can no longer be considered to be informationally encapsulated.
1.
Although this may be a mere artefact of the fact that the Flower and Hayes model lacks the detailed precision of Levett's model.
2.
3· See for a detailed summary: Schilperoord (1996).
4· Apart from this analysis, we also estimated the average durations of pauses excluding zero-pauses. In ordinal terms, the results of this analysis were similar to those including zero-pauses (see Table I), that is the order of ranking remained in tact (Schilperoord 1996,
334
Joost Schilperoord
92) . For this reason, Table 1 only reports the 'true' estimates, hence, pause time averages
without taking zero-pauses into account.
s.
For details concerning the construction of path-models, see Schilperoord ( 1996, pp. 100
ft). 6. This does not appear to me to be a far reaching claim. It would, after
all, be odd to assume that a writer begins retrieving lexical elements before he has decided on what information should be expressed in, say, a paragraph. 1· The full computations can be found in Schilperoord (1996).
References Ariel, M. ( 1988). Referring and accessibility. Journal ofLinguistics, 24, 65-87. Bereiter, C., & Scardamalia, M. ( 1987). The psychology ofwritten composition. Hillsdale NJ: Erlbaum. Bergh, H. van den, & Rijlaarsdam, G. (1999). The dynamics of idea generation during writing: an on-line study. In M. Torrance & D. Galbraith (Eds.), Knowing Mtat to Write. Conceptual Processes in Text Production (pp. 99-121). Amsterdam: University Press. Boomer, D. S. (1965). Hesitation and grammatical encoding. Language and Speech, 8, 148-158. Butterworth, B. ( 1975). Hesitation and semantic planning in speech. Journal ofPsycholinguistic Research,~ 75-87. Chafe, W. (1987). Cognitive constraints on information flow. In R. Tomlin (Ed.), Coherence and grounding in discourse (pp. 21-51 ). Amsterdam: Benjamins. Chafe, W. ( 1994). Discourse, consciousness, and time. The flow and displacement of conscious experience in speaking and writing. Chicago: Chicago University Press. Cochran, W. G. (1977). Sampling techniques, third edition. New York: John Wiley & Sons. Flower, L., & J, R. Hayes ( 1984). Images, plans and prose; the representation of meaning in writing. Written Communication, 1,120-160. Fodor, J. A. ( 1983). The modularity ofmind. Cambridge, MA: MIT Press. Ford, M. ( 1982). Sentence planning units: implications for the speaker's representation of meaningfull relations underlying sentences. In J, Bresnan (Ed), The mental representation ofgrammatical relations (pp. 797-827). Cambridge MA: MIT Press. Ford, M., & Holmes, V. M. (1978). Planning units and syntax in sentence production. Cognition, 6, 35-53. Garrett, M.P. (1980). Levels of Processing in Sentence Production. In B. Butterworth (ed.) Language Production. Vol I: Speech and Talk. (177-220). London: Academic Press. Gee, J. P. ( 1986). Units in the production of narrative discourse. Discourse Processes, 9, 391-422.
Conceptual and linguistic processes in text production 33S
Goldman-Eisler, F. (1958). Speech production and the predictability of words in context. Quarterly Journal ofExperimental Psychology, 10,96-105. Goldman-Eisler, F. (1968). Psycholinguistics: experiments in spontaneous speech. New York: Academic Press. Hayes, J. R., & Flower, L. ( 1980). Identifying the organization of writing processes. In L. Gregg, & E. Steinberg (Eds.), Cognitive Processes in Writing (pp. 3-29). Hillsdale, New Jersey: Earlbaum. Henderson, A. I. (1974). Time patterns in spontaneous speech- cognitive stride or random walk? A reply to Jaffe, et al. 1972. Language and Speech , 17, 119-125. Henderson, A. 1., Goldman-Eisler, F., & Skarbek, A. (1966). Sequential temporal patterns in spontaneous speech. Language and Speech. 9, 207-216. Holmes, V. M. (1988) Hesitations and sentence planning. Language and Cognitive Processes, 3, 323-361. Just, M.A., & Carpenter, P. A. ( 1987). The psychology of reading and language comprehension. Newton, MA: Allyn and Bacon. Kowal, S., & O'Connell D. C. (1987). Writing as language behaviour: myths, models, methods. In A. Matsuhashi (Ed.), Writing in real time. Modeling production processes (pp. 108-133). Norwood NJ: Ablex Publishing Corporation. Levelt, W. J. M. (1983). Monitoring and self repairs in speech. Cognition, 14,41-104. Levelt, W. J. M. ( 1989). Speaking: from intention to articulation. Cambridge, MA: MIT Press. Longacre, R. (1979}. The paragraph as a grammatical unit. InT. Givon. (Ed.), Syntax and Semantics, voL 12: discourse and syntax (pp. 115-134). New York: Academic Press. Lounsbury, F. G. (1954). Transitional probability, linguistic structure, and systems of habit-family hierarchies. In C. E. Osgood, & T. Sebeok (Eds.), Psycholinguistics: a survey oftheory and research problems (pp. 88-126}. Bloomington: Indiana University Press. Maclay, H., & Osgood, C. E. (1959) Hesitation phenomena in spontaneous English speech. Word, 15, 19-44. Matsuhashi, A. ( 1981 ). Pausing and planning: the tempo of written discourse production. Research in the TeachingofEnglish, 15, 113-134. Pool, E. van der (1995). Writing as a conceptual process. A text-analytical study of developmental aspects. Doctoral dissertation, University ofTilburg. Sanders, T., Janssen, 0., Pool, E. van der, Schilperoord, J., & Wijk, C. van (1995). Hierarchical structures in writing products and writing processes. In G. Rijlaarsdam, H. van den Bergh, & M. Couzijn (Eds.), Theories, models and methodology. Current trends in research on writing (pp. 473-493). Amsterdam: UVA-press. Schilperoord, J. (1995). The distribution of pause time in written text production.ln G. Rijlaarsdam, H. van den Bergh, & M. Couzijn (Eds.), Theories, models and methodology. Current trends in research on writing (pp. 263-275 ). Amsterdam: UVA-press. Schilperoord, J. ( 1996). It's about time: temporal aspects of cognitive processes in text production. Amsterdam/Atlanta: Rodopi.
336 Joost Schilperoord
Schilperoord, J. ( 1997a) Temporele modificatie in clauses; een pauze-analytische studie naar tekstproductie (Temporal modification in clauses; a pause-analytical study on text production). In H. van den Bergh, D. Janssen, N. Bertens, & M. Damen (Eds.), Taalgebruik ontrafeld (Unraveling language use) (pp. 263-275). Dordrecht: Foris Publications. Schilperoord, J. ( 1997b ). Eenheden van tekstproductie; de clause als domein van temporele modifietltie (Units in text production; the clause as domain for temporal modification). Lecture presented at the founding colloquium of the Utrecht institute of Linguistics OTS, 13-2-1997. Schilperoord, )., & Pool, E. van dcr (1997). Schrijfpauzes en tekststructuur; naar ecn cognitief model van real-time tekstproductie (Writing pauses and text structure; towards a cognitive model of real-time text production). Gramma/TIT, 5, 119137.
Schilperoord, J., & Sanders, T. (1997). Pauses, Cognitive Rhythms and Discourse Structure; An empirical study of discourse production. A. Liebert, G. Redeker, & L. Waugh (Eds.), Discourse and perspective in Cognitive Linguistics (pp. 247-269). Amsterdam/Philadelphia: Benjamins. Welschen, A. (1982). Formulecrvaardigheid en de cognitieve balans bij bet schrijven (Formulating skills and the cognitive balance in writing). Tijdschrift voor Taalbeheersing, 4, 131-162. Wijk, C. van (1987). Spetrking, writing and sentence form. Doct. Diss., University of Tilburg.
CHAPTER
13
Subordination and discourse segmentation revisited, or: Why matrix clauses may be more dependent than complements Arie Verhagen University of Leiden
1.
Introduction
All approaches to the structural analysis of texts and discourse have to make assumptions about the smallest units out of which larger pieces of discourse are constructed. A plausible first candidate for the status of "minimal discourse segment" is the grammatical clause, so it comes as no surprise that from the start of an approach such as Rhetorical Structure Theory (Mann & Thompson 1988), this assumption has actually been put forward. A simple text consists of a series of simplex clauses, connected by particular conceptual relations making the series of clauses into a coherent text. Thus one naturally wants to take at least all main clauses of a text as minimal segments. Complications arise when other clauses than main ones are also taken into consideration; sometimes one wants to assign such a clause the status of segment, sometimes not. This has also been evident since the beginning of RST. The problem I want to address in this paper is how to give a principled account of the relationship between grammatical subordination on the one hand, and the segmentation of texts into their minimal units on the other. Let me begin by reviewing explicitly the motivation for denying certain non-main clauses the status of discourse segments. Consider the following example: ( 1) They left early; they absolutely wanted to be on time. This mini-text consists of two segments. The conceptual relation connecting them and holding them together as a textual whole is some sort of causality; a
338 Arie Verhagen
competent reader will know that the desire to reach a destination on time provides a motivation for leaving early, and thus interpret the contents of the second segment as actually providing the cause of the event described in the first. Obviously, the number of discourse segments corresponds exactly to the number of finite grammatical clauses in ( 1). 1 Now consider example (2). (2) They left early(,) because they absolutely wanted to be on time.
If (1) is considered to be a text, consisting of two segments, then (2) is one as well. There are also two separately identifiable propositions, and the conceptual rdationship between them is the same as in ( 1). The difference is that the relationship is explicitly marked as causal in (2), whereas (1) lacks such a marking; thus although the interpretation of (2) can therefore be said to be more constrained than that of ( 1), there is no reason to assign it a fundamentally different status as a text. Calling ( 1) a text consisting of two segments and (2) a single clause text, for example, would dearly miss a generalization. As a matter of fact, it is intuitions like these that motivated the idea that this type of 'clause combining' can actually be regarded as the grammaticalization (conventionalized structural expression) of discourse relations (Matthiessen & Thompson 1988). However, the same kind of considerations (concerning conceptual interclausal rdationships) also leads to the conclusion that not all clauses should be considered to constitute discourse segments. Consider examples (3) and (4). (3) They left early; it is essential that they be on time. ( 4) They left early; they think that in that way they will definitely be on time. With respect to cases like these, one also wants to make it possible to state a generalization: there is a conceptual rdationship of causality in ( 3) connecting its component sentences in the same way as is the case in (1) and (2), and the same is true for (4). This requires one to assume that both (3) and (4) contain two segments, but each contains three finite clauses (the part following the semicolon consisting of a main and a subordinate clause). Therefore, as early as in Mann and Thompson's original RST -proposal, clauses that functioned as subjects (cf. (3)) or complements (cf. (4)) were denied the status of discourse segments. Matthiessen and Thompson ( 1988) called the type of subordination exemplified in (2) "clause combining", while the type of subordination in (3) and (4) was characterized as "embedding"; only the former cases are to be considered as grammaticalizations of discourse relations, while the latter are
Subordination and discourse segmentation revisited 339
properly viewed as actual constituents of their host clauses. Although the notions are not really defined in a fully explicit manner, it is intuitivdy clear what the authors are trying to get at, and this distinction also turns up in later approaches to discourse structure (Pander Maat 1994, p. 3036; Sanders 1992, p. 115/6; Sanders & van Wijk 1996, p. 126/7). However, it should also be noted that the exceptional status of subject and complement clauses is not really explained in this way. This becomes even more problematic when one realizes that there is minimally one more exceptional type of clause: restrictive relatives. Again, the motivation for assigning these a different status is not formulated very explicitly, but it can be made sufficiently clear. Consider examples (5), containing a restrictive relative clause, and (6), with a nonrestrictive one. (5) These schools all appear to have relatively many students who grew up in culturally deprived families. (6) They shouted at the waiter, who so far did not seem to have noticed them. One does not want to view (5) as a text consisting of two segments primarily because there does not seem to be a conceptual relationship between the two clauses making them into a textual whole (i.e. the clause just specifies some property of its head noun, restricting its denotation; cf. below). In other words, one does not want to divide (5) into two segments as indicated in (5)'. (5)'
~a.
These school all appear to have relatively many students b. who grew up in culturally deprived families.
On the other hand, the non-restrictive relative clause in (6) does have some conceptual relationship with the matrix clause (beyond mentioning a property of its referent): a plausible interpretation could be that the situation mentioned in the relative clause specifies the reason for their shouting at the waiter. Thus one would want to divide (6) into two discourse segments between which a textual relation (in this case, of causality) may be construed; cf. (6)': (6)' a. They shouted at the waiter, b. who so far did not seem to have noticed them. What is it that restrictive relative clauses, subject and complement clauses have in common which makes them exceptions to the 'rule' that discourse segments correspond to grammatical clauses? This is the question that has to be an-
340 Arie Verhagen
swered in order to make a start with an explanatory account of the relationship between the two. In the remainder of this paper, I want to propose a number of hypotheses intended as steps in that direction; as my point of departure I will take the analysis proposed in Schilperoord and Verhagen (1998).
2.
Conceptual independence and discourse segmentation
Working on the basis of analyses of {non- )restrictive relatives that were developed independently from the issue of discourse segmentation (Daalder 1989; Verhagen 1992, 1996a), Schilperoord and Verhagen (1998) propose a condition on discourse segmentation that can briefly be stated as follows: (7)
Condition on discourse segmentation (conceptual independence): "If a constituent of a matrix-clause A is conceptually dependent on the contents of a subordinate clause B, then B is not a separate discourse segment" (cf. Schilperoord & Verhagen 1998, p.lSO ).
This condition utilizes the idea that a matrix structure may be dependent for its conceptualization on some subordinate structure {cf. Langacker 1991, p. 436), and that as a consequence, the subordinate structure involved cannot be a separate discourse segment. Thus, it is not so much conceptual dependence of the subordinate structure that makes it inappropriate as a discourse segment, but rather its role in making its matrix structure conceptually independent. This 'shift' is crucial, as we will see shortly. But let me first illustrate the condition by showing how it applies to relative clauses. Consider the restrictive relative in {5) once more: (5) These schools all appear to have relatively many students who grew up in culturally deprived families. Notice that the conceptualization of the referent of students is crucially dependent on the contents of the relative clause. The sentence does not say that the schools have relatively many students {and that these grew up in culturally deprived families), but rather that relatively many students grew up in such families. In (6) on the other hand, the conceptualization of the referent of waiter is not crucially dependent on the contents of the relative clause: (6) They shouted at the waiter, who so far did not seem to have noticed them.
Subordination and discourse segmentation revisited 341
In the non-restrictive interpretation, the denotation of the waiteris determined independently of the relative clause, which then provides some additional information; this sentence does mean that they shouted at the waiter, and that he did not seem to have noticed them so far. Thus the explanation is that a restrictive relative clause is required to complete the conceptualization of some part of another clause, and hence cannot function as a separate discourse segment. Schilperoord and Verhagen ( 1998) claim that the same condition explains the exceptional role of subject and complement clauses, i.e. in so far as it is exceptional. The point is that the usual formulation of the exception is not fully adequate. In Mann and Thompson's (1988) original formulation, the claim was that a subject or complement clause was to be considered as "part of its host clause". As long as we take only relatively simple cases of embedding into consideration, that procedure gives the desired segmentation, but problems arise when we apply it to more complicated cases. Such complications actually abound in the material used for the research reported in Schilperoord (1996); (8) is a typical example. Te uwer informatie merk ik nog op dat client voorziet dat het niet eenvoudig zal zijn om snel ander werk te vinden. b. Daarbij komt dat zijn echtgenote zwaar gehandicapt is en dat hij een gezin heeft te onderhouden. a. For your information I .mzg that my client anticipates that it l!d1l not be easy to find another job fast. b. To this mmml. be added that his wife u seriously disabled and that he }uy a family to care for".
(8) a.
Fragment (8) consists of two sentences (marked a and b); it contains 6 finite clauses (indicated by the underlined finite verbs). Sentence (a) actually consists of a series of embedded clauses, and it is quite conceivable that recursive application of "Mann & Thompson's rule" could handle it, producing the identification of (a) as a single segment, as seems desirable. The real problem is exemplified by sentence (b). Straightforward application of Mann & Thompson's rule would result in it being characterized as a single segment: each of the two subordinate clauses is a subject clause, to be taken as a part of its host clause. However, the segmentation of(8) into two segments (corresponding exactly to the two full sentences a and b) seems undesirable because it makes it impossible to capture the fact that the writer of this fragment, a
342. Arie Verhagen
lawyer, adduces three arguments in favor of his clients position: the problem of finding another job, the health condition of his wife, and the fact that there is a family to be cared for. The essence of the first of these is contained in the single right-most embedded clause of (a), while the other two points are presented in the two subject clauses of (b); as a result of the way Mann & Thompson's rule is formulated, it is not possible to recognize the fact that there are actually two points being made in (b). 2 The condition proposed in (7) actually is capable of making the relevant distinction. The reason is that the relevant property of conceptual dependence is attributed to the matrix clause rather than to the subordinate one. Notice that the phrase (in Dutch) Daarbij komt ("To this should be added"), as an instruction to add certain pieces of information to a previously established one, is not conceptually complete, in a sense not even interpretable, without the information provided in the subordinate clause. The point is not that this information could only be provided by a clause (the ·subject slot' could also be filled by a noun phrase, for example nog iets anders, "something else"); rather, the point is that in this case, it is a subordinate clause that fulfills this necessary function of making the matrix conceptually independent, so that the subordinate clause does not constitute a separate discourse segment. Now by the same token, one complete clause always suffices for creating a conceptually complete message; the unit of a matrix and the first subordinate clause is never conceptually dependent on a second one. Consequently, all further subordinate clauses can be properly characterized as separate discourse segments,1 so that fragment (8) may be divided into the three segments indicated in (8)": (8)" a.
Te uwer informatie merk ik nog op dat clil!nt voorziet dat het niet eenvoudig zal zijn om snel ander werk te vinden.
For your information I note that my dient anticipates that it will not be easy to find another job fast. b. Daarbij komt dat zijn echtgenote zwaar gehandicapt is
To this should be added that his wife is seriously disabled c.
en dat hij een gezin heeft te onderhouden.
and that he has a family to care for. Interestingly, this does not only provide us with a principled account of a segmentation that is in accordance with our intuitive understanding of such fragments, it also appears to specify the boundaries of planning units in actual language production (see Schilperoord 1996, Chap. 6, and Schilperoord 1997).
Subordination and discourse segmentation revisited 343
Returning to the relationship between condition (7) and Mann and Thompson's original procedure, we can note that certain clauses that fulfill the syntactic function of subject or complement must in fact be allowed to be assigned the status of separate discourse segments. Mann & Thompson's rule could not accommodate such cases, but condition (7) does, while preserving the idea that a matrix forms a discourse unit together with a single complement or subject clause; it furthermore provides a generalization over these clauses and the restrictive relatives. I therefore consider it a substantial part of a more explanatory account of the relationship between grammatical and discourse structure: Only a relationship of conceptual dependence between syntactically related clauses is a sufficient condition preventing them from constituting separate discourse segments. Still, there are some remaining questions, in particular: (a) What is the reason that certain matrix clauses are not conceptually independent? Do such constructions have anything in common that relates to a specific discourse function, distinct from the discourse function of adverbial clauses (the cases of "clause combining" in terms of Matthiessen & Thompson 1988)? (b) How do subject and complement clauses differ from restrictive relatives, such that the latter never constitute separate discourse segments? (c) How can we avoid the grammatically impossible conclusion, suggested by the segmentation in (8(, that in fragment (8) a main clause (segment b) and a subordinate one (segment c) are being coordinated? I believe that there are interesting, interrelated answers to these questions, which will allow us to further deepen our understanding of relationships between grammar and discourse, in particular of the discourse function of the grammatical phenomenon known as complementation.
3·
Dimensions of text interpretation: (lnter)subjectivity
A fundamental aspect of the human capacity for using language is the ability to recognize other entities as essentially like oneself, and to take another person's perspective as one that could be one's own. For one thing, the whole idea of intentionally producing utterances to be recognized as such and to be thereby understood (i.e. linguistic communication), would not make sense without that. 4 More importantly for my present purposes, this ability is manifested in
344 Arie Verhagen
the interpretation of linguistic utterances in a very general sense: As soon as some observable phenomena (sounds, marks in stone or on paper, gestures) are recognized as instances of language, this implies that their content is attributed to some subject of consciousness, possibly unknown, but by implication seen as capable of linguistic communication just like the interpreting person him/herself; if an interpreter would not take the signals observed as having been produced as such, they would simply not count as language (possibly still as signs, but then non-intentional ones, i.e. symptoms). Thus the interpretation of discourse may always be seen as not just constructing some understanding of the events and situations depicted in it, but also as coordinating with some subject of conceptualization; the interpretation of linguistic discourse necessarily has both a "content-dimension" and a (intersubjective) "coordination-dimension".5 In view of this inherent, general feature of discourse interpretation, it should come as no surprise that there are several kinds of linguistic elements and constructions that serve to indicate particular features of this coordination dimension, for example modal expressions of different types. In the present context, it seems that this idea is also highly relevant for the semantic characterization of complementation constructions. A natural description of the function of matrix clauses such as My client anticipates ... and It should be added... is precisely that they do not provide information in the contentdimension, but rather in the coordination dimension of the interpretation of the discourse. The first explicitly instructs the reader to construe the informational content (e.g. "finding a new job will be hard") as an anticipation, of a particular person. The second provides an instruction by the writer to construe the content information (e.g. "He has a family to care for") as an additional point, paralleling a previous one. The (more implicit) latter case thus invokes intersubjective coordination between writer and reader, whereas the former expression invokes coordination between the reader and a specific individual mentioned. 6 Suppose now that we distinguish discourse segments not just linearly, in one dimension, but in two, taking this discussion into account. Then fragment (8) may be represented (somewhat abbreviated) as in Figure 1: coordination dimension
content dimension
I note that client anticipates:
not easy to find other work fast
Add to this that
his wife is severely disabled
and that7
he has a family to look after
Figure 1. Text segmentation in two dimensions
Subordination and discourse segmentation revisited 345
In such a representation, the matrix clauses are not part of segments in the content dimension. For one thing, this immediately provides an answer to question (c) mentioned above: In this dimension there is no coordination of a matrix and a subordinate clause, which allows us to avoid the suggestion to that effect in segmentation (8)". However, a more important question at this point is: Is this just an incidental property of the particular matrix-structures in this particular fragment, or is this a manifestation of a more general phenomenon? How general can the procedure be of assigning the content of matrix clauses to the coordination dimension of discourse interpretation? As a matter offact, I think such a procedure can actually be fairly general. I would like to suggest that, whereas constructions with adverbial clauses ('clause combining•, see section 1) may be viewed as grammaticalized expressions for rhetorical relations (cf. Matthiessen & Thompson 1988}, complementation constructions may be viewed as general grammaticalized expressions for intersubjective coordination (with the lexical content of the matrix clauses and the complementizers providing the specifics). To start, it is interesting to have a look at the set of complement-taking verbs, for example as listed for Dutch in the comprehensive reference grammar Algemene Nederlandse Spraakkunst (ANS, both in the first and in the recent second edition), and especially to see what kind of concepts these verbs express; the subtypes distinguished by the ANS are presented, with a few examples, in table 1. Table 1.
a b c
d e
f
Semantic types of verbs taking 'direct object clauses' according to the ANS Verbs expressing a statement, question, command, promise, etc., i.e. having a communicative meaning Verbs expressing some form of knowing, believing. supposing. etc. Verbs expressing evaluation [including constructions of the type "find it a pity/strange", etc.) Verbs expressing wish or desire Verbs expressing a way of perceiving Verbs expressing causation
(ANS 1, 1984, p. 840-842; ANS2, 1997, p. 1156-1158).
There is dearly a generalization to be made over cases (a) through (e): Such predicates all evoke some mental state or process of a subject of consciousness, and the content of the complement is to be attributed to this subject of
346 Arie Verhagen
consciousness. In other words, these predicates are all "mental space builders" in the sense ofFauconnier (1994). We could express this generalization in the form of a "constructional schema" (in the sense of Langacker 1991, p. 546 or Goldberg 1995): a construction consisting of a mental space building predicate and a clausal complement means that the contents of the subordinate clause is to be attributed to the subject of conceptualization referred to in the matrix clause:8 (9)
Complement Construction: construction form: construction meaning:
ls.. NP. [MentaJSpacePredicate····l datlofls.b·····l I AlTRIBtrrE CONTENTS OF
S-B TO REFERENT OF NP,.
Category (0 is different: these predicates indicate causality, with the complement denoting the result; I believe these can be integrated into the account in a motivated way, but as this is only indirectly related to the issue of segmentation, I will not pursue that matter further here. 9 In any case, it is clear that evoking, in some specific respect, a mental space for the contents of another clause is a very general function of matrix clauses of complements. With respect to segmentation it is important to ask if this is also true for other matrix clauses, especially those taking subject clauses (another subtype denied segment-status by Mann & Thompson's rule). In fact, I think it is not difficult to see that is. First of all, one important category of matrix predicates of subject clauses are the passive forms of the predicates mentioned in Table 1 (It was argued... , It has been claimed... , It can be seen ... , in which exactly the same relation between matrix and subordinate clauses holds as in the active voice. Another class consists of matrix clauses in which a predicate nominal phrase evokes some subjective point of view, i.e. adjectives as in It is clear/ puzzling... , or noun phrases as in It is a problem/question .. .. 10 Expressions of this type are specifications of a cognitive state with respect to the proposition expressed in the subordinate clause, and thus evoke a conceptualizer entertaining this cognitive state. Thirdly, grammatical subject clauses may be embedded under 'connecting phrases' such as Daarbij komt... (lit. There-to comes ... ; "It should be added", "Additionally") in (8), or Hier staat tegenover (lit. Opposite to this stands ... ; "On the other hand", "Conversely"). This type can be considered as evoking subjectivity too, albeit in a way that is more implicit than the other ones: such expressions are instructions on how to handle information and therefore imply a subject providing them, rather than that they explicitly mention some cognitive state with respect to a proposition.
Subordination and discourse segmentation revisited 347
The matrix of a complement clause always explicitly specifies a source of subjectivity, while this is not necessary in the matrix of subject clauses. As a consequence it seems that the subjectivity of subject clause constructions is usually interpreted as relating to the producer of the discourse, rather than to some other entity. Consider (10), for example. ( 10) Er is echter dringend behoefte aan nieuwe modellen. De twee-relatie is weliswaar een ideaal voor zeer veel homofielen, maar het is duidelijk dat dat dan heel iets anders is dan het traditionele huwelijk. However, there is an urgent need for new models. It is true that the tworelationship is an ideal for many homosexuals, but it is clear that this will be entirely different from the traditional marriage. In interpreting this fragment, a reader will normally ascribe responsibility for the claim that something is clear to the writer of the text. In other words: the matrix of a subject clause is usually taken as a manifestation of speaker/writer subjectivity (i.e. that of a speech act participant), rather than as character subjectivity (cf. note 6) as the clause itself contains no reference to a participant who is the source of the subjectivity. It should be pointed out though, that this is not an obligatory semantic feature of the construction as such, but a default option given the fact that the construction does not mark a source of subjectivity and the subjective roles of speech act participants are always available for use in interpretation. If the context contains an explicit reference to another subject of conceptualization, then the attribution of responsibility for the claim is easily changed. Suppose that this fragment was a report about someone delivering a speech on types of homosexual relationships; then it might well have been formulated as in ( 10)': ( 10 )' Er is volgens de spreker echter dringend behoefte aan nieuwe modellen. De twee-relatie is weliswaar een ideaal voor zeer veel homofielen, maar het is duidelijk dat dat dan heel iets anders is dan het traditionele huwelijk. However, according to the speaker there is an urgent need for new models. It is true that the two-relationship is an ideal for many homosexuals, but it is clear that this will be entirely different from the traditional marriage. Note that the second sentence in this fragment is actually fully identical to the one in ( 10), but that the opinion expressed in it is now naturally ascribed to the referent of"the speaker" in the previous sentence. But it is a difference between
348 Arie Verhagen
the matrix of a subject clause and that of a complement clause that identification of the latter's subject of conceptualization is constrained linguistically, whereas the matrix of a subject clause does not necessarily provide such constraints, and is thus the only type that allows for speaker/writer subjectivity without any special markings (complement clause constructions requiring some form of first person marking). The interpretive 'freedom' for the matrix of subject clauses is, in my view, a manifestation of the general property of any instance of language use mentioned at the beginning of this section: it being taken as language implies it being taken as having been intentionally produced as meaningful, and therefore implies the projection of some other cognitive entity like the interpreter. Whatever entity is available for attributing a particular thought in a text to can function as such in the case of subject clauses, but complement clause constructions have the special property that their matrix predicate provides a specific constraint on this attribution. Still, it is clear that a generalization over the discourse function of the matrix of complement and subject clauses can and should be formulated; the contents of such clauses is attributed to some subject of consciousness, explicitly or implicitly specified in the conceptualization of the matrix clause. This can be represented by means of a generalization of the construction in (9), which I will call the "embedding constructionn: 11 (II) Embedding Construction:
construction form: construction meaning:
(S-a ( Predica1e • • • .j
dat/ of( S-b • • • •• ) j
ATTRIBtrrE CONTENTS OF CONCEPTUAUZER IN
S-b TO
S-a
The idea now is that it is the construction's meaning that is the basis for the conceptual dependence of the matrix clause on a subordinate one. It specifies that the reader/hearer should engage in cognitive coordination with some subject of conceptualization, and such coordination always takes place with respect to some piece of information; cognitive coordination is never 'void': there is no illocutionary act without propositional content, no assessment without an object of evaluation, no instruction to handle incoming information without such information, in general: no coordination between subjects of conceptualization without some object of conceptualization. If interpreting a text involves the alignment of one's cognitive state with that of another, then such alignment necessarily takes place with respect to some informational content. That is a matter of conceptual necessity; what is a matter of linguistic
Subordination and discourse segmentation revisited 349
convention, on the other hand, is the degree to which this relationship between dimensions of discourse interpretation is 'encoded' in one or more specific words (for example modal adverbs) or constructions (such as ( 11 )). All in all, we now have completed the line of argumentation that aUows us to provide an answer to two of the questions formulated at the end of Section 2. First, as regards question (a), the above analysis contains an account of what it is that complement and subject clause constructions have in common, and that explains why the matrix clauses are not conceptually independent: They provide specifications of the coordination dimension, which must be completed by some specification in the content dimension. In a sense, we have thus turned the traditional notion of' dependent clause' upside down, by showing that it is the matrix clause that is actually conceptually dependent on a subordinate one. Whereas the original rule formulated by Mann & Thompson seemed to imply that it was the subordinate clause that was not independent, we now have reached the conclusion that it is actually the matrix that should be denied the status of separate discourse segment (along with, of course, one subordinate clause). This does not have to conflict with a functional interpretation of the notion of subordination, as soon as it is recognized that matrix clauses function in a dimension of discourse interpretation (that of cognitive coordination with a subject of conceptualization) that is functionally different from the content dimension (that of providing information). Viewing the embedding construction as a grammatical instrument (certainly not the only one) for indicating relationships between the coordination and content dimensions of discourse interpretation allows us to say simultaneously that structurally embedded information is subordinated to something else (viz. a mental space, usually in some specific way), and that it often still provides the most important information, especially new information. Also, several pieces of information can be subordinated to the same mental space (recall the string of embedded clauses in (8)), without them becoming just constituents of a single discourse segment. Secondly, this approach also provides a basis for an answer to question (b) at the end of Section 2: What is the difference between subject and complement clauses on the one hand and restrictive relative clauses on the other, such that the latter never constitute separate discourse segments? The answer can be formulated in terms of the distinction between the two dimensions introduced in this section: Whereas the specific character of embedding constructions precisely consists in a relationship between these two dimensions, restrictive
350 Arie Verhagen
relatives by definition always function in the same dimension as their head noun, and thus in the same dimension as their matrix clause.
4·
Thematic continuity in the content dimension
Before concluding I would like to present an additional piece of evidence suggesting that discourse analysis may profit from a segmentation procedure that takes the distinction between the two dimensions introduced in the previous section into account. 12 This evidence involves certain phenomena of thematic continuity in texts, i.e. indications of how the topic or 'theme' of a particular discourse segment is connected to previous segments. In Onrust, Verhagen and Doeve (1993, Ch. 2), two ways are distinguished in which the initial and final positions of sentences (in Dutch) may contribute to the thematic cohesion of texts. Given two adjacent sentences S1 and S2 in a text, then: a. when the sentence initial constituents ofS 1 and S2 refer to the same piece of information, we have a so-called "constant pattern" (about the same topic, two statements are being made); b. when the initial constituent of S2 refers to the same piece of information as a constituent that is (more or less) final in S2, we have a so-called "chaining pattern". This is indicated schematically in Figure 2. Constant pattern:
I5 1 A ...... B) 152 A ...... C) I
I
Chaining pattern:
151 A ...... B) 152 B ...... C) t.:...=...J
Figure 2. Two patterns ofthematic cohesion.
These patterns are not obligatory ones, but when used they do contribute to the cohesion of texts. As they have been defined these notions only apply to the initial and final positions of (complete) sentences. This sometimes restricts the applicability of the notions, giving rise to conflicts with language users' intuitions about textual cohesion. Students, when applying the analytic method of
Subordination and discourse segmentation revisited
which these definitions form a part, quite generally treat cases like the following as instances of chaining: b. Het gevaar bestaat dat uw klan ten door de aanhoudende vertragingen ontevreden worden oyer uw bedrijf. Wij denken dat dit voorkomen kan worden door te zorgen voor een snellere informatiestroom naar de bezorgafdeling. Een mogelijkheid hiertoe wordt gevormd door ... "The danger exists that because of the continuing delays. your customers will become djssatisjied wjth your company. We think that llW can be prevented by accelerating the flow of information to the delivery department. One possible option in this respect is... , According to the definitions in Onrust et al. ( 1993) however, this cannot be an instance of any pattern, because the demonstrative anaphor dit is not in an initial position of a sentence. Now note that between the final position of the first sentence in (6) - the underlined part "become dissatisfied with your company" - and the demonstrative in the second sentence is the matrix "We think that"- i.e. information relating to the coordination dimension. If we take this into account, and segment the text as in Figure 3, it is immediately apparent that in the content dimension, the demonstrative is adjacent to its antecedent, so in this dimension we actually do have a chaining pattern. coordination dimension
content dimension
Het gevaar bestaat dat
uw klanten ... ontevreden worden oyer uw bedrjjf.
The danger exists that
... your customers wiU become dissam,fied with your compaey dit voorkomen kan worden door te wrgen voor een
Wij denken dat
We think that
sneUere informatiestroom ...
.r1m can be prevented by accelerating the flow of information ...
Figure 3. Thematic cohesion in content dimension. What we may want to propose is that the conditions for patterns of thematic continuity should apply within one specific dimension of discourse representation. Given the distinction between the two dimensions as suggested here, we may say that material that is linearly intervening but relates to another dimension than the antecedent and the anaphor, is 'invisible' to the formation of
3il
3.52 Arie Verhagen
patterns of thematic cohesion. In order to see if this adaptation of the patterning conditions would account for the actual use of discourse anaphors in spontaneously produced texts, a search was undertaken in a corpus with text fragments from different genres, 13 collecting all instances of the complementizers dat and of that were immediately followed by a demonstrative with an antecedent elsewhere in the text (i.e. not in the same sentence). In this corpus, there were 62 instances satisfying this criterion - thus all of them 'violating', as it were, the thematic continuity conditions as formulated by Onrust et al. ( 1993 ). However, taking the distinctions proposed here into account, 39 of them turn into straightforward examples of the chaining pattern (perhaps even 42), and 9 (possibly 10) into examples of the constant pattern. So there are at least 48 out of 62 'exceptional' cases that turn out to be regular ones as an immediate consequence of distinguishing the coordination and content dimensions in the representation of discourse. An example of the most frequent pattern, that of chaining, is given in (13). At the end of one sentence the idea is expressed of the government taking over the entire production machinery. In the linear text, we then get a matrix clause opening a mental space assigned to some economists who used to believe something on theoretical grounds, thus belonging to the coordination dimension (which is indicated by small capitals), and then, as the first element of a new segment in the content dimension, we get the anaphor, referring to the idea at the end of the previous content segment. Such pieces of text are indeed completely natural and unproblematic. ( 13) [... ] Wanneer wij, in de rug gesteund door de moderne economie, bet laissez faire afwijzen, dan staan wij voor de keus tussen twee altematieven. In de eerste plaats kan de oyerbeid bet gehde produktieapparaat oyememen. SOMMIGE ECONOMEN MEENDEN VROEGER OP rnEORETISOfE GRONDEN, DAT di1 niet tot gevolg kon hebben dat de welvaart op gunstige wijze zou worden verdeeld, maar dit standpunt is thans door de meeste economen verlaten. "[ ... ]When we, with thesupportofmoderneconomy, decline the principle of'laissez-faire', we face a choice between two alternatives. On the one hand, the government could take oyer the enhrepraduction machinery. SOME ECONOMISTS USED TO BEUEVE ON 111EORE11CAL GROUNDS, THAT 1hU could not lead to a advantageous distribution of richness, but this opinion has now been abandoned by most economists."
Subordination and discourse segmentation revisited
Fragment ( 14) contains an example of a constant pattern, that can be analyzed in a similar way. ( 14) De EEG-raad van ministers van landbouw heeft maandag in Luxemburg in beginsel overeenstemming bereikt over de methodiek van een regeling voor vlas: er zal een forfa.itaire toeslag per hectare worden gegeven [... ]. Over het bedra& van die toesla& zal de Europese Commissie nog een voorstel doen. MINISTER L\RDINOIS VERWACHTIE WEL OAT dczc iets boger zal worden dan de huidige Nederlandse toeslag [ ... ].
"On Monday, the European council of ministers ofagriculture has reached agreement in Luxembourg about the method ofa regulation for flax: a standard surcharge per acre will be given [... ].As to the amount Q/the surcharte, the European Committee wiU produce a proposaL MINISTER LARDINors DID EXPECT THAT lhU will tum out somewhat higher than the present surcharge in the Netherlands [... ]. n Thus there is not only evidence from readers' intuitions, but also from the distribution of discourse anaphors in spontaneously produced texts, that language users treat these devices for cohesion across sentence boundaries in a way that takes the distinction between the coordination and content dimensions into account. This finding thus provides independent support for the proposal to systematically use this distinction in the segmentation of texts.
5.
Conclusion
The central claim in this paper is that it necessary, for an adequate segmentation procedure for natural language texts, to take into account two distinct dimensions of discourse interpretation with respect to which textual fragments may be interpreted. The nature of these dimensions is an immediate consequence of an intrinsic property of interpreting language, viz. that it by definition implies not only processing informational content, but also engaging in cognitive coordination with some entity projected to be responsible for that information. As cognitive coordination in tum presupposes some information to function as object of coordination, the interpretation of expressions in the coordination dimension is not conceptually independent from information in the content dimension. In these terms, constructions with complement and subject clauses have been analyzed as grammatical means for systematically distributing information over these two dimensions; the claim is that this
353
3S4 Arie Verhagen
provides a functional explanation for the condition that in such constructions, the matrix is not conceptually independent, therefore does not constitute a separate discourse segment, but needs at least one subordinate clause to make it conceptually independent (allowing further subordinate clauses to be added as separate segments). This view of conceptual independence as a condition on discourse segmentation is also empirically superior to previous formulations of the conditions on discourse segments.
Notes 1.
Under certain analyses, the second segment of ( 1) might be said to properly contain an
infinite clause (the complement of want), but I will not consider that issue in this paper,
though I believe that the present approach can ultimately be helpful in clarifying that as well. See Verhagen (1995) for some suggestions. This is not meant to imply that the three arguments are necessarily to be taken as equal. Recall that the issue here is just segmentation, not the assignment of (hierarchical) structure. Thus the grammatical structure of (b) could very well be taken as an indication that the last two arguments are to be taken as constituting a set to be added to the single argument in sentence (a) (cf. below). The point here is simply that the question of segmentation precedes the assignment of structure. 2.
3· Again: only segmentation is the issue here, not the assignment of structural relationships; cf. note 2. A dear exposition of the view of linguistic communication as influencing another person's cognition by displaying the intention to do so can be found in Keller (1995, p. 153ff., 1998, p. 136ff.). It is crucially related to Grice's (1957) notion 'meaningNN' and also occurs, in slightly variable forms, in several other approaches to pragmatics.
4-
s.
This distinction is related, but not identical, to distinctions between different domains of use, for example the distinction between epistemic and content domains as proposed in Sweetser ( 1990), or that between pragmatic and semantic sources of coherence as proposed in Sanders (1992). Cf. Foolen (1996) and Verhagen (1996b, p. 274/5) for some discussion. This difference consists in the distinction between what I called "speaker-hearersubjectivity" and "character-subjectivity" (cf. below; also J. Sanders 1994). In certain areas, such as that of language change, this difference is very important (cf. Verhagen 2000, for an example).
6.
7· In this representation the coordinating conjunction is taken to be an element in the coordination dimension, but this is not crucial. At the moment I have no principled considerations to offer on this point, but I find this representation useful for expository purposes. 8.
In a different terminology, a similar insight has been formulated for that-clauses, the
Subordination and discourse segmentation revisited
most prototypical subclass of complements, by Wierzbicka ( 1988: 132-140): " ... reference to knowledge is present in all sentences with THAT" (p.l37; Wierzbicka in turn cites a few other linguists who have proposed partly similar analyses, notably Bolinger). I believe that the mental space cum construction approach provides a generalization over these and other types of complements (linking the 'space building' feature to the construction and leaving other aspects of the semantics to the lexical specifications of the verbs and complementizers involved), as well as one that allows for integration into a more general theory of perspectivization. 9· The direction of the generalization I would like to propose is that causation is also attribution of the situation denoted by the complement clause to something else, but then to an objective factor (ie. the cause) rather than to a subjective one. In that perspective, the complementation construction would constitute an example of a particular kind of constructional polysemy. See Verhagen (1996b) for some discussion of this idea, and Foolen ( 1996) for some criticism.
In Dutch, the matrix clauses do not have to contain the pro-form it, for neither category of predicate nominal. Thus Dutch does not only have matrix clauses of the type Een probleem is dat ... ("A problem is that ... "), but also Duidelijk is dat ... (lit. "Clear is that ... "). The parallel between these two types is one reason why in some grammatical traditions, the initial noun phrase in a clause of the type Em/lret probleem is dat.. is analyzed as a preposed predicate nominal rather than a subject. 10.
u. As with complement clauses, the matrix of a subject clause may also have a causal relationship with the subordinate clause ( Cf. The result/reason is ... ). As mentioned in note 9, I think a further generalization is possible, so that we actually have constructional polysemy here, but I will not pursue that issue in this paper. There should also be independent grammar-internal arguments for positing a construction such as ( II) as part of the grammar of a language. I think such arguments can indeed be provided, at least for Dutch (cf. Verhagen 1996b and Foolen 1996, for somewhat different views). Furthermore, this analysis has consequences for the grammatical characterization of subordination as such. Again, these issues are only indirectly related to the matter of discourse segmentation, so I will not go into them here.
12..
13. The Eindhoven Corpus, in the version available from the Free University in Amsterdam; it is described in Uit den Boogaart (1975) and Renkema (1981).
References ANS 1 (1984). Algemene Nederlandse Spraakkunst. Onder redactie van G. Geerts, W. Haeseryn, J. de Rooij, M. C. van den Toom. Groningen/Leuven: WoltersNoordhoff. ANS2 (1997). W. Haeseryn, K. Romijn, G. Geerts, J, de Rooij, M. C. van den Toorn. Algemene Nederlandse Spraakkunst. Tweede, geheel herziene druk. Groningen/ Deume: Martinus Nijhoff/Wolters Plantyn.
3SS
356 Arie Verhagen
Boogaart, P. C. uit den ( 1975). Woordfrequenries in geschreven en gesproken Nederlands. Utrecht Oosthoek, Scheltema 8c Holkema. Daalder, S. (1989). Continuative relative clauses. In Norbert Reiter (Ed.), Sprechen und Hl1ren. Akten des 23. Linguistischen Kolloquiums (pp. 195-207}. Tobingen: Niemeyer. Fauconnier, G. (1994). Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge: Cambridge University Press. [First edition, 1985, Cambridge, MA: The MIT Press.) Foolen, A. ( 1996). Tekstsegmentatie, onderschikking en subjectiviteit. Commentaar op Arie Verhagen. Gramma/TTI', 5, 269-272. Goldberg, A. E. ( 1995 ). Constructions. A Construction Grammar Approach to Argument Structure. Chicago/London: The University of Chicago Press. Grice, H. P. ( 1957}. Meaning. The Philosophiad Review, 66, 377-388. Keller, R. ( 1995). Zeichentheorie.Tilbingenl Basel: Francke Verlag. Keller, R. ( 1998 ). A Theory ofLinguistic Signs. Oxford: Oxford UDiversity Press. Langacker, R. W. ( 1991 ). Foundations ofcognitive grammar, 2 Descriptive application. Stanford: Stanford University Press. Mann, W. C., 8c Thompson, S. A. (1988). Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8, 243-281. Matthiessen, C., 8c Thompson, S. A. ( 1988). The structure of discourse and 'subordination'. In J. Haiman 8c S. A. Thompson (Eds.), Clause Combining in Grammar and Discourse (pp. 275-329). Amsterdam: John Benjarnins, 275-329. Onrust, M., Verhagen, A., 8c Doeve, R. (1993). Formuleren. Houten/Zaventem: Bohn Stafleu Van Loghum. Pander Maat, H. (1994). Telcstanalyse. Groningen: Martinus Nijhoff. Renkema, J. (1981). De taal van "Den Haag": een kwantitatief-stilistisch onder:zoek naar aanleiding van oordelen over taalgebruik. 's-Gravenhage: Staatsuitgeverij. Sanders, J. ( 1994). Perspective in narrative discourse. Tilburg (diss. KUB). Sanders, T. J. M. (1992). Discourse structure and coherence: Aspects ofa cognitive theory of discourse representation. Tilburg (diss. KUB). Sanders, T. J. M., &, van Wijk, C. ( 1996 ). PISA- A procedure for analyzing the structure of explanatory texts. Text. 16,91-132. Schilperoord, J. (1996). It's about time. Temporal aspects of cognitive processes in text production. Amsterdam/Atlanta: Rodopi. Schilperoord, J. ( 1997). Temporele modificatie in clauses; een pauze-analytische studie naar tekstproductie. In H. van den Bergh, D. M. L. Janssen, N. Bertens, M. Damen (Eds.), Taalgebruik ontrafeld. Bijdragen van het zevende VIOT-taalbeheersingscongres gehouden op 18, 19 en 20 december 1996 aan de Universiteit van Utrecht (pp. 263-274). Dordrecht: ICG Publications. Schilperoord, J., 8c Verhagen, A. ( 1998 ). Conceptual dependency and the clausal structure of discourse. In Jean-Pierre Koenig (Ed.), Discourse and Cognition. Bridging the Gap (pp. 141-163 ). Stanford: CSLI Publications. Sweetser, E. E. (1990). From etymology to pragmatics. Metaphorical and cultural aspects ofsemantic structure. Cambridge: Cambridge University Press.
Subordination and discourse segmentation revisited 357
Verhagen, A. {1992). Patroonsplitsing en zinsstructuur. In H. Bennis & J. W. de Vries (Eds.), De Binnenbouw van het Nederlands. Een bundel artikelen voor Piet Paardekooper(pp. 373-382). Dordrecht: ICG Publications. Verhagen, A. (1995). Subjectification, syntax, and communication. In D. Stein & S. Wright (Eds.), Subjectivity and subjectivisation: linguistic perspectives (pp. 103-128). Cambridge: Cambridge University Press. Verhagen, A. (1996a). Sequential conceptualization and linear order. In E. H. Casad (Ed.), Cognitive Linguistics in the Redwoods. The Expansion of a New Paradigm in Linguistics (pp. 793-817). Berlin: Mouton de Gruyter. Verhagen, A. (1996b). Tekstsegmentatie, onderschikking en subjectiviteit. Grarnma/ TIT, 5, 249-268, 273-275. Verhagen, A. (2000). 'The girl that promised to become something": An exploration into diachronic subjectification in Dutch. InT. F. Shannon & J.P. Snapper (Eds.),
The Berkeley Conference on dutch Linguistics 1997: the Dutch Language at the Millennium (pp. 197-208). Lanham, MD: University Press of America. Wierzbicka, A. {1988). The Semantics of Grammar. Amsterdam/ Philadelphia: John Benjamins.
Subject index
A Accessibility 6, 9-10, 27-fJ7 Activation 27, 125 Additive relations 13, 15-16, 185, 198, 223 Adverbial clause 345 Adversative relations, see also Contrastive relations 154,212,215 Allocation of attention 94 Anaphor4,10,29-87,94,97,249,352353 Animacy 69-70 Antecedent-anaphor relation 29-87 Argumentation 231-246 Argumentation theory 126, 233 Argumentative discourse 231-232 Argumentative relations 130, 142, 145, 209,216,241 Argumentative scale 11 Argumentative orientation 155,214 Attention 313 Autonomous processes 112, 115, 118, 309-334
c Catapbor 98-100 Categorization ofcoherence relations, see Classification of coherence relations Causal relations 5, 13, 15-16,94, 130, 132-133, 141, 146, 148, 153-178, 189, 199,210-211,215,220,223-224,231, 234, 238,243,256-257,258,260, 337-338,346,355 Causal-contrastive relations 153-178
Centering theory 64-65, 189 Classification of coherence relations 13, 15, 125, 127-150,153-178, 181, 198, 221,223-224,239,260 Clause combining 338, 345 Clauses 33,307, 309-334,337-357 complement 338-339, 341,348,353, 355 elliptical217-220 finite338 full217-220 main 156, 164-167, 169-170,174176,337,343,346 relative 339,341,343,349-350 subject 339,341,346, 348,353 subordinate 156, 158-159, 164-167, 169-170,174-176,307,337,340, 342-343,345,355 Coherence relations 10-16, 125, 127150,153-178,181,235,247,260,261, 264,337-338,345 Cohesion 33, 350, 353 Comparative relations 198, 234 Conceptual dependency 336, 340, 342343,353-354 Conceptual entities 250, 273 Conceptual processes and production 310,312-313,315,319,321,323,327328,232-233 Concessive relations 153-178, 187 Conclusion relations 131,200 Concomittance relations 234, 237 Conditional relations 11, 128, 133, 138, 139, 143, 147, 159 Conjunctions, see Connectives
360 Text Representation
Conjunctive relations 181 Connections in network 274-275, 284, 299,302 excitatory 275 inhibitory 275 Connectives, see also Linguistic markers of coherence relations 4, 8, 11-12, 125, 190,194,197-227,235 al (Dutch) 177 although 153-180 and126,134, 155,197-227 because 12, 16, 125
besides224 but11,148-149,155-156,168, 177, 200,213-214,226 en (Dutch) 197-227 if133, 138,143-144,147 or133, 136,138,142 since 146 so 14, 128-130, 136, 138, 141, 146 then 136 though 177 Constituents316, 319,322,329-331 Construction Integration Model 6, 250 Constructionist theory 250, 252, 256, 258 Content relations 13, 125, 158,308,344, 353 Con~tivepronouns43
Con~tive
relations, see also Adversative relations 13, 128, 260 Conversational implicature 222 Coordination 308, 343, 354 Coordination dimension 344, 353 Coreference 57, 61 Corpus research, see Corpus analysis Corpus analysis 73, 125-126, 162, 166, 169,197,208-227 Cues for coherence relations, see Linguistic markers of coherence relations
D Deixis 193 Denial of expectation relations, see Causal-contrastive relations Discourse production 5 Discourse comprehension, see Text comprehension Discourse relations, see Coherence relations Discourse Representation Theory (DRT)
5 Discourse structure 247, 343 Discourse units 307, 316, 337-357 supra-sentential316 Disjunctive relations 132, 138 Domain knowledge, see World knowledge
E Ecological validity 71 Elaboration relations 126,181-195, 199200, 211, 216 Embedding 187 Episodic memory 91 Epistemic modality 6 Epistemic relations 125, 127-150,158, 223-224,238 Event indexing model 250, 261 Explanation relations 187, 232-234, 241 External relations 13, 128 Eye movement registration 5 F Figurative speech 6 Focus (of attention) 46, 90, 181, 189190, 192,205-206 Forgetting 303 Folk-psychology 281 Full nouns 316 G
Given/new 30, 67 Givenness hierarchy 63 Global coherence 10, 32, 192-193, 260
Subject index 361
Graded Salience Hypothesis 28, 113, 115, 118 Grammar 53 Grammatical structure 343
H Headinp, see Titles and headinp Hierarchy 192
I Ideational relations 13, 128, 308 Idioms 112 Indefinite article 8, 10,316 Inference 3, 4, 13, 32, 148, 159, 198-199, 2l2,249,252,254 Intentional relations 260 Intentions 139-150, 182 Interactive processes 112,309-334 Internal relations 14, 128 Intraclausal and interclausal relations 54, 204-208 Irony 112
K Knowledge representation 247, 273-306 of experts 276, 286-287 Knowledge structure 274 -306 Knowledge Telling model312
L Language production, see also Text production 342 Landscape model of Reading 6, 27, 90110,250 Lexical access 111-124 Lexical nouns 9 Linguistic markers of coherence relations 8,12,27,89-110,190,192,235,250, 261,264,338 Linguistic markers of referential coherence 10,89-110,205 Linguistic, processes and production 311,312-313,315,319,321,323,327328,332-333
List relations 219
Local coherence 260 Local operations 310
M Matrix of concept relatedness 2n-278 Memory extemal255 long term 285, 302 storage and retrieval296, 301 working 89-90, 260, 285, 301, 317 Memory representation 5 Mental model, see Situation model Mental space 3, 11, 15, 346 Metaphor 6, 27, 111-124, 307 Minimalist hypothesis 250, 152 Modularity, see Autonomous processes Multilevel model of analysis 318-334 Multiple representations 3
N Negative relations 13, 15, 153, 223-224 Negative causal relations, see Causalcontrastive relations Network representation 274-275, 285286,299 Nucleus/Satellite 155-156, 183, 185, 187
0 On-line measures 256, 258 Order of relations 160, 162, 164-167, 176, 178 p Paragraphs 33, 316, 319, 322, 323, 326, 333 Pause locations 309-334 Pauses307,309-334 variance in 318,326-327 Pause times, see Pauses Polarity 223-226 Positive relations 15, 223-224 Pragma-dialectical approach, see Argumentation theory
362 Text Representation
Pragmatic inference, see Inference Pragmatic principles 52 Pragmatic relations 14, 125, 127-150, 158,22l-224,236-237,239-241,308, 354 Pragmatics 53, 197,233 Presentational relations 14 Processing 45, 73,89-110, 159,207 Pronouns 8-9, 29-87,221, 249,316 Proposition 3,14,133,135,137,141,157,178,187189,191,193-194,200,215,234,338, 348 Psycholinguistic vs. linguistic data 72
Q Quantitative models of thinking 247, 271-306
R Readingtnnesl0,12, 16,125,258,261 Reaction tUnes 73 Reca1ll2, 16, 227 Referential choice 69-70 Referential coherence 7, 10, 17, 94, 125, 218,317 Referential distance 38, 46 Referential expressions 10,17, 29-87, 316 Referential function 46 Reflexives 34 Relational coherence 7, 17, 247 Relations between concepts 277-278 Relevance 38, 41 indicators of94-95, 102 Relevance theory 126, 197-227 Resonance model250 Resumptive pronoun 34, 57 Rhetorical Structure Theory (RST) 126, 181-195,236,337 Rhetorical relations, see Coherence relations
s Salience 55 Segmentation 19, 337-357 Segmentation marker 226 Semantic relations 13, 125, 127-150, 158,22l-224,236,239-241,308,354 Sentences 316,319,322,329-331,333 Simulation model274, 281, 283 Situation model 3, 249, 253, 258 Speech acts, see also Speech act relations 232 Speech act relations 125, 127-150, 158159,164,238 Speech production 316 Spreading activation 273, 279-284, 296297
Spreading activation model247 Structure building framework 250 Subject of consciousness 15, 343, 345, 348 Subjectivity 15, 343, 347, 354
T Temporal sequence relations, see Temporal relations Temporal relations 128, 134-135, 137, 144, 211, 220, 260 Text comprehension 5, 89, 105, 163,227, 249, 254, 264, 274, 307 Text generation 182 Text planning 307,316,342 Text production 6, 27, Ill, 114, 118, 153,307,309-334 Text representation 10, 12,247,307 Text segments, see Discourse units Thematic continuity, see also Topic continuity 167-178,316, 350-351 Think aloud protocol256 Thinking process 273-306 Thought product 301 Thoughtstruehrre288,294,301 Titles and headings 102-104 Topic 8, 10, 32, 186, 193, 197,201-202,
Subject index 363
205-206,209,216,219,316-317 Topic continuity, see also Thematic continuity 70,202,218 Topicality 38 Truth-conditional properties, see Truthfunctional properties Truth-functional properties 221 Typographical cues 89, 100-101
v Verbal protocols 256 Verification 16, 161-162
w Working memory 89-90, 285 World knowledge 17, 164, 250,256,261, 264,273-306
u
z
Units of analysis 307, 337-357
Zero-anaphora 8-9,57,317
In the series HUMAN COGNITIVE PROCESSING (HCP) the following titles have been published thus far or are scheduled for publication: 1. NING YU: The Contemporary Theory ofMetaphor. A perspective from Chinese. 1998. 2. COOPER, David L.: Linguistic Attractors. The cognitive dynamics of language acquisition and change. 1999. 3. FUCHS, Catherine and Stephane ROBERT (eds.): Language Diversity and Cognitive Representations. 1999. 4. PANTHER, Klaus-Uwe and Ganter RADDEN (eds.); Metonymy in Language and Thought. 1999. 5. NUYTS, Jan: Epistemic Modality, Language, and Conceptualization. A cognitivepragmatic perspective. 2001. 6. FORTESCUE, Michael: Pattern and Process. A Whiteheadian perspective on linguistics. 2001. 7. SCHLFSINGER, Izchak, Tamar KEREN-PORTNOY and Tamar PARUSH: The Structure ofArguments. 2001. 8. SANDERS, Ted, Joost SOIILPEROORD and Wilbert SPOOREN (eds.): Text Representation. Unguistic and psycholinguistic aspects. 200 I.