JOURNAL OF SEMANTICS
Volume 2 1983
Reprinted with the permission of the original publisher by
Periodicals Service Com...
21 downloads
617 Views
7MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
JOURNAL OF SEMANTICS
Volume 2 1983
Reprinted with the permission of the original publisher by
Periodicals Service Company Germantown, NY 2005
Printed on acid-free paper. This reprint was reproduced from the best original edition copy available. NOTE TO THE REPRINT EDITION: In some cases full page advertisements which do not add to the scholarly value of this volume have been omitted. As a result, some reprinted volumes may have irregular pagination.
JOURNAL OF
SEMANTICS VOL. II - 1983
JOURNAL OF SEMANTICS AN INTERNATIONAL JOURNAL FOR THE INTERDISCIPLINARY STUDY OF THE SEMANTICS OF NATURAL LANGUAGE
MANAGING EDITOR:
Pieter A.M. Seuren (Nijmegen University)
EDITORIAL BOARD:
Peter Bosch (Nijmegen University) Leo G.M. Noordman (Nijmegen University)
RFV1EW EDITOR
Rob A. van der Sandt (Nijmegen University)
CONSULTING EDITORS: J. Allwood (Univ. Goteborg). M. Arbib (U Mass. Amhcrsl). Th. T Ballmer (Ruhr Univ. Bochum). R. Bansch (Amsterdam Univ.), J. van Bcnthcm (Groningen Univ.). H.M. Clark (Stanford Univ.). O. Fauconnier (Univ. de Vincennes). P. Gochct (Univ. de Liege). F. Hcny (Groningen Univ.). J. Miritikka (Florida State Univ.). G. Huppcnbrouwers (Nijmegen Univ.), St. Isard (Sussex Univ.). Ph. Johnson-Laird (Sussex Univ.). A. Kasher (Tel Aviv Univ.). L. Kecnan (UCLA). S. Kuno (Harvard Univ.). W. Levclt (Max Planck Iiuu. Nijmegen).
ADDRESS:
J. Lyons (Sussex Univ.), W. Marslen-Wilson (Max Planck Inst. Nijmegen). J. McCawley (Univ. Chicago). B. Richards (Edinburgh Univ.), H. Ricser (Univ. Bielefeld). R. Rommetveit (Oslo Univ.), H. Schnelle (Ruhr Univ. Bochum), J. Searle (Univ. Cal. Berkeley). R. Stalnaker (Cornell Univ.). A. von Stechow (Univ. Konstanz). G. Sundholm (Nijmegen Univ.). Ch. Travis (Tilburg Univ.), B. Van Fraassen (Princeton Univ.). Z. Vendler (UCSD), Y. Wilks (Essex Univ.), D. Wilson (UCL).
Journal of Semantics, Nijmegen Institute of Semantics, P.O. Box 1454, NL-65OI BL Nijmegen, Holland
Published by the N.I.S. Foundation, Nijmegen Institute of Semantics. P.O. Box 1454, NI.-6501 BL Nijmegen, Holland
ISSN 0167- 5133 by the N.I.S. Foundation
Printed in the Netherlands
Review article page Pieter A.M. Seuren J.D. McCawley, Thirty Million Theories of Grammar
325
Book reviews Roland R. Hausser & Claudia Gerstner Benoit de Cornuiier, Meaning Detachment D.E. Over Brian Loar, Mind and Meaning Han Reichgelt Th.W. Simon & R.J. Scholes (eds.) Language, Mind, and Brain Pieter A.M. Seuren John Dinsmore, The Inheritance of Presupposition Herman Wekker Chr.J. Pountain, Structures and Transformations. The Romance Verb.
350 347 352 356 343
Ton Weyters
Gillian Brown & George Yule, Discourse Analysis Publications received
354 359
JOURNAL OF SEMANTICS CONTENTS VOLUME H (1983)
Articles page Thomas Ballmer Semantic structures of texts and discourses Janet Mueller Bing Contrastive stress, contrastive intonation, and contrastive meaning Dwight Bolinger Where does intonation belong? Arda Denkel The meaning of an utterance Jurgen Esser Tone units in functional sentence perspective Carlos Gussenhoven A three-dimensional scaling of nine English tones Roland R. Hausser On vagueness Daniel Hirst Interpreting intonation: a modular approach D. Robert Ladd Even, focus, and normal stress Willem J.M. Levelt & Anne Cutler Prosodic marking in speech repair D.E. Over Constructivity and relational belief Ragnar Rommetveit In search of a truly interdisciplinary semantics. A sermon on hopes of salvation from hereditary sins A.J. Sanford, S. Garrod, A. Lucas,, R. Henderson Pronouns without explicit antecedents? Peter Sgall On the notion of the meaning of the sentence Nigel Shadbolt Processing reference L.A. Zadeh A fuzzy-set-theoretic approach to the compositionality of meaningful propositions, dispositions, and canonical forms
221 141 101 29 121 183 273 171 157 205 <*1 1 303 319 63
253
PUBLICATIONS RECEIVED Trechsel, Frank R., A Categorial Fragment of Quiche. (Texas Linguistic Forum 20). Dept. of Linguistics, The University of Texas at Austin, Austin, 1982. Utrecht Working Papers in Linguistics 11 (1982). Van der Auwera, Johan, What do We Talk about when We Talk? Speculative Grammar and the Semantics and Pragmatics of Focus. (Pragmatics & Beyond II: 3). 3. Benjamins, Amsterdam, 1981. Pp. vi+121. / 38,- / $ 14,00 (paper). Verschueren, Jef, On Speech Act Verbs. (Pragmatics & Beyond 4). 3. Benjamins, Amsterdam, 1980. Pp. vii+83. / 38,- / $ 14,00 (paper). Weissenborn, Jurgen & Klein, Wolfgang (eds.), Here and There. CrossLinguistic Studies on Deixis and Demonstration. (Pragmatics & Beyond III: 2-3). 3. Benjamins, Amsterdam, 1983. Pp. 296. / 88,- / $ 32,00 (paper). Woodf ield, Andrew (ed.), Thought and Object. Essays on Intentionality. Clarendon Press, Oxford, 1982. Pp. xi+316. £ 17.50.
362
as, vol. 2, no.
PROCESSING REFERENCE Nigel Shadbolt
Abstract
1. Motivation In this paper I have presented a descriptive system. The task of the system is to provide a terminology to lay. out some of the referential possibilities of utterances. Throughout I have continually made remarks about the importance of including the language processor in our theories of natural language. But it will be evident that I allude to the processing metaphor in a way that suggests a processing account is in itself a goal for which the descriptive machinery has been developed. This suggestion is deliberate- Ultimately I do want to provide a processing model of some referential phenomena. This desire results from inheriting a belief in the primacy of the human processor in language. If such a primacy is accepted then a good research strategy will incorporate semantics within processing frameworks. Putting the human processor back into linguistic theories is a central concern of Cognitive Science. Recently Johnson-Laird summmed up the kind of overview required.
JOURNAL OF SEMANTICS, vol. 2, no. 1, pp. 63-98
63
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A system of referential description is presented that attempts to represent crucial aspects of the process of performing and understanding referential acts. It is suggested that traditional logical accounts distract our attention from important properties concerning the use of referential expressions. The model proposed is consonant with a growing body of opinion amongst cognitive scientists that generating and interpreting natural language is best explained as a process of constructing cognitive models and procedures that represent and process the content of our utterances. If this position is taken seriously, there is a requirement that the state of language processors is the most important determinant of the mechanics of the referential act. This leads to a process model of reference. The paper also touches on why language is in a sense 'radically opaque' and why this opacity does not consistently lead to failure in communicative acts. The theory predicts that using language is a 'risky' business and that misinterpretation will occur more often than other formal theories predict.
NIGEL SHADBOLT Logicians have only related language to models in various ways, psychologists have only related it to the mind. The real task is to show how language relates to the world through the agency of the mind. (P.N. 3ohnson-Laird 1981) The first stage in such work is to devise a descriptive system to adequately and perspicuously represent the phenomena. The descriptive system I have developed reveals facets of reference too often neglected or not recognized at all. I felt it worthwhile to present the descriptive system and its insights independently of any process implementation.
(1) Oedipus thinks he is going to marry Jocasta. (2) Oedipus thinks he is going to marry his mother. (3) Oedipus thinks a satyr broke his mirror. Intuitively, all of the opacity problems, problems of intensional context, involve a failure of the use of names or descriptions with those objects that normally give them meaning. Language in these situations seems to lose its grip on the things of the world. For language users intensional ambiguities can be viewed as a kind of referential collapse or misconstrual. These remarks notwithstanding, researchers have analysed intensional contexts by taking sentences of natural language and describing them at a putative level of logical form. The sentences of natural language are translated into sentences of a logical language. The sentences of the logic transparently display the 'scope' of the logical constituents which themselves represent natural language terms. Scope consists of principles in formal languages that determine the contribution of constituent expressions to the truth-conditional interpretation of sentences. The problems and ambiguities of intensional contexts in a 'logical' analysis are claimed to arise out of the scope possibilities of the logical constituents (Fodor 1970; Partee 1970, 1974; Montague 1970a, 1970b, 1970c). I think' the accounts provided within various logical or formal semantic frameworks are open to a number of objections. And although it is always possible the logic could be 'fixed-up' there comes a stage JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The referential phenomena I consider are those associated with 'intensionality'. Intensional contexts are traditionally regarded as created by1 such explicit language terms as the adverbs 'necessarily', 'possibly ; the verbs 'hope', 'seek', 'want', 'believe', 'regret'; or tenselike modal operators such as 'will'. Two. conditions are regarded as criterial for such contexts. One is that substitution under identity may fail to preserve truth. The other is that existential generalization may fail. The first condition is exemplified in the sentence pair (1) and (2); the second in sentence (3).
PROCESSING REFERENCE when by any set of criteria the patching becomes unacceptable. Space does not allow me to go into detailed arguments for and against logical analyses. So I will make the minimal claim that accounts within logical form fail to highlight some crucial aspects of intensionality. 2. Language and models There are certain conjectures about language and its processing that the descriptive system embodies; it would be worthwhile to make them explicit. None of them are radically new. They have a respectable history within A.I. and Psychology. Moreover, they naturally seem to suggest a processing perspective.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The first claim is that when a processor is involved in generating or interpreting a piece of natural language a model is built based on the states of affairs described through the language. The construction of these models is one of the major components in our understanding the meaning of a piece of language. This view of comprehending or generating language as an endeavour to build 'models in the head1 is advocated with increasing frequency (cf. Karttunen 1976; Stenning 1977; Grosz 1979; Fauconnier 1979; 3ohnson-Laird & Garnham 1980; Kamp 1981; Garrod & Sanford 1982; Seuren 1982). The exact nature of these models often differs but Fauconnier's comments are typical. He talks of the 'topology of discourse processing' and of 'considering meaning as instructions for use1. The crucial feature of the approach is that "sentences contain [...] can set up image spaces in such spaces [...] the up and referring to the zation of discourse".
instructions for discourse processing; they within the discourse, introduce new elements sentence is a set of instructions for setting mental constructs which support the organi(G. Fauconnier 1979)
A second claim arises naturally out of the one just considered. Plainly, discourse involves separate agents with differing views of the world. Language processors must be able to represent the beliefs and knowledge of other people. Included in a model a processor builds is a view of the other processor's models. Processors have models of the models of other processors. Interestingly, work in developing A.I. Planning Systems recognizes that the system's reasoning and inference procedures must be able to distinguish and handle separate agents' beliefs and knowledge. Furthermore, they must be able to operate on incomplete and sometimes inconsistent information. The systems have to ascribe intentions, beliefs and goals to other agents to explain behaviour, in particular linguistic behaviour (cf Allen 1979; Allen & Perrault 1979; Sidner & Israel 1981). 3S, vol. 2, no. 1
65
NIGEL SHADBOLT "much linguistic behaviour can best be explained in terms of the intentions of the speakers". (Allen 1979) The intention of an utterance is the attempt to bring about some change by affecting the models of situations which addressees have in their heads. Indeed this intention to discover, contrast or change models of what people believe can be seen as the main spring of linguistic communication. Of course language can be used to state the obvious, almost gratuitously describing the way the world is. But so often our linguistic behaviour is concerned to communicate new facts, elicit new information. We engage in a constant process of modifying our own and other people's views of the world.
Admitting the centrality of belief and intention in understanding behaviour in general and linguistic behaviour in particular is not confined to Psychology and A.I. The philosopher of language Paul Grice (1957,1968) has repeatedly stressed that an understanding of the meaning of utterances in language will require making reference to the possession by speakers of audience-directed intentions. He explicitly states in his paper 'Utterer's meaning, sentence-meaning and wordmeaning1 that the intended effect of an utterance is "[...] always the generation of some propositional attitude". (Grice 1968: 59) Very much in the spirit of Grice the American philosopher Dennett (1978) sees intentionality as the fundamental concept behind certain sorts of systems. He argues that intenSional referential phenomena inevitably arise out of language-using intenTional Systems. The connection is so close for Dennett that he talks of intenTional linguistic contexts and idioms (1978: 3). Intentional Systems are systems whose behaviour can be predicted by the ascription to them of beliefs, desires, goals etc. By assuming beliefs in other systems to which we have no privileged or veridical access we encounter all the problems of referential opacity presented say in (4). So whoever is referred to by he may not know CO Nigel says to Sam: 66
"He believes the pendulum-bob is moving." JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Apart from the 'semantic meaning11 of the discourse, intended meaning relies on the following beliefs for the speaker and the hearer. Beliefs about the current situation. Beliefs about each others beliefs and goals. Beliefs about the context of discussion. And even beliefs about their mutual beliefs; for example the mutual beliefs of A and B would be those beliefs that A and B both believe and furthermore that they both believe that they both believe, and they believe that they both believe that they both believe etc.
PROCESSING REFERENCE of the object he has the belief about that it is a pendulum-bob, or he may think it exists when in fact it doesn't etc. 2 Having got back to the problem of reference it is worth pointing out some often overlooked points. All issues involving reference must involve language processors. Linguistic expressions in themselves do not refer. A definite description like 'The Departmental notice-board' lacks reference unless it is invested with reference through a particular speaker's use of it. [...] referring is not something an expression does; it is something that someone can use an expression to do. (P.F. Strawson 1950)
The act of referring crucially depends on two things which it is almost impossible to exaggerate the importance of. Firstly, the 'context of utterance; by 'context' I mean at least, the time, the place, the speaker, the immediate focus of interest, the current histories and states of the speaker and addressees. Intimately tied up with context is the second crucial element in the referential act, the intentional state of the language processors. When I take a noise or a mark on a piece of paper to be an instance of linguistic communication, as a message, one of the things I must assume is that the noise or mark was produced by a being or beings more or less like myself and produced with certain kinds of INTENTIONS. (J.R. Searle 1969) An important point I do hope to have established is that referential phenomena, including intenSionality, arise out of intenTional acts. The intentional acts themselves issue from Intentional Systems that operate in complex contexts including internal as well as external states. It is worth making the point that these two inseparable aspects of reference, the context and state of a language processor either speaker or hearer, are implicit concerns of Artificial Intelligence and Computational Linguistics. The interpetation of an expression depends on who is doing the interpreting; speaker and hearer are considered as distinct interpreters (or processors) each with their own view or conception of the world.... The state of these processors, their condition at a given time plays a crucial role in the analysis of the interpretation of an utterance. (B.3. Grosz & G.C. Hendrix 1978) JS, vol. 2, no. 1
67
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
[...] reference is a speech act, and speech acts are performed by speakers in uttering words, not by words. (J.R. Searle 1969)
NIGEL SHADBOLT
3. The descriptive system The diagram in figure A represents what I have called the ProcessorCentric (P-C) standpoint. It is a division of the language processor into functionally convenient components. The partitioning is determined by two considerations: that it should embody the various conjectures about language and its processing discussed in the last section, secondly the representation should perspicuously represent the referential possibilities of natural language.
Fig. A MAIN COMPONENTS OF THE PROCESSOR-CENTRIC REPRESENTATION
General Processor State Primary Discourse Model .Primary Discourse Objects Secondary Discourse Model . Secondary Discourse Objects
Within the ' general processor state are three distinguished areas. One is the Lexical Material Buffer (LMB). The LMB contains lexical strings generated in, or interpreted out of discourse. In this area the constituents of natural language exist independently of any referential commitment. The LMB's existence is motivated by the fact that the same item in language can be made to refer in different contexts to different referents, as in the case of 'the Departmental noticeboard (p. 67). Although the contents of the LMB have no rigid referential commitment they are associated 3with information about their syntactic, phonetic and orthographic form. A second distinguished area is the Primary Discourse Model (PDM). Here a processor constructs his understanding of a piece of discourse. 68
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lexical Material Buffer
PROCESSING REFERENCE Part of this understanding will consist in beliefs about how the discourse was interpreted by fellow participants. These beliefs constitute an embedded model, the Secondary Discourse Model (SDM). The SDM is the model processors have of their fellow processor's models. It is perhaps worth pointing out that any terminology is going to be misleading by being either too general or too specific. The problem is that in constructing models we can draw on modalities and mediums other than language. We should view the components labelled as the Primary and Secondary Discourse-Models in Figure A as parts of general cognitive models, i.e. those parts built on the basis of language. The models built via different mediums will be intimately connected and cross-correlated with each other. Bearing this in mind I will retain my current terminology.
Knowledge structures linked to the DOs represent the semantic information the processor associates with a nominal expression being generated or interpreted in the context of the processor's state. I usually refer to the knowledge structures as sets of beliefs. However the Primary and Secondary Discourse Models and their respective Discourse Objects do not require that semantic information be expressed in any particular formalism. There is a fundamental difference between the Primary and Secondary Discourse Objects. Primary Discourse Objects are pointers to a processor's own knowledge about referents (objects, individuals etc.). Discourse Objects in the Secondary Discourse Models are associated with knowledge that constitutes a processor's view of what other participants in the modelled discourse know about the purported referent of the nominal expression. This might differ in any number of ways from what the processor himself thinks to be the case about a referent. Thus Sam might say to Robert something like (5). It is quite possible Robert knows the referent (5) Sam says to Robert: 3S, vol. 2, no. 1
"Louisa could get you some tablets." 69
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Within the Primary and Secondary Discourse Models I represent objects as elements I have called Primary and Secondary Discourse Objects. They act as pointers to knowledge structures. These Discourse Objects (DOs) can be viewed as loci at which various descriptions converge. The 'locus' is well established in A.I. Knowledge Engineering, and intuitively serves as the 'representational object' about" which properties can adhere. The mysterious ontological status of these loci of description or DOs calls to mind the philosophical problem in metaphysics about what could underlie the set of properties that constitute an object. At least in this respect we are no worse off in our internal models than we are about external 'reality1. Certainly we seem to need some kind of a handle on our aggregation of knowledge constituting represented objects. The DOs fulfil this requirement.
NIGEL SHADBOLT of Louisa in this context to be Sam's wife without knowing that she is a doctor. If Sam appreciated this lacuna 1in Robert's knowledge he would make this point of 'relevant ignorance explicit in the information he associates with the SDO that represents Sam's view of Robert's view of the referent of 'Louisa'. To see how the Discourse Objects, Discourse Models, and lexical material link up, consider the utterance by a processor of (6) (6)
Sam says to Robert:
"Barry js wearing a tie."
These assumptions are reflected in the descriptive framework as a mapping (called 'ACCESS') from the lexical string Barry within the LMB to a PDO in Sam's PDM, Figure B. The PDO points to the knowledge/beliefs the speaker has concerning the intended referent. Within the system as it stands there is no explicit representation of the intentions that prompt Sam to say anything in the first place. Although implicitly differences between what Sam knows about an object and what he thinks Robert thinks about that same object (embodied as information differences in the knowledge structures the relevant PDO and SDO point back to) are partly responsible for enervating the communicative act. The decision to lexicalise in a particular way the knowledge structures a processor wants to talk about is also not considered. Presumably a knowledge structure representing an object can be described from a multitude of perspectives. Thus the object in (6) lexicalised as Barry may be individuated in a context using some of the other information the processor possesses of that object. Another mapping from the DO in Sam's PDM to another DO in his SDM is given in Figure B, this mapping is labelled 'COPY'. The SDM is an area indexed for Sam's hearer Robert and the Secondary Discourse Object placed here indicates the speaker's belief that his referential act has evoked the same object for his hearer as for himself. The task of the descriptive machinery must be to represent all the referential possibilities of utterances like (6). Before I do this I shall introduce a second representational perspective. So far I have no device for indicating what is actually the case in the world. This includes states of affairs external to the language users as well as the states of other processors. By employing the perspective I have called that of the Omniscient Observer we can notice incongruencies 70
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It is assumed the speaker thinks his referential act successful. Thus, he thinks his hearer has construed the proper name in a uniquely referring way. So although Sam and his hearer may know a number of Barrys the occasion of his utterance is such that Sam imagines the name to select a unique individual. Moreover, Sam assumes he and his hearer are not thinking of different objects.
PROCESSING REFERENCE
PROCESSOR-CENTRIC MAPPINGS
Fig. B
SLMB 'BARRY' SGPS SPDM
— ACCESS I
.PDO
•SDO
SLMB=SAM'S LEXICAL MATERIAL BUFFER SGPS=SAM'S GENERAL PROCESSOR STATE SPDM=SAM'S PRIMARY DISCOURSE MODEL SSDM=S/HPDM=SAM'S SECONDARY DISCOURSE MODEL* SAM'S VIEW OF HIS HEARER'S PRIMARY DISCOURSE MODEL
between a processor's DOs and their purported referents. We can also display the whole range of configurations of processors, their internal representations (the DOs) and the purported referents of such representations. The Omniscient Observer's (OO) perspective is a 'God's eye view1 of the states of processors engaged in communication and the actual states of affairs and referents they are seeking to describe and locate. In Figure C we have the components of such a representation. The outermost box delimits the state space. This includes processors and their internal states as well as configurations of objects (referents). This descriptive representation assumes the existence of a common shared reality. Solipsists notwithstanding, this seems a reasonable assumption. The state space will be used only to indicate the success 3S, vol. 2, no. 1
71
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I— COPY SSDM/=S/HPDM
NIGEL SHADBOLT
Fig. C COMPONENTS OF THE OMNISCIENT OBSERVER REPRESENTATION STATE SPACE . REFERENT -REFER PROCESSOR 1 .
PDM
LMB
/ \
/ "».'
-access PDM -copy
SDM
,r
P1/P2PDM •
SDM P2/P1PDM
or failure of an act of reference. It is therefore only populated with 'objects' and processors. I am not concerned with setting the objects in detailed relations to one another in this space. A mapping distinguished in Figure C is given between PDOs in processors' PDMs and objects (referents) in the state space external to the processors: This relation I have called REFER. REFER is an indication of what actual object the processor was intending to refer to regardless of whether the nominal expression was appropriate or accurate enough to individuate the intended referent. As Omniscient Observers we can discern the object of the intentional referential act. 4. The possibilities of referential construal With the apparatus sketched so far let us again consider sentence (6). The possibilities for the referential construal of (6) are given in Figures DI-DV. The referential configurations are all given from the 72
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
LMB
PROCESSOR 2
PROCESSING REFERENCE
Omniscient Observer perspective.
Fig. Dl OMNISCIENT OBSERVER REPRESENTATION OF (6) PI
STATE SPACE
P2
1
REFERENT
0
PROCESSOR 2 ROBERT
PROCESSOR 1 SAM . LMB 'Barry 1 \
—r
[ •Barry'
^pbo
RPDM
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
SPOM
LMB
/
RSDM
SSDM
f SDO P1/P2PDM
—
-P2/P1PDM
Fig. D n OMNISCIENT OBSERVER REPRESENTATION OF <« PI
STATE SPACE
P2
1
I
0
0
REFERENT PROCESSOR 1 SAM
\
PROCESSOR 2 ROBERT
| LMB 'Barry' j
\
SPDM
PDO /'
SSDM
I SDO P1/P2PDM
3S, vol. 2, no. 1
/
PDO
|'Barry L M B " | Pc RPDM
RSDM SDO P2/P1PDM
73
NIGEL SHADBOLT
Fig. O OMNISCIENT OBSERVER REPRESENTATION OF (6) PI
P2
1
0
0
1
STATE SPACE • REFERENT PROCESSOR 1 SAM
/
\
PROCESSOR 2 ROBERT 'Barry' LMB
Fig. D IV OMNISCIENT OBSERVER REPRESENTATION OF (6) PI
P2
1
1
STATE SPACE :NT REFERENT
1
0
0
1
0
0
.
PROCESSOR 1 SAM A
REFERENT V . \
PROCESSOR 2 ROBERT | 'Barry' L M B |
LMB 'Barr
\ SPDM SSDM SDO P1/P2PDM
PDO
PDO'^
RPDM • RSDM SDO P2/P1PDM
3S, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
PDOV RPDM Won \RSDM SDO P2/P1PDM
PROCESSING REFERENCE Fig. D V OMNISCIENT OBSERVER REPRESENTATION OF (6) PI
STATE SPACE
P2
1
1
1
0
0
1
0
0
REFERENT
ERENT. REFERENT PROCESSOR 1 SAM/ LMB 'Barry' SPDM SSDM SDO P1/P2PDM
V \
PROCESSOR 2 ROBERT
\ PDO
•Barry' LMB / \ PDO-' RPDM \+con \ RSDM SDO P2/P1PDM
In processing terms we can tell the following story. If Sam thinks his use of the nominal expression Barry has resulted in successful reference he thinks that Robert has had evoked in his mind a knowledge structure that maps onto the same external object as Sam's own knowledge structure. Sam need not possess detailed information about Robert's knowledge of that referent. Indeed the information he has may be very sketchy. My suggestion is that in the absence of evidence to the contrary Sam assumes that he and his hearer have broadly similar sets of criteria for distinguishing and identifying objects in their shared reality. We assume we see the world and objects in broadly the same way by virtue of our shared biological and. cultural endowment. Sam (processors in general) extends this assumption to the point of placing in his developing model of the discourse an item that stands for his view of Robert's view of the object Sam referred to in his referential act. Our Sam's lexical Object of the
descriptive system represents this common configuration of supposition of successful reference in the following way. A string in Sam's Lexical Material Buffer ACCESSES a Discourse in Sam's PDM. This PDO is the relevant bit of Sam's view 'Barry' he intended to uniquely refer to. There is also an SDO
JS, vol. 2, no. 1
75
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Notice first that in all cases the configuration of the processor Sam is the same. The point is that immediately after uttering (6) Sam supposes successful reference to have occurred. He certainly has no information to the contrary at this stage.
NIGEL SHADBOLT placed in a portion of Sam's SDM indexed as his view of Robert's PDM. The SDO (Sam's view of the PDO in Robert's PDM) is linked to the antecedent PDO (which is Sam's own view of the referent he is referring to) by a mapping I have called COPY. Thus, to restate, the Secondary Discourse Object is generated on the assumption that Robert has in his PDM a PDO which relates to the same external referent as Sam's PDO. The SDO so created will generally be able to inherit many of the beliefs associated with its antecedent PDO in Sam's PDM. More of the details of the transmission of knowledge from PDOs to SDOs later.
Returning to the referential possibilities of (6), using the OO perspective, we can look at a whole range of possibilities. Most of them illustrate situations where the referential act has gone wrong despite Sam supposing it to have been successful. But first the canonical case of successful reference is shown in D I . Here the object Sam intended to refer to has indeed been evoked in Robert's mind as Robert's own representation of the object. But things are not always so straightforward. The first area where the processors might be in disagreement involves the putative uniqueness of Sam's use of the proper name Barry. This is represented in Fig. D II. If Sam thinks that the act of reference has been successful he supposes that his hearer has in mind a knowledge structure that refers to the same external referent as his own knowledge structure. But the context may not be as unambiguous as Sam thinks. His hearer may not know which of a number of possible knowledge structures that could be associated with the nominal Barry he should have chosen. In other words, which of a limited number of Barrys known to Sam and Robert Sam meant. The descriptive system represents this naturally as the absence of an ACCESS mapping. There is no mapping to a PDO for the hearer. Thus there is no knowledge representing an object available. Notice at this stage Robert has suffered referential problems unbeknownst to Sam. Immediately after Sam's utterance of (6) Robert's problems "and misconstruals will be unknown to Sam.4 The next source of possible difference is whether or not the processors actually are talking about the same referent. The Figures D II D V represent situations in which both processors believe that the context is sufficient to refer uniquely to a particular object. The re76
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The OO perspective is also able to 'divine' what the object in the world was that Sam intended to refer to. Sam's representation of this object is the PDO ACCESSed by his use of the nominal Barry. The mapping REFER maps this PDO onto the 'real-world' object it is meant to represent.
PROCESSING REFERENCE presentations reflect these possibilities by showing both processors ACCESSING a particular knowledge structure in their respective PDMs. The processors also COPY SDOs in their models of each other (their SDMs) to stand for their views of what the other processor(s) knows about the relevant referent. However, in Figures D II and D III we see cases represented where the processors actually are referring to a common object; whilst in D IV - D V they are not.
Examining the COPY mapping a little further throws more light on the mechanics of the referential act in (6). In saying (6) Sam expects its contents to be news. He assumes before (6) that his and Robert's view of Barry are different at least in respect of the content of (6). Sam is trying to produce an utterance-induced change in Robert's view of Barry. After (6) he thinks this has been done. For the information contained in (6) he assumes no discrepancy between Robert and himself. In the descriptive system this amounts to no [+con] feature on Sam's COPY mapping. However, in Figures D III and D V the constrained [+con] feature on Robert's COPY mapping represents a discrepancy in the beliefs of Sam and Robert, a discrepancy produced by the explicit assertion contained in (6). Sam says that P(b) (i.e. Barry is wearing a tie), whilst Robert sticks to the belief ~1P(b). To prevent a contradiction generated by ~> P(b) being inherited by the SDO representing Robert's view of Sam's view of Barry, a constraint is placed on Robert's COPY mapping.5 Although there could be many other differences between the two views of Barry the different processors have* they are not brought into conflict by the utterance. So in this system the differences are not explicitly represented as further constraint features. It might be argued that since our concern is describing referential configurations, details of the incompatibility of one processor's view of an object against another's are irrelevant. And no matter how processors agree or disagree on the attributes of their referents, the important question is whether they are referring to the same object JS, vol. 2, no. 1
77
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A third type of incongruence is indicated between Figures D II and D III, and again between D IV and D V. In D III and D V the COPY mappings for Robert, the hearer, are shown with a feature (+con) attached. These are 'constrained' COPY mappings. They represent a situation where the assertion made in the utterance of (6) is not consistent with the knowledge and beliefs evoked in Robert by Sam's use of the nominal Barry in a particular linguistic context. Robert may disagree that Barry is wearing a tie. Notice, he may disagree independently of whether he and Sam are actually referring to the same referent. Thus in D III Robert is represented as disagreeing with the assertion made about a referent they do both have in mind, whilst in D V Robert again disagrees with the assertion though in fact unbeknownst to them they are referring to different objects.
NIGEL SHADBOLT or not. Certainly in a case like (6) this might appear justified. But often the ascription of attributes to objects determines the nominal descriptions used in referential acts. Proper Names are atypical in this respect, but Barry might be described by any number of his putative attributes. So we might refer to Barry as Karen's boyfriend or the rugby-playing semanticist, etc. And if different people held different views on the applicability of these attributes and consequently the descriptions, then using them in referring acts could well lead to misunderstandings and opacity. We shall be looking at just such a case later.
(7) Sam says to Nigel: Nigel says to Sam:
"Barry has blue eyes." " No they are grey."
processors might just disagree about when to apply a colour term. One could even imagine a case in which the sense of grey eyes in one processor was similar to the sense of the term blue eyes in the other. Both processors could then be said to hold mutually consistent beliefs. However, at this stage I do not want to consider this additional problem of . checking the characterizations of 'predicate' terms in various processors. I only mention its presence. Finally the problem of checking the veracity of beliefs and assertions can produce a further set of possibilities. It can be argued that the REFER mapping is marked on occasion. The idea being that each processor's referent will (in terms of our God's eye view) either have the property claimed for it by the utterance or it won't. In the case that the referent does have the property the processor claims then the REFER mapping is labelled 1, if it does not then it is labelled 0. Rather than showing all these possibilities as separate diagrams, 1 have placed small tables alongside the diagrams representing the marked REFER possibilities that could hold for each basic diagram configuration. Thus in Figure D I the only REFER mapping is from Sam and he will either be right or wrong from the God's eye view as to the claim he makes in (6) about the referent he has in mind. In Figure D II since both processors are REFERRING to the same object and Robert has not constrained his COPY for the claim Sam is making, then they 78
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Another complication is that, in natural language, terms included in descriptions and assertions are often vague. Whilst most language terms are generally understood and applied consistently in a language community there are cases like (7):
PROCESSING REFERENCE are either both right or both wrong from an absolute point of view. In Figure D III Robert does dispute the claim Sam is making in (6) and so either Robert or Sam is right about the veracity of the claim. Finally in Figures D IV and D V they are both REFERRING to different referents, so each of their views about the different objects can either be right or wrong, this gives us four possibilities each for D IV and D V. 5. Two types of intensional context
In explicit intensional contexts an intensional operator occurs as a lexical item in the linguistic string. These constitute the so-called 'classic' intensional contexts discussed at the beginning of the paper, p. 6*f. Explicit intensional operators can be seen in utterances (8), (9) & (10). In these examples they occur with a second type of intensional operator. This is the covert or implicit intensional context. A number of scholars including Richard Montague recognize the ubiquity of intensionality in natural language. His programme for natural language semantics required a general intensional characterization of the whole language. He did, however, feel that certain elements in language are genuinely extensional. The point to notice is that in the system presented here natural language use always involves implicit intensional operators, thus creating a total blanket of intensionality over language. Implicit intensional contexts arise out of the fact that all 'natural' language use has an origin. It is generated by someone. This amounts to the covert intensional operator. 'X says that "..."', 'X writes that "...'" etc. Natural language also has a destination. It has percipients. This supplies another covert intensional context. 'Y hears-that "...", •Y reads that "...'" etc. Examples (8), (9) and (10) all exhibit these implicit as well as the explicit intensional operators, whilst (6) presents us with a case in which only the implicit intensional context is present. (8) Sam says to Nigel:
"Robert says your girlfriend just rang."
(9) Sam says to Nigel:
"Robert thinks he saw your girlfriend."
JS, vol. 2, no. 1
79
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Having canvassed the full range of referential possibilities for (6) with the OO descriptive framework we can begin to see the complexity of even the simplest of referential acts. The descriptive system also throws up a fact so obvious it is almost always over-looked: there are two sorts of intensional context. Language use always has one type present, the second only occurs if certain lexical items are present. I have called the first implicit intensional contexts and the second explicit intensional contexts. On the grounds that nothing in life is straightforward I shall explain the second type of context first!
NIGEL SHADBOLT (10) Sam says to Nigel:
"Robert wants to invite your girlfriend out."
I would argue that many of the classic ambiguities of intensional contexts are only the most obvious cases of a much more widespread referential problem. The fact is our descriptions have to be indexed from a point of origin to a point of percipience to make clear who said what to who. And that even when this is achieved a nominal expression used by one processor may not refer to the same referent as it does for another processor (as in some of the configurations of (6)). In this respect all generated or interpreted language has opaque possibilities. 6. Descriptions and referential construal
(lla) Nigel says to Alec:
"Do you have an address for Steve's girlfriend?"
(lib) Alec says to Nigel:
"Yes, it is 37 Great
King Street."
(lie) Nigel says to Alec: "Thanks." Indeed since (lie) indicates that an exchange of information has taken place, both processors in the context of the dialogue of (11) assume they have the same referent 'in mind1. This kind of assumption is a manifestation of one of the Conversational Maxims that operate in language. Such maxims consist in various general principles of cooperative behaviour and have been proposed by scholars such as Grice (1975). One of these appears to be a principle of Local Interpretation, cf. Lewis (1969), Tannen (1979), Brown & Yule (1983). Local Interpretation arises in a number of situations. The relevant one here occurs when conversation is initiated and a referential act performed. Unless the hearer gives information to the contrary, discourse proceeds on the assumption that the participants have the same referent in mind.6 So in the sample dialogue (12) the state after (b) implies that both have the same Pub in mind. Subsequently, evidence might reinforce or discredit this assumption. These two possibilities are reflected in the two ways the conversation might be continued (c) or (d). (12) a. Nigel: b. John: 80
"Have you been to the Pub on Buccleuch Street?" "No, but I heard it sells Real Ale." JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
As an example consider the exchange in (lla)-(llc). It is an exchange in which a logical analysis finds no overt intensional operators, yet opacity is possible. In (lla) Nigel assumes that the description 'Steve's girlfriend' evokes in Alec's mind a knowledge structure which refers to the same object in the world as his own. Alec's reply at (lib) indicates a reciprocal assumption by Alec.
PROCESSING REFERENCE (c) Nigel: OR (d) Nigel:
"Yes, and very good it is too." "Oh, you're thinking of Proctors Bar."
Returning to (11) let us consider the possible set of referential configurations, given as Figures E I -E II. The configurations are again presented from the Omniscient Observer perspective. Fig. E I OMNISCIENT OBSERVER REPRESENTATION OF (12) PI
P2 1 0
.REFERENT PROCESSOR 1 NIGEL 'Steve's g'
NSDM
PROCESSOR 2 ALEC 'Steve's g' |
\
NPDM
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 0
STATE SPACE
\ / P D O ; ' ' APDM
PDO J
\
ASDM
SDO P2/P1PDM
SDO P1/P2PDM
Fig. En OMNISCIENT OBSERVER REPRESENTATION OF (12) PI
STATE SPACE
P2
1
1
1
0
0
1
0
0
REFERENT REFERENT PROCESSOR 1 NIGEL
1 'Steve's g' {
j 'Steve's g' NPDM NSDM
• PDO
' SDO P1/P2PDM
JS, vol. 2, no. 1
PROCESSOR 2 ALEC
PDO-^
APDM • ASDM SDO P2/P1PDM
SI
NIGEL SHADBOLT For all the configurations, E I - E II, the COPY mapping in the two processors is never constrained. This is because at the stage in the dialogue at which these representations are taken neither processor doubts the appropriateness of the description and, recalling the principle of Local Interpretation discussed in previous paragraphs, they both assume that the other has referred to the same external referent. In E I both participants are represented as having the same referent mapped to by their respective REFER mappings. But there are two states of1 affairs represented in E I: one where the description 'Steve's girlfriend is 'true' so both REFER mappings would be marked 1, in the other the description was not 'true' and both REFER mappings would be marked 0.
7. Doing without Omniscient Observers In talking, as I did extensively in the last example, about the Omniscient Observer perspective and the 'truth' of the ascription of properties and descriptions to objects there is a problem. In the descriptive system it is manifested in marking the REFER mapping 1 or 0. This representational device implies that it is useful and indeed possible to incorporate an absolute view about the veracity of descriptions and predications. But we do not have the ability to assume a 'God's eye view'. We cannot escape our own subjectivity and fallibility. Some philosophers even agree that the notion of 'absolute truth1 holding independent of observers is incoherent. Without wishing to get drawn into the controversies of metaphysics I think it is undeniable that 'the veil of perception' is a part of our nature. The 'veil of perception' is the claim, advanced by Locke (1690), that we can never know objects directly, but only through the representations in our minds. What we think an object and its qualities to be can be confirmed, refuted or modified by our 'observations', but what we in the end say of the object is still what we think and represent it to be. In this sense the human mind can never escape from itself. If a God's eye view of the world is possible at all, it is possible only to God.
82
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Despite the assumption of Local Interpretation by the processors, the configuration in E II indicates four states of affairs possible when the processors are referring to different. objects. One in which the description can be said to be 'true' of both the referents; one in which the description, is 'true' only of the object referred to by Nigel. Another in which the description is now 'true' of the object referred to by Alec. Finally a situation is represented where both assumptions made by the processors turn out to be wrong, they are both referring to different objects neither of which is at that moment Steve's girlfriend.
PROCESSING REFERENCE At the beginning of the paper I quoted 3ohnson-Laird. He argued that much of language understanding is best conceived as constructive processing: building models of discourse. At the end of his paper he leaves us with a philosophical worry closely tied up with the realization that the veil of perception can never be torn aside. As Intentional Systems who use language, build models and generate descriptions of the world, we encounter intensionality. There are no irrefragable Omniscient Observers able to look inside processors1 heads and determine with certainty what is true of them, their descriptions, predications and the world. No veridical 'God's eye view' exists for cognitive processors. We as humans cannot stand outside ourselves as Intentional Systems. Everything we have is a constructed model of reality, a model our various modalities endeavour to construct.
In what follows we have to move beyond the initial utterance of a discourse. And as soon as we do, the process-dependent nature of referential phenomena becomes even more striking. For example, the history of a piece of discourse becomes a crucial element in referential assessment. Lexical strings can change their referents, at one moment referring to one object, at another to some other object. I hope these points will become self-evident as we attempt to reanalyse (6) & (11) solely within a P-C representation.
Fig. F ! PROCESSOR-CENTRIC REPRESENTATION OF (6)
'BARRY' v -ACCESS SAM'S GENERAL PROCESSOR STATE
SPDM SSDM
COPY^* pS
S/RPDM
35, vol. 2, no. 1
83
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This suggests an interesting possibility, that we should dismantle the Omniscient Observer representation and do alone with our Processor Centric representations. Have only representations that embody how the processor sees things. I shall now attempt to show how this might be -done.
NIGEL SHADBOLT If we take the instigator of the referential act as the source of the P-C representation, then to canvass the other referential possibilities revealed in our OO representation the processor must suspect that his referential act has misfired. So, in the case of (6), until Sam receives information that causes him to think things have gone wrong he maintains the standard configuration of Figure F I which repesents supposed successful reference. In Figure F I we have an ACCESS mapping to a PDO (si) linked to a knowledge structure (set of beliefs) representing the object Sam intends to refer to. There is also a COPY mapping (from a PDO si to an SDO srl) fulfilling exactly the same role as in the OO representation. In successful reference this mapping is not constrained for the novel information 'Barry is wearing a tie1 contained in (6a).
(6a) Sam says to Robert:
"Barry is wearing a tie."
(6b) Robert says to Sam:
"Which Barry?"
What has happened here? Well suppose the knowledge associated with the DO si in figure F I includes some of the following of Sam's beliefs: that the intended referent's name is Barry, that he is a student and that he is currently and 'remarkably' wearing a tie. 7 Fig. F D PROCESSOR-CENTRIC REPRESENTATION OF (6)
•BARRY1 ~ \ -ACCESS SAM'S GENERAL PROCESSOR STATE \ SPDM
| . si SSDM
S/RPDM
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
But imagine Robert replying to (6a) with (6b). This is a clear indication that Sam's referential act has not had the intended effect on Robert:
PROCESSING REFERENCE Sam has another set of beliefs about an individual: his name is also Barry, he is married and is Director of the School of Epistemics. This other Barry has not yet appeared in Sam's discourse model but he is known about by Sam. Robert's reply in (6b) lets Sam know that Robert has more than one candidate for the referent of Sam's referential act in (6a). Sam could check for himself that alternative candidates are possible, but before he supplies additional disambiguating information a configuration represented by F II holds. Here the SDO srl that Sam had placed in the area of his SDM indexed as S/RPDM (Sam's view of Robert's Primary Discourse Model) is removed and the COPY link retracted. We move from the configuration F I after (6a) to F II after (6b). There has been a change in the processor's appreciation of the referential configuration as the discourse proceeds-
(6a) Sam says to Robert:
"Barry is wearing a tie."
(6c) Robert says to Sam:
"His wife must have made him."
Let us assume that Sam has the same beliefs about the two Barrys I discussed in the last example. Robert's reply in (6c) might prompt Sam to one of two replies. The first reply might be something along the lines of "I didn't know he was married". This reply is a reflection of Sam's ignorance in which case he could add to his knowledge of 'Barry the student1 and still assume they were talking about the same person. Thus the configuration represented by Figure F I would be maintained. Only when remarks produce direct inconsistency with the beliefs Sam has about the Barry he is talking of, will Sam suspect Robert of having another Barry in mind. If he finds on re-checking that there is another Barry of whom it would'be: (a)
reasonable that be remarkable,
the predicate
associated
with (6a) should
(b)
consistent with what Sam knows about this other Barry to suppose Robert could think it the referent of Samls referential act,
then Sam would be forced to consider a reanalysis. In this case a number of moves are required to represent the reanalysis Sam has to perform. Descriptively our P-C representation moves from a configuration as in F I to one expressed by F III. The P-C representations have mirrored the required processing in the following way. The realization that they are talking about different people requires that the link between PDO si and SDO srl is removed and srl relabelled say sr9. as, vol. 2, no. 1
85
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Another situation where referential success might be called into question occurs if Robert replies as in (6c).
NIGEL SHADBOLT
What the SDO retains is the property of having a proper name Barry. Sam now has to find a PDO to relate to this new SDO sr9. He could create one 'de novo1 knowing nothing about this new 'Barry' or he could search for a candidate he already knows about and who he knows Robert already knows about; In any event a type of backwards COPY has been performed, and the DO s9 is now able to be ACCESSED by the nominal 'Barry'. Fig. F PROCESSOR-CENTRIC REPRESENTATION OF (6)
•BARRY
1== ACCESS SPDM SSDM
. sr9
S/RPDM
Fig. F IV PROCESSOR-CENTRIC REPRESENTATION OF (6)
'BARRY' -ACCESS SAM'S GENERAL PROCESSOR STATE ^ SPDM COPY (•con) - ^ * SSDM
86
S fsrl
1 3S, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
SAM'S GENERAL PROCESSOR STATE
PROCESSING REFERENCE If Robert disputes the claim made by Sam in (6a) and we get into another range of possibilities (for example if Robert replies to (6a) with (6d)), one move open to Sam is to constrain the COPY link he has established to srl (Sam's view of Robert's view of Barry). The configuration then moves from the one represented by F I to F IV. If the argument continued and Sam was convinced he was right then he might suspect they were talking about different people. Applying a check for the appropriateness of his remark to another Barry, Sam could be forced to move from F IV to F III. The COPY would not be constrained from s9 to sr9 since I assume Sam accepts Robert's claim as authoritative with regard to this other Barry. "Barry is wearing a tie."
(6d) Robert says to Sam:
"I'm sure he isn't, him."
I've
just seen
We can now set the OO representations into a correspondence with our P-C representations. Some of the P-C representations pass through a number of stages to capture the full content of the OO ones. OO REPRESENTATION DI D II D III D IV D V
P-C REPRESENTATION F ] F II F F .1 =* F IV .F 1 • F III F 1 • F IV - F III
There are a number of points to notice in this reanalysis of (6). Although Sam might recognize his referential act as misfiring, his intention to refer to a certain object does not change, only his discourse model changes. When he detects a possible incongruence he may try and engineer Robert's model back to the one he originally intended to produce, perhaps by uttering something like "I meant the student, not the Director".Secondly, I have only concentrated on the nominal lexical material Barry, not the expressions his wife or him. I have not in this paper been able to consider how and where anaphora has occurred. It is clear that this is an important missing, element in the system so far presented. Thirdly and perhaps most interesting of all, I have deliberately made no mention of any mapping into any object external to the processor. There is some support for this view of reference as primarily concerning models and representations and only secondarily the world. It is in this sense that I have not bothered to canvass the set of possible configurations resulting from the REFER tables of the OO representations. In the first place communication is not just gratuitous JS, vol. 2, no. 1
87
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(6a) Sam says to Robert:
NIGEL SHADBOLT description of the world. Natural conversation seeks to discover, contrast, alter and build 'views of the world1. Communication is usually about the models of the world people have. But the deeper reason for seeing language and reference as primarily concerning representations and models derives from the 'veil of perception' position. A position I have outlined and generally subscribe to. It suggests an epistemological dualism committed on the one hand to a world of representation to which all human thought and experience is restricted, and on the other to a quite separate mind-independent world somehow causally related to the first.
Fig. G I PROCESSOR-CENTRIC REPRESENTATION OF (12)
•STEVE'S GIRLFRIEND \ NIGEL'S GENERAL PROCESSOR STATE
N^-ACCESS
NPDM COPY NSDM
88
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To complete the dismantling of OO representations, I want to reanalyse (11) within the P-C representation to show how it deals with descriptions. Again alternative configurations to successful reference (Figure G I) will only be apparent in this type of representation when the processor has information that discredits his assumption of referential success.
PROCESSING REFERENCE Fig. G PROCESSOR-CENTRIC REPRESENTATION OF (12)
'STEVE'S GIRLFRIEND'
NIGEL' i GENERAL PROCESSOR STATE
V - ACCESS
V,,
NPDN NSDM
(lla) Nigel says to Alec:
"Do you have an address for Steve's girlfriend?"
(lib) Alec says to Nigel:
"Yes, it is 37 Great King Street."
(lie) Nigel says to Alec:
"Thanks, do you know which airline she works for?"
(lid) Alec says to Nigel:
"She doesn't, she's a nurse!"
Alec's reply at (lid) stands in direct contradiction to Nigel's belief about the person he thinks is Steve's girlfriend. If he accepts Alec as authoritative he will have to make one of the following moves: (i)
Change his belief that nl is an air hostess and the configuration represented in G I will still hold.
(ii)
Accept the possibility that two people are being talked about and that Steve is going out with both of them, in which case
JS, vol. 2, no. 1
89
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Certainly a configuration such as G II seems ruled out. After (lie) Nigel cannot suppose that his use of the description Steve's girlfriend has failed to refer to some object for his hearer. But it could turn out that, as discourse unfolds, Nigel comes to suspect that Alec is thinking of a different person. To present the required scenario suppose that all Nigel knows about Steve's girlfriend is that she is an air hostess. Imagine (11) continuing as below:
NIGEL SHADBOLT we move from configuration G I to G III. (iii) Accept the description as holding of another object than the one first claimed, here we move from G I to G IV.
Fig. C ffl PROCESSOR-CENTRIC REPRESENTATION OF (12)
•STEVE'S GIRLFRIEND' \
- \ = ACCESS Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
NIGEL'S GENERAL PROCESSOR STATE NPDM
nl
NSDM
^
n9
COPY S .•fa?
Fig. G IV PROCESSOR-CENTRIC REPRESENTATION OF (12)
'STEVE'S GIRLFRIEND1 \ NIGEL'S GENERAL PROCESSOR STATE
\ -- ACCESS
NPDM . nl NSDM
90
COPY-p/
JS, vol. 2, no. 1
PROCESSING REFERENCE The moves required in terms of our P-C representation for the cases described in (ii) and (iii) are quite complex. In both cases the property of being Steve's girlfriend possessed by nl and nal is transferred from nal to na9. The SDO na9 is created on the ground that Nigel realizes Alec is talking about someone else, nal is removed and the COPY link retracted. Now there has to be a counterpart to na9 in Nigel's PDM. This is achieved by a backwards COPY. This COPY will not be constrained for the description Steve's girlfriend, since we are accepting Alec as authoritative. So in both cases (ii) and (iii), represented by G III and G IV respectively, we can in the future use the description Steve's girlfriend to directly ACCESS n9.
(He)
Nigel to Alec:
"I was thinking of the wrong person."
There are parallels to situations (i) and (ii) where Nigel does not take Alec as authoritative. So we can imagine Nigel (i1)
changing his view of nal, the COPY link constrained for the information about occupation whilst he retains his view of nl. Hence G I will stay the same except for this constraint on COPY.
OR (ii')
Nigel might accept the possibility that two people are being talked about but he only believes his view of who Steve's girlfriend is. However he appreciates that Alec has another view on the matter so na9 may have the description 'Steve's girlfriend1 attached to it but its associated n9 will not. We move from G I to G III where the backwards COPY is constrained.8
So what of the 'real truth1 or not of the description with respect to the various obejcts being referred to? Again the interesting fact is that in examples iike (ii) on a P-C view the description's success in referring depends on the processors' assent or dissent to the application of the description with respect to a representation. What is the case in the world is only secondary to the model-centered view. If they neither of them suppose their descriptions are going awry they will imagine that communication has been perfectly successful. Their linguistic behaviour will proceed as if their models matched and reflected exactly how the world is. Other information from the world might eventually find them out but in terms of the models built JS, vol. 2, no. 1
91
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Notice in G IV that the original PDO nl cannot be ACCESSED by the description any longer by Nigel. But although the isolated DO in G IV is an object apparently adrift in the Discourse Model, it can be useful. Indeed if Nigel replied to (lid) with (lie) we see strongly the need for exactly such a DO to refer to.
NIGEL SHADBOLT on the basis of discourse it all comes down to what the representations inside the processors contain. In this respect we see a big difference in the OO representation and the P-C representations of (11). There is, in the P-C perspective, no concern with the possible ways descriptions are in reality true of the objects referred to. 8. Concluding remarks
In (13) I might be questioning a schizophrenic about his distorted beliefs and world views. The existence of the objects has no influence on the linguistic constructions themselves. (13)
Psychiatrist to Patient:
"What does the sword think about the crisis?"
of
God
Nominals can function in many different ways, effectively evoking many different knowledge structures. Thus in (14M16) we have the nominal the book referring to a physical token, a sub-class of a type and the type respectively. Although in (16) the nominal could be ambiguous between all these readings. The book is on the shelf. (15)
The book was finally published by.OUP.
(16)
The book has shaped history.
Matters can be even more complex than this. Nunberg (1978) describes the range and subtlety of polysemy in language, I am indebted to both his ideas and his examples in this matter. Consider the nominal radio in (17)-(20). In (17) it is used to refer to a physical object. In (18) it refers to a method of transmission. Radio can also be used to refer to an industry as in (19). And a use like (20) where reference is to the quality of the product commonly transmitted over radio sets.
92
(17)
John bought the radio.
(18)
They got the news by radio.
3S, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I have been talking about the problems of successfully referring to objects and it is worth considering what these objects are. I have made the point that primarily linguistic reference operates on representations. These are the first objects of reference. Only secondarily does the world get hooked onto these representations. But not all representations map into objects in the world cleanly if at all. We might expect, then, that the range of referents open to language is enormous, and so it is.
PROCESSING REFERENCE (19)
He made a pile out of radio.
(20)
Radio has gone downhill since TV came in.
There are also intriguing cases where the nominal can simultaneously be construed in two ways in an utterance. In (21) Sc (22) the nominals seem to be construed as both token and type as in (23), which was heard at Wimbledon 1982. The chair you are sitting in is commonly seen in Eighteenth Century interiors.
(22)
The newspaper you are reading has come out against hanging.
(23)
Mark Cox "Connors has one of the best overheads in the game and he has missed two already."
Speakers' referential ability extends to refer to the objects language is composed of, producing self-reference to various levels of linguistic structure. (2*0
This sentence contains five words.
(25)
'Plato' has five letters.
(26)
'Plato' has six letters in French.
(27)
'Plato' begins with a stop.
(28) 'Shit' may be obscene, but it is easy to sayAll these examples indicate how wide may be the range of information associated with nominals in language. By using a nominal we can refer to all these types of information. What is maintained in all these poiysemous nominals is their ability to introduce discourse elements which serve as objects that can be subsequently referred back to. This is part of the reason for my introducing the notion of the Discourse Object, which is ontologically neutral. The proposals in this paper adopt a different perspective on some classic problems of reference. A crucial element is the treatment of intensional operators as instructions to delimit and separate processors' models of discourse. There is a sense in which the intensional operators have a scopal content, controlling in part from which discourse models descriptions can be imported from and exported to. But the proposals contain three elements providing a much richer system than traditional logical accounts: JS, vol. 2, no. 1
93
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(21)
NIGEL SHADBOLT (1)
introducing covert or implicit intensional operators;
(2)
seeing the origin and point of percipience of nominals as crucial;
(3)
seeing the interpretation of nominals in various discourse models as determined by state, the general state of the processors, the beliefs and knowledge processors have about themselves and other processors.
Edinburgh University Department of Linguistics 15 Buccleuch Place Edinburgh EH8 9NW
Notes 1 Semantic meaning is taken to be the content of individual lexical items and the contribution of these parts of the sentence to the meaning of the whole sentence. 2 Artificial intelligence systems already exhibit a degree of intensionality. At Edinburgh University the Mecho project consists of a set of programs some of which represent the micro-world (applied mechanics) the system knows about. Objects in the micro-world are represented as sets of descriptions. In such a system if a description is used to talk about an object and the system does not know of the object under this description we are in the realm of the type of opacity problem exemplified in (4). 3 I am grateful to an anonymous reviewer who pointed out to me what I had failed to make explicit. Namely, the view of the lexical material buffer I present is extremely oversimplified. It is certainly not the case, at least in interpretation, that all our interpretative processing is from the bottom-up. We often reconstruct the sound we hear in terms of what we ^expect to hear (cf for example, Miller & Isard 1963, Warren & Warren 1970). * Recently Han Reichgelt and I have grafted additional machinery onto the system, partly because we realized that there is a fundamental asymmetry between speakers and hearers, a realization precipitated JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Whilst other formalisms do not rule out the considerations in (2) & (3), neither do they highlight them. I hope to develop the descriptive system with its process bias into a program that models in its own turn certain of the aspects of the cognitive processes involved in understanding reference. And ultimately perhaps see the construction of discourse models as rather more than a surrogate for reality.
PROCESSING REFERENCE by a problem in my analysis of this particular referential failure. It is odd to suppose that in the case of Barry not uniquely specifying an individual, Robert should do nothing. He at least knows that Sam intends to refer to someone. There should therefore be an SDO generated in R/SPDM, as in Figure D I1, even if it fails to point to any knowledge structure or potentially points to too many. Fig. D P
STATE SPACE -
PROCESSOR 1 SAM
REFERENT
PROCESSOR 2 ROBERT
1 LMB 'Barry' / / SPDM
ypoo
1 [LMB 'Barry'
/
RPOM
1
1
RSDM| •SDO R/SPOM
SSDM *SDO
That a DO is available to Robert is evidenced by possible replies he can make to Sam: (6b)
(6c)
Sam says to Robert:
"Barry is wearing a tie."
Robert says to Sam:
"Which Barry?"
Sam says to Robert:
"Barry is wearing a tie."
Robert says to Sam:
"Who?"
So even though Robert may have no pre-existing knowledge about a 'Barry' or indeed too many knowledge structures, he does have a unique DO which on this occasion simply fails to point back successfully to a knowledge structure. In a forthcoming paper Reichgelt and I suggest that the canonical form of referential uptake by the hearer reflects the same asymmmetry between speaker and hearer (Figure C). We provide supporting evidence from the use of indefinite noun phrases and definite descriptions. It indicates a rather more complex dynamic interchange between speaker and hearer roles than is implicit 3S, vol. 2, no. 1
95
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1
NIGEL SHADBOLT in the system presented here. However, we attach the additional machinery and analyses to essentially the present system without too much disruption of the explanatory machinery herein.
Fig. C* STATE SPACE REFERENT
A
PROCESSOR 1 (= speaker) LMB
P1SDM
I /
r
P1/P2PDM
V
LMB P2PDM
| \ t
P2SDM
P2/P1PDM
5 This amounts in A.I. parlance to the concept of truth maintenance system. Truth maintenance attempts to make sure that information is added to and taken away from a knowledge base without producing inconsistency. This becomes more of a problem when multiple knowledge bases are allowed to pass information to and from each other, and this is exactly the situation we have in the inheritance of beliefs by embedded models within processors. 6. The operating principle is not always followed or applied by processors even at the outset of conversation. In some contexts, certain (especially comparative) definite descriptions, are understood to refer to different referents for the different processors. A good example of this occurs in the dialogue fragment below. Celtic Fan:
"That is the best team in the league out there."
Rangers Fan:
"Yes, it is."
7 I stress 'remarkably' because, as I have said throughout, communication is not just describing the world. Often, as in this case, it is conveying information the speaker has reason to believe the hearer will find novel. 8 This configuration is interesting since it represents a well known 96
JS, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
P1PDM
PROCESSOR 2 (= hearer)
PROCESSING REFERENCE situation much discussed in the literature (Donnellan 1966, Perrault & Cohen 1981, Johnson-Laird & Garnham 1982). Imagine Nigel and ALec at a party. They watch together as water and wodka are poured into two identical glasses, and the water is given to Louisa, while the wodka is given to Gill. Unbeknownst to Alec, Nigel sees Louisa and Gill exchange glasses. Later Nigel says to Alec: "The woman with the wodka is Sam's wife." Here Nigel is referring to an object with a description he knows not to be true of it. Nevertheless he realizes successful reference will occur since his view of Alec's view of Louisa allows such a description to be appropriate. References
JS, vol. 2, no. 1
97
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Allen, J.F., 1979: A plan-based approach to speech act recognition. Technical report 137, Dept. of Comp. Sci., University of Toronto. Allen, 3.F. & Perrault, R.C., 1979:Analysing intention in dialogue. Technical report 50, Dept. of Comp. Sci., University of Rochester. Brown, G. & Yule, G., 1983: Discourse Analysis. Cambridge University Press, Cambridge. Dennett, D.C., 1978: Brainstorms: philosophical essays on mind and psychology. Bradford, New York. Donnellan, K., 1966: Reference and definite descriptions. The Philosophical Review 75; 281-30*. Fauconnier, G. 1979: Mental spaces - a discourse processing view to natural language logic. Unpublished paper, Universite de Paris VIII. Fodor, J.D., 1970: The linguistic description of opaque contexts. Doctoral dissertation, MIT. Garrod, S.C. & Sanford, A.J., 1982: The mental representation x»f discourse in a focussed memory system: implications for the interpretation of noun-phrases. Journal of Semantics 1; 21-*2. Grice, H.P., 1957: Meaning. Philosophical Review 66; 377-388. Reprinted in P.F. Strawson (ed.), Philosophical Logic. Oxford University Press, Oxford. Grice, H.P., 1968: Utterer's meaning, sentence meaning and word meaning. In J.R. Searle (ed.), The Philosophy of Language. Oxford University Press, Oxford. Grice, H.P., 1975: Logic and conversation. In Cole & Morgan (eds.), Syntax and semantics, vol. 3. Academic Press, New York. Grosz, B.J. <5c Hendrix, G., 1978: A computational perspective on indefinite reference. Paper delivered at Sioan workshop on indefinite reference, UMASS. Grosz, B.J., 1979: Utterance and objective: issues in natural language communication. Technical note 188 SRI. Johnson-Laird, P.N. & Garnham, A., 1980: Descriptions and discourse models. Linguistics and Philosophy 3; 371-393. Johnson-Laird, P.N., 1981: Comprehension as the construction of mental models. Philosophical Transactions Royal Society London B 295; 353-37*.
NIGEL SHADBOLT
98
3S, vol. 2, no. 1
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kamp, H., 1981; A theory of truth and semantic representation. In 3. Groenendijk, T. Janssen <5c M. Stokhoi'.(eds.)»Formal methods in the study of language. Mathematisch Centrum, Amsterdam. Kartunnen, L., 1976: Discourse referents. In 3.D. McCawley (ed.), Syntax and semantics, Vol. 7- Academic Press, New York. Lewis, D., 1969: Convention: a philosophical study. Harvard University Press, Cambridge, Mass., Locke, 3., 1690: An essay concerning human understanding. Miller, G.A. & Isard, S., 1963: Some perceptual consequences of linguistic rules. Journal of Verbal Learning and Verbal Behaviour 2; 217-228. Montague, R., 1970a: Pragmatics and intensional logic. In R.H. Thomason (ed.), Formal Philosophy: selected papers of Richard Montague. Yale University Press, New Haven. Montague, R., 1970b: English as a formal language. In R.H. Thomason (ed.), Formal Philosophy: selected papers of Richard Montague. Yale University Press, New Haven. Montague, R., 1970c: Universal grammar. Theoria 36; 373-398. Nunberg, G.D., 1978: 'The pragmatics of reference' reproduced by Indiana University Linguistics Club, Indiana. Partee, B.H., 1972: Opacity, coreference and pronouns. In D. Davidson & G. Harman (eds.), Semantics of natural language. Reidel, Dordrecht. Partee, B.H., 197*: Opacity and Scope. In M.K. Munitz & P.K. Unger (eds.), Semantics and philosophy. New York University Press, New York. Perrault, C.R. & Cohen, P.R., 1981: It's for your own good: a note on inaccurate reference. Technical report 4723 BBN. Reprinted in A. 3oshi, B. Webber & I. Sag (eds.), Elements of Discourse Understanding. Cambridge University Press, Cambridge. Searle, 3.R., 1969: Speech acts: an essay on the philosophy of language. Cambridge University Press, Cambridge. Seuren, P.A.M., 1982: The construction of discourse domains through accumulated increments. Paper delivered at Edinburgh Conference on language, reasoning and inference. Sidner, C.L. & Israel, D.3., 1981: Recognizing intended meaning in speaker's plans. Draft paper BBN. Stenning, K., 1977: Articles, quantifiers and their encoding in textual comprehension. In R.O. Freedle (ed.), Discourse processes: advances in research and theory. Ablex, Norwood, New 3ersey. Strawson, P.F., 1950: On referring. Mind 59; 320-344. Reprinted in A.G.N. Flew (ed.), Essays in conceptual analysis. Macmiilan, London. Tannen, D., 1979: What's in a frame? Surface evidence for underlying expectations. In R.O. Freedle (ed.), New Directions in Discourse Processing. Ablex, Norwood, New 3ersey. Waismann, F., 1968: Verifiability. In G.H.R. Parkinson (ed.), The theory of meaning. Oxford University Press, Oxford. Warren, R.M. & Warren, R.P., 1970: Auditory illusions and confusions. Scientific American, 1970; 223: 30-36.
JOURNAL OF SEPTEMBER 1983 VOL. II - NO. 2
Special Issue on Semantics and Intonation edited by Peter Bosch and Leo G.M. Noordman
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
SEMANTICS •MTT
EDITORS' PREFACE
This situation has changed. The last twenty years have seen a fast development in natural language semantics, with increasingly more and increasingly explicit models of what 'meaning' in natural language made be taken to be, and these models have tried to incorporate an increasing number of parameters. One of the natural steps was to look again at intonation. Independently of the development in semantics, there have also been notable developments with regard to prosody, which can no longer be said to be of only marginal interest among the sound aspects of language, overruled by a main interest in segmental phenomena. The last few years have witnessed probably more conferences, publications, and research proposals on prosody than many decades before taken together. Intonation, within these developments, has been one of the notions of a more central interest. Hence, also in this context it was natural to ask again about possible functions intonation parameters may have for meaning or meaning parameters may have for intonation. Considering also the interdisciplinary interest in these developments in semantic and phonetic research, we thought the Journal of Semantics should take a look at results of recent research on the relation between semantics and intonation and at the direction developments are taking. This is what we are trying to do with this special issue.*) Leo Noordman, Peter Bosch *) Replies, squibs, and discussion notes, linking up to matters raised in this issue are particularly invited and will be printed in subsequent issues of the Journal. 100
JOURNAL OF SEMANTICS, vol.2, no.2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Until not too many years ago questions of intonation, and of prosody more generally, occupied at best a marginal place in the disciplines concerned with natural language. And among the topics that were discussed the relationship between intonation and meaning occupied an even more marginal position. If we set one or two notable exceptions aside, one seems to have believed that intonation was a matter of adding some colour to an otherwise dull message, perhaps also of highlighting some of its aspects. But then, the notion of 'meaning' was not in any better shape, and it is not clear wether it would have made much sense to even ask any such question as what the relation between semantics and intonation would be.
WHERE DOES INTONATION BELONG?
Dwight Bolinger
Abstract
The study uses evidence from the intonation of English.
"We need to pay more attention to our biological endowment" (Bailey 1982: 23) An obvious answer to the question in my title - which answers everything and therefore nothing - is that intonation belongs wherever people have a use for it. 1 It belongs in syntax, because it helps to mark the start and finish of stretches of speech such as clauses and sentences. It belongs in pragmatics because it is the best audible cue we have as to what a speaker is doing with his utterance. It belongs in psychology because it gives a running account of emotion and counts among the symptoms of certain brain disorders. It even belongs in music, because tune and lyric ought to stay close together if they are not to be self-contradictory. But where does intonation really belong? What association can we make that will tell us most about its nature and serve best to illuminate the other associations? Linguists have come to this question with a severe case of occupational centrism, and have tended simply to take the role of intonation for granted, as a handmaiden to what their chief concerns in language JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 101-120
101
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Though intonation has many ties to the central arbitrary manifestations of human language - to syntax, phonology, and to some extent lexicon its most intimate connections are with the general scheme of iconic nonverbal communication, particularly the now spontaneous, now simulated or ritualized, gestures of the face, head, hands and body. Its meanings are based on inferences from concepts of up and down often associated with actual up-down movements in other parts of the gestural complex - plus metaphorical extensions of those concepts. Supposed grammaticizations are contradictory unless seen as intersections of two relatively autonomous systems: word-based language, and intonatiohal and physical gesture.
DWIGHT BOLINGER happened to be. Those who had invested part of their careers in tone languages were prone to see intonation as a set of levels that did for sentences what Chinese tone does for Chinese morphology. Those with a stake in more traditional approaches to grammar saw in it the framework of utterance types like questions and statements, and tended to describe it in melodic terms - speech tunes were the units. Transformational grammarians, concentrating on how to predict the form of sentences, reduced intonation to a set of rules for stress and for a long time ignored the melodic side altogether. It is clear that no coherent view of intonation could develop from these partial approaches that look upon it as a sideshow to supposedly more interesting questions in linguistics. It is also clear that a realization of the fragmentary nature of the approaches would dawn eventually and enable intonation to come into its own as an autonomous field. We are at that point now, with wide-ranging publication and one symposium after another, in various parts of the world. My purpose here is to identify what I believe is the most promising springboard from which to make the leap toward the various fields and applications to which intonation is pertinent. First a glance at some of the evidence coming out of neurolinguistics and allied research. This has been recently summarized by R.D. Kent (1982) and is pretty persuasive in the connections it makes between intonational disorders and damage to the right hemisphere of the brain. The most significant tie is the one between intonation and emotion, which is close enough to lead investigators to refer to "affective and prosodic processing" in the right hemisphere. The same patients who have trouble with intonation also have trouble with emotional gesturing. And we learn from Ekman's brief summary (1982: 172-173) that recognition of faces is better when emotion is displayed than when the faces are affectively neutral. (More later on the connection with faces.) The relationship of the articulate parts of language with the opposite hemisphere has been known for more than a century - the delicate phonemic contrasts are localized, as far as anything can be strictly localized, in the left, hemisphere, and it is at least suggestive that intonation should be mainly on the other side, with communicative functions that seem to relate to emotion. Next a curious piece of evidence from psychology. It is well known that young children are able to produce contrasts in pitch that attach to 'new1 information and that adults interpret as normal accentuation. Weeks (1982: 165) reports a study of five children aged 1:9 to 2:5 all of whom managed this well. Cutler and Swinney (1980) have found the same, but their word-recognition tasks produced the quite unexpected finding that those same children were not helped in performing the task by the presence of the normal intonational accent in the test sentence. They did just as well without it, in telling the difference, for example, between a sentence like The nurse brought a CLEAN towel 102
J5, vol. 2, no. 2
WHERE DOES INTONATION BELONG? and took away the DIRTY one and one like The nurse brought a clean TOWEL and took away the dirty one. It seems that production precedes recognition, the opposite of what is usually expected, and one wonders why. I think we can guess. The child knows the meaning of the sentences and the meanings of the individual words - otherwise the tests would have given no results at all. Now imagine a child producing a sentence in which one word is more important - more interesting and more exciting - than all the other words. If there is a mechanism connecting pitch and emotion, this should trigger it. The child knows how he feels about the word, but has not yet learned how to interpret the signals coming from other people. The same is apparently true of pitch in its demarcative function: whatever ability children have to produce appropriate falls and rises at separation points is not reflected in their understanding of sentences: they are not confused, as adults are, by inappropriate contours (Bosshardt and Hormann 1982).2 Lock (1980: 103) makes a similar observation about manual gesture: if a child "points at something the mother will (usually) give it to him, but if she points at something he will not give it to her". It is difficult to understand the child's use of pitch if we do not accept some kind of "built-in tie between intonation and affective state" (Pye 1982: 48; writing of accent in the speech of Mayan children). All of this accords with what Redican (1982: 269) reports as "a prevalent view of communication among primatologists", which is that "facial displays and vocalizations take place when the sender is in an emotional state. (Thus, a call does not represent 'fig1, but the animal may become highly aroused when it beholds a fig.)" The neurological and psychological evidence I know only by hearsay, but there is also abundant evidence from linguistics. When we compare the descriptions of intonation from language to language, covering languages of the most diverse origins, we find resemblances that far surpass anything that could- be attributed to chance, and so widely separated in space that they could hardly be the result of diffusion. Read the various papers by L. Dascalu on Romanian intonation 3 and you appreciate the point-by-point resemblances to English; and something approaching these same resemblances turns up in native American languages and languages of Asia and Africa (see Bolinger 1978). We might find similarities even greater if we had better descriptions. Now what does this mean? Is it only a typological accident, such as the fact that all languages have phonemes and some totally unrelated languages have phonemic systems that are remarkably similar? No, for two reasons. First, for the comparison to be valid, all or nearly all languages would have to have similar phonemic systems. Second, the resemblances would have to extend to meanings, for what we find in intonation is not merely a physical comparability but a semanr.: one. It is as if all languages of the world shared the same basic vocabulary. We can hardly escape the conclusion that intonation must be intimately associated with some primitive mode of communication JS, vol. 2, no. 2
103
DWIGHT BOLINGER that comes wired in our nervous systems. Now the partisans of intonation-as-grammar are perfectly willing to grant that intonation-as-emotion covers a great deal of what can be observed in the behavior of fundamental pitch, and they might even concede that some kind of emotional drive at one time did and even today at certain stages still does prime the system. But they prefer to see intonation in mature speakers as having advanced beyond the babbling stage to the point that it is now just as integral a part of the system as any of its other components. They can point to some considerable successes in relating intonation to the analysis of vocatives, compounds, scope, restrictiveness, and other undeniably grammatical phenomena, and their arguments would be hard to refute because we have too few detailed descriptions of intonation to tell how general those grammatical manifestations of intonation are. It is possible that they are features of one or a very few languages, without any natural base. In other words, it is easy to deny the iconic basis of intonation and to write it up as one more chapter • in the arbitrariness of the sign. People do use higher pitches and a wider range when they get excited and a monotone when they are bored, but that - if we take this view - sheds little light on the fact that vocatives are absorbed into the intonation contour that precedes them or that wh questions have falling terminals. This leaves us only one way to go if we expect to show that no matter how sophisticated intonation appears to be in its counterpoint with other parts of language, it still echoes that primitive cry. It will be necessary to look closely at a reasonable number of constructions and uses where intonation plays a part and to which it has been assigned the role of either definiens or definiendum - and try to see whether the opposing view of intonation as an emotional obbligato makes the phenomena more understandable. Before getting on with that it will help to broaden the inquiry by looking at that other part of the communicative complex that is even more athletic than intonation and that seems to be connected with it, if the hints from neurology are to be taken seriously. I refer to physical gesture and I am asking whether there are significant parallels between the movement of pitch and the movements of certain highly visible parts of the human body. The pregnant question is whether all these movements, including pitch, are interrelated, and correspond to some kind of atavistic metaphor. That there is a natural correspondence, from the very earliest stages, between the visual and the auditory side of communication is suggested by some experiments reported in MacKain et al. (1982). Five- to sixmonth old infants were tested for the attention they paid to visual JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? gestures accompanying the pronunciation of disyllables that they heard at the same time. They preferred having what they heard match up with the correct mouth gesture rather than with some other synchronized mouth displayMost interesting from the standpoint of shared function is some recent theorizing by John Ohala regarding the origin of the smile. For Ohala (1983) the smile is an evolutionary product of high pitch as the sign of helplessness: high pitch relates to the small size of infants and defenseless animals and enables them to claim protection. The acoustic effect of smiling is to- raise the second formant of voiced sounds, particularly the vowels. The opposite gesture, that of liprounding, has a lowering effect. The smiling speaker imitates the high register of the infant's voice and signals nonaggression. Ohala (1983) relates this to a panzootic frequency code "whereby vocalizations consisting of high frequencies signal the vocalizer's apparent smallness and, by extension, his non-threatening, submissive, or subordinate attitude and by which low-frequency vocalizations signal apparent largeness and thus threat, dominance, self-confidence". The code "is an inherent part of human vocal communication (and probably has been for millions of years) .... The frequency code explains the similarities in cross-language and cross-cultural use of the pitch of the voice to mark questions vs. non-questions, to signal different social attitudes . . . , and to refer to things small and large using sound symbolic vocabulary". The operant phrase in this passage of Ohala's is "by extension". I mentioned an atavistic metaphor; it is defined by the only thing that pitch can do: go up and down. The first extension is the smile itself, which is "up", while the frown is "down", exactly as the thespian mask portrays them. The up-down metaphor according to Lakoff and Johnson (1980: 57) is among1 "the central concepts in terms of which our bodies function". It would be surprising if something so fundamental were restricted to the vocal channel with no visual counterpart except the smile. Actually what we find is an interlocking gestural complex pervaded by the up-down metaphor and other extensions of symbolic acts, all in the service of affective communication - and with grammatical implications that are by no means confined to intonation. The best evidence is the coupling of intonation and physical gesture in the act of communicating. This first caught my attention when I was describing (Bolinger 1946) the reverse accent, which is made to stand out by a drop in pitch to the accented syllable rather than a rise. For example, one can say with assertive emphasis wi 11. JS, vol. 2, no. 2
105
DWIGHT BOLINGER or one can say with great courtesy or reassurance I
In doing these the head - if it moves at all - tends to follow the line of pitch; to do the opposite is quite difficult. The deferential bow of the head is accompanied by an intonational bow. This sort of covariation is known as "motor equivalence" in biomechanics: the same goal is achieved in the same way by different motor assemblies. The coupling of the up-down of pitch and the up-down of other body parts extends all across the range of intonational contours. We see head movement paralleling pitch movement again in the emphatic rise-and-fall: I kn W
° w:
A prolonged high pitch followed by an abrupt drop is paralleled by a prolonged high head position followed by an abrupt downward thrust: shee:r It's non S
Pitch movement and hand movement are in parallel when you say There there: to calm someone down, and you move your hands, palms-down, downward at the same time. Hand-and-arm movement parallels pitch in the case of minor and major clauses, which are marked by narrower and wider excursions of pitch, for example, If you'll help me / when I get there, // I'll pay you. If you'll help me, / / when I get there / I'll pay you. But the most sensitive of the couplings is the one between intonationai gestures and the gestures of the face. Both have evolved as organs of affective communication and both appear early in the development of the child. Beebe (1982: 170), speaks of the "biologically-ensured fascination of the infant for the human face", and calls attention to the preference that four-day-old infants show for "a regular schematic drawing of the human face over a scrambled drawing". Paralleling 106
JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? Ohala's frequency code there is - at least as far as the primates are concerned - a facial code: "Facial expressions may offer the best, if not the sole example of behavioral comparisons between man and his relatives" (van Hooff 1976: 184; quoted by Redican 1982: 215). Facial-intonational linkage is thus particularly significant for the notion of a gestural complex that includes both the seen and the spoken. The most obvious instance of linkage is in the posing of questions. Questions have a universal tendency to go up, and at the same time that the pitch goes up, so do the eyebrows. 'Question' and 'surprise' are manifested by a "What's up?" face, which may be reduced in polite society to a mere lifting of the eyebrows but nevertheless remains our best cue to asking. Looking at intonation and facial gesture together, we can say that to the extent that a question sincerely asks, it is expressed by an upness in pitch, in the eyebrows, and at the corners of the mouth. And if arm-and-hand gestures are added, they too will be at the level of the upper torso. When you ask, you leave things up in the air; you anticipate an answer that will settle things, and settling is of course down. The organizing metaphor is "up versus down. In terms of Ohala's "dependency", a question is up because the speaker is left hanging on the disposition of a hearer to satisfy it. It would be a mistake to suppose that the coupling of intonation and physical gesture is either uniform or simple. Facial gesture alone is subject to what Ekman, Friesen, and Ellsworth (1982: 19) term blending, whereby more than one emotion can be shown at the same time; all the more reason to expect, then, that when an extra channel is added, the complications should increase, even to the point of allowing contradictory messages to be sent simultaneously. The rule is for motion to go in parallel, but two or more gestural modes may be uncoupled and go their separate ways; and this allows us to give our hearer-viewer a richer choice of responses. A common type of uncoupling is again found in questions. If I say
with a straight face, the "down" intonation delivers a statement. But if I raise my eyebrows, iook you in the eye, nod my head, and leave my mouth open when I have finished, the result is both a statement and a question: it is a statement presented for confirmation. The conclusion that I draw from this is that our feelings and attitudes - which may be mixed feelings and attitudes - shape certain movements of our body, including those of the larynx, and thus "come through" in the way they affect the carrier wave of articulate speech. Gestures become audible. The gesture of the larynx, modulating the fundamental, and of smiling or frowning lips, modulating the formants, are not JS, vol. 2, no. 2
107
DWIGHT BOLINGER the only ones that we are able to hear. Any gesture that changes the shape of the resonators may become audible - we hear the effect, say, of thrusting the jaw forward, and visualize the gesture that produced it. The gesture of the larynx differs from the others chiefly in two respects: we perceive it directly and not through visualizing some facial expression or bodily stance, and it is more flexible than any other, thanks to the sensitivity of the human ear that enables us to exploit a wide tonal range and to express not only different directions of arousal but different amounts of it. One major impediment stands between this generalization and any further conclusion that the same will hold regardless of language or culture. Despite the resemblances mentioned earlier that cannot be explained without assuming some kind of universality, there remain striking differences between languages and dialects. At this stage of our ignorance we can only repeat the observation of Ekman and Oster (1982: 152) who ask whether people in natural situations actually show the distinctive universal patterns of facial expression, and lament that "these facts are not available for even one culture". It may be that some parts of the system have been grammaticized or lexicalized away from the basic metaphor - perhaps there are stages intermediate between intonation languages and tone languages. A likelier possibility, and one that has been amply demonstrated by anthropologists and ethologists where physical gesture is concerned, is that the display rules differ from culture to culture without necessarily touching the underlying uniformities. An elegant experiment by Ekman and Friesen was able to capture a set of behaviors stripped of the social veneer. As Ekman and Oster report it (1982: 149), "when Japanese and American subjects sat alone watching either a stress-inducing or neutral film, they showed the same facial actions.... However, as predicted by knowledge of the display rules in the two cultures, when a person in authority was present, the Japanese subjects smiled more and showed more control of facial expression than did the Americans". If intonation is basically keyed to emotion, it is logical to suppose that it will be sensitive to social pressures and will be hedged by the same kinds of display rules that hedge facial expression. I believe that this position must now be accepted as a definitive answer to those who contest the fundamental universality and iconism of intonation. But does that supposition cover "grammatical" as well as "emotional" intonation? Should we make a distinction between motivated (ideophonic, phonesthematic, sound-symbolic) and unmotivated (arbitrary, phonological) aspects of intonation? Liberman (1979: 139-158) argues that we should. The sound-symbolic and metaphorically rich rising-falling distinction he feels is "overlaid" on an intonational phonology, which - if I understand him correctly - is capable of representing certain meanings that remain constant throughout their various ideophonic colorings. 108
JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? While this is a position that can neither be confirmed nor refuted it rests on disputed evidence of separation - I believe the best procedure at the moment is to pursue the known iconism to the limit and perhaps defer the positing of arbitrary units. Centuries of discussion of grammar and lexicon preceded our modern phonologies. A few decades of more or less casual observation are not enough to underpin a secure phonology of intonation. So I propose that we accept for now the role of intonation as iconic throughout, and hypothesize that whatever arbitrariness there is may neutralize the iconic base here and there but does not succeed in contradicting it.. And that the accentual and grammatical applications of intonation are best understood as pragmatic inferences from the underlying metaphor. A test of this idea is a closer look at some of those applications. The focus I suggest is on the concept "intonation of". If we speak of an "intonation of" questions, do - we mean that intonation plays a defining role in determining what a question is? Or that being in a question plays a defining role in what the intonation contour can be? Or that neither of these propositions is true, and the encounters between intonation and what on other grounds might be called a question are predictable only in statistical terms and occur because their convergence is convenient for speaker and hearer, given the compatibility of their meanings for some linguistic work that needs to be done? Questions are a good place to start because more has been said and written about the question-nonquestion opposition than about any other topic relating to intonation. The test has to be refined at the outset, because there are questions and questions. The ones that are thought to go up most of the time are those that can be answered with yes or no: Did you see him?. Is it raining?. Has she got the money? Questions beginning with an interrogative word - the wh questions - more often go down, with or without going up later: How can they do it?. Whose is this?, Where do you live? Take yes-no questions.4 Do they have a predictable intonation? The answer is a resounding no. Probably any intonation that can occur on any other type of utterance can occur on a yes-no question. There are some intonations that are not very good in certain types of contexts. For instance, I would not have pronounced the question I just asked in this way: na have Do they
JS, vol. 2, no. 2
.. a pre
table
in
to
n ti°
?
109
DWIGHT BOLINGER This would be normal as an echo question in response to your already having asked me Do yes-no questions have a predictable intonation? but not as a question that comes out of the blue. But if we are allowed to contextualize freely, practically anything goes. The same is true of wh questions. They can fall, rise and fall, fall and rise, rise and fall and rise, or simply rise; for example: How do you
How do you do
do it?
it?
How
How do you
do you *
•
'
do i t -
do
. How do
1
yv o u
All these are possible though the last one would be the least frequent and would probably occur only as an echo question or as a matter of great urgency. If there is no "intonation of" yes-no or wh questions such that if we saw one written we could be quite sure how to say it, is there an intonation that can be called a questioning intonation, that is, one that is used exclusively with questions? The best test case is the intonation most commonly associated with questions, the simple rise. And it is immediately apparent that simple rises occur in many places where no one would think of putting a question mark. If I utter something like guil•i
t
y
Had he pleaded and stop there, you could make a fair guess that I was asking a question - there is a broad hint from the syntax, besides the rise. But I could go on: ty
guil Had he pleaded
vie he would have been con ted.
110
JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? The conditional clause has the same intonation as the yes-no question. (The two may also share the same eyebrow-raising). Suppose we hedge the definition by saying that what really counts is whether the intonation goes with a complete sentence - obviously if you are in the middle of something you may be caught at a high pitch, and the defining position of the contour is at the end. But that also fails. If you say to me Your daddy is a liar I can retort with a simple rise: daddy
wouldn't
My If it is true neither that intonation defines questions nor that questions define intonation, what is the alternative? Evidently we should be looking not for a defining relationship but for a contributory one. What does a rising intonation contribute to a question? If what it contributes is due to its own autonomous nature, this should be rephrased: What does a rising intonation contribute to anything? From a gestural point of view the answer to that is simple and straightforward. If you sail up in the air, you contribute an up-in-the-airness - and nothing could be more appropriate to the majority of questions: they require an answer to complete the conversational cycle. Similarly the conditional clause requires a main clause to resolve the condition. And that outraged answer My daddy wouldn't tell a lie! goes up because you've got me all excited, I'm keyed up, I haven't yet calmed down our folkloric use of the up-down metaphor in this context is a clear sign of what the intonation contributes. We find the same impartiality in other intonations, toward the question-nonquestion dichotomy and toward yes-no versus wh. Take the fall-rise in which the accent is obtruded downward. It works equally well on both types of question: Don't
Why
ne?
iie
Compare these with the same questions using a rise and a rise-fall: Don't you b elliieevvee
m e?
Wh
*
ddiidd
*ou
lie t0
me? JS, vol. 2 no. 2
111
DWIGHT BOLINGER What we hear in the fall-rise is a sort of conciliatory tone, in the wh example even a trace of ceremoniousness. And if we look at the shape of the curve and the elements that are on it and where they occur, we can see the reason for those effects: the main word, the one that carries the heaviest freight of information, is at the lowest pitch, exactly the opposite of where we would expect to find it on the basis of that earlier observation about the children going up on the word they found most exciting. The speaker reverses the position because he wants to reverse the impression. The effect is the same in non-questions. Statements become more reserved or soothing, as in They
"
t
t°mean
by comparison with the more outspoken, or one could say upspoken mean They didn't to.
Commands become more subdued, more coaxing, as in Come a
long-
versus the whip-cracking Come
To conclude our sampling of questions we can say simply that intonation is one of the many cues forming the complex by which we identify interrogative utterance. To give full weight to the other parts of that complex, I offer a final example in which the only cues are context and the common knowledge that the two speakers share. My friend Jerry comes into the room and says I've just had a great piece of news. I say to him sweep You
won the stakes.
112
3S, vol. 2, no. 2
WHERE DOES INTONATION BELONG? He replies No, I got the NEH grant that I applied for. My remark is_
a declarative sentence, it has a terminal fall, and (this once) I do not eyeball him or give him any other gestural cue, yet it is a question, in that I expect him to confirm or deny. Questions, nonquestions, conditional clauses, and the like make use of terminal intonations, and we have seen enough to judge the contribution of the up-down metaphor as a marker of what is or is not finished - a device of grammatical demarcation, but quite in keeping with the underlying meaning. The same goes for the reverse accent, though that has more to do with mood than with grammar. To make a case for the intersection rather than the communion of intonation and grammar we need a wider selection of instances in which intonation has been supposed or might be supposed to determine the grammatical contrast in some unmotivated way. Take the case of causative versus adversative meaning in absolute adverbial phrases. In the following, Strong as
.
he
is, I'm
afraid of.. him.
Strong as he . i
. . I'm not a l r a s
'
id
°K-
him. the first signifies 'because he is strong' and the second the opposite, 'in spite of the fact that he is strong'. Though the intonation makes the distinction, it is not designed to do so. What it is designed to do is distinguish between something 'not new' 1 in the first example ('knowing how strong he is'), and something 'new , or newly asserted ('declare him to be very strong - it makes no difference') in the second. The accent on is is crucial: in the second sentence it has been the abrupt drop that characterizes assertions whose import is not to be taken for granted. In terms of the underlying metaphor, the fall expresses finality in both a modal ('confidence') and a terminative sense - here it is the former. The modal sense of 'confidence' (and its opposite) makes the difference in another opposition that looks, on the face of it, almost like a case of lexical tone, and is the subject of an experiment by Nash and Mulac (1980). It involves particularly the verb to think in pairs like the following:
IS, vol. 2, no. 2
113
DWIGHT BOLINGER (1)
I U
gh
t
soo.
(2) I
Subjects had no difficulty in identifying (1) as implying 'and I was right', (2) as implying 'but I was wrong'. Do we have here a case of truly grammaticized intonation? One test is to substitute a full clause for so and supply a context that will compel the reversal of one or the other interpretation. If the result is anomalous, then we have a candidate for grammaticization; if not, the effect may be such as to enable us to detect the direct and not the inferred meaning of the intonation. Suppose we substitute / thought you were going to say that! on the same (1) and (2) intonations (you were going to say is at the lowest pitch and that ends in a rise) and use it as a response to the interlocutor's actually making •the remark referred to; this forces the interpretation 'and I was right'. Both intonations are perfectly normal. There is no contradiction in (2); instead, there is an archness about the response - the speaker may chuckle as he says it and go on to add and you didn 't disappoint me, you devil, you! The same interpretation would be true if thought were replaced by knew, and knew would by itself make 'but I was wrong' anomalous. Now for a context that makes 'and I was right' anomalous and prescribes 'but I was wrong1. Imagine that' the interlocutor has just said I'm really annoyed that you reported me as planning to say that Leslie has dirty habits. Here the other speaker may use (2) as an apology, with the meaning, of course, 'but I was wrong1. If (1) is used, the interpretation still has to be 'but I was wrong', but now we take it as a rather truculent disclaimer: 'It's not my fault if you weren't going to, so don't blame me'. What started out looking like a grammaticization or lexicalization turns out to be the incidental result of adding 'confidence' or 'uncertainty' to the meaning of think. If confidence is added in a context where 'verification' is at issue, we get 'confirmation'; if it is added in a context where comment on something already verified is intended, we get 'truculence'. And when the upness of (2) intersects with 'verification' we get 'uncertainty' - the speaker is worried and keyed up about his error; when it intersects with something already known, we get JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? arousal at the service of teasing. Contour (2) has a role in another supposedly grammaticized use of intonation, involving the scope of negation. A sentence like All the
kids aren't asle eP<
is taken to mean 'some of the kids are and some are not', with negation of all ('not all') rather than the predicate ('not asleep'). Weeks (1982: 161) reports a study by Iannucci (1978) showing that children under the age of six take this sentence "literally", to mean that none of the kids are asleep. So the construction has the earmarks of5 a grammatical specialization added fairly late in the child's grammar. But if we look at the contour and at the function of the accent on all we see that the supposed grammatical distinction is a semantic one, based on an inference. The critical features of the contour are the contrastivity of its main accent and the inconclusiveness of its terminal rise. Something about the accented item is left up in the air. The most general use of the contour is to report something involving an entity that is a member of a class where some other member of the class might be expected to be involved instead. For example: (1) John
/_» (2) wouldn't do tha^
It's
not
sen that she's in
siti v e "
some (3) I can see one out the r The speaker is invited to infer in (1) that some other individual of the class to which John is assigned might do it, for (2) that she has some other fault related to insensitivity (the speaker might go on with It's just that she's a little distracted sometimes), and for (3), which is an answer to some such question as Can you see John out there?, that some person of John's class is there, perhaps John himself. The sentence with all is simply an instance of this same general type. Given the tight relationships in the subclasses of quantifiers, all's natural partner is some or a few: the terminal rise raises a question about all, and we infer that partner. But if the rise can be given some other motivation, the "grammaticization" recedes and the ambiguity of the all sentence returns. There are at least three ways of doing this. One is by converting the rise for use as a clause terminal, say JS, vol. 2, no. 2
115
DWIGHT BOLINGER an if clause (the intonation, needless to say, remains the same): If all the kids aren't asleep, we can organize games for the ones that are awake. If all the kids aren't asleep, let's take the whole bunch out for a romp. Another way is to use the rise to mark a reflex question, one that repeats another person's nonquestion: All the kids aren't asleep? I would have expected at least some to be asleep by now! All the kids aren't asleep? Let's just be glad that some of them are! A third way is to invoke 'accent of power' rather than 'accent of interest1 (see Bolinger 1983). Alongside the heightening of pitch that goes with something new, interesting, exciting, we find a kind of general arousal not tied semantically to particular words but to the utterance as a whole, even though there is no way to manifest it except on particular words. An example is the keyed-up admonition Peo P l e won't accept t h » t ! where the speaker does not intend anything like 'but animals may'. The accent on people is there for the initial impact. So we can have the repeated plaintive whine in the following, in answer to the question Don't you think you ought to hang in there in spite of everything? G
o
d
i
only knows-
,,
My
«
•
.1
friends have all
all They
abandoned me.
job's My
•• sDeak to me_.immeringwon't spe« g o n e gn'» and the speaker finishes his lament with What is there to live for? Despite the intonation, in this context They all won't speak to me does not imply 'some will'. Given these other avenues of interpretation, it is no wonder that children have trouble with quantifier negation, until they develop a degree of mathematical sophistication. 6
116
JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? I believe it is safe to say that the "intonation of" quantifiers is an intersection of independent systems, not a grammaticization. As a final illustration of the autonomy of intonation I offer something involving, on the intonational side, merely a relative weighting of accents of interest. It is a case that has stimulated much dispute, and the examples most often cited are the contrasting pair John has orders to leave. John has orders to leave. The first signifies 'John is supposed to depart1, with focus on the departing; the second, 'John is carrying orders (to be deposited somewhere or with someone)1, with focus on the orders. At the same time there is a difference in the syntax: in the first, to leave is intransitive, and modifies orders; in the second, to leave is transitive and takes orders as its implied direct object. (Other statements are available, but this will do). That is, such is the syntax so long as nothing is contextually presupposed. Supply a context, and the meanings can be reversed: I thought that what's-his-name was supposed to take his orders with him. -No, John has orders to leave. Wasn't John merely complying with a request to leave? - No, John has orders to leave. The syntax explains nothing, and we come down to a relative weighting of interest. It is obvious why leave is highlighted in the first sentence. It is almost as obvious why it must be less prominent than orders in the second sentence, and its position does the rest: the usual thing is for further accents to be played down, or to be absent, after the last main accent (this much is part of the intonationai syntax of English). Relative weight can be readily seen as we substitute verbs having different degrees of interest within their contexts: John has bread to eat, money to spend, a job to do, a book to read, stories to tell, wheels to grease, a hole to dig. John has a palimpsest to decipher, a student to discipline, a cancer to extirpate, a wall to undermine. As soon as one gets away from conventional objects having conventional actions tied to them (bread is naturally to be eaten, money is for spending, etc.), the action takes on more importance. And it may have more importance than the noun even in some conventional settings: John has time to burn. JS, vol. 2, no. 2
117
DWIGHT BOLINGER John has lots to do. John has something to say. And the weighting may easily vary: (1) John has a life to live. (2) John has a life to live. (3) John has things to do. (4) John has things to do. In (1), the interest is in the verve of living; in (2), it is assumed that living is what one expects of a life. In (3), the interest is in John as a doer; in {<*), things are weighing on him. Liberman's image of the overlaid function is appropriate, in reverse. The formalizations - ritualizations would be a better term still, since they are never wholly arbitrary - are laid on top of an immemorial system of gestural communication, of which the up-down of pitch, with its rich metaphorical associations, is only a part. To understand intonation we must establish it on its own foundations, along with its natural congeners in the rest of nonverbal communication, and then, in its fulness, bring it back to grammar to study the interactions. And meanwhile integrate it with facial and bodily gesture. Cartoonists know what the corners of the mouth are for and why that patch of fur over the eyes was left behind after the hirsute adornment disappeared from the rest of the upper face. It will help linguists to see that intonation is most at home in that companionship. Dwight Bolinger Palo Alto, California
Notes 1 This paper is adapted from a lecture delivered 8 November 1982 at the Linguistics Section of the New York Academy of Sciences. 2 I owe this reference to Dr. Anne Cutler. 3 For example, Dascalu 1974 and 1979. 4 Defined as utterances to which a response of yes or no would be appropriate (though not necessarily sufficient). 5 See Bolinger (1982: 526-528) for discussion and further references. 6 The most we can say in this matter of scope is that there has apparently set in some trace of lexical stereotyping independent of 118
JS, vol. 2, no. 2
WHERE DOES INTONATION BELONG? intonation. If in the last example we substitute all of them for they all, giving All of
them won't sP e
k
to me.
it is hard to avoid the anomaly of 'some will' in spite of the ironclad context that forbids that interpretation. It is not as if the postposition of all affected the outcome in other contexts using this intonation, that is, we find all The kids (They) aren't asleep* just as convincingly favoring 'some are awake' as all the kids or all of the kids. But a preceding all seems to invite quantifier negation regardless of intonation: (i) All (of) the young men didn't volunteer, (ii) The young men all didn't volunteer. - (i) is easily 'some did' with any intonation; (ii) is 'none did' unless the intonation makes an issue of all by leaving it in suspense. The same comments can be made for both (of) the young men and the young men both.
References Bailey, Charles-James N., 1982: On.the Yin and Yang nature of language. Karoma Publishers, Ann Arbor. Beebe, Beatrice, 1982: Micro-timing in mother-infant communication. Key 1982; 169-195. Bolinger, Dwight, i9*6: Thoughts on yep and nope. American Speech 21; 90-95. . Bolinger, Dwight, 1978: Intonation across languages. In: Joseph Greenberg (ed.), Universals of human language, Vol. 2, phonology; 471 52k. Stanford University Press, Stanford, California. Bolinger, Dwight, 1982: Intonation and its parts. Language 58; 505-533. Bolinger, Dwight, 1983: Affirmation and Default. To appear in Folia Linguistica 7, special issue, "Prosody". Bosshardt, H.G. <5c Hormann, H., 1981/1982: Der Einfluss suprasegmentaler Information auf die Sprachwahrnehmung bei 4-bis-6-jahrigen US, vol. 2, no. 2
119
DWIGHT BOLINGER Kindern. Archiv fiir Psychologie, 1981/1982, 134(2); 81-104. Cutler, Anne, and David Swinney, 1980: Development of the comprehension of semantic focus in young children. Paper at Fifth Annual Boston University Conference on Language Development, October. Dascalu, Laurentia, 1974: On the "parenthetical" intonation in Romanian. Revue Roumaine de Linguistique 19; 321-348. Dascalu, Laurentia, 1979: On the intonation of questions in Romanian. Revue Roumaine de Linguistique 24; 35-44. Ekman, Paul (ed.), 1982: Emotion in the human face. 2nd edition. Cambridge University Press, Cambridge. Ekman, Paul, Friesen, Wallace V. and Ellsworth, Phoebe, 1982: Conceptual ambiguities. In: Ekman, pp. 7-21. Ekman, Paul and Oster, Harriet, 1982: Review of research, 1970-1980. In: Ekman, pp. 147-173. Iannucci, David E., 1978: The acquisition of quantifier dialects by children. Paper at Fourth Annual Berkeley Linguistics Society meeting, February. Kent, R.D., 1982: Brain mechanisms of speech and language with special reference to emotional interactions. Unpublished paper. Key, Mary Ritchie (ed.), 1982: Nonverbal communication today: current research. Mouton, Berlin e t c Lakoff, George and Johnson, Mark, 1980: Metaphors we live by. University of Chicago Press, Chicago. Liberman, Mark, 1979: The intonational system of English. Garland, New York. Lock, Andrew, 1980: The guided reinvention of language. Academic Press, New York. MacKain, Kristine 5., Studdert-Kennedy, Michael, Spieker, Susan and Stern, Daniel, 1982: Infant intermodal speech perception is a left hemisphere function. Status Report on Speech Research, Haskins Laboratories, SR-71/72. Pp. 125-130. Nash, Rose and Mulac, Anthony, 1980: The intonation of verifiability. In Linda Waugh and C.H. van Schooneveld (eds.), The melody of language. University Park Press, Baltimore. Pp. 219241. Ohala, 3ohn J., 1983: Cross-language use of pitch: an ethological view. To appear in: Phonetica. Pye, Clifton, 1982: Mayan telegraphese: intonational determinants of inflectional development in Quiche Mayan. Unpublished paper. Redican, W.V. 1982: An evolutionary perspective on human facial displays. In Ekman, Pp. 212-280. Van Hooff, J.A.R.A.M., 1976: The comparison of facial expression in man and higher primates. In: M. von Cranach (ed.), Methods of inference from animal to human behaviour, Aldine, Chicago. Pp. 165-196. Weeks, Thelma, 1982: Intonation as an early marker of meaning. In Key, pp. 157-168. 120
JS, vol. 2, no. 2
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE
3urgen Esser
Abstract
1 Tone unit structure 1.1 Setting the problem Quite recently one of the current theories of functional sentence perspective, Halliday's theory of textual organization in terms of information units, has been questioned by various scholars. In a nutshell, the theory states the following, cf. e.g. Halliday (1970: 354): Every text in spoken English is organized in information units. Every information unit is expressed as one tone unit (tone group). Every tone unit has one tonic element (nucleus, focus, sentence accent) which may be preceded or followed by non-tonic elements (examples taken from Brown et al. 1980: 28): (1) (What's happened today?) Daddy washed the CAR (2) (What's happened to the car?) Daddy WASHED it
JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 121-139
121
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The phonological structure of the tone unit in terms of only one tonic element per tone unit is discussed in (1.1) and related to Brazil's theory of proclaiming and referring tones. It is argued that certain claims of Brazil's theory are too strong (1.2). When describing discourse functions it is necessary to recognize besides the 'given''/'new' dichotomy, a distinction between 'foregroundworthy' and 'less foregroundworthy' elements (2.1). This leads me to formulate two kinds of information presentation: rising and falling communication (2.2). It is also argued that there is a need for a distinction between a semantically and a formally defined theme. This makes it possible to incorporate the Prague notions of objective and subjective order of theme and rheme into the theory (2.3). Finally, it is shown that the concept of rising and falling communication can be used to differentiate different reading and text styles (2.4).
JURGEN ESSER (3)
(Who's washed the car?) DADDY did
The tonic element (CAR, WASHED, DADDY) is taken to mean the marking by the speaker of what he assumes to be 'new': |'not in the sense that it cannot have been previously mentioned, although it is often the case that it has been, but in the sense that the speaker presents it as not being recoverable from the preceding discourse" (Haliiday 1967: 204). Anything which follows the tonic element in a tone unit is 'given': that part of the message which the speaker offers as anaphorically or situationally recoverable. What precedes the tonic element may be 'given' and/or 'new' (Examples taken from Quirk et al. 1972: 940. Capitals mark the tonic syllables, the tonetic stress marks indicate the tones, i.e. pitch changes, associated with the tonic syllables.): (What's on today?) v We're going to the RACes 'new' (5)
(What are we doing tcjday?) We're going to the RACes 'given 1 /
'new'
Applying Halliday's theory to spontaneous speech, Brown et al. (1980) encounter problems: "They arise both from the difficulty of assigning tone group boundaries in a principled way and from identifying the location of tonics" (p. 46). In their instrumentally-aided auditive analysis they find it more convenient to recognize two-peaked contours (i.e. tone units with two tonic elements) - and not, as Halliday's theory suggests, one-peaked contours. Furthermore, they deviate from Halliday's theory by not determining tone unit boundaries phonologically but phonetically by measured pauses. Therefore, Brown et al. speak of "pause-defined units". The descriptive problem of deciding whether a piece of discourse is to be analyzed as one two-peaked contour with no tone unit boundary in between or as two separate one-peaked contours is also dealt with by Taglicht (1982). He quotes the following example from Quirk et al. (1972: 941), who follow the basic tenets of Halliday's theory. (6)
William WORDSworth is my favourite English POet
Leaving the two possible domains of 'new' ('William Wordsworth1 if contrasted with John Keats, 'Wordsworth' if contrasted with William Shakespeare) out of our consideration, it is interesting to note that Taglicht recognizes a similar description problem as do Brown et al. 122
JS, vo. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(4)
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE
(7) William Wordsworth is my favourite English poet // (ii) simple tone unit with a single fall-rise nucleus (8) William Wordsworth is my favourite English poet // (iii) two tone units, one with a falling and one with a rising nucleus (9) William Wordsworth // is my favourite English poet // In what follows, I want to argue that the differences among these three analyses are not only notationally but also theoretically relevant. The first decision we have to make is whether our notation should only render the phonetic properties or allow for phonological abstraction. Since the history of intonation research over the last three decades has shown that phonetic clues alone lead to no adequate systems of description, the demand today is for a phonological approach, cf. e.g. Brazil et al. (1980: 2). According to the current phonological approach to intonation, which is reflected in Halliday's theory and also expressed e.g. by Hirst (1977) and Esser (1979), the tone unit should contain only one tonic element. In a phonological analysis the question is not exclusively 'What do I hear?' but also 'With which systematic structure can a given piece of discourse be associated?1.1 Ladd (1982: 206) has convincingly criticized the phonetically established two-peaked contours proposed by Brown et al.: "Obviously, any string of words bounded by pauses has an F o [fundamental Frequency, the acoustic correlate of pitch] contour; but there is simply no reason to suppose that F o contour to be a phonological unit, any more than to suppose the string of words so bounded to be a syntactic unit. [...] I see no justification - and certainly no theoretical discussion of the assumption that any F o contour bounded by pauses must have IS, vol. 2, no. 2
123
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
He (1982: 214) first regards examples like (6) as "compound units with a 'major' falling nucleus followed by a 'minor' rise", but then states the following problem (ib.): "It is sometimes uncertain whether a particular utterance contains such a sequence of major nucleus plus minor nucleus, or a single nucleus with a distributed fall-rise contour. Apart from this, it is a matter of dispute whether such compound tone-units should not be regarded rather as sequences of two tone units, each with one nucleus, as proposed by Fox (1973), and also by Brazil (1978)". Example (6) thus lends itself to the following three analyses: (i) compound unit with a major falling nucleus ^ followed by a minor rising nucleus ' , i.e. a two-peaked contour in the terminology of Brown et al. (due to the intermediate status of poet in (7), tonetic syllables in the following three examples are indicated by tonetic stress marks only and not, additionally, by capitals; // marks the end of tone units)
JDRGEN ESSER phonological s t a t u s . Yet t h a t assumption is the major reason for BC&K 1 s [Brown e t al.] difficulty in identifying tonics in their d a t a , since the contours on their pause-defined units mostly have two major a c c e n t u a l p e a k s , and t o n e groups a r e supposed t o have only one".
s
Applying t h e phonological approach to example (7), the intonation p a t t e r n s expressed in (10) and (11) a r e possible s y s t e m a t i c s t r u c t u r e s with which (7) can be associated for purposes of identification. (For reasons which will become clear l a t e r , I s e p a r a t e the choice of the t o n i c e l e m e n t , which is underlined, from tone, both of which a r e conf l a t e d in the t o n e t i c stress marks of the British tradition and which w e r e used so far. The arrows for tone also mark tone unit boundaries.) William Wordsworth is my favourite poet A
(11)
William Wordsworth
is my favourite poet A
The appropriate phonological i n t e r p r e t a t i o n has t o rely on the s e m a n t i c s of the u t t e r a n c e . We therefore have to consider possible meaning differences between (10) and (11). 1.2
Brazil's
theory
of proclaiming
and referring
tones
T h e intonation p a t t e r n of (11), consisting of one tone unit t e r m i n a t e d by a fall and one t o n e unit t e r m i n a t e d by a r i s e , plays an i m p o r t a n t r o l e in Brazil's theory of discourse intonation. We shall t h e r e f o r e look a t examples (7) t o (11) in the light of Brazil's new theory. Brazil e t a l . (1980: 13 f.) s t a r t with t h e following set of examples ( / / marks here also t h e beginning of a tone unit): (12)
/ / when I've finished Middlemarch / / I shall read Adam Bede / /
(13)
/ / when I've finished Middlemarch / / I shall read Adam Bede / /
(1ft)
//I
(15)
/ / I shall read Adam Bede / / when I've finished
shall read Adam Bede / / when I've finished Middlemarch / /
V
Middlemarch / /
*
According to the theory "the function of the fall-rise tone is to mark the experiential [propositional] content of the tone unit, the matter, as part of the shared, already negotiated, common ground, occupied by the participants at a particular moment in an ongoing interaction. By contrast, falling tone marks the matter as new" (Brazil et al. 1980: lftf.). Later the falling tone is assigned the general function of 'proclaiming' and the fall-rise tone that of 'referring1. It will be noted that Brazil's notions of 'common ground' or 'referring' and 'new' or 'proclaiming' correspond to Halliday's 'given'/-'new' dichotomy (although 12
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(10)
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE Halliday relates this dichotomy to the location of the tonic element and not to tone). Therefore (12) and (14) are possible answers to (16)
What will you do when you've finished Middlemarch?
and (13) and (15) are possible answers to (17)
When will you read Adam Bede?
(121)
When I've finished Middlemarch A I shall read Adam Bede y
(15')
I shall read Adam Bede A when I've finished Middlemarch y
The intonation pattern of (13) and (14), consisting of one tone unit terminated by a fall and one tone unit terminated by a (fall-)rise, has also been described before in the literature. Leech <5c Svartvik (1975: 173 f.), who in a way anticipate Brazil's theory, write: "We tend to use a falling tone to give emphasis to the main information in the sentence, and a rising tone (or, with more emphasis, a fall-rise tone) to give subsidiary or less important information, i.e. information which is more predictable from the context". One of their examples, in their notation, is (18) | I saw your brother | at the game yesterday. | MAIN
SUBSIDIARY
It has to be pointed out, however, that the identification of textual functions ('given', 'common ground', 'referring', 'subsidiary' vs. 'new', 'proclaiming', 'main' ) by means of tone and not - as in Halliday's theory - by the location of the tonic element leads to difficulties. The first concerns the generally acknowledged basic function of the rising tone, which is "non-assertive or continuative" (Schubiger 1958: 11) and which may "suggest politely that a (confirmatory) comment would be welcome" (Quirk et al. 1972: 1045). This latter interpretation of the rising tone holds particularly well for sentence (18), which is meant to initiate a dialogue. Given the basic function just outlined, it follows that a speaker can also be 'assertive' and not 'polite' by using a falling tone without changing the distribution of 'given' and 'new' elements. There is no reason why sentences (12) and (15) couid not be pronounced with a rising tone so as to politely invite a comment as, vol. 2, no. 2
125
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The intonation pattern of (12) and (15), consisting of one tone unit terminated by a (fall-)rise and a • following tone unit terminated by a fall, is a long-recognized contour in the description of functional sentence perspective, cf. Danes (1960: 48), Halliday (1967: 204), Esser (1979: 45 ff.). I call this contour, which singles out the 'given' information (theme, see below), a thematic rise. In the notation already introduced in connection with (10) and (11), (12) and (15) yield:
JURGEN ESSER nor why sentences (13) and (14) could not be pronounced in a less polite, assertive way by using a fall. The second difficulty concerns the tone unit boundary after the falling tone in (11), (13), (1*) and (18). The separate tone unit after a falling tone within one utterance renders the second tone a somewhat independent afterthought, cf. e.g. Kingdon (1958: 79, 105). This interpretation holds e.g. for Leech & Svartvik's (1975: 174) example (19) | It was snowing | when we arrived
|
(20)
What was the journey like?
but not to (21)
What was the weather like when you arrived?
It follows from this that the second tone unit in (19) must be 'new', 'proclaiming1 and not - as Brazil's theory suggests - 'given1, 'common ground1, 'referring'. (To be 'given1, when we arrived would have to follow the tonic element snowing in one tone unit.) Brazil's referring and proclaiming functions are too specific to base a general discourse theory on them. There is yet one more argument to be considered in connection with the intonation pattern of (11), (13), (H), (18) and (19). It is wellknown that there are classificatory problems as to whether an utterance is to be classed as simple tone unit with a fall-rise or two tone units, one with a fall and one with a rise, cf. e.g. Kingdon (1958: 79 f.). As regards the second possibility, Kingdon writes (p. 80): "This combination is of frequent occurrence when limiting clauses or phrases are added as afterthoughts to straightforward statements, as an afterthought is usually added after a break". The first possibility is described as follows (ib.): "A Tune III [falling-rising] should have its main emphasis on the falling element, a semantic homogeneity and no break".' With regard to our crucial examples (7), (8) and (9) - the interpretations of (6) - we can safely conclude that (11) represents semantically the afterthought interpretation - which is very limited in its range of application and probably not intended by (6).
126
ZS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in which the last tone unit with the rising tone indicates "subsidiary information added as an afterthought" (ib.). It should be noted now that, according to Brazil's theory, the second tone unit in (19) should be assigned the function 'common ground1, 'referring' ('given'). This interpretation is, however, incompatible with the notion of afterthought, since the afterthought adds something not thought of in the first place. Sentence (19) - with when we arrived ' as afterthought in a separate tone unit - could be an appropriate answer to
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE
The foregoing discussion has shown that Brazil's theory needs certain modifications: (i) To avoid an afterthought interpretation of examples (13) and (1*) these should phonologically be regarded as consisting of one tone unit and therefore be rewritten in'our notation as (13') When I've finished Middlemarch I shall read Adam BedeA (I*1) I shall read Adam Bede when I've finished Middlemarch A In a "bold", "didactic" statement it is also possible to choose the falling tone (13") When I've finished _Middlemarch I shall read Adam Bede y (IV) I shall read Adam Bede when I've finished Middlemarch f (11) The idea that utterance-final choice of tone contributes t o the marking of 'given' and 'new' has t o be abandoned. Also (12) and (15) could end in a non-assertive, polite rise. (iii) The holistic interpretation of rising tone units as referring and of falling tone units as proclaiming has t o be abandoned. Only rising tone units which a r e not utterance-final can generally be said t o have a referring function. To sum up the results of our discussion of tone unit structure and JS, vol. 2, no. 2
127
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(6) William WORDSworth is my favourite English POet (11) William Wordsworth Y is my favourite English £oet A The notations of (7) and (8) must now be regarded as notational variants of (10). They are both meant to indicate one tone unit, one tonic element and rising tone: (10) William Wordsworth is my favourite English poet A This follows from the "main emphasis on the falling element" in the last quotation and from a general observation made by Kingdon earlier in his book (p. 33): "[...] if a falling tone rests on a word which is at a distance from the end of the utterance there is a tendency for some later word to be given a low rising tone, thus changing the utterance from Tune II [falling] to Tune HI [falling-rising]". The (low) rising tone in the examples here under discussion has the same basic function mentioned earlier Cnon-assertive1 or 'continuative'). This is corroborated by Kingdon's explaining the change from Tune II to Tune III by reference to the following factors (p. 33): "to introduce an insinuation and so avoid making a bold statement, then an instinct against having a long level tail (which tends towards obscurity), and a wish to soften the tune and make it sound less didactic". Finally, it should be noted that our notation al-1 lows for a separation of the two functions 'choice of the tonic element and 'tone', which are conflated in the British system of tonetic stress marks; for a similar criticism cf. Standop 1982: 7). As I have shown elsewhere (Esser 1977: 349), the fall on the tonic in a rising tone unit is an automatic feature. Therefore the complex tone 'fall-rise' is a confusing, redundant notion.
JURGEN ESSER functional sentence perspective so far, we can identify two basic intonation patterns which can be used to implement the 'given'/'new' distinction expressed in examples (12) to (15), rewritten as (12') to (15') below. Thematic rise: (GIVEN) _GIVEN A (NEW) NEW TA Non-final tonic: (GIVEN) (NEW) NEW (GIVEN) GIVEN TA Brackets indicate optional elements, these may be more than one; the utterance-final tone can be rising or falling, see above. It will be convenient to repeat Brazil's examples in our notation, add suitable contexts, and correlate them with the two intonation patterns: 2 (12') (What will you do when you've finished Middlemarch?) When I've finished Middlemarch A I shall read Adam Bede Y (Thematic rise)
(IV) (What will you do/read when you've finished Middlemarch?) I shall read Adam Bede when I've finished Middlemarch A (Non-final tonic) (15') (When will you read Adam Bede?) I shall read .Adam Bede when I've finished Middlemarch V (Thematic rise) There is, however, one intonation pattern relevant to the functional sentence perspective which does not occur in the examples quoted from Brazil et al. so far. This is the intonation pattern used in examples (4) and (5) which is conveniently called 'end-focus', cf. Quirk et al. (1972: 938). As we already noted above, end-focus does not indicate a 'given'/'new' distinction among elements which precede the tonic element. The basic pattern is the following: End-focus:
(GIVEN) (NEW) NEW YA
I repeat examples (<0 and (5) in our new notation: (V)
(What's on today?) We're going to the races Y 'new'
(5')
(What are we doing today?) We're going to the races Y 'given'/
128
'new'
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(131) (When will you read Adam Bede?) When I've finished Middlemarch I shall read Adam Bede A (Non-final tonic)
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE If we now come back to the criticism of Halliday's theory, reported at the beginning of the present paper, it must be stated that in the light of the critical discussion of Brazil's theory, Halliday's basic tenets prove to be sound and applicable. 2 Discourse functions 2.1 Foregroundworthy information
(22)
The ^ap_'s leaking y
['empty' verb]
'new' (23)
My keys've disappeared y
[verb of disappearance]
'new' (24)
The car broke down y
[verb denoting a misfortune]
'new' Also the intonation pattern with a thematic rise, which can be used to differentiate between 'given' and 'new', can mark a distinction in all-'new1 sentences; cf. the following example taken from Chafe (1976: 36) rendered in our new notation: 3 (25)
(What happened at the meeting?) They elected _Alice A president y 'new'
What, then, is the reason for the distinctions expressed intonationally in (22) to (25)? For their examples, Allerton & Cruttenden point out the particular semantic classes to which the verbs involved belong. But besides this they offer a more general explanation (p. 53): "Thus when we say The TAP'S leaking both the tap and its leaking are in some sense new, but the important thing in the speaker's mind is to get the attention of the listener focused on the tap". 4 This means that there is a kind of textual organization in spoken discourse, expressed by intonation, which is independent of the 'given'/'new' distincJS, vol. 2, no. 2
129
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I now want to suggest certain modifications of the theory summed up in the preceding section. Although it is generally acknowledged that the intonation pattern with non-final tonic can be used to differentiate between 'given' and 'new' information ('given' elements follow the 'new' tonic element), this intonation pattern is also used to mark a distinction in all-'new1 sentences, cf. the following examples taken from Allerton <5c Cruttenden (1979: 53):
JURGEN ESSER tion. I therefore propose to express the aspect of 'getting the attention of the listener focused on something1 in5 terms of the contrast 'foregroundworthy' vs 'less foregroundworthy'. In an abstract notation, the phonological and the meaning contrast can be expressed as follows (X and Y are elements of a tone unit, the final tone is left unspecified): _X
Y
vs.
'X foregroundworthy'
X
X
vs. 'X less foregroundworthy'
A similar opposition can be seen if (25) is contrasted with (26)
(End-focus)
'new' Here, Alice is less foregroundworthy in comparison to (25). We therefore postulate in addition the following opposition: _X A _Y
vs.
'X foregroundworthy' vs.
X X 'X less foregroundworthy'
It follows from these observations that a foregroundworthy element (e.g. X) can be marked by two distinct intonation patterns (for purposes of later reference I use small Roman figures): (i)
X> X
(iii)
2L
Y
(cf. Thematic rise)
A less foregroundworthy element (e.g. X) can also be expressed by two distinct intonation patterns: (ii)
X
(iii)
X
X
x
2.2 Rising and falling communication I take the two intonation patterns (i) and (ii) as two variants of one kind of information presentation which I call 'rising communication'; by contrast, the intonation pattern (iii) expresses 'falling communication1. My concept of rising and falling communication is derived from Henri Weil's notions of 'ascending accent', which is another term for 130
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(What happened at the meeting?) They elected Alice president y
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE
falling communication
rising
communication X is foregroundworthy X is less foreground-
worthy
(i) X,l
X
(ii) Y (ii) X
_X
x
(iii) _X
Y
(iii) X
X
We witnessed rising communication in the above examples (12'), (15'), (25) and (26); it is also there in the following examples (27), (28) and (29). Falling communication was witnessed in the above examples (13'), (IV), (22), (23) and (2<0; it can also be seen in (27); examples (27) to (29) are adapted from Quirk et al. (1972: 953). (27) (Who did you give the water to?) It was the dog I gave the water to T
(28) (Which dog bit you?)
•new'/rising 'given'/f ailing communication communication It was the dog I gave the water toT m
'given1
'new'
rising communication (29) (Which dog bit you?) It was the dog.A I gave the water toT new1 'given rising communication JS, vol. 2, no. 2
131
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
end-focus, and 'descending accent1, which is another term for a tone unit with non-final tonic; see the discussion of Weil's theory in Adjemian (1978: 258). While end-focus [intonation pattern (ii)] pertains only to tone unit level it has an analogue in terms of rising communication above the tone unit level, i.e. at the level of clauses or sentences, as expressed by the thematic rise [intonation pattern (i)]. This analogue is called the 'principle of resolution1 by Quirk et al. (1972: 791): "As rising and falling-rising tones have implications of non-finality, the effect of this sort of pattern is to build up a continuing sense of anticipation, which is at last 'resolved' by the finality of the falling tone". My term 'rising communication1 is now meant to combine under one cover term the notions of end-focus [intonation pattern (ii)] and resolution [intonation pattern (i)] on the ground that these formal properties, in terms of information presentation, have the general function of marking right-bound elements as foregroundworthy and additionally often as 'new'; see below. 'Falling communication' pertains only to tone unit level and has the general function of marking right-bound elements (i.e. elements which follow the tonic in a6 tone unit) as less foregroundworthy and additionally often as 'given'. The interrelation of rising and falling communication on the one hand and foregroundworthy and less foregroundworthy elements on the other can be summed up as follows:
JURGEN ESSER Before I go on to deal with the patterning of 'given' and 'new' and rising and falling communication, a few comments on the interrelation of the intonation patterns (i) and (ii) are appropriate. In our discussion of two-peaked intonation contours in section 1.1 and 1.2 we only dealt with differences between intonation patterns (i) and (iii) but not between (i) and (ii). The problem which arises here is that in a phonological (not phonetic) description of English intonation there is no justification for a rhythmic level between word stress and sentence accent (= tonic element). This is, however, done by Brazil et al. (1980) in the guise of 'prominence1. They do not see tone unit structure simply as a unit of one or more words with one tonic element. Carrying on the phonetic description of tone units of the British tradition they postulate the following (p. 40):
(30)
He was going to go y
where lexical word stress is marked in addition to the tonic element? Or is it to be understood as (31)
He was .going A to go T
where going is made foregroundworthy by means of intonation pattern (i) in contradistinction to (30), where (ii) is realized? The second interpretation seems to be more likely since (p. 40): "The distribution of prominence [...] depends upon the speaker's apprehension of the state of convergence he shares with the hearer. More precisely, it represents his assessment of the relative information load carried by particular elements in his discourse". This seems to be a vague description of what in Halliday's theory is meant by 'new'. Since Brazil et al. have, however, narrowed the semantic range of the phenomenon so described to 'proclaiming' and given it a very particular tone unit structure (cf. section 1.2), they have to resort to the vague concept of prominence; for a more detailed discussion of word stress, sentence accent and rhythm see Esser (1979: chapter 6). Returning now to example (31) we must state that this is a clear example of 'level stress'. (On level stress see Pilch 1970 and Esser 1979: chapter 6.L). The difficulty of deciding between level stress [intonation pattern (i)] and end-focus [intonation pattern (ii)] arises for various reasons. First, the information presentation, i.e. rising communication is the same, cf. (28), (29) and 132
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
proclitic segment tonic segment enclitic segment GOing to GO he was VEry TALL STO that's a. ry WED it was a nesday According to their story (p. ,39 f.) "the tonic segment begins, with the first prominent syllable [••• J and ends with the last prominent syllable, the tonic". Functionally and theoretically it is, however, by no means clear what a prominent syllable js. Is the first sentence in their set of examples to be understood as
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE (30), (31). Second, mere final position will cause a peak to be perceived as more prominent than those preceding in a series, cf. Ladd (1982: 207). Third, the decision will depend on the speed with which the sentence is uttered. ' The interplay of foregroundworthy and less foregroundworthy elements, rising and falling communication and 'given' and 'new' yields six types (A to F) of discourse organization.
foregroundworthy less
'new'
rising com. (i)
A
B
rising com. (ii)
C
D
falling com. (iii)
E
F
In the following examples, braces indicate those parts of the sentences which are analyzed in terms of the feature combinations symbolized by the types A to F. Type A: (32)
(We're having peas for dinner tonight) Peas A I can't stand y 'given' foregroundworthy rising communication
Type B: (33)
(What happened at the meeting?) They elected ^Uice k president y 'new' foregroundworthy rising communication
Type C:
(3f)
(Which dog bit you?) It was the dog I gave the ^ater to y 'given' less foregroundworthy rising communication
JS, vol. 2, no. 2
133
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
foregroundworthy
'given'
JDRGEN ESSER Type D: (35)
(What happened at the meeting?) They elected Alice president y 'new' less foregroundworthy rising communication
Type E: (36)
(Tell me something you can't stand!) Peas I can't stand y
Type F: (37)
(What happened?) The car broke down y 'new' less foregroundworthy falling communication
A stylistic analysis in terms of the six types of discourse organization was carried out in an illustrative way in Esser (1981) and on a larger scale in Esser (1983). 2.3 Theme and rheme, semantically and formally defined I now want to consider the six types of discourse organization A to F in the light of the traditional theory of functional sentence perspective as proposed by the Prague School of Linguistics. It is well-known that the term 'theme' is positionally defined in Halliday's theory: it denotes the first element in a clause. By contrast, the classical definition, as proposed by Mathesius, combines the positional aspect with a semantic one. Firbas (1964: 268) translates the relevant passage from a Czech paper by Mathesius, who defines theme as "that which is known or at least obvious in the given situation, and from which the speaker proceeds [in his discourse]". Indeed, the semantic aspect ('given') prevails in most definitions of theme and its terminological counterpart rheme ('new'). But even if theme and rheme are restricted to their semantic meanings ('given1 and 'new') it will be difficult to maintain the terminological pair in a precise definition in view of the six types of discourse organization outlined in the foregoing section. If we want 13*
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
'given' less foregroundworthy falling communication
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE to relate - the phenomena there described to the terminological pair theme/rheme in a consistent way we have to decide whether we "want to make the semantic criterion ('given' or 'new') or the formal one (rising or falling communication) decisive. I think one should adhere to the current practice and define them (in the Prague theory and not in Halliday's) in terms of 'given' and rheme in terms of 'new'. In terms of our six types of discourse organization theme relates to 'given' and foregroundworthy elements or 'given' and less foregroundworthy elements, cf. (32), (34) and (36); rheme relates then to 'new' foregroundworthy elements and 'new' less foregroundworthy elements, cf. (33), (35) and (37).
objective order: Theme ('given') before Rheme ('new') subjective order: Rheme ('new') before Theme ('given') The objective order is expressed in our types A and C of discourse organization, the subjective order in our type E. In the all-new sentences (33), (35) and (37), exemplifying types B, D and F respectively, it is, however, impossible to speak of an objective or subjective order since we have here only 'new' information. To be able to apply the useful terminological pair of objective and subjective order of theme and rheme also to types B, D and F, we have to allow, in all-'new' sentences, for a formally defined theme which disregards the semantic feature 'given' and defines by way of analogy to types A, B and E the braced elements in (33), (35) and (37) as 'formal themes'. In the unmarked cases (32), (34) and (36) the semantically defined theme and the formally (intonationally) defined theme overlap. The following table summarizes the semantic and formal definitions of theme and rheme: semantic
formal
Theme
'given'
concerning utterance element X
(i)
Rheme
'new'
concerning utterance element Y
(ii) X (iii) Y
_XA
x x X
2.4 Stylistic considerations The formal definition of theme and rheme in terms of the three intonaJS, vol. 2, no. 2
135
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We are now in a position to relate our six types of discourse organization to another terminological pair of the Prague School, the objective and the subjective order of theme and rheme, cf. Mathesius (postum 1975: 83 f.):
JURGEN ES5ER tion patterns is particularly interesting in those cases where in concrete texts or long, complex sentences 'given' and 'new' are not easily definable in terms of recoverable information. Furthermore, our concept of rising and falling communication can be used as style markers of written texts which are read aloud and of spoken texts. To show differing reliefs in the presentation of information, which are due to the reading style and the text structure in terms of cohesion (cf- Halliday & Hasan 1976), I finally want to compare two written texts which were read aloud. Example (38) is a piece of broadcast radio news. (38)
Some sixty British holidaymakers A have arrived home A from the earthquake-area A of South Jugoslavia y More than three A hundred tremors A have been recorded A since Sunday's majorA earthquake A which left two hundred A dead A and a thousand A jnjured More than eighty thousand people A have spent another cold A night A in the open A for fear of tremors y
Example (39) is quite different (the written text is taken from Buck (1971: 7it), the intonation notation renders the reading of an English native speaker): (39)
The Decline A of a Suburban Symbol y Scenery along trolleybus-routes A was to a considerable extent A inherited from tramway-days y but where the trolleybus had its own charcteristic type of scenery A was in its extensions A beyond the old tram routes A into the newer A outer areas of the town y
Here we have four instances of falling communication (trolleybusroutes and tramway-days are taken to be forestressed compounds which do not count as falling communication): extent (line 2 f.), characteristic type of scenery (line 4), The old tram routes (line 5) and areas of the town (line 6). Extent is 'empty'; the other three elements are 'given' and less foregroundworthy and thus exemplify type E. Their relative frequency and length render them a salient style marker of this text portion.
136
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In (38) there are only two instances of falling communication: tremors (line 3) and people (line 5). These items, which are 'given' and less foregroundworthy, exemplify type E (semantic and formal themes). In the first sentence of (38) we have 'new' before 'given' (typical of news and advertising) under rising communication: both the 'new' elements Some sixty British holidaymakers have arrived home and the 'given' elements from the earthquake-area of South Jugoslavia are presented -under rising communication. In the first sentence, the first and second tone unit exemplify type B, the third and fourth type A. The even rising communication can be seen as a style marker of the broadcast radio news.
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE Jurgen Esser Department of English University of Erlangen, FRG
Notes
to relate the accentuation of tap to the speech-act function 'explanation'. It seems, however, to be hard to make such a link between choice of nuclear element and speech-act function in a principled way. 5 A note on terminology may be in order here. I do not want to use the term focused because "focus of information" (Quirk et al. 1972: 937) is too general in its meaning and can also cover 'new'. "Foregrounded" (Leech 1969: 57) could not be used either because this term has a special stylistic meaning. The asymmetrical opposition 'foregroundworthy' vs. 'less foregroundworthy1 - instead of 'more foregroundworthy' vs. 'less foregroundworthy' - is asymmetrical in the same sense as 'large' vs. 'not so large' is in comparison to 'large' vs. 'small'. It seems possible that Halliday's second component in his JS, vol. 2, no. 2
137
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 Like in segmental phonology, the notion of 'systematic structure' can be explicated by making use of minimal pairs. Since the placement of one nucleus in one tone unit can be shown to be distinctive in some cases (e.g. I've inSTRUCtions to leave vs. I've instructions to LEAVE), it is assumed that the placement of the nucleus is a systematic choice which is recognized to be relevant in general and serves as structural information in the interpretation of phonetic data, even if the data themselves are not unequivocal. Similarly, phonetic continua, e.g. along the voice-dimension as between /p/ and /b/, are made discrete and unequivocal in non-minimal-pair situations by, among other features, the structural information that can be explained in terms of minimal pairs. See in this connection the fundamental distinction made by Pilch (1976: 68 f.) between the phonetic models of speech interpretation and the phonemic model. 2 It should be noted that the phonological rise fulfils different functions in (12') and (13'). In (12'), utterance-medial, it is part of the structural description of a tone unit whose function it is to single out the 'given' information (thematic rise). In (13'), utterance-final, the rise has the modal function discussed in connection with example (18) above. 3 Chafe marks in his notation primary stresses only, They elected Alice president, and no tone unit structure. Given the one-nucleusper-tone-unit principle as stated above, his example has to be interpreted as in (25). i* This explanation by Allerton and Cruttenden only seeks to account for the choice of tap as a tonic (nuclear) element. At the present stage of the theory, overall speech-act functions are not yet considered. E.g., looking at the speech-act function of B's answer in the dialogue A: What's that noise over there? B: The TAP'S leaking one could try
JDRGEN ESSER
References Adjemian, Christian, 1978: Theme, rheme, and word order: From Weil to present-day theories. Historiographia Linguistica 5; 253-273. Allerton, D.J. & Cruttenden, A., 1979: Three reasons for accenting a definite subject. Journal of Linguistics 15; 49-53. Brazil, David, 1978: Discourse Intonation II. University of Birmingham, Birmingham. Brazil, David, Coulthard, Malcolm <5c Johns, Catherine, 1980: Discourse Intonation and Language Teaching. Longman, London. Brown, Gillian, Currie, Karen & Kenworthy, Joanne, 1980: Questions of Intonation. Croom Helm, London. Buck, Timothy, 1971: Modern Phonetic Texts for Foreign Students of English. Hueber, Munich. Chafe, Wallace L., 1976: Givenness, contrast! veness, definiteness, subjects, topics and point of view. In: Charles N. Li (ed.), Subject and Topic. Academic Press, New York. Pp. 25-55. Danes, Frantilek, 1960: Sentence intonation from a functional point of view. Word 16; 34-54. Esser, Jurgen, 1977: Review of Leech & Svartvik 1975. IRAL 15; 347-350. Esser, Jurgen, 1979: Englische Prosodie: Eine Einfuhrung. Gunter Narr, Tubingen. Esser, Jurgen, 1981: On the analysis of complex sentences: A study in the cohesion of spoken English. In: Jurgen Esser &. Axel Hubler (eds.), Forms and Functions. (Papers in General, English and Applied Linguistics presented to Vilem Fried on the occasion of his sixty-fifth birthday.) Gunter Narr, Tubingen. Pp. 163-174. 138
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
definition of 'new1, 'presented as not being recoverable from the preceding discourse' - cf. the quotation at the beginning of section 1.1 was intended to allow in a similar way for a subjective element as does my notion of 'foregroundworthy'. In any case, it seems appropriate to keep this notion apart from the basic notion of 'new' in terms of recoverable information. 6 It must be stressed that the terms 'rising communication' and 'falling communication' are independent of the utterance-final choice of tone; see the abstract summations on pages 12 and 13 above and examples (27) to (29) below; in (27), the combination of falling tone and falling communication is not necessary. The premodification rising in rising communication is meant to highlight the 'suspense-creating' nature of intonation pattern (i) and (ii). The 'suspense-creating' nature of intonation patterns (i) lies in its cataphoric (progredient) function, that of (ii) in the expectation of a nuclear element. The premodification falling in falling communication is meant to draw attention to the lack of suspense associated with (the final part of) the intonation pattern (iii).
TONE UNITS IN FUNCTIONAL SENTENCE PERSPECTIVE
JS, vol. 2 no. 2
139
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Esser, Jiirgen, 1983: Untersuchungen zum gesprochenen Englisctv Ein Beitrag zur struktwellen Pragmatik. Gunter Narr, Tubingen. Firbas, Jan, 196V: On defining the theme in functional sentence analysis. Travaux Linguistiqu.es de Prague 1; 267-280. Fox, A., 1973: Tone sequences in English. Archivum Linguisticum 4; 1726. Halliday, M.A.K., 1967: Notes on transitivity and theme in English, Part 2. Journal of Linguistics 3; 199-244. Halliday, M.A.K., 1970: Functional diversity in language as seen from a consideration of modality and mood in English. Foundations of Language 6; 322-361. Halliday, M.A.K. <5c Hasan, Ruqaiya, 1976: Cohesion in English. Longman, London. Hirst, Daniel, 1977: Intonative Features: A Syntactic Approach to English Intonation. Mouton, The Hague. Kingdon, Roger, 1958: The Groundwork of English Intonation. Longman, London. Ladd, D. Robert, 1982: Review of Brown et al. 1980. Language 58; 204208. Leech, Geoffrey N., 1969: A Linguistic Guide to English Poetry. Longman, London. Leech, Geoffrey N. <5c Svartvik, Jan, 1975: A Communicative Grammar of English. Longman, London. Mathesius, Vilem, 1975: A Functional Analysis of Present Day English on a General Linguistic Basis. Ed. by Josef Vachek. Mouton, The Hague; Academia, Prague. Pilch, Herbert, 1970: The elementary intonation contour of English. Phonetica 22; 88-111. Pilch, Herbert, 1976: Empirical Linguistics. Francke, Munich. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey & Svartvik, Jan, 1972: A Grammar of Contemporary English. Longman, London. Taglicht, J., 1982: Intonation and the assessment of information. Journal of Linguistics 18; 213-230. Schubiger, Maria, 1958: English Intonation: Its Form and Function. Max Niemeyer, Tubingen. Standop, Ewald, 1982: Das Jones-Gimsonsche Intonationsmodell nebst einigen grundsatzlichen Bemerkungen zum Verhaltnis von Phonetik und Aussprache. LAUT (Linguistic Agency University of Trier) Series B, Paper No. 77.
JS, vol. 2, no. 2
CONTRAST1VE STRESS, CONTRASTIVE INTONATION AND CONTRASTIVE MEANING 1
Janet Mueller Bing
Abstract
Articles on syntax or semantics frequently refer to contrastive stress. For example, Lasnik (1969), Chomsky (1971), Jackendoff (1972), and Sag (1976) all note semantic differences which are said to result from contrastive stress used somewhere in the sentence. These writers all assume that there is such a thing as contrastive stress which can be characterized as a departure from a 'normal' stress pattern. This assumption, however, has been periodically challenged. Among others, Pike (1945), Bolinger (1961b), Schmerling (1976) and Ladd (1978) have all argued either that there is no such thing as contrastive stress or that the term should be redefined. Some of the confusion has arisen from the name "contrastive stress". It is, of course, possible to have contrastive meaning without having contrastive stress, as Bolinger (1961b: 106) illustrates: (1) He didn't buy a Ford, he bought a Plymouth.2 There are a number of ways to show or imply a contrast, including contrastive intonation, which will be discussed below. Conversely, despite its name, what is commonly referred to as contrastive stress does not function only to signal contrastive meaning. (2) My real secret is what's at the other end. W'l what's at the other end? - United Airlines Cargo.
JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 141-156
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Although the term "contrastive stress" has been incorrectly used in the past to describe default accent, contrastive intonation, or a combination of the two, English does have contrastive stress. However, contrastive stress is not used primarily to show a contrast. It is possible to have contrastive meaning without .contrastive stress and it is also possible to have contrastive stress without contrastive meaning. The fact that English has contrastive stress constitutes further evidence for the existence of an unmarked or normal stress pattern.
JANET MUELLER BING Boiinger (1982: 24) discusses this sentence claiming that the accent on at is an "accent of affirmation". 3 Only in a very abstract way could (2) be judged semantically contrastive. Contrastive stress can be used to show a contrast, of course, but the presence or absence of a semantic contrast is not the deciding factor. In this paper I will show that the term contrastive stress has been incorrectly used to describe stress patterns which are part of the predictable or 'normal' stress pattern. For example, the following sentences are representative of those sometimes cited as having contrastive stress. (3) Can Willy do SIXTY pushups? Oackendoff, 1972: 234) (Boiinger,
1971: 107.
(5) Even a two-year-old could do that. (Schmerling, 1976: 49) (6) Harry ate the bagel. (Sag, 1973: 234)
•
(7) Senator Eastland didn't grow cotton to make money. (He grew tobacco.) (Lasnik, 1969) I will argue that although some of these sentences signal contrastive meaning, none of the sentences (3) - (6) have contrastive stress. Having shown what contrastive stress is not, I will then argue that it is exactly what it is often assumed to be, a deviation from the norm. Although they do not always imply a contrast, the following sentences all contain contrastive stress. (8) John hit Bill, arid then George hit him. 1970: 124)
(Akmajian and
Jackendoff,
(9) The bills were not large but there were a great many of them. (Boiinger, 1982: 7) (10) Of course, it's the movie to see this year. Because contrastive stress will be defined as a deviation from 'normal' stress, it cannot be described as a particular stress configuration because in each case the 'normal' stress pattern is different. For this paper I will be assuming the hypothesis that there is a normal stress pattern, despite arguments against this hypothesis such as those found in Boiinger (1961b; 1972) and Schmerling (1976). * In assuming that there is a normal stress pattern, I am also assuming that factors introduced by the discourse should be formalized as part of this unmarked system; that is, the unmarked system is not deter142
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(4) I said to report the trouble, not broadcast it. Quoted without accents.)
CONTRASTIVE STRESS, INTONATION, MEANING mined solely on the basis of grammatical structure. The stress assignment rules I will be using are proposed in Liberman and Prince (1977: 257). These include the following versions of the Nuclear Stress Rule and the Compound Stress Rule: (11)
In a configuration
[QA BQ]
a. NSR: If C is a phrasal category, B is strong. b. CSR: If C is a lexical category, B is strong iff it branches. These rules can assign prominence to structures which are the output of the phrase structure and transformational components of the grammar, and assign relative rather than absolute values, as in (12). In (12) R is assigned to the root of the tree, and lexical stress has not been indicated since it is not relevant to the discussion.
The pattern of prominence on (12) is often cited as the normal or unmarked stress pattern, and is the stress pattern sometimes represented as (13). (13)
Willy can do sixty pushups.
In (12) the primary stress is the node not dominated by any weak nodes; the rules for associating intonation patterns with stress patterns will associate the nuclear tones with this syllable; for this reason, primary stress can usually be perceived on the basis of rises and falls in pitch. 5 For discussions on the relationship between stress and intonation see Liberman (1975), Pierrehumbert (1979) and Bing (1979). It is in comparison to the unmarked stress pattern represented in (12) and (13) that the stress pattern in (14) is generally judged to be contrastive. (14)
Willy can do sixty pushups.
This judgment ignores the fact that no sentence can ever be completely divorced from a context. When a context is not given, a reader or listener supplies one. It is quite obvious that (14) has been removed JS, vol. 2, no. 2
143
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(12)
JANET MUELLER BING from some context and would be unlikely to be used at the beginning of a discourse. It is possible to imagine several situations in which a speaker could waik into a room and announce (13), but it is improbable that (1ft) would be used except in a situation in which the subject of pushups had somehow already been introduced into the discourse, as in (15): (15) A: Willy will probably pass the running test, but I doubt that he can do the required fifty pushups. B: Willy can do sixty pushups. It is worth noting that once put in a context in which it might possibly occur, sentence (lft) no longer requires a diacritical mark as it does in (lft). In (15B) the destressing of pushups results in a speaker or reader normally adding prominence 6 to sixty. In the context of (15A), the stress pattern of (15B) is the normal pattern, and is not contrastive.
(16)
Default accent often appears to be contrastive stress because an entire phrase and not simply a single word may be destressed, with the resulting default accent falling some distance away from the deaccented phrase. For example, the stress pattern on (17) is default accent when the sentence occurs in a context such as (18). (17) Willy can do sixty pushups. (18) A: Who else can do sixty pushups? B: Willy can do sixty pushups. In this case, can do sixty pushups refers back to the idea in (15A), and Willy receives stress by default. lftft
introduced
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The phenomenon of destressing 'old' or 'given' information is wellknown, but it was not until the Liberman-Prince metrical hypothesis that it became possible to formalize rules of destressing. It is the fact that sixty, which is a weak node in (12) becomes strong because of the destressing of pushups that causes Ladd (1978) to label the stress on sixty 'default accent'. The metrical representation of (lft) is (16).
CONTRASTIVE STRESS, INTONATION, MEANING (19)
R.
(21)
Harry wants a VW, but his wife would prefer an American car.
Default accent is not limited to sentences with a single focus. At first glance, sentence (4), repeated as (22), gives the impression of having contrastive stress, since the semantic idea of contrast is expressed and because there are two foci. (22)
I said to report the trouble, not broadcast it.
Despite the two foci and the overt contrast, there is no contrastive stress on (22) any more than there is on (1), repeated here as (23): (23)
He didn't buy a Ford, he bought a Plymouth.
In (23) the heaviest stress falls on the final noun of each clause as predicted by the nuclear stress rule. Similarly, the stress pattern on (22) is not contrastive. Like (17), sentence (22) is not likely to be used to initiate a discourse. In order to interpret this sentence, the reader must assume that 'the trouble' has either been previously referred to or is obvious to both speaker and listener. The prominence on report in (22) is therefore another example of default accent and not of contrastive stress. Schmerling (1976: 49) cites sentence (5) as an example of a sentence which "must have contrastive stress". Sentence (5) is repeated as (24) Even a two-year-old could do that.
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Despite the fact that the phrase, "refers back to previous discourse" requires a more extensive definition than space will allow here, 7 it is possible to formalize the idea of default accent as (20). (20) Any node that refers back to previous discourse is weak (W). Ladd's very useful discussion of accent by default contains many examples which show that referring back to previous discourse does not require that the deaccented word or phrase has been mentioned explicitly. For example, the stressing of car is quite clearly by default in the following example from Ladd (1978: 83):
JANET MUELLER BING A number of factors, including the presence of the pronoun that, suggest that (24) is another example of default accent rather than contrastive stress. Primary stress does not fall only on a noun phrase foiiowing even, as the following example indicates. (25)
A: You expect me to make lunch? B: Use your head, dummy. Even a four-year-old can open a can of tuna.
Assuming the stress assignment rules in (11), the assigned pattern of prominence for the second sentence in (25B) would be: 8 (26)
W a
W S four - year
W W
W W
old can open a
W
can
W
of
S
tuna
In the context of (27A), however, prominence shifts to the word four by default. (27)
A: I can't get this can open. B: Use your head, dummy. Even a four-year-old can open a can of tuna.
Since opening the can of tuna is already in the discourse, stress will fall on four by default. The resulting pattern of prominence is (28), with the crucial nodes encircled.
IS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
w Even
CONTRASTIVE STRESS, INTONATION, MEANING The pattern of prominence on (5), repeated below as (29B) is similar to that of (27B) and is likely to occur in a similar context. (29) A: I can't seem to get this machine to work. B: Just push the button, dummy. Even a two-year-old could do that. Schmerling uses this sentence, which she claims is "necessarily contrastive" to argue against the notion of a normal or unmarked stress pattern. However, in (29B) the stress pattern seems to be the result of default accent rather than contrastive stress.
(30) Harry ate the bagel. Despite the fact that the word would normally receive prominence on bagel by the Nuclear Stress Rule, it is still stressed. There are several ways to gain this additional prominence. The first would be the additional prominence on the stressed syllable which, among other things, would increase the degree of rise and fall of the pitch on the nuclear syllable, giving (32) rather than the 'normal' (31). (31) Tiarry ate th/bagel. (32) Harry a t e t h / b a g e l . The type of emphasis indicated in (32) would not be contrastive stress, but would be a gradient variation of the same stress pattern as (31). This type of difference is gradient in the sense defined in Bolinger (1961a) and discussed in Ladd (1978, Ch. 4). The interpretation that Sag gives this sentence, given in (33), and his subsequent discussion, indicates that what he may have been trying to indicate in (30) was not the increased prominence indicated on (32), but the use of a different intonation contour, the falling-rising contour shown in (34). (33) (the bagel
C x /
[ Harry, X y ( y ate x ) ]
(34) Harrvate_the bagel.
(But not the eggs.)
The implication in (34), which I have indicated, is that Harry did not eat something else in the set to which the bagel belongs, presumably some set of foods. The use of the falling-rising contour does signal JS, vol. 2, no. 2
147
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In addition to default accent, there are other prosodic phenomena which are sometimes called contrastive stress. Sentence (6), repeated as (30), is a type of sentence with additional prominence indicated. The additional prominence is sometimes shown by the use of italics or capitalization.
JANET MUELLER BING a contrast of this sort, but it is important to note that the contrast is signalled by a difference in intonation and not by a difference in stress.
(36) _Lflon't like the blue tie.
(But I like the red one )
The contrastive interpretation in (35) and (36) is the result of the intonation rather than the stress pattern. This becomes clearer when the intonation, pattern is changed without change in the stress pattern. It is quite unlikely that either sentence (37B), with a falling contour or (37C) with a falling-rising contour would be used 'out of the blue1, but rather would occur in a situation indicated in (37A): (37) A: What do you, think of these? (Holds_up several ties) B: I/like the red, tie. C:_L'iike the red. t i e . / In both (37B) and (37C) the prominence on red is due to the destressing of tie, which means that red has been given primary stress by default. Although (37B) and (37C) both have the' same stress pattern, (37C) with the rising-falling contour has a contrastive interpretation. This is not because of contrastive stress, but is the result of a combination of default accent and 'contrastive intonation'. When the falling-rising contour is used on a longer sentence, such as (7), repeated as (38), the impression of contrastive stress is very strong. (38) a. Senator Eastland doesn't grow cotton to make money. b. Senator; Eastland doesn't grow cotton to make money. Because of the implied contrast in (38b), it is often assumed to have contrastive stress on cotton. In fact, just the opposite phenomenon has taken place. Two other phrases, Senator Eastland and to make" 148
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The falling-rising contour is frequently confused with contrastive stress, particularly1 when it is combined with default accent. This 'contrastive intonation has been widely discussed in the literature. Jones (1966) and Ladd (1978) are among those who have noted that the falling. rising contour is sometimes given a contrastive interpretation. Ladd (1978: 153) notes that the contrast is with other members of an implied set. "The meaning of fall-rise is thus something like focus within a given set." .(Italics in original). This interpretation occurs only in sentencefinal position, of course, and in non-final position would be interpreted differently. In Bing (1979) I argued that if the original sentence having this contour is affirmative, the implication (or implied contrast) is usually negative, and vice-versa. . /—V (35) yiike the red tie. (But I don't like the blue one )
CONTRASTIVE STRESS, INTONATION, MEANING money have been destressed. In a brand-new context such as (39) these phrases would not be destressed and prominence (indicated by the intonation contours) would occur on all three phrases because the speaker does not presuppose that any element of the sentence refers back to a previous part of the discourse. (39) Reliable sources revealed today that Senator Eastland doesn'.t grow cotton to make money
(40) Reliable sources revealed today that Senator Eastland doesn't grow cotton to make money. / The Senator claimed that by maintaining his farm, he is able to employ 20 workers who otherwise would be on welfare. The contrastive interpretation found on (40) but not on (39) is the result of the falling-rising contour and not of the stress pattern. Similaly, the absence of the falling-rising contour on (41) explains why (41) has the same stress pattern as (38) with no contrastive meaning. (41) A: What does your senator do to supplement his income? B: Senator, Eastland doesn't grow cotton to make money, (i.e. he makes money on government subsidies.) The idea that prominence must be the result of stressing rather than destressing is so common that sentences such as (38) are automatically assumed to have contrastive stress. A careful analysis of these sentences, however, sometimes reveals that the contrast is due to intonation rather than to stress. Although sentences (3) - (7) are representative of what is often called contrastive stress, I have attempted to show that, at least in certain contexts, they are examples of default accent, contrastive intonation, or more commonly, a combination of the two. Most have stress patterns which can be accounted for either by standard stress 35, vol. 2, no. 2
149
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In Bing (1980) I argued that in sentences such as (39), the unmarked stress pattern is the sentence with multiple foci and that sentences such as (38) are the result of destressing rather than of adding contrastive stress. In (38) it is the combination of destressing and the falling-rising contour which gives an impression of contrastiveness. However, with the falling-rising contour it is not even necessary to destress in order to achieve a contrastive interpretation. It is possible to have an implied contrast even when none of the phrases are destressed, as (40) illustrates.
JANET MUELLER BING rules such as (11) or by discourse factors as formalized in (20). Contrastive stress, on the other hand, is not predictable by rule, nor is it the result of destressing. The clearest cases of contrastive stress are those where stress occurs on words such as anaphoric pronouns that are normally destressed by (20). In the following sentences, which do not have contrastive stress, the pronouns are unstressed as one would expect them to be. (42) Sam called Mary a Republican, and then she laughed at him. (43) Sam called Mary a Republican, and then she insulted him. (Lakoff, 1971: 333)
(44)
s insulted
him
If, like Lakoff, the speaker considers calling someone a Republican an insult, it might be possible to attribute the stress on him to default accent in (45), but since both pronouns are usually stressed with this interpretation, then stress must somehow be added. (45) Sam called Mary a Republican, and then she insulted him. (46) . . . and then she insulted him. Given the current assumptions about how intonation contours are associated with metrical trees (proposed in Liberman, 1975), it is not possible to represent (45) without some extension of the theory such as that proposed in Bing (1979, Ch. 5). It is beyond the scope of this paper to explore the problem here. At this point it is possible only to point out that for (45) stress must be added, and cannot be accounted for by default accent. Because of this deviation from the expected stress pattern, the contrastive stress on (45) signals the well-known vice-versa effect noted in Akmajian and Jackendoff (1970: 124) and elsewhere. (47) John hit Bill, and then George hit him.
150
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The assumption in (43) must be that it is no insult to be called a Republican. The destressing of the pronouns results in a pattern of prominence indicated in (44)
CONTRASTIVE STRESS, INTONATION, MEANING (48) John hit Bill, and then George hit him.. In (47) him refers to Bill, but in (48) it refers to John. This change of reference on pronouns when they receive contrastive stress is quite consistent, since the 'normal' case for anaphoric pronouns, or any anaphoric part of the sentence is the case where destressing has taken place. If there were no expected or unmarked stress pattern, there would also be no unexpected or marked stress pattern, and one would not expect to find the consistency of interpretation found on sentences (43) - (48). Interestingly, in this clear case of contrastive stress, the stress pattern does not signal a contrast, but rather a departure from an expected interpretation.
(49) My real secret is what's at the other end. Wl what's at the other end? - United Airlines Cargo! (= sentence (2)) In sentence (9), repeated here as (50), prominence also falls on a normally unstressed preposition. (50) The bills were not large but there were a great many of them. Bolinger (1982: 2) discusses these and other examples and suggests that they are 'accents of affirmation1 which 'insist on the truth-value of the whole clause.' He notes (p. 4) that this accent 'tends to be specialized on items that are normally unaccented.' (Italics added). In this case, as with anaphoric pronouns, there must be some rulegoverned or expected behaviour in order for accent (or stress) to be a grammatical signal which can be interpreted as an accent of affirmation. Sentences (49) and (50) are clear, cases of marked or contrastive stress as I have defined it. Yet in neither case is there a contrast or even an implied contrast, except in the abstract sense in which affirmation is a contrast to negation. Another example of the use of contrastive stress to obtain a marked interpretation is (10), repeated as (51). (51) Of course, it's the movie to see this year. In contexts which allow the interpretation, the contrastively stressed JS, vol. 2, no. 2
151
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The contrastive stressing of anaphoric pronouns is not the only use of contrastive stress. In addition to pronouns, most function words receive relatively little prominence. In the following example from Bolinger (1982: 6), the word at should be stressless both because it is a function word and because "what's at the other end" has been mentioned in the previous sentence. .
JANET MUELLER BING the consistently has the special interpretation, 'the only1. In this case, as well, the presence of contrastive stress is used without any implied contrast. The fact that a word or phrase gets some sort of special interpretation does not, however, necessarily mean that it is contrastively stressed. The difference between (52) and (53) is frequently attributed to contrastive stress, but is, in reality, due to the fact that the sentences have different intonation patterns. / ^ (52) We can't serve any, wine at the party. (53) We can't serve anyt wine at the party.^
Although the clearest cases of contrastive stress are instances where it falls on words that are rarely stressed, it is not only function words which may be contrastively stressed. In certain contexts, content words clearly have contrastive stress. Bolinger (1982: 7-8) provides a good explanation and a number of good examples: Though function words - by being inherently colorless - outnumber content words as affirmation-carriers, content words may be rendered colorless by obviousness in the context, e.g. through repetition, which normally causes them to be deaccented. They may then be re-accented, under the conditions described here, for affirmation . . • The Bureau of Study Counsel wants this bulletin posted. I guess we'd better put it on the board. In fact, it's exactly the" right time to put it up. (The accent could just as well go on to. ) (55) Here is an example of a conversation between atop the Empire State on the other side. The on the last word . . •
a noun accented for affirmation, from two people looking for the sales counter Building. One said .It isn't here - it's other replied, with a high rise-fall
All right, let's go to the other side. (Bolinger, 1983: 9) In (5*0 put receives stress despite the fact that the verb had been' 152
3S, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For most people the interpretation of (52) with the falling contour is 'We can serve no wine1, and for many people (53) has a different interpretation, 'We can't serve just any wine1. Both sentences, however, have the same stress pattern. The different interpretations are due to differences in intonation. The falling-rising contour, which I have called contrastive intonation, can result in this special interpretation when used with sentences containing any, but this is not contrastive stress. •"-•
CONTRASTIVE STRESS, INTONATION, MEANING introduced in the previous sentence. In the final sentence of (55) side receives stress despite the fact that the other side had just been mentioned. It is the fact that the speaker chose not to stress go in this sentence that is noteworthy. By choosing to put contrastive stress, Bolinger's accent of affirmation, on side, the speaker is signalling that some special interpretation is intended. Of the sentences (8) -(10) which I have claimed exemplify contrastive stress, none are used to make a contrast, except on a very abstract level. However, in the following sentence from Bolinger (1961b: 101) there is an overt contrast. (56) "This whiskey", said O'Reilly, sampling spirits that claimed to be from his homeland, "was not exported from Ireland; it was deported".
The" position which I have argued for up to this point is that although much of what has been called contrastive stress in the past can be shown to be something else, there still remain examples of stress patterns that can best be accounted for as clear deviations from a normal or unmarked stress pattern. It is these deviations from the norm that should be considered contrastive stress. For those who argue that there is no unmarked stress pattern, and therefore no such thing as contrastive stress as I have characterized it, the alternative seems to be the claim that certain words are made more prominent because they are informationally more important. Bolinger (1972: 633) summarizes his views about the unmarked stress pattern as follows: My position was - and is - that the location of sentence accents is not explainable by syntax or morphology . . . I have held, with Holtzen 1956, that what item 'has relatively stronger stress accent in the larger intonational pattern is a matter of information, not of structure' . . . For example, in explaining why nouns normally receive more prominence than adjectives, Bolinger (1958: 69) states: I have said that in the nature of things nouns are less predictable than attributes. This is true of common nouns and common attributes. JS, vol. 2, no. 2
153
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There is general agreement that (56) does exemplify contrastive stress, but it is important to note why the stress pattern on the word exported in (56) is contrastive. This is an example of contrastive stress not because a contrast is made, - but because the speaker has- chosen to deviate from an understood norm to stress part of a word which would normally not be stressed. In (56) the speaker has chosen to stress the first syllable of exported in order to emphasize a contrast, but it is the deviation from the norm and not the contrast which makes this an example of contrastive stress.
JANET MUELLER BING He does, however, concede the following: . Perhaps sheer frequency has led or is leading to a partial fossilization of the stresses in adjective-noun phrases; the grammatical binding of once-free combinations is nothing new in language. But it has not gone far enough to free us from taking stock of the pressure of information.
Until some satisfactory explanation for the special interpretations on sentences such as (8), (9) and (10) can be found, the evidence seems to be in favor of the hypothesis that English does . have contrastive stress, and by implication normal or unmarked stress as well. Although current attempts to describe the unmarked system have not yet achieved that goal, this is not necessarily a reason to abandon the task, particularly since there seem to be some fairly clear examples of deviations from the norm, which I have called contrastive stress. Previous arguments against contrastive stress have correctly pointed out that not everything called contrastive stress necessarily is- I have attempted to show that default accent, contrastive intonation, and a combination of the two should not be considered contrastive stress. In addition, I have attempted to show that there can be a contrast without contrastive stress, and, more importantly, there can be contrastive stress without a contrast. Once contrastive stress has been distinguished from other prosodic phenomena, it is possible to define it as what linguists often assume it to be, a deviation from a normal stress pattern. Linguists have continued to use the term contrastive stress despite repeated attempts to show that it does not exist. Bolinger (1965: 101) admits that "One of the most durable concepts in American linguistics has been that of 'contrastive stress"1. I would predict that this concept will continue to endure, and for good reasons.
154
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It is, in fact, this very "fossilization of the stresses" that results in the unmarked or normal stress patterns, and rules such as (11) describe these unmarked patterns. Without the expected stress patterns there would be nothing noteworthy about a sentence such as, "He has a lean and hungry look". (Bolinger 1958: 68). In addition, the cases of contrastive stress discussed above cannot occur without the assumption of an unmarked stress pattern. The interpretation of contrastively stressed anaphoric pronouns, for example, makes no sense in a framework where acccent is only "a matter of. information, not of structure". It seems far more reasonable to assume that the atypical semantic interpretation in sentences such as (45) and (48) is signalled by an atypical or contrastive stress pattern. Similarly, Bolinger's accents of affirmation, as he points out (1982: 4) fall on items "that are normally unaccented".
CONTRASTIVE STRESS, INTONATION, MEANING Notes
References Akmajian,
Adrian and Jackendoff, Ray, 1970: Coreferentiality stress. Linguistic Inquiry 1; 124-126.
JS, vol. 2, no. 2
and
155
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 I would like to thank John Broderick and Charles Ruhl for their helpful suggestions on an earlier version of this paper. I must assume responsibility for any shortcomings and confusions in this version. 2 I have quoted examples from a number of articles by Bolinger, and in some cases, including this example, I have done so without indicating accent, rather indicating stress by i t a l i c s . Bolinger (1961b) distinguishes between contrastive stress and contrastive accent. 3 Bolinger (1958) argues against the hypothesis which I am assuming, that stress and intonation are separate and interdependent systems. He claims that rather than two interdependent systems of stress and intonation, there is a single system of accent. My reasons for not accepting this hypothesis are given in Bing (1979). Bolinger's accents of affirmation "insist on the truth-value of the whole clause". 4 Many of the counter-arguments against a normal stress pattern correctly point out that a number of apparently non-contrastive sentences cannot be accounted for with only the Nuclear Stress Rule and the Contrastive Stress Rule. In Bing (1979) and in a forthcoming paper, "Formalizing the Stress Rules in English", 1 discuss shortcomings of the stress rules as presently formulated in Liberman and Prince (1977) and show how these rules can be revised to account for some apparent counter-examples. However, the fact that there is general agreement on which examples are exceptions to current formalizations of the unmarked stress pattern is itself evidence that an unmarked system exists, despite current failures to account for it. 5 The fact that changes in pitch are the primary cue for stress is one of the reasons for the accent hypothesis. 6 I am using prominence to indicate something very similar to what Bolinger calls accent, that is, the combined effects of pitch, intensity, duration and vowel quality. In Bing, 1979, Ch. 5, I argue against the hypothesis that a sentence may contain only one primary or sentence stress. I argue that examples such as (8) and (10) have more than one primary stress. 7 There is an extensive literature on given/new and old versus new information. The bibliography in Chafe (1976) contains numerous references on this issue. 8 In Bing (1979) I argue that sentences such as (26) would have two intonation phrases each dominated by R rather than the pattern of prominence indicated in (26). Since this is not directly relevant to the topic under discussion I have used the rules in (11) with no modification.
JANET MUELLER BING
156
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Bing, Janet Mueller, 1979: Aspects of English prosody. University of Massachusetts, Amherst, Dissertation. (Distributed by Indiana University Linguistics Club.) Bing, Janet Mueller, 1980: The given/new distinction and the unmarked stress pattern. NELS Proceedings XI. University of Massachusetts, Amherst. Bolinger, Dwight, 1961a: Generality, gradience, and the all-or-none. Mouton, The Hague. Bolinger, Dwight, 1961b: Contrastive accent and contrastive stress. Language 37; 83-96. Reprinted in: Bolinger 1965, pp. 119-127. Bolinger, Dwight, 1958: A theory of pitch accent in English. Word 14; 109-1*9. Reprinted in: Bolinger 1965, pp. 17-55. Bolinger, Dwight, 1965: Forms of English: accent, morpheme, order. In: I. Abe and T. Kanekiyo, (eds.), Harvard University Press, Cambridge, Mass. Bolinger, Dwight, 1972: Accent is predictable, - if you're a mind-reader. Language 48; 633-644. Bolinger, Dwight, 1982: Affirmation and default. Unpublished paper. Chafe, Wallace I., 1976: Giveness, contrastiveness, definiteness, subjects, topics and point of view. In: Charles N. Li (ed.), Subjects and topics. Academic Press, New York. Chomsky, Noam, 1971: Deep structure, surface jstructure and semantic interpretation. In: D. Steinberg and L. Jakobovits (eds.), Semantics, an interdisciplinary reader. Cambridge University Press, Cambridge. Jackendoff, Ray, 1972: Semantic interpretation in generative grammar, MIT Press, Cambridge, Mass. Jones, Daniel, 1966: The Pronunciation of English. 4th ed. Cambridge University Press, London. Ladd, D. Robert, 1978: The structure of intonational meaning. Indiana University Press, Bloomington. Lakoff, George, 1971: Presupposition and relative well-formedness. In: D. Steinberg and L. Jakobovits (eds.), Semantics, an interdisciplinary reader. Cambridge University Press, Cambridge. Lasnik, Howard, 1969: Analysis of Negation In English. MIT dissertation. Liberman, Mark, 1975: The Intonational System of English. MIT dissertation. (Distributed by Indiana University Linguistics Club). Liberman, Mark and Prince, Alan, 1977: On stress and linguistic rhythm. Linguistic Inquiry 8; 249-336. Pierrehumbert, Janet, 1979: Intonation synthesis based on metrical grids. Proceedings of the American Acoustical Society, June 1979. Pike, Kenneth, 1945: The Intonation of American English. University of Michigan Press, Ann Arbor, Michigan. Sag, Ivan, 1976: Deletion and Logical Form. MIT dissertation. (Published by Garland, New York, 1980). Schmerling, Susan F., 1976: Aspects of English sentence stress. Univ. of Texas Press, Austin.
EVEN, FOCUS, AND NORMAL STRESS
D. Robert Ladd
Abstract
1 'Focus' and 'normal stress' are undoubtedly two of the most ill-defined and most argued-about concepts in the literature on accent placement, and I should emphasize right here that in using the terms I am not declaring my allegiance to any particular school of thought. At the same time, however, it seems to me that focus and normal stress are also two of the most inevitable concepts in the literature on accent placement: sentences mostly say something new or make some point that is the focus - and with certain accent locations the focus is specified only very broadly - that is normal stress. I feel that if we seriously explored the intuitions that underlie these two notions, we would have a chance of breaking out of the hardened theoretical positions on either side of the placement debate. Let us consider those positions briefly. In one corner we have the traditional 'normal stress' view that every well-formed sentence has a structurally defined location for a single primary stress. 'Contrastive JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 157-170
157
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The two traditional schools of thought about sentence-accent placement the 'Nuclear Stress Rule approach' and the 'semantic highlighting approach' - do not really offer competing accounts of the same phenomenon, but emphasize different aspects of the overall problem. A good starting point for integrating these two approaches is provided by recent work by Gussenhoven. His work makes crucial use of the notion that the focus of a sentence may extend over several constituents and may be divided up into one or more accent bearing domains, within which accent placement is structurally specified. By extending this notion beyond Gussenhoven's original use of it, we arrive at a fundamental distinction between 'information chunking' (which depends on 'given/new', semantic weight, etc.) and focus (which is a syntactic phenomenon). These represent two separate. (though interrelated) functions of accent, and have distinguishable effects on where accents are located. Past descriptions emphasize one or the other of these functions, and can be reconciled with each other if the distinction between the two functions is recognized.
D. ROBERT LADD stress1 is governed by different principles and simply overrides the syntactically determined stress placement. It will be convenient to refer to this as the NSR view, after Chomsky and Halle's Nuclear Stress Rule, though of course the NSR represents only a formalization of a much older general approach (Newman 19^6, Chomsky and Halle 1968, Bresnan 1971, 1972).
Somewhere in the middle of the ring we find a number of recent investigators (e.g. Schmerling 1976, Ladd 1980, Fuchs 1980, Bing 1980, Gussenhoven forthcoming) who have worked at reconciling the two views just sketched. Most of what I say here will be put in terms suggested by Gussenhoven's work, which must surely be reckoned the most succesful and comprehensive of these recent attempts. Gussenhoven acknowledges that pragmatic and contextual considerations affect a speaker's decision where to accent a sentence, but he rejects the hypothesis that accent represents a simple choice between highlighting and not highlighting any given word. Rather, he assumes that the speaker's choice is made at a more abstract level, in the assignment of a semantic feature [+ focus] to semantic constituents, and is thus part of the speaker's decision of 'what to say' rather than 'how to say it'. He presents a considerable amount of evidence that the link between focus and accent is not a one-to-one highlighting, but a 'realization', by surface features such as accent, of abstract semantic specifications. Like the link between other abstract features (e.g. features of case) and the surface elements that realize them,, the connection between focus and accent is governed by a set of welldefined and to some extent structure-dependent principles. Gussenhoven's case is best illustrated by his discussion of 'minimal focus' on the 'polarity' of the sentence, where the lexical content and grammatical relations are all contextually given, and only the truth or falsehood of the proposition is at issue (as in He DOES play golf or But the house isn't ON fire). He shows that the accent placement in such cases depends on independently motivated semantic/pragmatic distinctions - such as the distinctions between 'counterassertive' and 'counterpresuppositional' (Dik et al. 1982) - and on language-specific accent placement rules (Gussenhoven forthcoming, Sec. 8 and 9). By contrast, his discussion of cases that are more like 'normal stress1' 158
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In the other corner we have what we may call the 'highlighting' view. This holds that speakers put certain words into intonational relief in order to highlight them or focus on them: accents are meaningful wherever they occur, and 'normal stress' merely results from the conjunction of expected word orders and contexts. This view is perhaps most often associated with the work of Bolinger (e.g. 1961, 1972), but it also corresponds roughly to the views of the Prague School (e.g. Danes 1967), and has recently found considerable favor among experimental phonologists and phoneticians (Brown 1983, Nooteboom, Kruyt, and Terken 1980, Nooteboom and Terken forthcoming).
EVEN, FOCUS, AND NORMAL STRESS are less satisfactory. His specification of the feature [+_ focus] is sometimes counterintuitive and suspiciously circular, and he is frequently forced to resort to ad hoc categorizations like 'event sentence1 and 'adverb of proper functioning' to make his rules work out right.
But contrastive stress is merely focus on a single word or constituent. Even works the same way when the focus encompasses several constituents, or the whole sentence; this is normal stress. If we say She even speaks Classical CHINESE, we may of course be focusing 'contrastively' on Chinese (e.g. Classical Chinese as opposed to Classical Greek), or on Classical Chinese (e.g. Classical Chinese as opposed to Lithuanian or Tamil). But we may also be focusing on the whole verb phrase, with the implication that (e.g.) she not only plays jazz shakuhachi and runs the mile in 4'45", but speaks Classical Chinese as well. The presence of even makes it clear that there are different ways to interpret the focus of this sentence, which in turn makes us aware that there is some sort of special relationship betwen focus and normal stress (discussed most thoroughly by Jackendoff 1972, Ch. 6). Any adequate description of accent placement must take such facts into account.
Gussenhoven's ruies for normal stress can be summarized as follows: 1 i.
The focus of a sentence is composed of one or more focus domains, each of which is accented on its syntactic head. (In order to avoid the terminological confusion between 'focus of a sentence' and 'focus domain1, I will henceforth replace Gussenhoven's 'focus domain1 by 'accent domain1.)
ii. For the most part, each major semantic constituent within the focus constitutes a separate accent domain and is therefore accented. JS, vol. 2, no. 2
.
159
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
My goal in this essay is to work through Gussenhoven's analysis of focus and normal stress and to try to show where I think he goes wrong. As a heuristic device I propose to try adding even to data sentences, since the presence of that word sharpens up intuitions about focus as nearly nothing else can. Now the interaction between even and accent placement is probably most often thought of in terms of 'contrastive stress'. For example, the presence of even in a sentence is sometimes said (in traditional terms) to 'require contrastive stress', as in Even a TWO-year-old could do that. Alternatively, when even is in its most neutral position, preceding the main verb, 'contrastive stress' may be used to signal the pragmatic presuppositions associated with even, as in Jennifer even SPEAKS Classical Chinese (the implication is that most people know nothing about Classical Chinese, and those that do only read it but do not speak it).
D. ROBERT LADD Gussenhoven identifies the major semantic constituents as Arguments, Predicates, and 'Conditions' (complements and adverbials of various sorts). iii. However, in certain circumstances two such major constituents can form a single domain and receive a single accent: a Predicate can combine with an Argument (in which case the Argument is accented), and certain Conditions can combine with a Predicate (in which case the Predicate is accented).
If the focus consists of several constituents, multiple accents are required, as in She even studied Classical CHINESE at HARVARD. In this, neither at Harvard, nor (studied) Classical Chinese alone is the focus, but the whole fact of having studied Classical Chinese at Harvard. The sentence would be appropriate in a discussion of other remarkable biographical facts about the subject - say, that she grew up in Brazil, or once shook hands with Helmut Schmidt. As predicted by Gussenhoven's rules, the focus is divided up into two domains (Predicate/Argument studied Classical Chinese plus Condition at Harvard) and each receives its own accent. This correctly predicts that if the sentence had only a single main accent, the focus would be narrow and the pragmatic force somewhat different. That is, She even studied Classical CHINESE at Harvard could be used in a discussion of things various people had studied at Harvard, while She even studied Classical Chinese at HARVARD would be appropriate in a context concerned with places where one might study Classical Chinese. Even such everyday data as these illustrate the superiority of Gussenhoven's rules to the two older approaches. Sentences like She even speaks Classical CHINESE pose a problem for the highlighting approach, because they point very clearly to the existence of structuredependent principles for accent assignment in cases where large constituents are in focus. (Insisting, as the highlighting approach is forced to do, that Chinese is somehow the most important or informative word in the verb phrase merely begs the question.) Conversely, sentences like She even studied Classical CHINESE at HARVARD pose a problem for the NSR approach, because they point very clearly to the necessity of being able to assign more than one accent in cases of normal stress. (Again, insisting that at Harvard is somehow a 'separate prosodic phrase1" 160
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The application of these principles can be illustrated with a few simple examples. In She even SPEAKS Classical Chinese, only speaks is the focus; there is a single accent domain and a single accent. The situation is similar in the 'contrastive' interpretations of She even speaks Classical CHINESE (i.e. where Chinese or Classical Chinese is the focus). In the normal stress version, where the whole verb phrase speaks Classical Chinese is the focus, we observe the combination of two semantic constituents (Predicate speaks plus Argument Classical Chinese) into a single accent domain, with the accent placed by rule on the Argument.
EVEN, FOCUS, AND NORMAL STRESS which therefore gets its own primary stress merely changes the terminology without answering the basic question.) Both types of sentences are covered quite naturally by Gussenhoven's rules.
This may seem like a trivial distinction. I would argue that it is not trivial but subtle. Its importance can be appreciated more fully if we consider the accent placement in Your COAT'S on fire! and There's a SPARK on your coat! These examples force the highlighting approach into the unenviable position of claiming that coat is somehow more contextually salient than fire but that spark is more contextually salient than coat. What Gussenhoven's approach permits us to say is the following: the two sentences each form a single information unit, with a single accent. The fact that they can be treated as single information units is due to a variety of poorly understood factors of contextual salience, informativeness, etc., of the sort that the highlighting approach has always emphasized. But given that fact, the location of the accent is structurally specified. By assuming that accents apply to accent domains - information chunks - rather than to individual constituents considered independently of one another, Gussenhoven's approach avoids the empirical embarassment of having to make implausible claims about the relative contextual salience of fire and coat and spark.
35, vol. 2, no. 2
161
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Furthermore, Gussenhoven is able to incorporate many of Bolinger's observations about contextual and lexical influences on accent placement by allowing the constraints on domain formation to be relaxed under certain circumstances. This is the case, for example, with infinitive complements of the sort seen in I even have all these OFFPRINTS to file or Me even wanted some Persian CARPETS to EXPERIMENT with (cf. Gussenhoven sec. 7; Bolinger 1972). Bolinger (and the highlighting approach in general) argues that file is unaccented because it is relatively 'predictable' or adds little independent information to the sentence. Gussenhoven accepts the argument that such considerations can affect accent placement, but formally he analyzes the lack of accent on file by saying that under appropriate contextual/ pragmatic circumstances the infinitive complement and the Argument can form a single- domain. Gussenhoven's explanation of this phenomenon is thus comparable to Bolinger's, but his formal statement of the phenomenon itself is different: Bolinger is talking about words having or not having an accent, whereas Gussenhoven is. talking about the formation of accent domains within the focus. Unlike Bolinger's analysis, in other words, Gussenhoven's explanation does not require us to say that file is somehow 'predictable' from offprints, but only that filing offprints (unlike experimenting on Persian carpets) is a. familiar enough activity that the two elements can be combined in a single accent-bearing information chunk. Once that combination has occurred, the fact that the noun is accented rather than the infinitive is a purely structure-dependent consequence of combining them.
D. ROBERT LADD
With the simple provision that the focus of the sentence is divided up systematically into accent-bearing chunks, it seems • to me that Gussenhoven has transformed the empirical question of accent placement in a potentially very productive way. The older approaches ask: "What determines where accents go?" For Gussenhoven, the location of the accents is a relatively low-level consequence of domain formation; his question is rather: "What determines the division into accent domains?" This changed formulation brings out what is sensible and what is misdirected in both the earlier approaches. The observations of writers like Bolinger can be interpreted as being primarily with the conditions under which two semantic constituents may combine into a single accent domain, while the main emphasis of traditional normal stress rules is the structural specification of accent placement once domains with more than one major constituent are formed. Gussenhoven's model thus embodies an implicit explanation for the inadequacies of both the NSR and the highlighting approaches, namely that they try to describe two distinct phenomena with a single set of rules or principles. Unfortunately, though, Gussenhoven fails to make the most of his own insight. There are several points in his analysis where, in my view, he describes things in terms of well-defined structural distinc162
aS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Returning to the examples with the infinitive complements, it can be seen that Gussenhoven's analysis has a further advantage over the highlighting approach, namely that it correctly describes the possible focus interpretations in these cases. First, it correctly predicts that I even have all these OFFPRINTS to file is ambiguous between a reading where the focus includes to file, (the 'normal stress1 reading) and one that focuses only on offprints (e.g. 'not only do I have to file a year's worth of correspondence and three unsuccessful grant proposals, but all these offprints as well'). Second, Gussenhoven's analysis correctly predicts the existence of a difference between He even wanted some Persian CARPETS to EXPERIMENT with and He even wanted some Persian carpets to EXPERIMENT with. The former ('normal stress1) puts wanted some Persian carpets to experiment with on a par with things like spent $3000 on old comic books or threw out all his dirty clothes and bought new ones instead of doing laundry; the latter contrasts to experiment with and things like as an investment or to put in his basement. In both these cases, the highlighting analysis treats only the connection between deaccenting and 'predictability' or 'lack of informativeness'. Gussenhoven's analysis, in comparison, brings out - the fact that these contextual factors also affect focus interpretation, and treats the deaccenting not as a direct consequence of the division of the focus into accent domains.
EVEN, FOCUS, AND NORMAL STRESS tions that should actually be treated as involving the more probabilistic constraints on domain formation. The remainder of the paper is devoted to backing up this criticism.
Taken separately, ' Gussenhoven's explanations for these three cases are not unreasonable, and there is jusitification for all three within the internal workings of his overall system. At the same time, however, all three seem to exemplify what the highlighting approach might call 'deaccenting for predictability' - that is, in all three cases we are dealing with contextual influences on accent placement. Gussenhoven acknowledges these factors in the analysis of offprints to file, but in the other two cases he talks in terms of structural contraints on domain formation and accentability. That is, at Harvard is specified as [-focus] and is therefore structurally unable to bear an accent; open is treated as part of the 'semantic constituent' Predicate and is therefore structurally incapable of forming a separate domain. Yet considering that the structural definitions that Gussenhoven provides for [+focus] and 'semantic constituent' are vague at best, it seems unwise for his rules to depend so critically on these presumed categories and distinctions. The most consistent and unified treatment of these three cases within Gussenhoven's general framework, it seems to me, is to generalize the explanation applied to offprints to file. Specifically, we might say that the unaccented constituent at issue Cat Harvard, to file, open) is unaccented because it is combined, with another constituent, in a larger accent domain that is accented somewhere else. This explanation does not depend crucially on the exact specification of [-focus] elements or on the exact identity of semantic constituents; it is enough to say that 'contextual givenness' (as in at Harvard) is JS, vol. 2, no. 2
163
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The data that give Gussenhoven the most trouble are cases in which major syntactic constituents are unaccented. We have already discussed two such cases: She even studied Classical CHINESE at Harvard, and I even have all these OFFPRINTS to file. In the first of these, which would be usable only if Harvard were specifically under discussion, Gussenhoven would analyze at Harvard as [-focus]. In the second, which is usable even if filing things is not explicitly a topic of conversation, Gussenhoven would treat to file as [+focus] but allow it to form part of a larger domain with offprints because of pragmatic/contextual considerations. There is still a third comparable case, exemplified by He even left the DOOR open. The intended focus interpretation is the normal stress reading, i.e. 'not only did he spill juice on the tablecloth and forget to make his bed, but he left the door open as well'). Here Gussenhoven argues that open forms part of the underlying semantic constituent Predicate (i.e. leave ... open, parallel to throw ... away or take ... out), and therefore by definition cannot constitute a separate domain; by rule the Argument door is accented when it combines with the Predicate.
D. ROBERT LADD one of the influences on domain formation, and that syntactic constituents that are semantically or syntactically closely bound to each other (such as leave and open) tend to combine in single acceiu domains.
4.1 Gussenhoven assumes that semantic elements that are contextualiy given - like Harvard in the example above - are by definition [-focus] and hence ineligible for accent. On the face of it this is not an unreasonable assumption. It is more or less consistent with the notion of deaccenting for predictability, and fits well with a good deal of well-known 'deaccenting' data (cf. Ladd 1980 Ch. 3). Thus: A: What about FRED? B: I don't LIKE Fred. However, there are two problems with the simple assumption that Fred, being contextualiy given, is automatically [-focus]. The first is that givenness is not an all-or-none attribute like [+focus]. Prince 1981, for example, has proposed a taxonomy of discourse 'statuses' that shows very clearly that givenness is a matter of degree. The experimental work of Brown and Nooteboom et al. has shown that there are certain reasonably well-defined circumstances in which accents are placed on words that are clearly contextualiy given. If JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Gussenhoven's approach, in other words, can be developed into the strong hypothesis that accent depends on two distinct types of factors with distinguishable effects. An adequate description of accent placement should thus consist of two clearly delimited parts: (1) a part that describes the probabilistic, context-influenced constraints on the formation of accent domains or information chunks, and (2) a part that describes the structural principles governing the location of accent once accent domains have been formed. The first part is the proper repository of all the observations of the highlighting approach (and in particular of the recent experimental research by Brown and Nooteboom et al.), while the second will accommodate e.g. Jackendoff's work on focus and normal stress, and things like Gussenhoven's rule that a domain consisting of an Argument and a Predicate is accented on the Argument. However, strenghthening Gussenhoven's model in this way will call for the modification of at least two of his basic assumptions: (1) the relationship between focus and 'givenness' must be reconsidered, and (2) accent domains must be assumed to have hierarchical structure. The next section of the paper is devoted to a brief discussion of these two topics.
EVEN, FOCUS, AND NORMAL STRESS [-focus] is to be defined in terms of givenness, much detail remains to be filled in. The second problem with equating givenness and [-focus] is that it puts serious strains on any intuitively useful definition of focus (by which I mean any definition that accounts for data involving even). Specifically, it makes it difficult to account for cases of what I have termed 'default accent' (Ladd 1980), which arises when constituents are (so it appears) given, but also [+focus]. My original example was the following: A: Has John read Slaughterhouse-FIVE? B: No, John doesn't READ books.
(Context: Two writers talking about a third one) A: I hear John's been having a really unproductive spell since his -last novel. B: Yeah, it's really bad - he doesn't even READ books anymore. (Context: An enthusiastic young student and a jaded older one, talking about a charismatic professor) A: Prof. Smith is so incredibly knowledgeable and literate - he gave an incredible analysis of Ulysses in class today. B: Are you kidding? He doesn't even READ books anymore. In the first of these, the contrast between reading and writing books is explicit in the context, and it may reasonably be said that the focus associated with even is read. Books, which is obviously in the context because of the speaker's profession and the reference to John's last novel, is thus [-focus]. In the second dialogue, the contrast implied by the even is not between reading books and writing them, but between reading books and the various other activities that scholars are expected to engage in - writing papers, giving lectures, etc. The focus is thus read books. Books is again in the context because of A's mention of Ulysses and of how literate Prof. Smith is, but it cannot be said that it is [-focus] without affecting the analysis of the pragmatic force of even.
JS, vol. 2, no. 2
165
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I pointed out that the accent on read is not 'contrastive' - i.e. it does not focus narrowly on read as opposed to write or burn or sell but falls where it does by default, to signal the contextual givenness of books. With the help of even, it can be seen that the difference between 'contrastive stress' and 'default accent' is primarily a question of how much of the sentence is in focus. Consider the following two constructed dialogues:
D. ROBERT LADD In my opinion, these examples show that elements can be part of the focus even when they are given. The focus of a sentence - operationally defined as that portion of the sentence that might be associated with words like even - probably always contains elements that are new or explicitly contrastive, but it may contain other material as well. There is no reason to equate focus with newness or informativeness: focus is related to syntax in a fairly well-defined way, while newness or informativeness is a question of poorly understood features of discourse organization. The key to understanding sentence accent is to recognize that signalling focus, and dividing up new and old information, are two distinct functions.
4.2 Let me return to the examples just discussed above. In He doesn't even READ books anymore, I suggested that one possible reading of this has books as part of the focus. This contradicts Gussenhoven's claim that books is [-focus], and in so doing eliminates the basis for his explanation of the accent placement. In my view, cases such as these argue for a more elaborate structure to accent domains and a somewhat richer set of rules governing accent placement within domains. Specifically, I would argue that books is unaccented in this case because it combines into an accent domain with read. This, of course, raises a problem for Gussenhoven's system, because if Predicate and Argument are combined in a single domain, the accent is supposed to fall on the Argument - which is precisely what does not happen here. The solution to this problem is to be found in the recent theoretical work on metrical structure (Liberman and Prince 1977, Selkirk 1980, etc.). Domain formation is not (as Gussenhoven would have it) linear concatenation, but the joining of two sister constituents in a hierarchical structure in such a way that one of the two must be relatively stronger than the other. Domain formation necessarily involves the subordination of one element to the other; the specification of which is subordinated depends on a set of structural features that must include not only 'Predicate' and 'Argument1, but also discourse statuses such as 'given1. In the example just discussed, the unmarked case (and the one covered 166
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Research such as Brown's or Prince's or Nooteboom et al.'s thus says little about focus, but rather provides information about the treatment of discourse entities in the formation of accent domains. So interpreted, this research need not - and should not - be viewed as somehow discrediting the notion of grammatical effects on accent placement. Grammatical specifications of [+focus] are among the structural determinants of accent placement within domains; 'newness' as such plays a role only in constraining domain formation.
EVEN, FOCUS, AND NORMAL STRESS by Gussenhoven's rules) would be
A
w s read books
(s=strong, w=weak)
The marked case arises when books is contextually given:
A
By way of exemplifying some of the foregoing suggestions, I wish to consider briefly the question of accent placement in sentences with a non-pronominal subject and an intransitive predicate, like Your coat's on fire or Truman died or Jesus wept. These have been the subject of considerable amount of discussion in the last ten years, most of it revolving around the question of why 'normal stress' in some cases appears to be on the subject and in other cases on the predicate (Bresnan 1972, Schmerling 1976, Allerton and Cruttenden 1979, Ladd 1980, Fuchs 1980, Bardovi-Harlig ms.). Adding even shows that when such sentences are accented only on the subject, the focus interpretation is ambiguous between focus on the subject only and focus on the whole sentence. For example, His REFRIGERATOR even stopped could focus on refrigerator (e.g. 'he blew so many fuses in the house that not only the clocks and the record player stopped, but even the refrigerator, which was on a separate circuit, stopped as well'); or it could focus on the whole event (e.g. in a list of misfortunes that were awaiting Bill when he returned from his vacation - his plants had died, his basement had flooded, and his refrigerator had stopped as well). To the extent that 'normal stress 1 refers to the location of the accent when two focused constituents are combined in a single domain, then it is clear that normal stress in these cases is on the subject. But this means only that if subject and predicate are treated as a single domain, then the prediJS, vol. 2, no. 2
167
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
s w read books The similarity of this proposal to my discussion of deaccenting and default accent will be apparent, as will the need for a more careful study of the whole topic. It seems to me, however, that conceiving of domain formation in this way should eventually enable us to take the notion of 'relative strength 1 implied by the Liberman-Prince-Selkirk metrical theory and give it an explicit interpretation in terms of the pragmatic force of accent patterns.
D. ROBERT LADD cate will be subordinated to the subject. The existence of many contrary cases, which has given the NSR approach such trouble, simply proves the difficulty of stating precisely the constraints on the formation of accent domains.
Once again, the distinction between this explanation and the simple highlighting approach may be difficult to see, and it is worth discussing a couple of examples in a little more detail to make the difference clear. Consider first Schmerling's examples TRUMAN DIED and JOHNSON died. As Schmerling says (1976: 42): Bolinger's theory would appear to suggest [„.] that the mention of Truman in the relevant context should have suggested "death" and, therefore, -that died [...] should not be stressed. On the other hand, the mention of Johnson in the relevant context should not have suggested "death" any more than anything else one might have wanted to say about him, and therefore died {..^should be stressed. Bolinger's theory would thus appear to predict stress contours opposite the to the ones which actually occurred. The explanation I would suggest locates the predictability on Truman, not on anything to do with dying; this seems to be consistent with the fact the Prince's discussion of given/new statuses refers to entities in the discourse. Because he had been in the news, Truman was more 'inferrable' or less 'unused' (to use Prince's terms) than might otherwise have been the case. Consequently, died was therefore relatively stronger in the sentence and less suitable for subordination in a single domain with Truman. In effect, both constituents are accented because one of them is less informative than normal. Obviously, nothing in a simple highlighting approach would lead us to expect this result. While this interpretation may seem far-fetched, it appears to be 168
as, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Contrary cases include cases where the predicate is long or semantically complex or unexpected, such as The TRAFFIC light just turned purple or The RENT'S EXCESSIVE; cases where the predicate gives a defining characteristic of the subject, such as My BROTHER is a GEOLOGIST; and cases where the subject is 'topicalized1, i.e. somehow contextually inferrable. The best illustration of this last type was the distinction Schmerling reported between JOHNSON died and TRUMAN DIED (1976: 41ff.) The reports of the deaths of two former U.S. presidents, as she heard them, differed contextualiy in that Johnson died relatively unexpectedly, whereas Truman's terminal illness had been in the news for several days and hence was in some sense more in the context. Within the modification of Gussenhoven's approach that I have been developing here, all of these would be treated as cases of contextual influences on domain formation: the predicate is prevented (by its greater unexpectedness, etc.) from being subordinated into a single domain with the subject.
EVEN, FOCUS, AND NORMAL STRESS
6
I am aware of having raised more questions here than I have answered. In a sense, that was my goal - to propose that the endlessly repeated questions of accent placement be asked in a new way- 3ackendoffstyle observations about normal stress and focus, and Bolinger-style observations about highlighting and relative information value, have been made often enough, and the theoretical weaknesses of both points of view have been thoroughly exposed. But the debate continues inconclusively because neither side has been asking the right questions. To escape from the stalemate, we must attempt to integrate recent insights into discourse structure and recent advances in prosodic theory with syntactic data about accent and focus. It seems to me that Gussenhoven's notion of domain formation, modified along the lines hinted at here, provides the needed new perspective.2
Notes 1 I am grateful to Carlos Gussenhoven for much discussion and correspondence on the issues dealt with here. However, I must emphasize that even my presentation of his rules is not in his original form, but is my own interpretation. He is in no way responsible for any distortions of his original intent that may be contained here. 2 After this paper was completed, I discovered that comparable phenomena exist in Danish and have been discussed in somewhat similar terms by J^rgen Rischel in "On Unit Accentuation in Danish - and the Distinction between Deep and Surface Phonology", to appear in Folia Linguistica, vol. XVII (1983). JS, vol. 2, no. 2
169
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
supported by the following attested example:/ SAW Ron HARRIS today. Ron Harris was a former fellow student of the speaker and addressee, who had returned to campus to defend his thesis. The conversation had been about him for a while, but the topic had changed when the sentence was uttered. Ron Harris was thus not a completely new discourse entity, but was not immediately in the context, either (in Prince's terms, Ron Harris was probably 'inferrable', rather than either 'unused' or 'evoked'). Had the sentence been the first mention of Ron Harris in the conversation (Prince's 'unused'), it would have been accented / saw Ron HARRIS today, i.e. with the predicate subordinated to the object. Had Ron Harris been the immediate topic of conversation (Prince's 'evoked'), it would have come out I SAW Ron Harris today. In the actual context, because the informativeness of Ron Harris was as it were weakened, neither predicate nor object could appropriately be subordinated to one another, and the speaker (as in TRUMAN DIED) put accents on both constituents in order to emphasize that one of them was contextually inferrable.
D. ROBERT LADD
References
170
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Allerton, D.J. and Cruttenden, A., 1979: Three Reasons for accenting a definite subject. Journal of Linguistics 15; 49-53 Bardovi-Harlig, K., (ms.): A Comment on Comment and Bresnan's Topical Stress. Unpublished paper, Univ. of Chicago. Bing, 3.M., 1980: The given/new distinction and the unmarked stress pattern. North-East Linguistic Society XI; 13-21. Bolinger, D.L., 1961: Contrastive Accent and Contrastive Stress. Language 37; 83-96. Bolinger, D.L., 1972: Accent is predictable - if you're a mind-reader. Language 48; 633-6*4. Bresnan, 3., 1971: Sentence stress and syntactic transformations. Lanuage 47; 257-281. Bresnan, 3., 1972: Stress and Syntax: a reply. Language 48; 326-342. Brown, G., 1983: Prosodic Structure and the Given/New Distinction, in: A. Cutler and D.R. Ladd (eds.), Prosody: Models and Measurements. Springer, Heidelberg. Pp. 67-77. Chomsky, N. and Halle, M., 1968: The Sound pattern of English. Harper and Row, New York. Danes, F., 1967: Order of elements and sentence intonation. In:To Honour Roman Jakobson (Mouton, The Hague), pp. 499-512. Reprinted in: Bolinger (ed.), Intonation. Penguin Books. Pp. 216232 (1972). Dik, S. et al., 1980: On the typology of focus phenomena. GLOT3; 41-74. Fuchs, A., 1980: Accented subjects in 'all-new' sentences. In: Wege zur Universalien-Forschung (Festschrift fur Hans-Jakob Seiler). Narr, Tubingen. Pp. 449-461. Gussenhoven, C , (forthcoming): Focus, mode, and the nucleus. To appear in: Journal of Linguistics. Jackendoff, R.S., 1972: Semantic Interpretation in Generative Grammar. MIT Press, Cambridge. Ladd, D.R., 1980: The Structure of intonational meaning: evidence from English. Indiana University Press, Bloomington. Liberman, M.Y. and Prince, A., 1977: On stress and linguistic rhythm. Linguistic Inquiry 8; 249-336 Newman, S.S. 1946: On the stress system of English. Word 2; 171-187. Nooteboom, S.G., Kruyt, T. and Terken, J., 1980: What speakers and listeners do with pitch accents: some explorations. In: T. Fretheim (ed.), Nordic Prosody II. Tapir, Trondheim. Nooteboom, S. and Terken, 3., (forthcoming): What makes people omit pitch accents: an experiment. To appear in: Phonetica. Prince, E.F., 1981: Toward a taxonomy of the given/new distinction. In: P. Cole (ed.), Radical Pragmatics. Academic Press, London. Schmerling, S.F., 1976: Aspects of English sentence stress. University of Texas Press, Austin. Selkirk, E.O., 1980: On the role of prosodic categories in English word stress. Linguistic Inquiry 11; 563-605.
INTERPRETING INTONATION: A MODULAR APPROACH
Daniel Hirst
Abstract
1. Approaches to intonation meaning Studies of intonation, particularly those concerned with the way in which intonation contributes to the meaning of an utterance, are traditionally classified into a number of categories, i.e. syntactic, attitudinal, pragmatic, information structure approaches etc. It is an interesting fact in itself that despite the reductionist nature of much of this work, despite the fact that it is carried out on the basis of often radically different assumptions as to the nature of the inquiry, each of these approaches to the problem constitutes a legitimate way of dealing with (at least some of) the facts and is capable of revealing genuine insights. A somewhat more pessimistic corollorary would be that no one of these approaches is capable, by itself, of providing the whole picture. This seems a not unreasonable conclusion one is reminded of the Buddhist JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 171-181
171
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Intonation provides an apparent counter-example to the claim made by proponents of the Extended Standard Theory of generative grammar that there is no direct interaction between phonology and semantics. In the light of recent work on phonological representations, a phonological analysis of intonation is proposed which breaks down an intonation contour into two component parts: the phonological structure and the underlying tone sequence. It is suggested that while the phonological structure is partly determined by the syntactic structure, the tonal sequence is assigned freely in the phonology. A further possible contribution to the intonation is the existence of tonal morphemes in the form of floating tones. It is argued that while the description of English intonation is simplified if we assume that there is a tonal emphatic morpheme, a similar analysis for "interrogative" intonation cannot be correct. It is suggested finally that the claim that phonology and semantics do not directly interact, rather than being disproved by the facts of intonation, provides an essential clue to the composite syntactic/semantic/pragmatic nature of intonation meaning.
DANIEL HIRST story of the blind men who described an elephant as variously like a tree-trunk, a snake, a wall, a huge leaf or a spear, depending on which part of the elephant they had touched. Given the complex nature of the way in which intonation contributes to meaning, it becomes imperative to discover principles by which we can break down this complexity into simpler and more readily understandable aspects. Such an approach, as has often been emphasised, will necessarily be highly modular, the surface complexity of linguistic observations being determined by the interaction of a number of autonomous sub-components.
(1)
SYNTAX
PHONOLOGY
SEMANTICS
in such a way that there is no direct interaction between the phonological component aand the semantic component. Any contribution of the phonology to the overall "meaning" of the sentence would consequently, on this analysis, be assumed to be a function of the way in which the sentence is used, rather than part of the linguistic content of the sentence itself (i.e. pragmatic meaning rather than semantic meaning). This is, naturally, intended to be an empirical claim. The case of intonation and its contribution to meaning, however, would seem at first sight to be a counter-example to "this claim, since surely this is one case where phonological items receive a direct semantic interpretation. This point of view has been expressed recently by Ronat (1982) for example. In fact, however, if we look more closely at the implications for the analysis of intonation of (1), I shall claim that far from being disproved by the facts of intonation, (1) provides a fundamental insight into the different ways in which intonation contributes to meaning. Forgetting (1) for the moment, the essential problem is to account for the fac.t that a given "sentence" (taken in the sense of a sequence of words with an associated syntactic structure) can be pronounced in several different ways, each of which conveys some aspect of the meaning of the sentence as it is used in discourse. If we take the propositional content of the sentence as given, we then need to specify 172
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Within the Revised Extended Standard Theory of Generative Grammar, as presented for example in Chomsky (1981), the components of the grammar PHONOLOGY, SYNTAX and SEMANTICS are specifically assumed to be organised as in (1):
INTERPRETING INTONATION how this combines with the meaning of the intonation. To do this, however, we need to be able to specify the nature of the phonological primitives involved. Z The phonology of intonation The phonological model I shall assume here incorporates a number of proposals which have been made in the recent literature concerning phonological structures (for more details cf Hirst 1983). The essential characteristics of this model are as follows: a the terminal elements of a phonological representation are segments of two distinct types: T-segments defined by the features [HI] and [LO], and P-segments defined by the features [SYLL], [CONS], [COR], [LAB] etc. b. T-segments and P-segments are linked by means of category symbols such as Syllable, Foot, Phrase etc. constituting a phonological representation. Within such a framework, the elements which contribute to the intonation meaning of a sentence are on the one hand the way in which the sentence is mapped onto the phonological structure, and on the other hand the T-segments which are assigned to this structure. Note that while this analysis corresponds in some ways to traditional notions of 'stressing' and 'intonation' respectively, this does hot in any way imply that the T-segments do not themselves contribute to the identification of the phonological structure of the sentence. Indeed the evidence seems to indicate that the T-segments provide the primary evidence for the listener in identifying the phonological structure, (cf Faure, Hirst & Chafcouloff 1980). I shall have litte to say here about the mapping from syntactic structure to phonological structure (for some interesting recent proposals see Selkirk 1978, 1981, forthcoming, Nespor & Vogel 1982, Culicover & Rochemont 1983) but it will be obvious that although the two structures are not isomorphic, structural differences in syntax will often represent potential differences in the phonological structure. In this case, since the syntactic structure is available for semantic interpretation, there is no violation of (1) above. More crucial for this hypothesis will be cases where a difference in T-segments corresponds to a meaning difference. The T-segments could be introduced into the phonological representation in at least two conceivable ways. First of all they could be introduced from the lexicon: this is presumably the case in lexical tone languages where a lexical entry can be taken as consisting of a sequence 3S, vol. 2, no. 2
173
DANIEL HIRST of P-segments together with a sequence of T-segments. The other possibility would be for the T-segments to be introduced freely in the phonology. I have suggested (Hirst 1983) that a core system for (British) English intonation can be defined by assigning tones in conformity with a template: (2)
F +HI
-HI
-LO
+LO
defining the tones on the stress-foot F, and with a separate template: (3)
P -HI
aHI
+LO
-aLO
specifying the tones assigned directly to the phonological phrase P. Since English is not a lexical tone language, the T-segments cannot be part of the lexical entry for English words, or, looked at another way, the sequence of T-segments in a lexical entry for an English word must be null. There is, however, another possibility, namely, that the T-segments constitute by themselves a lexical entry in which, this time, it is the sequence of P-segments which is null. Such a situation is in fact attested in a number of African tonelanguages. Welmers (1959) cites the case of Jukun (Takun dialect, Eastern Nigeria) where "the replacement of any tone by high is a morpheme signalling the 'hortative' construction" (p. 8). In Babete (West Cameroun), Hyman & Tadadjeu (1976) describe an "associative marker" which is realised as a "tone-raising on the prefix of the second noun". A well documented example of such "floating tones" is to be found in Bambara (dialect of Bamako, Mali) where a floating low tone is the only phonetic manifestation of the definite determiner (cf Bird 1966, Bird, Hutchinson &. Kante 1977). 3.
Tonal morphemes in English
Although the English lexicon does not allow lexical entries containing both T-segments and P-segments, there seems to be no a priori reason JS, vol. 2, no. 2
INTERPRETING INTONATION why we should exclude the possibility that some morphemes might consist of such floating tones. Such an analysis has in fact been proposed for English by Leben (1976) to account for terminal rise in so-called "comma intonation" sentences although it is not quite clear whether Leben actually intends the "comma-intonation marker" to be represented as a constituent of the surface syntactic structure, a solution which seems to have no other syntactic justification than to generate the required intonation patterns. In the rest of this paper, I examine two other candidates which might be analysed as morphemes realised as floating tones, and the implications of this analysis for semantic interpretation. 3.1 Interrogative intonation Template (3) accounts for the basic distinction between falling and rising intonation patterns as a free choice in the phonology, implying that this distinction is not open to semantic interpretation. An alternative analysis would be to assume that the final rise commonly associated with questions in English as in a great number of other languages is in fact the phonological manifestation of a question-morpheme. Here, unlike Leben's "comma-intonation marker" mentioned above, it seems fairly controversial to assume that questions are syntactically marked by the presence of an interrogative morpheme. In a number of languages this morpheme is phonetically realised, (Polish "Czy", Japanese "ka", Welsh "A", Bambara "wa" etc.). One particularly interesting case is that of the Cameroun language Basaa (Moreton £c Bot Banjock 1975) where the interrogative morpheme is generally realised as the low-toned particle /€/ as in W
a.
Me fjkon
"I am i l l "
b.
Me fjkonfe "Am I ill?"
What is particularly interesting in the Basaa example is that when the preceding word finishes in an open syllable, the vowel of the particle / / assimilates to that of the vowel it follows so that we find: (5)
a.
Me TjtT
"I gave"
b.
Me ijtif
"Did I give?"
so that in fact Basaa turns out to be a case of a language where a large number of sentences are distinguished by a final high tone for the declarative and a final fall for the interrogative, a situation which must be extremely rare among the world's languages which practically always seem to prefer the opposite solution (cf Bolinger 1978, Cruttenden JS, vol. 2, no. 2
175
DANIEL HIRST 1981). The interrogative particle in Basaa is clearly very close to becoming a pure floating tone like the Bambara definite marker or the Babete associative marker. A similar analysis for the rising intonation associated with questions in English would be attractive but this solution cannot be the right one for the simple reason that questions identified only by rising pitch in English can be shown not to be syntactic questions at all. Consider the following examples: (6)
a. Did he buy something? b. Did he buy anything?
(7)
a. He bought something, b. *He bought anything.
where (7b) is unacceptable. The crucial fact is that even when the sentence is pronounced with a rising "question" intonation, the sentence is still unacceptable: (8)
a. He bought something? b. *He bought anything?
This seems to show conclusively that the sentences in (8), despite the rising intonation (or the question-mark in writing) are not in fact syntactic questions and hence cannot be analysed as containing a question-morpheme. The alternative analysis, then, is for the final high-tone to be assigned freely in the phonology as I suggested above by means of template (2). The consequence of this analysis for the semantics of rising intonation is that if the modular approach outlined in (1) is correct, a sentence with a final high pitch cannot be semantically different from a sentence with a final low pitch. This, however, is precisely what we should be led to expect from example (6) to (8). Note furthermore that while it is apparently the case that in a great number of languages a final high pitch is used to indicate that the speaker expects an answer, there is to my knowledge no well-documented case of a language where a question without a final high, (or conversely a statement with a final high pitch), would be considered syntactically ill-formed. Even a language like Italian which has no overt marker for yes-no questions does not restrict falling intonation to statements or rising intonation to questions. 176
JS, vol. 2, no. 2
INTERPRETING INTONATION 3.2 'emphatic' intonation In an earlier analysis of English intonation (Hirst 1977) I suggested that one of the elements which constitute the intonation of a sentence is an abstract feature which I referred to as "emphasis" and which can be interpreted in one of two ways, either as an adverbial equivalent to "extremely", "to a great extent" (emphasis for intensity) or as implying a contrast with another sentence (emphasis for contrast). Assuming that there is an emphatic marker E present in syntactic structure, it will have an effect on the phonological and semantic interpretation of this structure. The phonology can be accounted for by: (9)
Emphasis interpretation (phonology) In the mapping from syntax to phonological structure, after a marker E, group together ail the following feet into a single foot.
Thus for example a structure like (10a)
[It was a E [lovely] [meal]] P
F
F
or (10b)
[We E are [having a [nice] [time]]] P F F F
will be converted into something like: (lla)
[It was a Efflovely] [mealD] P FF F
and (lib)
[We E [are [having a] [nice] [time]]] P
F
F
F
F
we can now assume that the phonology will assign T-segments in conformity with the template (2) to the outer foot only, thus accounting for the fact that 'primary stress1 or 'intonation centre' occurs on "lovely" and "are" respectively in these sentences rather than on "meal" and "time" as would be expected in the non-emphatic version of these sentences. This cannot be the whole story, however, since the emphasised JS, vol. 2, no. 2
177
DANIEL HIRST word can be the last word of the sentence as in: (12a)
The coffee's E awful.
applying (9) to (12a) will result in (12b)
[The [coffee's] E [[awful]]] P
F
FF
which once the T-segments have been assigned will not be tonally distinct from the non-emphatic version of (12) (13)
[The [coffee's] [awful]] P
F
F
It does, however, appear to be the case that (13) is tonally distinct from (12), since (13) is generally being described as having a low falling pitch movement on the final word which is quite different from the high falling pitch movement observed on (12). This difference is even more clearly marked in the case of a final rising intonation so that contrastive emphasis on "yours" for example in (1*)
Is it yours?
results in a falling-rising intonation which is clearly different the simple rising pattern observed on the non-emphatic version.
from
Suppose now that the emphatic marker E is not simply an abstract marker but corresponds to a tonal morpheme consisting of a floating high tone (i.e. [+HI -LO]). Suppose furthermore that when two high tones follow each other the second is interpreted as being higher than the first. Nothing more need be added to extend the core intonation system defined by (2) and (3) to include emphatic intonation patterns, since the tonal sequence on the final foot will now be as follows (15)
non-emphatic
emphatic
a. H L L
HHLL
b. H L H."
H H LH
experiments in speech-synthesis on the basis of patterns generated from (15) using a production-model I have described elsewhere (Hirst 1981, 1983) give quite satisfactory results. Suppose, then, that we adopt the solution of analysing emphasis as a tonal morpheme in English, how will this affect our semantic interpretation? The answer is obviously that in accordance with (1) 178
JS, vol. 2, no. 2
INTERPRETING INTONATION if the emphatic marker is present in the syntactic structure, it will be visible to the rules of semantic interpretation. Presumably the emphatic marker will have two distinct lexical entries corresponding to the emphatic and contrastive interpretations, although this point is no more crucial than for any pair of homophonous morphemes. The intensive interpretation is quite straightforward. The contrastive interpretation is a little more subtle. In my earlier analysis I assumed that the contrastive marker was in fact conditioned by an enriched underlying structure, so that for example a sentence like (16)
John doesn't E read books.
where E cannot be interpreted as intensive intensity, was assumed to be derived from an underlying form like (17
John doesn't read books John [ A ] books. V
where the second half of the sentence contains a dummy verb [ A ]. The contrastive intonation was then assigned to (17) and the second half of the sentence (now recoverable due to the contrastive intonation) deleted by an operation called "reduction". This analysis seemed to capture the semantic interpretation of (16) quite correctly since by reconstructing (17) the listener can supply whatever value for the dummy variable seems most appropriate from the context. Note however that since the listener needs to reconstruct (17) from the context there is no need to assume that (17) was ever an underlying form for (16) at all. If instead we interpret E as a contrastive morpheme as I suggest above, we can assume that the semantic interpretation of (16) derives something like" (17) in quite a straightforward way. It is worth noting that some languages possess an overt contrastive morpheme which seems to function like the tonal morpheme' I propose for English. Thus in Bambara, (Bird, Hutchinson & Kante 1977) a sentence like (18a)
Muso fila be John f« 'John has two wives'
can be modified by the emphatic morpheme "de" giving (18b)
Muso fila de be John fe 'John has TWO wives'
JS, vol. 2, no. 2
179
DANIEL HIRST (18c)
Muso fila b£ John de fe 'JOHN has two wives'
4. Conclusion It seems then that despite appearances, the facts of intonation cannot be taken as counter-evidence for the modular approach to linguistic description exemplified in the Revised Extended Standard Theory of generative grammar. The organisation of the grammar as in (1) seems on the contrary to lead to the right predictions concerning which aspects of intonation are to serve as input to the rules of semantic interpretation, and which are to be interpreted pragmatically.
References Bird, C , 1966: Determination in Bambara. Journal of West African Linguistics 3; 5-11. Bird, C , Hutchinson, J. & Kante, M., 1977: An Ka Bamanankan Kalan: Introductory Bambara. Indiana University Linguistics Club. Indiana. Bolinger, D., 1978: Intonation across languages. In Greenberg et al. (eds.), Universals of Human Language. Vol. 2: Phonology. Stanford University Press, Stanford. Pp. <>71-52
J5, vol. 2, no. 2
INTERPRETING INTONATION Leben, W.R., 1976: The tones in English intonation. Linguistic Analysis 2(1); 69-107. Moreton, R. <5c Bot Banjock, H.M., 1975: Manuel d'Initiation au Basaa. (2nd edition). Douala. Nespor, Marina & Vogel, Irene, 1982: Prosodic domains of external sandhi rules. In Van der Hulst <5c Smith (eds.), The Structure of Phonological Representations. Part 1. Foris, Dordrecht. Pp. 225-255. Ronat, Mitsou, 1982: Logical form and prosodic islands. Journal of Linguistic Research 2(1); 33-48. Selkirk, Lisa, 1978: On prosodic structure and its relation to syntactic structure. Ms. distributed by IULC. Selkirk, Lisa, 1981: On the nature of phonological representations. In Myers, Laver and Anderson (eds.), The Cognitive Representation of Speech. North Holland. Pp. 379-388. Selkirk, Lisa, (forthcoming): Phonology and Syntax: the Relation between Sound and Structure. MIT Press. Welmers, W., 1959: Tonemics, morphotonemics and tonal morphemes. General Linguistics IV(1); 1-91981.
35, vol. 2, no. 2
181
182
JS, vol. 2, no. 2
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES Carlos Gussenhoven
Abstract
There have been a number of proposals concerning the paradigm of nuclear tones of British English (e.g. O'Connor & Arnold 1973 [1961], Halliday 1967 [1963], Crystal 1969, Brazil 1975). To greater or lesser extents, these proposals attempt not simply to list a number of nuclear tones, but also postulate relations between certain tones of the type 'these two tones are variants of each other1. There has been no attempt, however, to collect exhaustive experimental evidence for any one of these proposals. Rather, the presence or absence of relations between tones is rendered plausible by means of appeals to (assumed) shared intuitions concerning the functions of the tones, and by considerations of phonetic similarity (cf the frequently claimed connection between the 'rise-fall' and the 'fall'). It is not surprising that this should be so. As Pierrehumbert observes (1980: 60), we cannot, unfortunately, ask informants to tell us whether two intonation patterns belong to the same category or to different categories, because this question 'overtaxes the native speaker's powers of introspection'. However, linguistic experiments need not have the straightforward design envisaged in Pierrehumbert's statement. There are other ways of getting at the structure of linguistic paradigms besides asking subjects to JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 183-203
183
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Dissimilarity judgements were obtained from 44 naive native speakers of British English on the semantics of nine synthetic English nuclear tones, presented pairwise on five carrier-sentences. The aim of the experiment was to test the hypothesis that the nine tones form a linguistic paradigm of three sets of three tones, each of which set represents a semantic continuum from 'special' to 'routine'. A threedimensional scaling analysis was carried out on the data to see to what extent the configuration predicted on the basis of the hypothesis matched the one actually obtained. The similarity turned out to be very satisfactory for the tones in the sets 'fall' and 'fall-rise', but poor in the set 'rise'. The results are reported in such a way as to enable other researchers to test alternative hypotheses concerning the relationships between the tones.
CARLOS GUSSENHOVEN
The delayed fall and the delayed fall-rise are the 'rise-fall' and 'rise-fall-rise' of other descriptions. The delayed rise had not previously been noted in the literature. The stylised fall (also known as the 'call contour') and the stylised rise (also known as the level tone), as well as the meaning 'routine' for these tones, are taken from Ladd (1978). The stylised fall-rise, again, is new in the sense that its occurrence and function had not previously been discussed, although the phonetic possibility for such a tone is noted by Pierrehumbert (1980: 115). The illustration of the tone she gives, however, is in fact more like a high, narrow-range fall-rise (called a half-com pie ted fall-rise in Gussenhoven, forthcoming) rather than the one given in Figure 'fab (the sixth tone). Instead of asking subjects to say whether two tones belong to the same or to different categories, the hypothesized tone paradigm in Figure 1 permits the experimenter to ask informants to give an estimate of closely any two tones agree in meaning, since estimates of semantic proximities can be subjected to a multidimensional scaling analysis, whose outcome can be compared to the hypothetical configuration suggested by Figure 1 and given in Figure 2 Unlike the investigations by Uldall (1972) and Owen (1980), in which JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
tell us what they are. In Gussenhoven (forthcoming) a structure of the nuclear tone paradigm is argued for which may be more amenable to experimental testing, by virtue of the fact that the tones are arranged in a coherent network of relationships that allows the prediction of differences in degree of association between pairs of tones. In this proposal, there are three tone categories: the fall, the fall-rise, and the rise. Within all three categories, modifications of basic tone variants are possible, each modification adding a semantic constant (morpheme) to the meaning of the basic tone. A tone category is thus a 'free1 morpheme (like nouns, say) and a modification a 'bound1 morpheme (like a diminutive suffix, for instance). One modification is 'delay', another 'stylisation'. Delay causes the meaning of a tone to acquire a measure of 'special-ness' or 'non-routineness', and is phonetically realised by delaying the association of the tone with the text, for example by one syllable. Stylisation, by contrast, causes the tone to acquire a measure of 'routineness' (Ladd 1978). If we restrict our attention to these two modifications, there are in each of the three basic tone categories therefore three variants, which are spaced on a single dimension of 'routineness': the unmodified variant is positioned in the centre, flanked by its delayed and stylised counterparts, as in Figures 1 and 2. Observe that these Figures imply that the delayed variant and the stylised variant in a tone category are 'further apart' than either of the above and the unmodified variant. However, because we are dealing with tone categories, the difference between one category and the next (between 'fall' and 'fall-rise', say) is equivalent to any other such difference (say, between a 'fall' and a 'rise').
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES Osgood semantic differentials were applied to a number of intonation patterns and a factor analysis was made of the scores with a view to discovering the attitudinal attributes of those patterns, our investigation was hypothesis-testing, rather than exploratory in nature. Multivariate techniques like factor analysis and multidimensional scaling, however, yield information that might typically be used by the experimenter as an aid to the discovery of some kind of structure in data about which no prior hypotheses were entertained. Thus, the configuration resulting from a three-dimensional analysis of the nine tones will enable us to recognise a certain resemblance between that obtained configuration and the theoretical one given in Figure 2, but it will not enable us to test the hypothesis that the latter configuration is in fact the one used by native speakers in a similarity judgement task. A (1)
\ (2)
~~=- (3)
A/ (4)
V (5)
1^(6)
_/(7)
/
~~
fall-rise (8)
(9)
rise non-routine (delay)
neutral (unmodified)
routine (stylisation)
Fig. 1 Structure of nuclear tone paradigm It is, however, possible to make comparisons between informants' responses to one tonal contrast and their responses to another: we can compare perceived semantic differences between pairs of tones about which the structure in Figure 2 makes predictions, and test these difference scores for significance. Figure 2 is a gross idealisation. It is not in fact claimed, for example, that the distance between the stylised variants of any two tones equals the distance between their delayed variants, or even that distances between tone categories per modification should actually be equal. It is rather that the configuration predicts certain inequalities. These inequalities are of seven types. They are listed below, with the number of actually occurring inequalities of each type given in brackets. (See Figure 1; '1 modification' stands for the distance between two adjacent tones in a tone category ( e . g . / \ , \ ), and 'a tone1 for the distance between tones in different tone categories that agree in modification (e.g. /N , _ / ) . An example of the first inequality would thus be / \ , \ < / \ , "=— or equivalently 1,2 <1,3; cf. Appendix.)
JS, vol. 2, no. 2
185
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
fall
CARLOS GUSSENHOVEN 1. 2.
1 modification a tone
3.
1 modification
H.
a tone + 1 modification
5.
1 modification
6.
a tone
7.
2 modifications
< < < < < < <
2 modifications a tone + 1 modification a tone + 1 modification a tone + 2 modifications a tone + 2 modifications a tone + 2 modifications a tone + 2 modifications (total
(6) (2*0 (2*0 (12) (12) (12) (12) 102)
V
\
•/
•
MOU
'i Hr-i i / / / •+/
/
/
1 1
'
,
/ / /
/ ' / / A / ' 1 / I I I
Fig- 2 tones.
1
!•
-Hr \\ \\ \ \ \ \ \ \
\
\ \\ \\ \ \ \ \ \ \ \ \ \ -\ \ \ \ \ \ ^ \ \ \
NON-ROUTI 1MB
Hypothetical three-dimensional configuration of nine English
Such comparisons will enable us, for example, to respond to Gunter's complaint (1982), that an analyst's claim that the 'call contour' is a variant of the fall would appear to be as legitimate as a claim that it is a variant of the fall-rise, if the only requirement for such claims is that it should satisfy the particular analyst's preference. 186
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
N
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES
For the synthesis of the stimuli, canonical versions of the nine tones on suitable carrier sentences were needed. The following were selected as carrier sentences: 1. 2. 3. k.
Is it a unicorn? Do you need a paperclip? From Paddington On Saturday
5.
a'p<7:ssso
The structures were chosen so as to make natural occurrence of only a single accent possible. This accent fell on the antepenult, so as to allow the F o configurations of the tones to spread over three syllables and thus be maximally recognisable, also in the case of delayed tones. A male speaker of British English read the five carrier sentences with the nine tones three times. These 135 utterances were recorded on magnetic tape and run through a Frtfkjaer-Jensen Pitch Computer 1400, which produces, in real time, synchronous intensity and speech wave-form traces and a nearly-synchronous periodicity trace, in addition to a time trace. The records were used to establish segment durations of all segments. In a number of instances of V+nasal C combinations, segment durations could not be established owing to the absence of obvious differences in intensity and speech wave form between the segments. In most of these cases, measurements were supplemented from spectrographic records produced by a Kay Sonograph. Mean segment durations were established per carrier sentence-tone combination. Durations from the nuclear vowel onwards appeared to vary with tone. These were rounded off to the nearest 10 msec, the maximum time sample in the synthesis program to be used. Consonants appeared to vary less, and less consistently than vowels, and frequently a mean duration over all tones was established for these post-nuclear consonants. Pre-nuclear segments did not appear to vary greatly in duration, and for these, too, mean durations were established per carrier sentences across all nine tones. Subsequently, some normalisation of the segment durations that did clearly vary was achieved by ensuring that segment JS, vol. 2, no. 2
187
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Two of these are syntactic questions, referred to from now on as "Questions'. They were included because it was thought that since intonation cannot cause a functional shift from 'interrogative' to 'affirmative' in syntactic questions, but can cause such a shift in declarative sentences, they could be regarded as neutral carrier sentences where intonational meaning is concerned. The syntactically different structures 3 and 4 are referred to as 'Statements'. In addition, a nonsense carrier sentence was included (cf Uldall 1972).
CARLOS GUSSENHOVEN duration differences between any two versions of a carrier sentence did not deviate disproportionately from the mean difference between corresponding segments established over the five carrier sentences. The effect of this ad-hoc normalisation (and rounding) can be illustrated by plotting total durations of the last three syllables of the nine versions of each carrier sentence as measured (Figure 3a) and as used for the synthesis of the stimuli (Figure 3b).
The records obtained from the FJ Pitch Computer were insufficiently detailed for establishing F o contours. Therefore, one utterance of each carrier sentence-tone combination (i.e. one third of the material), was analysed with the help of a pitch extraction program based on Gold & Rabiner (1969), incorporated into the ILS package in the location mentioned above. The numerical output of this analysis (periodicity values for every 10 msec sample) was plotted by hand on linear graph paper, which traces were supplied with segment boundaries as obtained from the earlier analysis. Mean values of F o turning points were established and their relative timings in the syllable estimated- The mean range for falling movements of delayed and unmodified falls and fall-rises was 178-75 Hz, the mean peak for final rises of fallrises was 160 Hz, the mean range of the delayed rise was 75-170 Hz, and that of the unmodified rise 75-185 Hz. The two plateaus of the stylised fall averaged 150 and 100, of the stylised fall-rise 148-80, while the plateau of the stylised rise averaged 123 Hz. These values were rounded off to the nearest programmable Fo step (6 or 7 Hz). Pre-nuclear stretches were made the same for all 45 stimuli: a weakly falling slope of 100-87 Hz. After trial-synthesis with the values obtained, the first plateau of the stylised fall and fall-rise was raised to 156 Hz, the peaks of the falling movements to 187 Hz, and the peak of the final rise of fall-rises to 174 Hz. Moreover, lpngish plateaus of stylised tones were interrupted by a 10 msec, step-up or step-down of one programmable step (6 Hz) somewhere along the way, in an attempt to get rid of the unnatural effect of perfect monotony. These measures seemed to improve the acceptability of the stimuli, and the final result was quite satisfactory. Figure 4a gives measured Fo traces for a representative set of tones (those for From Paddington) and 188
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
One version of each carrier sentence was synthesized, with the segment durations that were to be used for the version with the (unmodified) fall. The synthesis program used was the Speech Imitation Device (SID), a modified version of the synthesis-by-rule program described in Holmes et al. (1964), available in the Phonetics Department of the University of Edinburgh. The fairly poor segmental quality of these initial versions was improved by trial and error, aided by selective inspection of spectrograms. After more satisfactory segmental versions had been produced, eight copies were made, in which segment durations were altered according to the values established earlier. This ensured that all nine versions of each carrier sentence would be identical, except for the duration and Fo of the last three syllables.
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES Fig. 3a 1000 pasosa Paddington paperclip
900
unicorn Saturday
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
600•
V
A
V
—
_/
-a.
*V
^f
Fig. 3b 1000 posesa 900
unicorn Paddington paperclip
800
Saturday
K
700
600 \
A
V
—
_
-«_
/v
-»,
Fig. 3 Durations of the last three syllables of all versions (a) as measured and (b) as used for synthesis.
35, vol. 2, no. 2
189
CARLOS GUSSENHOVEN
\
/ )
\
\
\
r '
J \
1 •0 t
P a
/ \
•
n
n
.
-
\
/
V
J
—\
*^
' rwm
rm p
j
^
t
1
n
tj •
a [d|
a
•
n
•
'
J / n
• P
/
nm
p
\ \1 n
n
t
•i
p
a
•
1
m P •
1
1)
n
t
•
t
J
—
\
n
P
n
/
/
p
a
a
i t
•
n
Fig. * A representative set of nine measured F o contours (a) and F o contours used in the synthesis of the stimuli, with segment durations as for 'From Paddington' (b). 190
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
-
v.. /
\
frw p
n
t
a
r M
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES Figure 4b gives the contours used in the synthesis, with durations as for this carrier sentence. Fo values for time samples between two specifications are calculated by SID by linear interpolation. These values are indicated by straight lines in Figure 4b, instead of by a histographic representation giving actual values for each 10 msec time sample. 3. The experiment
3.1 Presentation of the test tape The two test-halves were presented through high-quality headphones to 24 female and 21 male judges, who were naive native speakers of British English. They were recruited from the student population of Edinburgh University, and were paid for their services. The score sheet of one male judge had to be discarded for technical reasons, so that in effect judgements were obtained from 44 judges, of whom 22 listened to the first half and 22 listened to the second half, i.e. each subject judged only one half of the materials. Before doing the test, they received an oral instruction on the nature of their task, were played recordings of each of the synthesised carrier sentences (with unmodified falls), and were allowed to do three trial pairs. Their instruction was to decide whether they thought the two stimuli in each pair differed little or a lot in meaning. They recorded their judgements on five-point scales. Written versions of the carrier sentences were printed against each five-point scale, both to serve as JS, vol. 2, no. 2
191
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The nine versions of each carrier sentence were combined pair-wise, such that every combination occurred once and every version occurred as frequently in first as in second position. This resulted in 36 stimulus pairs per carrier sentence, or 180 pairs in ail. The test tape was recorded in real time, as the SID program made it possible to program pauses between stimuli and stimulus pairs. These were, respectively, 2 sees and 9 sees long. Each stimulus pair was preceded by a warning signal, while a longer pause (17 sees) was programmed between blocks of ten pairs. Mean stimulus duration was .96 sees. The total duration of the test tape was thus 180 x ((2 x .96) + 2 + 9 ) + 1 7 x 9 sees, or 41 minutes, which made it necessary to split the test into two halves. Stimulus pairs were distributed over the two halves so as to make them maximally comparable: each half contained 36 occurrences of each carrier sentence, 20 occurrence of each tone, and either 2 or 3 occurrences of each combination of tones, ordered so as to avoid clusterings of either tone or carrier sentence. In each half, moreover, the stimulus pairs included for one of the two carrier-phrases in the categories 'Questions' and 'Statements' complemented those included for the other carrier-phrase, so that all the cells in the matrix for each category were represented in each half.
CARLOS GUSSENHOVEN identificatory labels and to obviate problems of intelligibility. Judges reported favourably on the quality of the stimuli, and - somewhat surprisingly to the experimenter - on the ease with which the task could be done. 3.2 A 'phonetic' test
Twelve phoneticians with various linguistic backgrounds (most of them either native speakers of English or of Dutch, and if not the former, always with extensive knowledge of English) were asked to rate these 36 stimulus pairs for phonetic similarity. They were given score sheets with five-point scales, similar to those used in the semantic experiment. Judges listened to the stimuli individually for as long as they wished, playing the tape on a portable cassette recorder. A number of judges commented on the difficulty of comparing disparate phonetic attributes like 'falling pattern1 with 'mid level', and said that, for that reason, they could not help feeling dissatisfied with their performance. These data are referred to as 'Phoneticians' below. 4 Results Table 1 gives the raw-score matrices for Questions, Statements, Nonsense sentence and 'Phoneticians'. To the cells in the Nonsense -sentence matrix each of the W subjects contributed one half of the data, while to the first two matrices each subject contributed a complete matrix, representing the summation of his scores for the two carrier sentences in each of the categories Questions and Statement. To the Phoneticians matrix, of course, each of the twelve subjects contributed a complete matrix. All inequalities predicted by the model in Figure 1 were tested by means of Wilcoxon matched-pairs signed-rank tests for Questions, Statements and totals of Questions and Statements separately. Since for the Nonsense sentence only half a matrix was available per subject these data were not included in this analysis. The Appendix lists z-scores 192
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The instruction to our subjects was to rate the stimuli for semantic difference. Although the two modifications alter the shapes of the unmodified tone trajectories in similar ways (roughly, delaying the crucial movement and stretching it, respectively), there is no a priori reason why phonetic differences between stimuli should equal semantic differences. If they do not, it would clearly be desirable to demonstrate that the judges did in fact respond to the semantic effects of the tones rather than to their physical characteristics. In order to be able to compare the semantic judgements with phonetic judgements, the 36 pairs of nonsense stimuli (i.e. /a'paisssa/) were randomised, and recorded on a cassette tape.
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES (2) 94 101 (3) 137 125 132 125 (4) 75 134 153 95 149 130 (5) 145 92 145 119 136 107 131 105 (6) 125 127 102 118 100 146 136 113 131 103 (7) 133 148 158 104 94 128 132 154 148 95 120 111 (8) 103 128 170 120 136 136 79 139 163 163 132 128 152 90 (9) 174 134 121 150 130 137 138 123 136 135 122 141 151 128 131 159 Questions Statements 43 90 57 44 65 65 70 50 66 58 73 61 63 56 53 62 69 87 64 56 57 70 72 87 58 73 77 48 64 65 54 82 83 71 76 89 (1) (2) (3) (4) (5) (6) (7) (8) Nonsense Sentence
28 50 44 24 41 37 43 28 41 32 47 29 31 35 24 39 42 49 25 31 30 37 41 51 32 41 43 25 47 42 31 51 43 46 40 40 (1) (2) (3) (4) (5) (6) (7) (8) Phoneticians
Table 1 Difference scores for Questions, tence and Phoneticians.
Statements, Nonsense sen-
and significance levels for all comparisons. For the summed data, 71 out of the 102 inequalities appeared to show significant differences (at least at the 0.5 level, one-tailed), while a further 17 had the right sign, though did not reach a significant level. Of the remaining 14 that had the wrong sign, 4 were significant at the 0.5 level, two-tailed. Of these 14 differences that turned out to have the wrong sign, 13 involved comparisons that included a tone in the category 'rise'. The reasons why some of the results run counter to our predictions would appear to be that (1) the delayed rise and the unmodified rise are perceived as roughly equally different from the stylised rise, and (2) the delayed rise is perceived as much closer to the stylised fall-rise than would be expected on the basis of our model. The inequalities fared better in the Questions than in the Statements : in the Questions 14 inequalities appeared to have the wrong sign, in the Statements 23. Note also that the inequalities of the type 'a tone + 2 modifications > a tone + 1 modification1 would appear to suffer from a ceilingeffect: differences of both types are relatively large (see Appendix). It should be observed that the 102 inequalities listed in the Appendix are a mathematical hotbed of dependencies, in the sense that with JS, vol. 2, no. 2
193
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(2) (3) (4) (5) (6) (7) (8) (9)
CARLOS GUSSENHOVEN
In order to compare the observed perceptual behaviour of our subjects with the theoretical model of Figure 2, a non-metric three-dimensional scaling was performed on the three proximity matrices obtained for Questions, Statements and Nonsense utterance (INDSCAL). The analysis yields a single configuration of objects (tones, in our case) based on ail matrices entered, as well as dimension weight indices for each matrix (Young
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the establishment of some of them, others are necessarily true. The reason why all of them are given is that they are independent in the sense that each one of them can be regarded as a self-contained miniexperiment, testing the hypothesis that tone X is perceived as more akin to tone Y than to tone Z. In answer to Gunter's query* for example, it is now possible to say that the 'call-contour' is perceived as more akin to the fall than to the fall-rise, both in Questions and in Statements, at an overall significance level of .04. Or one could look at how Brazil's claim (1975) that the (high) rise is an intensified variant of the fall-rise stands up to our judges' responses: his classification predicts that the distance between the rise and the rise-fall (intensified variants of the fall-rise and the fall respectively) is smaller than that between the rise and the fall (difference of tone plus a difference in state of 'intensification'), whereas our proposal predicts the reverse. The z-score of -3.11 corresponds to significance level of .002, twotailed, in our favour. Other researchers may be interested in other comparisons. 3
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES
V
N
«Ol
Til« 1
/ / /_ 1 1
/
/
i
/
1 1 1 1
1 1 / JL 1 ' / /
/
/
\
\
\
\
\
\
\
\
\
\
\
"
V
\
\
\
\
\
^
NON-ROUTI ME
Fig. 5 Derived stimulus configuration in three dimensions. ('fall vs rise1) than by the first ('routine'), whereas the opposite holds for the Question matrix (cf the horizontal axis in Figure 6). The Statement matrix also emphasizes the 'fall vs rise' dimension relative to the 'fall-rise vs other' dimension, while for the Nonsense sentence the opposite is true (cf the vertical axis in Figure 6). The behaviour of the Statement data relative to the other matrices is probably due to the fact that a greater difference in communicative effect is achieved by substituting a rise for a fall in non-interrogative sentences than in other sentence types, and listeners are therefore inclined to perceive the difference between these tones as greater. In order to compare the results of the 'semantic' experiment with those obtained in the experiment in which phoneticians rated the stimuli for phonetic similarity, a non-metric three-dimensional scaling analysis (ALSCAL) was performed on the matrix obtained in the latter experiment. The derived configuration closely resembled J5, vol. 2, no. 2
195
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
\ \ \ \ \ \ 1 \ \ \ A \ \ \ \ \ \ \ \ \ \ \
\
CARLOS GUSSENHOVEN
' f a l l vs rise' man than ' r a l l - r i s t vs other tones'
•fill vs rite' 'routine'
' f a l l vs rise' less than 'routine'
mn than
°-°
Question
.
- 1.0
' f a l l vs rise' less than ' f a l l - H s e vs other tones'
Fig. 6 Dimension weight indices of three matrices on the 'fall vs rise' dimension relative to the other two dimensions. the one given in Fig. 5. The (Pearson) correlation coefficient between the 36 inter-stimulus distances in the two configurations was .91 (p <.O1). Although this high coefficient already indicated that asking naive subjects to rate the tones on partly meaningful and partly non-meaningful carrier sentences for semantic similarity, and asking non-naive subjects to rate the same tones on non-meaningful carrier sentences for phonetic similarity amounted to doing the same thing, the interstimulus distances of both configurations were correlated with the theoretical distances in our model, to see if the 'semantic' configuration resembled the model better than did the 'phonetic' configuration. To this end, the distance of one modification was. assumed to be 1, while the distance between two tones with the same modification was assumed to be 1.3. This latter figure was arrived at by dividing the mean raw score for a difference between two tones with the same 196
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
• Nonsense
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES modification by the mean raw score for a difference of one modification, which ratio was taken to be a rough estimate of the relative perpetual distances concerned. The distance of two modifications was then 2, while a distance of a tone plus one modification was taken to be the hypothenuse of a triangle with sides 1 and 1.3, and the distance of a tone plus two modifications that of a triangle with sides 2 and 1.3, i.e. 1.64 and 2.39 respectively. The correlation coefficient between the 36 inter-stimulus distances in the 'semantic' configuration and the model distances was .54 (p<.01) and that between the 'phonetic' distances and the model distances .57 (p < .01), which coefficients are not significantly different. 5
Discussion
Clearly, these deviations have a phonetic explanation. The delayed rise and the stylised fall-rise both involve a plateau and a rising movement late in the final syllable, and the stylised rise, as a level tone, is very different from tones with rising movements. When collecting proximity judgements from phoneticians, our hope was to be able to underscore the finding that emerged from our analysis of the dimension weight indices (see Figure 6), i.e. that the naive subjects had responded in a non-phonetic fashion, and had given functional, semantic ratings to our stimuli. This hope was vain: the two groups of subjects did not respond differently to the stimuli, in spite of the different instructions. This was clear from the good correlation between the inter-stimulus distances in the two derived configurations, as well as from the fact that there was no difference in correlation between these two sets of distances on the one hand and the hypothetical distances in our model on the other. This result does not, strictly speaking, demonstrate that the naive- subjects based their judgements on phonetic considerations. The only thing we can say is that the experiments did not reveal that semantic differences between English US, vol. 2, no. 2
197
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The experiment described above was undertaken to lend plausibility to a proposal for the structure of the English nuclear tone paradigm. It was clearly very successful where the six tones in the categories 'fall' and 'fall-rise' are concerned. Of the 36 inequalities predicted by the model for these tone categories only one had the wrong sign (1,5-1,6), and 26 of them were significant (p <.O5). The three tones in the category 'rise' did not pattern in the derived configuration in the way we expected to: the positions of the delayed rise and the unmodified rise are reversed, while the stylised rise is uncomfortably far removed from the other two rise variants. These two deviations could be attributed to (1) the fact that the stylised rise, a level tone, was perceived as rather different from both the delayed and the unmodified rise, instead of being perceived as closer to the latter than to the former, and (2) the fact that the delayed rise was attracted by the stylised fall-rise.
CARLOS GUSSENHOVEN tones differ from phonetic differences between them.
A comparison of the dimension weight indices for the three matrices that were entered into the INDSCAL analysis suggests that the declarative carrier sentences elicited relatively large difference judgements between rises and falls. This was interpreted as an artefact of the extra communicative effect created by the interchange of these tones. For the assessment of semantic attributes of tones it would therefore seem to be appropriate to use syntactic questions as carrier sentences, which are impervious to such shifts from 'question' to 'statement' as a result from the substitution of a fall for a rise. If, to end on a speculative note, the number of tones had been increased at the expense of the number of carrier sentences, and the modification 'half-completion' had been included in our material to produce three further tones, we might have ended up with a more orderly arrangement of the tones in the category 'rise'. The modification 'half-completion' produces tone variants that would intuitively seem to be positioned between 'unmodified' and 'stylised' (Gussenhoven, forthcoming). The half-completed rise has a brisk rise to mid level in or immediately after the nuclear syllable and a mid level postnuclear stretch, and could thus be regarded as a phonetic hybrid of the unmodified rise and the stylised rise. As such, it might have had a corrective effect on the positions of the rise variants.
198
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The fact that the correlation between our derived 'semantic' stimulus configuration and the hypothetical configuration is low (r=.54) should not be interpreted to mean that there is a poor resemblance between model and observations. The correlation coefficient was only meant to be seen in relief against the coefficient between the model and the 'phonetic' configuration. A simple visual inspection of Figures 2 and 5 is, we would suggest, a better way of assessing the similarity between the two configurations. It should also be observed that our model distances were inordinately restrictive. There is, for example, no reason why the distance between 'delay' and 'unmodified' should equal the distance between 'unmodified' and 'stylisation'. Or indeed, that these distances should be the same when measured between variants in different tone categories. Stylisation is, for instance, further removed from 'unmodified' in the case of the fall than in the case of the fall-rise: although the structure postulated in Figure 1 does not imply equality here, these distances were made equal in the model in Figure 2. To make a comparison, the subjective semantic difference between He killed her and She died may well be greater than that between He tripped her and She fell, which in turn may be greater than that between He made her laugh, and She laughed, in spite of the fact that in all three cases the same semantic difference of 'causation' is involved.
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES Appendix Wiicoxon matched-pairs signed-rank tests are listed for all inequalities predicted by the model, for Questions, Statements and summed data for Questions and Statements separately. The first column gives (absolute) z-scores, the second gives information about the sign of the difference and the significance level. In this column, blanks indicate that the difference between total scores has the wrong sign, while exclamation marks indicate that such a difference is significant at p > .05, two-tailed. Where the sign is right, significance levels are specified if these reach at least p = .05, one-tailed, 'ns' indicating that this level is not reached. 2 modifications
1 modification Statement
Question 1,2 2.3 4,5 5,6 7,8 8,9
3.25 1.00 0.28 1.5 4.5 1.6
.000 ns
ns
.000 .05
5,3 8,3 9,2 6,2 5,1 5,3 4,2 8,1 8,3 9,2 7,2 8,4 8,6 7,5 9,5
5,2 5,2 8,2 8,2 8,2 8,2 8,5 8,5 8,5 8,5
4.76 4.68
.000 .000
2.7
: .05
1.67 0.86 1.53 1.14
ns
3.6
4.19 1.33 2.97 3.89 0.86
.004 .000 .000 ns
.002 .000 ns
3.6
.000
2.31 4.74
.01
0.6
ns .04
1.74
.000
1.6
0.14 3.9
0.59
JS, vol. 2, no. 2
3.9
4.14 0.55 1.83 2.17 3.25 1.94 2.1
0.3
2.59
3.6 1.02 1.62 2.72 4.85 0.95
.000 ns .05 .004 .000
Total
Statements
Questions 1.4 1,4 1,7 1,7 7,4 7,4 9,6 9,6 6,3 6,3 9,3 9,3 5,2 5,2
Total .004 ns .009 .002 .002 1
a tone
a tone • 1 modification
1,5 4,2 1,8 7,2 7,5 8,4 9,5 8,6 6,2
2.70 0.58 2.36 2.05 2.89 2.50
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1,3 1,3 »,6 4,6 7,9 7,9
ns ;
2.45 1.9 3.4
1.08
.000 .000
5.57
ns .04 .02
1.84 2.45 0.88 3.35 0.95 1.83 3.36 4.02 4.54
.000 .03 .02 .07 .03
.000 ns
2.5 2.6 0.4 3.4
.006 .005
2.12 2.54 2.3 0.74 0.3 2.02 0.78 1.73
.02
.000 .006 ! ns .02
5.3
.007 ns
.000 ns .03
.000 .000 .000
1.8 3.9 4.7 0.7
.04
4.42 3.11 4.76
.000 .001 .000
1.5 0.8
0.56 1.65 3.3
.04
.000 .000
0.88
.000 .000 ns
ns ns
.05 1 ns
199
CARLOS GUSSENHOVEN 1 modification
a tone + 1 modification
Statements
Questions 1,5 2,4 1,8 2,7 2,6 3,5 2,9 3,8 4,2 5,1 5,7 1,8 6,2 5,3 6,8 5,9 7,2 8,* 8,3 9,2 9,5 8,6
3.73 3.64 0.67 4.04 0.27 1.8 0.8 3.9 1.26 2.17 2.5 0.18 2.53 3.44 3.26 2.52 4.68 2.34 1.56 3.22 3.76 0.85 0.58 0.23
.000 .000
ns
.000
ns
.04
ns
.000
ns
.02 : ns .006 .000 .000 .006 .000 .01 ns .000 .000
ns ns ns
Questions 1,5 2,6 1,8 2,9 3,5 2,4 4,8 5,9 6,8 5,7 3,8 2,7
1.77 0.1 3.6 1.2 0.84 1.99 2.35 2.02 0.75 3.39 1.5 0.93
a tone + 1 modification
200
1,2 1,2 2,3 2,3 4,5 4,5 5,6 5,6 7,8 7,8 8,9 8,9
2.99 4.24 3.0 2.8 2.12 2.3 2.44 2.29 5.08 4.21 2.1 2.03
ns ns ns
.001 .000 .003
ns
.006 .003 .02 .000 .000 .000 .000 .003 .000
ns •
Statements 0.8 1.2 0.19 0.J3 0.07 1.49 1.04 0.82 2.93 0.88 1.43 0.62
ns .000 ns ns .02 .01 .02 .000
ns
4.19 4.47 2.47 4.6 0.82 1.75 1.16 4.46 3.6 3.07 0.78 1.87 3.66 3.77 4.14 4.03 5.21 4.41 3.15 4.3 3.37 0.71 0.12 0.45
.000 .000 .006 .000 ns .04 ns .000 .000 .001 .04 .000 .000 .000 .000 .000 .000 .000 .000 .000
ns
Total ns ns ns :
0.48 0.72 2.31 1.24 0.61 0.15 3.03 0.64 3.03 1.91 1.96 0.38
ns" .01
ns ns ns
.001
ns .03 ns
1 modification
Questions 1,6 1,9 3,4 3,7 4,3 4,9 1,6 7,6 7,3 7,6 1,9 4,9
.003 .000 .002 .000
a tone + 1 modification
a tone + 2 modifications
1,6 1,6 1,9 1,9 3,4 3,4 4,9 4,9 6,7 6,7 3,7 3,7
2.81 3.65 2.94 4.12 0.9 0.5 0.87 3.13 3.68 2.77 1.51 2.5 2.8 2.12 3.56 3.76 4.38 3.68 2.84 3.57 0.43 2.25 0.76 0.48
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
8,1 7,5
1,2 1,2 1,2 1,2 2,3 2,3 2,3 2,3 4,5" *,5 1,5 4,5 5,6 5,6 5,6 5,6 7,8 7,8 7,8 7,8 8,9 8,9 8,9 8,9
Total
Statements .002 .000 .002 .003 .001 .01 .008 .01 .000 .000 .02 .02
3.53 3.26 0.4 2.33 2.12 3.3 3.34 0.55 4,26 1.8 1.94 1.9
Total 000 000 ns 01 02 000 000
ns
000 03
3.79 4.5 2.41 3.5 3.18 3.93 3.74 1.91 5.36 3.84 0.53 1.0
.000 .000 .08 .000 .000 .000 .000 .03 .000 .000
ns ns
3S, voK 2, no. 1
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES
a tone +
2 modifications
a tone
Questions 1,6 3,4 7,3 7,9 6,7 4,9 7,3 9,1 6,1 4,3 4,9
6,7
1,4 1,4 1,7 1,9 4,7 4,7 9,3 9,3 6,3 6,3 6,9 6,9
1,6 4,3 1,9 7,3 4,3
6,1 4,9 7,6 7,3 1,9 7,6 4,9
*
1.3 1,3 1,3 1,3 4,6 4,6 4,6 4,6 7,9 7,9 7,9 7,9
4.07 3.17 1.46 0.5 1.42 4.12 2.24 1.31 3.23 1.72 1.32 1.2
.000 .000 .02 ns .007 .000 .000 .008 .007 .000 ns
4.94 5.0 2.32 1.06 2.83 4.91 3.87 2.53 3.84 4.21 1.74 1.78
.000 .000 .01 ns .003 .000 .000 .006 .000 .000 .04
2 modiiications
2 modifications Questions
Statements
0.94 1.83 1.12 2.25 3.26 0.65 2.69 0.77 2.08 1.04 1.39 1.03
1.24 0.03 0.55 1.62 0.11 1.11 0.9 1.78 1.52 0.39 1.57 1.07
3S, vol. 2, no. 2
000 001 ns ns ns 000 03 ns 000 04 ns
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a tone
4.1 5.1 2.2 1.3 2.4 3.76 3.5 2.3 2.4 4.8 1.2 1.2
Totals
Statements
.04 ns .01 .000 ns .004 ns .02 ns ns
Total ns ns 05 ns ns ns ns ns
0.27 1.32 1.07 2.39 2.38 1.48 2.91 0.72 2.46 1.21 2.06 1.76
ns ns ns .01 .01 ns .002 .007 ns 1 .04
201
CARLOS GUSSENHOVEN
Notes
References Brazil, D., 1975: Discourse intonation 1. English Language Research, Birmingham University, Birmingham. Crystal, D., 1969: Prosodic systems and intonation in English. Cambridge University Press, Cambridge. Gold, B. and Rabiner, L., 1969: Parallel processing techniques for estimating pitch periods of speech in the time domain. JASA 46; 442-448. Gunter, R., 1982: Review of D.R. Ladd, The Structure of intonational meaning: Evidence from English. (Bloomington, Indiana, 1980). Language in Society 11; 297-307. Gussenhoven, C , (forthcoming): A semantic analysis of the nuclear tones of English. Halliday, M.A.K., 1967: Intonation and grammar in British English. Mouton, The Hague. (Incorporating a revision of 'The tones of English', Archivum Linguisticum 15; 1-28. 1963). Holmes, J.N., Mattingly, I.G., and Shearme, J.N., 1964: Speech synthesis by rule. Language and Speech 7; 127-139. Young ,F.W., and Lewyckyj, R., 1979: ALSCAL-4 user's guide.2nd edition. Chapel Hill: Psychometric Laboratory, University of North Carolina. Ladd, D.R., 1978: Stylized intonation. Language 54; 517-540. 202
JS, vol. 2, no. 2.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 I thank Ellen Bard for her contribution in the design stage of this experiment, Margo van Eyck for helping with the processing of the results, Toni Rietveld and Carel van Dijk for their discussion of statistical matters, and Leo Noordman for his comments on an earlier version of this report. 2 The literal text of their instruction was: 'What you .are about to hear are 90 pairs of intonation patterns. Your task will be to compare the intonation patterns in each pair, and to decide whether they express very nearly the same meaning, or very different meanings. For each pair, there is a set of five boxes on your answer sheet. If you think that the two intonation patterns have very similar meanings, you tick the first box; if you think they have very different meanings, you tick the fifth box. Boxes 2, 3 and 4 can be used for intermediate contrasts. Only put one tick per pair of intonation patterns, and only put your ticks inside the boxes. (Follow illustrations and trial pairs).' 3 I refrain from giving significance levels for comparisons about which Figure 1 makes no predictions. For anyone interested in estimates for other comparisons, a rough rule-of-thumb would appear to be that if the summed differences between scores in the Question and Statement matrices exceeds 33, there is a fair chance that the difference would reach a significance level of .05, two-tailed.
A THREE-DIMENSIONAL SCALING OF NINE ENGLISH TONES O'Connor, J.D., and Arnold, G.F., 1973: Intonation of colloquial English. (2nd edition, first published 1961). Longman, London. Owen, E., 1980: Intonation and meaning. Journal of Psycholinguistic Research 9; 23-39. Pierrehumbert, J.B., 1980: The phonology and phonetics of English intonation. MIT dissertation. Uldall, E., 1972: Dimensions of meaning in intonation. In: D. Bolinger (ed.) Intonation. Penguin Books, Harmondsworth. Pp. 250-259.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3S, vol. 2, no. 2
203
204
JS, vol. 2, no. 2
PROSODIC MARKING IN SPEECH REPAIR
Willem 3.M. Levelt and Anne Cutler
Abstract
1. Some determinants of intonational marking in self-corrections At least two people are in trouble when a speaker interrupts the flow of speech in order to make a self-correction. The first person is the speaker himself* who apparently became aware of some unclarity or error in what he just said. The second person is a listener who is confronted with an abrupt break, and with the task to find out whether what is going to follow is just a continuation, as after a mere hesitation, or whether it is a repair of something said previously. In the latter case, moreover, she has to find out what the reparandum is, and to replace it by the appropriate items in the correction. This will be called the listener's continuation problem. * For ease of reference we will in the following treat the speaker, i.e. the trouble maker, as male and the listener, i.e. the victim, as female. JOURNAL OF SEMANTICS, vol. 2, no. 2, pp. 205-217
205
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.
LEVELT & CUTLER
Cutler (1983), on the other hand, dealt almost exclusively with prosodic aspects of spontaneous self-corrections. Following a suggestion of Goffman (1981), Cutler drew a major distinction between repairs that are prosodically marked and those that are unmarked. In an unmarked repair "the speaker utters the correction on, as far as possible, the same pitch as the originally uttered error" or trouble item. Amplitude and relative duration of the repair item also closely mimic the trouble item. A correction is marked when the prosody of repair item and trouble item differ. Hence, the notion is a relational one; it is not necessarily the case that a high-pitched correction is marked, or that a low-pitched one is unmarked. Marking can be accomplished by a noticeable increase or decrease in pitch, in amplitude, or in relative duration. Cutler's analyses showed that, in her corpus of repairs, corrections of phonetic errors are always unmarked, only lexical errors are frequently marked. However, even lexical errors are unmarked in 62% of the data. What, then, determines whether a lexical correction will be marked or not? There are two possible sets of determinants. The first set will be called syntactic. These are properties of the repair such as the extent to which the interruption is delayed, and the amount of previously uttered material which is repeated in the repair. Interruptions may occur early, i.e. within the trouble item or immediately after it, as in (1), or they may be delayed by one or more syllables, as in (2): (1) Well, let me write it back - er, down, so that ... (2) ... what things are this kid - is this kid going to say incorrectly? 206
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
How does the speaker deal with the trouble he created for himself and for the listener? Different aspects of this question were treated in previous papers by the present authors. Levelt (1983) analysed the different sources of trouble, or occasions for making a correction, and related them to the ways in which the speaker restarts. It appeared that speakers put severe constraints on the ways in which they make the correction. They not only signal to the listener what sort of trouble they had encountered, but they also give unambiguous cues for the listener to solve her continuation problem. The cues analyzed in that paper were of a syntactic and lexical character. Syntactically, it turned out, the original (interrupted) utterance and the repair relate to one another very much like two conjuncts in a coordination. This guarantees semantic interpretability of the repair, given the original utterance. With respect to lexical cues, they play a significant role in relating the first word of the repair proper to a corresponding place in the original utterance, the place where the repair has to be "inserted". The paper, however, did not anaiyze potential intonational cues, in spite of the fact that the 957 repairs in the corpus were tape recorded.
PROSODIC MARKING IN SPEECH REPAIR Independently of this, after interruption the speaker may instantly introduce the repaired element, as in (1) and (2), or may retrace to an earlier element, as in (3): (3) I cannot work out where I ran over - ran across that other name Prosodic marking may, then, serve as a way in which the speaker can indicate to the listener that he has delayed his interruption, or that he is retracing, or that he is making a fresh start, etc. The listener, in her turn, may use such cues to solve her continuation problem.
(4) ... to a dark brown crossing - T-crossing There were different types of crossing in the domain of discourse, and "T-crossing" is a further specification. There are other forms of correction for appropriateness: a demonstrative can be replaced by a definite description ("from there, from the blue node ..."), a definite article by an indefinite one ("a line to the yellow disc, to a yellow disc"), etc. We might conjecture that the speaker would be more concerned to draw the listener's attention to a repair replacing an error, i.e. completely wiping out the previous version of the utterance, than to a repair which merely elaborates or expands upon the previous version. That is, if marking is a way to signal rejection, we would expect corrections for error to be more marked than corrections for appropriateness. Within the category of error repairs, there is at least one further dimension which might be relevant to the speaker's marking decision, namely the size of the semantic domain in which error and repair contrast. This can be conceived of as being at a minimum when error and repair are antonyms, as in (5): (5) Left to green - er, right to green Other such pairs in our corpus are "horizontal/vertical", "up/down", etc. However, the corpus also includes many cases in which the error is replaced by a word from a more general semantic field, as in (6): JS, vol. 2, no. 2
207
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The second set of potential determinants for marking can be called semantic. Marking could be used by the speaker to express a semantic relation between the repair and the reparandum. The most obvious semantic dimension on which repairs differ is whether or not the trouble item and the repair are compatible or incompatible; that is, was the trouble item actually an error, which must be replaced by a correct version of the intended message, or was the trouble item simply not the most appropriate possible word for the context, so that the repair does not so much replace it as further elaborate upon it? This latter type of repair will be referred to as an appropriateness repair; an example is given in (4):
LEVELT & CUTLER (6) Right of that is green - oh, blue In this case the speaker was describing patterns consisting of colored nodes, which were connected by either vertical or horizontal black lines. There were 11 different colors involved. It is possible that speaker and listener are mutually aware of the number of alternatives to the trouble item in the domain of discourse, and the larger the number of alternatives, the smaller the sensed degree of opposition, hence the less contrastive it is to mark the repair. We would then expect to find more marking in cases like (5) than in cases like (6).
In her paper, Cutler could not find a systematic relation between marking and syntactic factors. There was, moreover, no clear indication that prosodic marking of lexical repairs was due to semantic determinants. Cutler suggested, however, that analysis of a more extended sample of corrections might reveal effects which could not be discerned in her data. The present paper provides such an analysis. It is based on the sub-sample of 412 lexical corrections in Levelt's repair corpus for which the tape quality was good enough to make a judgment of intonational marking. This sample is indeed large enough to reach more definite conclusions with respect to the determinants of marking in spontaneous self-repairs. 2. Corpus, judgments of marking, and ways of analysis
The corpus of self-repairs is extensively described in Levelt (1983), to which the reader is referred. Here it suffices to say that the repairs were obtained in an experiment where each of 53 native adult speakers of Dutch described 53 visual patterns, consisting of colored nodes, connected by black arcs (see above). The average number of repairs per subject was 18.1, with a standard deviation of 10.3. The lexical repairs in the corpus were called "lexical" because the trouble item was a single lexical item. Examples (4), (5) and (6) are English translations of lexical repairs from the corpus. The two authors of the present paper independently judged each of the 412 lexical repairs for intonational (un)marking. The criterion was as described above: is the prosody of the trouble word roughly the same as the prosody of its correction, or is it different? Here "prosody" refers to pitch, amplitude and duration, since variation in any of these can constitute marking (usually, of course, they vary together). After the judgments were completed, they were compared 208
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In summary, there are two semantic dimensions which may be of relevance for the analysis of intonational marking in spontaneous selfcorrections: is the intended correction for error or for inappropriateness, and, if it is for error, is the replaced element one of a smaller or a larger set in the domain of discourse?
PROSODIC MARKING IN SPEECH REPAIR between the authors, and it turned out that there was agreement on 299 items, i.e. 73%. This is reasonable, given the fact that one of the authors is not experienced in making prosodic judgments, and the other one is not a native speaker of Dutch. We decided to be ruthless, and to limit the further analyses to the 299 cases where we agreed. The marking values were added to the (computerized) codes which were already available for these repairs (cf Levelt, op. cit.). These involved various syntactic and semantic aspects of the corrections, among them those mentioned in the previous section. It was, finally, easy to compute the distribution of intonational marking for different levels of the hypothesized determinants. The next two sections will discuss the results for syntactic and semantic determinants, respectively.
3.1 Delay of interruption Examples (1) and (3) above were cases where the speaker interrupted the flow of speech immediately after the trouble item; examples (
inter-
delayed
total
marked correction
26
64
44
134
unmarked correction
23
87
55
165
JS, vol. 2, no. 2
209
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3. Syntax and marking
LEVELT & CUTLER Table 1 gives the distribution of intonational marking over the three categories of interruption, described above. A chi-square test shows no significant differences between these categories. A slight tendency for corrections after interruption within the trouble item to be more marked than those after interruptions following the trouble item either immediately or delayed - has an obvious semantic explanation, to which we shall return in the next section. Here, one can safely conclude that speakers do not use intonational marking to tell the listener whether or not the trouble item occurred just before interruption of the flow of speech, or earlier. 3.2 Retracing
(8) ... and it ends then in a black - rather, in a purple ball Here the speaker prepares for the trouble element ("black") by retracing to the beginning of the prepositional phrase in which it occurred. There are also other ways for a speaker to restart (cf. Levelt, op. cit. for details), but they are so infrequent in the present sample that we can refrain from discussing them, and classify them as "other". It should be noticed that this categorization ignores such interjections as "er", "rather", etc. The repair proper is often preceded by "editing expressions" of this sort. We will return to them in the next section. Do speakers use intonational marking to inform the listener about the type of restart they are making? One might conjecture, for instance, that instant repairing is the default case: the listener assumes that the first word of the repair proper is the replacement for the trouble item. If the speaker retraces, however, it would be helpful to mark the focussed element which is to replace the trouble item. Table 2 Intonational marking in repairs with different ways of restarting other
way of restarting
instant
retraced
marked correction
75
53
6
134
unmarked correction
92
57
16
165
210
total
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There are different ways for a speaker to restart after interruption. Examples (1) and (2) above were cases where the speaker introduced a replacement for the trouble item instantly, as the first word of the correction. The same is true for examples (4) through (7). Example (3) was a case where the speaker restarted at an element which preceded the trouble item in the original interrupted utterance. Such retracings are quite frequent in the corpus; another example is (8):
PROSODIC MARKING IN SPEECH REPAIR The relevant data for answering this question are presented in Table 2. It gives the distribution of marked and unmarked corrections over the categories of instant repair, retraced repair and "other". Here again, a chi-square test revealed no significant differences between the categories. Speakers do not use intonational marking to tell the listener what sort of restart they have chosen to make. It should, finally, be added that there is nothing in the data to suggest that particular ways of restarting are more marked under particular conditions of delay, neither is there any interaction between delay, restarting, and semantic type of correction (error versus appropriateness) with respect to prosodic marking.
4. Semantics and marking 4.1 Error or appropriateness It was discussed above that there are two major classes of reasons for a speaker to interrupt and repair his utterance. The utterance can, in the first place, contain a straightforward error. This is the case for examples (1) through (5), (7) and (8) above. The error can, still, be of different sorts. Lexical errors are often referential misnomers, such as "green" for blue (cf. (4)), "left" for right (cf. (5)), or "over" for across (cf. (3)). In these cases the substituted word has an obvious semantic relation to the intended word. Other types of lexical error are also possible - for instance, where the relation between the error and the intended word is one of form rather than of meaning; but in the present corpus of lexical corrections, almost all cases of error are of a referential sort. (Further kinds of error syntactic, morphological, phonetic, prosodic - are beyond the scope of this paper.) The second main reason for making a repair is that the utterance is not fully appropriate, given the context in which it occurs. In (6) the word used is too vague, given the set of contextual alternatives. This is especially often the case when demonstratives are used, as in (9): (9) And right of that one - of that purple ... Also, an otherwise correct word is sometimes replaced because it does not match previously used terminology. A speaker may, for instance, decide to replace a static verb by a verb of motion, because he is JS, vol. 2, no. 2
211
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
So far, the present analysis confirms the findings of Cutler (op. cit.): there is no indication that the difference between marked and unmarked lexical corrections has anything to do with the interruption-and-restart structure of the repair. Let us now turn to the second possibility, semantic determinants of intonational marking.
LEVELT & CUTLER giving a dynamic description of the spatial network, i.e. in terms of an imaginary tour. An example of such an appropriateness repair is given in (10): (10) If you go up one, there's - er, you come to yellow Here the static "there is" is replaced by the dynamic "come", though the speaker could have completed the original static utterance.
It should be remembered that a correction was defined as marked when the repair differed prosodically from the reparandum.' Do speakers apply prosodic differentiation when they are in the act of contrasting, rather than when they are in the act of elaborating? This can be tested by analysing the marking distributions for appropriateness and error repairs. Table 3 gives the results. Table 3
Intonational marking in repairs for error and in repairs for appropriateness
correction for
error
appropriateness
total
marked correction
121
13
13
unmarked correction
108
57
165
It shows a highly significant (p < .001 by chi-square test) difference in marking between the two types of repair. Of the corrections for error 53% are marked, whereas corrections for appropriateness receive marking in only 19% of the cases. Hence it may be concluded that a main function of intonational marking in spontaneous self-repairs is to reject by establishing contrast.
212
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Only corrections for error involve rejection of the reparandum, and this is often marked by the editing expression the speaker uses before making the repair proper. The explicit denial "no" (nee), for instance, occurs almost exclusively in corrections for error (Levelt, op. cit.). The repair is therefore an act of contrasting. This is not so in the case of correcting for appropriateness. There is no rejection, but rather specification of the reparandum. Here, what was said is confirmed, and this is often apparent from the editing terms speakers use as interjections. In the Dutch repair corpus, "dus" (literally "thus", "therefore"; the English contextual equivalent for the present repairs is "that is") is frequently used in corrections for appropriateness, but never for error repairs. Repairing for appropriateness is an act of elaboration.
PROSODIC MARKING IN SPEECH REPAIR The higher rate of marking in error corrections also explains the slightly higher occurrence of marking in repairs with interruptions within the trouble item which was observed in the previous section (cf. Table 1). Such interruptions occur almost exclusively in correction for errors, not in repairs for appropriateness (for reasons explained in Levelt, op. cit.). Two points are left to be explained. The first one is why only 53% of the corrections for error are marked, given the fact that all of them presumably involve rejection of an item in the original utterance. This issue will be dealt with in the next section. The second point is why there is still 19% marking in appropriateness repairs. This will be taken up first.
4.2. Number of alternatives In order to explain why not all repairs for error are intonationally marked, a further partitioning of these errors should be considered. Earlier we suggested that, dependent on the context of discourse, speaker and listener may be mutually aware of the set of alternatives to the trouble item that caused the speaker "to interrupt speech. The sense of contrast should be higher if this set is small, such as in case of antonyms and the like. The conjecture can be made that these cases especially will induce a speaker to mark the contrast by intonation. It is possible to test this conjecture for the present corpus of repairs. A comparison can be made between two classes of error repairs. The first class consists of color name repairs; there are 119 of them among the 229 corrections for error. For these trouble items the set of alternatives is known: speaker and hearer knew that there were 11 different colors in the patterns described. The second class contains the repairs for directional terms. There are 61 of these in the sample. The directional terms almost always came in pairs: "left" - "right", "up" - "down", "horizontal" - "vertical". Since there were only four possible directions in the patterns, the maximum number of contextual alternatives at a particular choice point was four. The set of alternatives is, therefore, JS, vol. 2, no. 2
213
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The 13 marked corrections for appropriateness in our sample are very heterogeneous in character, and for most cases we have not been able to find an explanation for the marking that occurred. One subject marked every repair of either type. Another subject marked the same correction in one case but not in a second case (these were repairs where "bianco", i.e. blank, was replaced by "wit", i.e. white). Other cases in this set were, for instance, "door" - "rechtdoor" ("on" "straight on"), "vanuit" - "door" ("from" - "through"), and the unusual case "het" - "een" ("the" - "a"). No uniform pattern emerges from these cases.
LEVELT <5c CUTLER substantially smaller for directional expressions than for color names. Does this correspond to a difference in the amount of marking? Table *
Intonational marking in repairs for color and in repairs for direction
correction for
color
direction
total
marked correction
59
M
103
unmarked correction
60
17
77
5. Discussion How far have we proceeded in finding an answer to the question why some lexical repairs are intonationally marked and others are not? Cutler's (op. cit.) data showed a "marking rate" of 38% for lexical repairs. The sample analyzed in the present paper has a rate of *5%. It was shown first, that syntactic factors i.e. the interrupt-and-restart structure of the repair played no role in marking. But a word of caution is in place here. Though marking, in the sense of prosodic contrast, is apparently not used for this purpose, it is possible and even likely that intonation does play a role in the solution of the continuation problem. If, for instance, a speaker makes an unmarked retracing, i.e. repeating elements that occurred before the trouble item, the listener could use the correspondence in intonation contour for identifying the part of the original utterance with which the repair proper overlaps. The other obvious cue here is the lexical identity of the repeated words (cf. Levelt op. cit.). Such lexical identity is not present in instant repairs, where the first word of the repair proper replaces the trouble item. In the absence of such a lexical joint between repair and original utterance, the listener may very well use intonational cues to match the repair to the trouble item. But notice that such a match exists only for unmarked repairs; in the marked case the prosody of the items to be_ matched is different. In other words, if intonation is used in this way for solving the continuation problem, intonational marking would be likely to interfere.
21*
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Table * presents the marking data for these two classes of error repairs. The difference is in the expected direction and significant (p < .01 by chi-square test): only half of the color word repairs are marked, but 72% of the direction term repairs. This supports the notion that there is more intonational marking for smaller sets of contextual alternatives to the trouble item.
PROSODIC MARKING IN SPEECH REPAIR
In fact such a conjecture, it will be seen, fits well with what we consider to be the function of the prosodic marking of repairs in the context of prosodic structure in general. We will argue that marking a repair is, in effect, accenting it. In prosodic theory, accent is defined simply as the assignment of prosodic prominence to one element or part of an utterance; it is not defined in terms of how the prominence is realised. That is to say, accent is an abstraction; in an actual utterance it can be realised in a variety of ways. Accented words are usually longer and louder than unaccented words, higher in pitch or with more pitch movement, but they need not be - in appropriate circumstances accent can be realised by a noticeable decrease in amplitude, in pitch, etc. In other words, the definition of prosodic marking which we gave above is remarkably similar to a definition of accent. What factors determine the placement of accent in an utterance? Although syntactic rules can be formulated which will correctly predict accent placement in neutral (acontextual) utterances, such rules only account for the default case; semantic factors will always override the syntactic. In actual utterances, in context, the placement of accent overwhelmingly reflects the semantic structure of the utterance (Cutler and Isard, 1980; Ladd, 1980). If marked repairs are accented repairs, it is little wonder that we found marking to be determined primarily by semantic rather than syntactic influences. Moreover, the case of prosodic marking of lexical repairs allows 3S, vol. 2, no. 2
215
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A second type of factor, however, showed a clear relationship with intonational marking. Corrections for error were marked in 53% of the cases, whereas corrections for appropriateness reached a mere 19%. The first, but not the latter of these repair types involves rejection of what was said before. Marking, it was argued, is used to express rejection. It was further found that marking is even more frequent if the number of contextual alternatives to the rejected item is small, i.e. if the contrast acquires the character of opposition. Corrections for directional terms ("left" versus "right", etc.) show a marking rate of 72% in the present sample. One might, for the sake of theoretical clarity, wish to distinguish between degree of opposition and number of contextual alternatives. The degree of opposition is the exclusiveness of the repair vis-a-vis the trouble item. If the task of the speakers involved just four different colors (instead of eleven), and these colors had been purple, pink, orange, and yellow, the number of color alternatives would have been the same as the number of directional alternatives. Nevertheless, the degree of opposition might still have been less, since the colors are. sensed as fairly similar, whereas the four directions are highly exclusive. The present data do not allow us to make a choice between these two notions, but we would conjecture that it is the sensed degree, of opposition or exclusiveness, rather than the size of the set of contextual alternatives per se, that primarily underlies intonational marking.
LEVELT & CUTLER an even closer comparison with the function of accent. A lexical repair consists in the replacement of a single lexical item by another, virtually without exception one of the same form class, in the same syntactic context. Accenting of two lexical items of the same form class which are embedded in identical syntactic contexts occurs frequently in speech; it is said to express contrast, as in (11): (11) First we WROTE it, then we reVISED it Again it seems in this context hardly surprising that when a speaker wishes to emphasize the contrast between a repair item and the original trouble item which occupied its syntactic slot, he would mark it or accent it.
(12) An INcrease in pitch but a DEcrease in amplitude Thus when the element to be repaired is below the morphemic level, as in (13) in which only a single sound is corrected, the appropriate environment for the assignment of accent is not available: (13) Well it'll all have to be unsiled - unsigned To apply accent to the word as a whole would be to mislead the hearer into thinking that one word was to be contrasted with another, whereas the desired contrast is in fact only between sounds. One sound cannot be contrasted with another by the application of accent; thus phonetic errors cannot be marked. Prosodic marking in speech repair, therefore, conforms to general constraints on the prosodic structure of language.
Acknowledgements This research is a European Psycholinguistics Association collaborative project. The second author thanks the Max-Planck-Institut fur Psycholinguistik, Nijmegen, for a Visiting Scholarship, during the tenure of which the work described in this paper was carried out. The authors are grateful to John Hawkins and Ewald Lang for useful discussions.
216
JS, vol. 2, no. 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Finally, the interpretation of marking as a manifestation of accent allows us retrospectively to account for the finding of Cutler (op.cit.) that marking is applied only to errors at the lexical level or above, never to phonetic errors. The smallest unit to which contrastive accent can be applied is a morpheme, as in (12), in which prefixes are contrasted:
PROSODIC MARKING IN SPEECH REPAIR References Cutler, A., 1983: Speaker's Conceptions of the Functions of Prosody. In: A. Cutler and D.R. Ladd, (ed.), Prosody: Models and measurements. Springer, Heidelberg. Cutler, A. and Isard, S.D., 1980: The Production of Prosody. In: B. Butterworth (ed.), Language Production. Academic Press, London. Goffman, E., 1981: Radio Talk. In: E. Goffman (ed.), Forms of Talk. Blackwell, Oxford. Ladd., D.R., 1980: The Structure of Intonational Meaning. Indiana University Press, Bloomington. Levelt, W.J.M., 1983: Monitoring and Self-repair in Speech.' Cognition. forthcoming. Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
as, vol. 2, no. 2
217
218
JS, vol. 2, no. 2
Obituary
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Professor Dr. Hans Hormann, one of the consulting editors of this Journal, died on March 28th 1983 in Bochum. The news of his death has deeply moved all those who knew him personally or through his numerous articles and books. With his death, the German tradition of Psychology of Language loses one of its most eminent representatives. The international psycholinguistic tradition loses a scientist whose unique vision it was to connect European language research with modern philosophical, linguistic, and psychological attempts to investigate language and language use in context. Hormann leaves behind a scientific heritage for all who concern themselves with the study of language. The editors
JS, vol. 2, no. 2
219
Editorial Announcement
The Editors
220
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The editors are pleased to announce that, as from Volume 3, Nr.l, the Journal will be produced and published by FORIS PUBLICATIONS, P.O.Box 509, 3300 AM DORDRECHT, the Netherlands. As is the usual practice, all matters regarding subscriptions and dispatch will henceforth be taken care of by the Publisher. The conditions of subscription will not be changed. All relevant information will be provided on the inside of the back cover of the issues of Volume 3. The editors hope that this new development will contribute to the flourishing of semantics as an empirical, interdisciplinary study, and will be for the good of the Journal.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES Thomas Ballmer
Abstract
1. The structural and the dynamic approach Only recently has the historic stream of linguistics taken consideration of texts and discourses as an object of study. It seems as if linguistics has finally got to the point of being able to deal with full blown human communication. More traditional studies of language were always . of a clearly restricted nature - cf. phonology, morphology, or sentence oriented syntax, semantics, and pragmatics; in these fields only fragmentary linguistic objects were considered. This is no longer the case for text and discourse analysis, especially when including the variably relevant context factors. Being now obliged to take into account, so to speak, everything, a radical change occurs regarding the domain of linguistics. Whereas sentential linguistics - and clearly more so lexical, morphemic, and phonemic linguistics - could rely upon naturally given limitations, text and discourse analysis cannot.
JOURNAL OF SEMANTICS, vol. 2, no. 3/4, pp. 221-252.
221
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The paper starts out with a comparison of the structural and the dynamic approach. It is maintained that when considering texts and discourses as full blown linguistic entities, a structural and emically abstracting approach would not be sufficient. A text grammar has to account for dynamics, and specifically for the dynamics of context change. Such a grammar has to fulfill a number of requirements, specifically some concerning its formalizability. As ah example of such a formal approach the Context Change Logic for a solution of the Bach . Peters paradox is proposed^ In a second and third part of the paper the missing lexical basis of Logical Language Analysis is criticized. A programme is then presented to give formal logical semantics of natural language a solid linguistic basis. The topology of the semantic space of natural language is developed. This is achieved by reference to a comprehensive study of 21.000 German verbs, 13.000 German adverbs and an indefinite number of nouns. A last part of the paper demonstrates how the semantic space of natural language impinges upon text and discourse structures. The expressive power of language is seen to be performed and; severely restricted by the lexico-semantic findings presented in the paper.
THOMAS BALLMER Is it surprising then that such a situation influences matters rather drastically for linguistics as a theoretical and empirical field, and particularly its methodology?
The Structural Approach 1. 2. 3. 4. 5.
look out for a structure collect items, data establish a corpus extract order & find structure reduction to a basic set of items
Fig. 1 The Structural Approach exhibits, roughly speaking, 5 stages of investigation. (1) a heuristic stage, (2) a collecting stage, (3) a systematizing stage, (4) a categorizing stage and (5) a justificatory stage.
Moreover texts and discourses are interactive entities. They operate upon the physical and mental states of the people communicating. This opens another area of non-structural phenomena. In brief, texts and discourses constitute, also from the point of view of linguistic theorizing, a transition point, namely the transition 222
3S, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let me explain what I mean. As long as more of less well defined (if to some degree preconceived) restrictions are accepted, like the ones we have just mentioned (with respect to sentential and even smaller scale linguistic entities), an exclusively structural approach would do. It allows us to describe and structurally explain the observed facts. When we consider texts and discourses as full blown linguistic entities in a full blown contextual environment, however, a structural and emically abstracting approach is insufficient. Each text or discourse is a temporally developing entity. Details of lesser or no relevance at some point of the text or discourse may matter later on and gain importance. This means, in fact, that there is no natural lowest limit of relevance, and hence an emic approach must fail: minimal details below the threshold of relevance that fall through the meshes of the system may become crucial at a later point and lead to palpable effects. In extreme cases, random -fluctuations may lead to irreversible historical consequences: hesitation phenomena, idiosyncrasies, or arbitrary mistakes may influence and even determine the temporal de velopment of the text or discourse, and in some - admittedly rarer cases the language system itself. Thus we have clear cases of nonstructural phenomena, i.e.: phenomena reaching beyond the purely structural limits.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES point between the individual and the collective, between the random and the systemic, between the dynamic and the static, between the non-structural and the structural. This situation requires special care. Although we cannot here now disentangle this matter in sufficient detail we shall at least try to clarify as much as is useful for the moment.
The Dynamic Approach
evolution
process
decay
Fig 2 The Dynamic Approach has to deal with three phases: (1) an evolutionary phase, (2) a process phase, and (3) a decaying phase. The Dynamic Approach presupposes a structural approach from a heuristic and descriptive point of view, but. the Dynamic Aproach also explains why the structure is as it is.
At the opposite end stands another type of thinking. It does not ask for the static, stable, and time-invariant things, but for temporal, (possibly irreversibly) changing and dynamically interacting, things. As I see the situation, text and discourse analysis, by bringing in the new elements of temporality, relevance-amplification, and dynamics require the inclusion of this type of thinking, which otherwise is more commonly practised by engineers and scientists. The questions are JS, vol. 2, no. 3/4
223
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The common conception in linguistics is to look out for structures, to collect data, to arrange them in an orderly manner, and to reduce them to a basic set of items, paradigms, rules, deep structures, and the like. This is called the structural approach. Its linguistic foundations go back to de Saussure and to Bloomfield at the beginning of this century and conforms largely to the axiomatical and logical thinking that has gained ground in the last eighty years in some branches of mathematics, in mathematical and philosophical logic, and in analytic philosophy of language.
THOMAS BALLMER then not so much 'What is the structure?1 and 'How are the structures related?' but rather: 'Given these structures, how did they arise, why are they stable, how and under what circumstances do they disappear?' These questions pertain to the evolution, stability, and decay of structures. The answers are not sought on the level of the structures themselves but on the level of the underlying dynamics. This non-structural approach is called a dynamic approach. 2 These introductory remarks may shed some light upon the kind of problem we have to consider when we deal with texts and discourses. We should be able to account for both sides of the coin: the structural and the underlying dynamic phenomena. What does this mean for the treatment of texts and discourses? What are the structural and what the dynamic phenomena we have to investigate?
Chain of Communication (1): initial context
(2) synthesis (of text or discourse)
(3) behavior
internal and external states (of speaker and hearer); especially: needs and goals
using innate & learned strategies
moving legs, arms, hands, the face and speech organs (larynx, tongue, etc.) especially production of sounds
(•) transmission
(5) influence
(6) analysis (of text or discourse)
(7) resulting context
affecting the perception organs
hearing analysing categorising parsing interpretation reflection assessment
new internal and external states (of speaker and hearer); especially: needs and goals
Fig. 3 This chain of communication is the basic element for more extended chains of communication. It starts out from (1) an initial context, bearing in itself the potential of future actions, processes etc., 22*
IS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The following table in Fig. 3 gives a survey of some of the more prominent problems to be solved when analysing a text or discourse. This situation would look rather more intricate and cumbersome, if considered in sufficient detail. If looked upon from due distance, however, we see that the overall dynamics is relatively simple. An adequate understanding of this overall dynamics will lead eventually to an appropriate textgrammatical description.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES which result in the synthesis of linguistic expressions (2) and (observable) physical behavior of sound or graph production (3); the acoustic or' graphic shapes are transmitted (
The core of the chain of communication initial context
resulting context
Fig. *> A presentation of the basic features needed for a logical analysis of an elementary chain of communication (as used e.g. in Context Change Logic, cf. Ballmer 1972, 1975, 1978).
JS, vol. 2, no.
225
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There is, first, an initial context (or input context), which includes the physical and mental states of the participants (cf. Fig. 4). This context produces, on the grounds of physico-biological needs or mental goals, and by means of innate as well as learned strategies, processes and, more specifically, actions like the moving of legs, arms, hands, the face, and the speech organs. For our purposes, we concentrate upon what is relevant for the production of texts and discourses. Primordially this is the production of sounds (and, if you wish, of gestures and mimicry). The sounds, by reaching the participants, change the initial context. The physical and mental states are altered by the sound-producing process emanating from, the initial context and the strategies used, and a resulting context is produced.
THOMAS BALLMER
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The sounds are analysed, categorized, parsed, interpreted, reflected upon, accepted or rejected, and thus create new needs and goals. These new needs and goals may in turn keep the process of producing utterances going on. The process of iterated context change leads to texts and discourses. Texts and discourses, on the basis of the universal ; context change mechanism exhibit thus an action/non-action architecture (cf. Fig. 5). The phase of non-action is a phase of potential action, of preparation of a (next) action. It is the start for a new action, and thus a state of need and goal formation. A single action thus has the form of quiescence (initial need and goal relaxation), inchoation, activity/duration, termination (need and goal relaxation), and final quiescence (which may be identical to the initial phase of a new action. This leads to a circular structure, which allows for repetition and thus forms the basis for even larger units. On this level a new mode of context change can be attained and thus a new nonaction/action 'hat' (cf. Fig 6.4, below) can be created. The co-ordination of (iterated) actions on all these levels of magnitude constitutes a text or discourse.
Temporal Sequence of Actions potential ft actual activity action
initial need ft goal formation
intermediary need and goal formation action
need ft goal relaxation (end of action, virtual)
action
action
final relaxation (end of text, sleep, death)
Fig. 5 The iteration of linguistic actions leads to a certain dynamics of the potential and/or actual activity which the participants engaged in these actions undergo.
A text grammar has to account for this kind of context change dynamics. The question is now, of course, how we can link this requirement to what is known from the more traditional approaches of sentential linguistics. Let us try. What we would like to have is a grammar that conforms to the following desiderata:
226
JS, vol. 2, no. 3/4
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES
Each grammar distinguishes between a lexical part and a set of rules (basic or transformational). The question is where to draw the boundary. In fact, it does not really matter. But there is one extreme answer, which allows us to see the relevant problems very clearly (cf. Ballmer 1978a). This is the decision to construct a grammar with (essentially) one rule (of synthesis and analysis) and to account for the rest in the lexicon. To render this extremely lexical approach interesting for linguistics requires an elaborated theory of the lexicon. This is what the bulk of this paper is devoted to. For those sceptical of the lawlikeness of grammatical rules, we should mention a statement of Bloomfield's according to which all idiosyncracies of a language belong into the lexicon. The fact that there is just one rule (of analysis and synthesis) corresponds to the view that there is just one kind of context-change mechanism. This mechanism may be seen as biologically realized as the (human) brain or the genes, or physically as the causal interaction of processes. As an example of how problems at sentence-, text-, and discourselevel are treated by such an approach, Appendix I takes a look at the well-known case of the Bach-Peters Paradox. A proper treatment of texts and discourses must take into account a considerably long list of static as well as dynamic phenomena. The list we are giving below distinguishes static, kinematic and dynamic properties.
3S, vol. 2, no. 3/
227
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1. It should be formal and model-theoretically interpretable. 2. It should relate contexts to the sounds produced (utterances, sentences, texts, discourses), and the sounds (in the appropriate context) with their induced context changes (thus leading to new contexts). 3. It should obey (as much as possible) a (modified) Fegean principle: the context change of an utterance should be reducible to the context changes of its parts (and vice versa). (Context change includes change of information and hence this modified version of the Fegean principle implies the standard version only with respect to meaning. The essential claim of such a requirement is that an algorithmic representation is the ultimate goal). 4. It should relate syntax and semantics-pragmatics in a smooth and coherent way. 5. It should relate the lexical part and the transformational part of the grammar in a smooth and coherent way.
THOMAS BALLMER Properties of Texts and Discourses
I. Static Structural Properties phonetic level: phonemic level: morphbphonemic level: morphological level: lexical level:
semantic level:
syntacticsemantic level: pragmatic level:
morpheme inventory, paradigms, rule fragments. word lists, rule fragments of word formation (affigation, suffigation, prefigation, derivations), categorization; interpretation of words, meaning assignment, illocutionary potential, ... syntactic categories, lexical categories, translation rules. parts of speech, syntactic patterns, rules (bare, transformational), nuclear sentences, paraphrasing rules, surface structures, shallow structures, deep structures, hypersentences, ... , translation rules, interlingua. metaphysics, kinds of entities, cornbinatory/ relational meaning, Fregean principle, propositions, illocutionary force (potential), formal/ informal interpretation, logical form, influencesyntactic interpretation rules, translation from syntactic forms to logical forms. contexts, relations between contexts, context structures, context change, context influence, manipulation, failure/success of action.
2. Kinematic properties temporal development running off of procedures: synthesis/analysis search procedures, storage procedures, ordering procedure types: procedures, retrieval, influence, approximation, paraphrasing, translation, problem solving, context use, context determination, context 228
JS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
lexico-syntactic level: syntactic level:
sounds and their properties, form characteristics of: intonation, stress, timbre, mood-indications, ... , distinctive features. phonemes, sounds patterns, paradigms, rules, morphonological rules.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES selection, context change, presupposition balance, topic comment, book-keeping, referencing,' predication, ... 3. Dynamic Properties need and goal dynamics: cognitive dynamics linguistic dynamics behavioral dynamics interaction dynamics
biogenetic, historic, single discourse, turn,
With respect to these properties of texts and discourses it is possible to describe the creation of needs and goals, the step by step synthesis of utterances and their physical realization as sounds, the flow of words and sentences, and the context change dynamics induced, after the analysis of the incoming sound pattern (cf. Figs. 3, k, 5). 3 Physical and mental states are more or less deliberately altered, controlled by the initial needs and goals. Specifically changes of attention may be induced, of reference, beliefs, knowledge, needs, goals, obligations, and other (propositional) attitudes, and of linguistic parameters such as meaning assignments and even grammatical properties (e.g.1 categorization, paradigms) (cf. the notion of 'Conceptual Behaviorism in Ballmer/Brennenstuhl 1980).
2. The Missing Lexical Basis: Criticizing Logical Language Analysis The Fregean approach, in itself a typically structural approach, leads to an essential problem, which is independent of whether the Fregean principle is formulated in its traditional meaning-related form or in our context-dynamic form. The problem has rather to do with the reductive character of the principle. Meanings or effects of discourses, texts, and sentences are reduced to the meanings of words or morphemes by means of rules or strategies. This is certainly not trivial, but the Fregean analysis stops before having reached the end. 3S, vol. 2, no.
229
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
macro/micro history
need-goal creation, relaxation, satisfaction strategic dynamics: plans, goals, needs, ..., illocutionary effects, success, failure, speech-act dynamics turn-taking, expressions, enaction, interaction, discourse and text ontogenetic, dies (daily course of events), sentence, clause, text, phone, dynamics.
THOMAS BALLMER Any formal analysis, be it traditionally meaning oriented, i.e. structural, or (more adequately for discourse analysis) dynamical, is ineffective when it comes to the analysis of constants, i.e. predicate constants corresponding to lexical items such as verbs, nouns, ad-forms. In German, for instance, there are (according to Mater) 20,000 verbs, or at least about 13,000 non-dialectal standard verbs. There are also 13,000 adforms and many more nouns (the exact number depending on where to draw the line between the standard and terminological expressions). Thus a Fregean analysis of texts gets blocked at the level of single words.' Andwithout also understanding the single words, as far as practical comprehension is concerned, all Fregean analysis yields nothing.. It does no more than reduce something unnown (the meaning of a given text, say) to something else which is unknown (the meaning of the words occurring in the text).
The situation just sketched speaks against a 230
combinatory
and also
JS, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
So what we need is a semantics for single words. Why don't we then simply use an axiomatic approach or some technique of lexical decomposition? At first sight, there does not seem to be a crucial objection to such a way of proceeding. On closer observation, however, an essential difficulty emerges. It seems virtually impossible to provide an empirically valid axiomatization or a coherent system of lexical decomposition, because, first, the number of predicate constants to be treated is beyond what can be handled 'in one afternoon'; secondly, the relations among these constants are so manifold that it would be difficult to begin with such a method of formalizing at just any arbitrary point in the lexicon. Any progress is rendered virtually impossible by permanent revisions which have to be made in the course of the application of such a procedure to the entire lexicon. Thirdly, we cannot see the wood because of the many trees, i.e. it is utterly impossible to judge what are relevant distinctions in the lexicon and what are less important modifications. Words all look the same. Over and above this, a formalization of such a bulk of empirical material should fulfil the following requirements: 1. It should start with the relevant distinction first (approximation correctness). 2. It should lay the base for a complete formalization (basis for completion); in other words: it should embody a basis for approximation. 3. It should be extendable and refinable (extendability, refinability), in other words: it should allow for a completion of the approximation. In sum, a formalization of such a bulk of empirical material should not be ad hoc but should be conducted in a clear and transparent manner which can be examined, criticised and improved step by step.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES against an (exclusively) feature based approach applied either to single, isolated words, or to overly small lexical fields. This is confirmed also by the fact that the meaning of single words is opaque in the sense that it cannot be read off from their form (arbitrarite). In contrast to a combinatory approach we may try a holistic, relational approach, using meaning relations not very different from those Lyons (1969) uses. But other questions remain: how can we handle such a large number of constants, where should we start, what is relevant, what less relevant? A procedure which I will describe immediately ensures a solution to these problems.
If there is such a set, this would be of great help. The problem of finding the semantic structure of the entire lexicon would be drastically simplified. As it turns out, the set of verbs is a good candidate for such an approach. Syntactically, verbs play a central role in the sentence, as is documented clearly for instance in dependency and valency grammars. Verbs determine sentence patterns and provide case frames. Historically, verbs form a relatively stable word class; the number of verbs and their meanings remain relatively time-invariant as compared to nouns and ad-forms. And last, but not least, semantically and logically speaking, verbs are the predicates 'par excellence'. So why not start a semantic analysis of the lexicon on the basis of verbs. 3. The semantic structure of the entire
lexicon
Our task is thus to bring to the fore the semantic structure of all verbs, i.e. the verb thesaurus. In order to do this we should try to look at the verbs with half-closed eyes, so to speak. We should, in other words, try to get rid of minor differences in meaning and we should collect verbs which are similar in meaning. More technically speaking, we establish groups of .verbs which stand in a close meaning similarity relation, Starting with about 8000 non-prefixed standard (German) verbs the result of this procedure is a set of about 1300 similarity groups (categories); which can be arranged again in about W to 50 similarity groups of similarity groups (models). Figs. 6.1 - 6.6 illustrate the procedure and its results. The categories in a model and the models among themselves can be brought into a 'logical' order with the aid of a second meaning relation: presupposition (not exactly in the Strawsonian sense). This
•JS, vol. 2, no. 3/4
231
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The leading question is: how could we order the meaning of words optimally? A second, no less important, question is: is there a special distinguished set of words which may allow us to organize the rest of the thesaurus once we have found its semantic structure?
THOMAS BALLMER results in a three-dimensional meaning structure. The dimensions are aktionsart (to be carefully distinguished from aspect), degree of influence and intensity.
(a) 2000 Noncomposite Standard Verbforms
(b) 1300 Categories of each about 10 Verb readings
(c) 40 to 30 Models of each about 30 Categories
Fig 6.1 The 8000 non-composite (German; standard verbs (a) are analysed and grouped according to similarity of meaning into 1300 categories (b), and these in turn are grouped into 40 to 50 models (c). Ordering of the categories and the models results in structure (d).
Examples of verb categories tegories of quasi-synonymous verbs) (i.e. categories Beruhren jdl etw2 anrGhren jdl etw2 berOhren " " betupfen " " rankommen " an treffen " tuschieren " Fig 6.2 232
(handle) (touch) (dab) (reach) (meet) (affect)
Sammeln abholzen abreissen ernten mahen pflucken roden sammeln
jdl etw 2 jdl etw2 " " " " "
(deforest) (tear off) (harvest) (crop) (pick)
(grub) (gather) JS, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(d) 1 Verb Thesaurus Structure with «0 Models and 1300 Categories
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES Examples of Models Begin - Expand - Run off - Decrease - End Not Exist - Come into Existence - Be - Vanish - Have Existed Be Conceived - Be Born - Grow up - Live - Grow old - Die - Be Dead Wish - Want - Plan - Do - Perfect - Finish - Succeed/Fail Desire- Reach out - Touch - Grasp - Be Grasped - Let Go - Withdraw Want - Speak - Hear - Think - Respond - Hear - Be Satisfied Fig. 6.3
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The hat structure of verb models intensity of the subject's activity
durative
j;»ie*scence
_^_ aktionsart
Fig.
The verb thesaurus structure (qualitatively)
Fig. 6.5 3S, vol. 2, no.
233
THOMAS BALLMER The dimension of Eingriffsgrad (degree of influence) it
states of affairs processes existence and relations life happening actions (spontaneity) self movements n ambulance " grasping control of transport " manipulation " modification change of give and take " transaction " speech distant grasping emotive " enactive " interactive " discourse (symbolic) "
control (scalar) » (vector) " (tensor) objects (scalar) " (vector) " (tensor) objects (scalar) " (vector) " (tensor) and transaction (scalar) (vector) (tensor) (abstract)
Fig. 6.6
234
JS, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
SV VO sth EX anm LE PW sb HD FB BT GR TR TZ BE NG TA SPA EM EN IA D
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES With only two semantic relations (meaning similarity and presupposi-: tion) and the semantic competence of the users of the language to judge these relations, it is possible to find a semantic structure (not only for the verbs but in fact for the entire lexicon). It can be shown that the nouns as typical subjects or objects of the verbs in question are organized in a quite analogous way. The result is a closely corresponding meaning space. The semantic structure of the ad-forms (adverbs and adjectives) is more difficult to find but can be shown to be related to the meaning-space of the verbs too.
Number of verbs in the underlying models: (numerical fine structure of the verb thesaurus) number N of verbs in a model
DC
1000
Fig. 7 The number of verbs of each model is arranged with respect to each such model (the models are ordered along the axis of Eingriffsgrad, i.e. degree, of influence). It can be. observed that there appears a quite articulate structure, the shells (VO, processes; HD, action; GR, grasping; BA, modification; DC, discourse) of the verb thesaurus. JS, vol. 2, no. 3/4
235
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We cannot go into the details of how to set up these meaning spaces, of how to justify them linguistically and psycholinguistically. Nor is it possible to discuss all the consequences and applications to which such a semantic organization of the predicate constants of a. language leads. Instead, we shall concentrate for the remainder of this paper on some consequences relevant for text and discourse analysis.
THOMAS BALLMER Basic and fundamental verbs The basic verbs: (hold true) gelten (proceed, evolve) ablaufen (happen) geschehen (exist) existieren bestehen aus (consist of) (cause) verursachen wahrnehmen (perceive) (want) wollen (try) versuchen (touch) beruhren (use) benutzen
The fundamental verbs: gelten
(SV - VO)
bestehen
(EX - ER)
versuchen (HD) beruhren benutzen
(GR) (BE)
We have used the atomic predicates and their semantic relations (similarity and presupposition) to build up a semantic space. We have exploited, in other words, the kind of information which the users of a language experience to be fundamental enough to assign them particular constants, i.e. lexical items. Looking at the meaning space of verbs in more detail we find that the number of verbs in the categories and models vary quite considerably. A systematic arrangement of these numeric values leads to a numerical fine analysis (see fig. 7). We find that certain concepts are better lexicalized than others. This conforms to the following principle of relevance: The Principle of Relevance In the course of its use human language adapts itself to the needs of its users. The more relevant something is, the more overtly, i.e. briefly, conventionally, abundantly, impressively, etc., it is expressed by a linguistic entity (type). The lexicalized simple (non-composite) non-technical verbs constitute the most relevant categorematic linguistic entities of a particular language, these verbs are the backbone of the entire (lexicon of the) language. In fact the numerically most relevant concepts (cf. Fig. 7) can be shown to be an adequate universal basis for paraphrasing all other verbs (cf. Ballmer forthcoming). The fundamental verbs and, though 236
JS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. 8 These verbs constitute the basis or fundament upon which (with the aid of ad-forms and nouns) the .entire lexicon is built.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES less perspicuously, the basic verbs (cf. Fig. 8) correspond to the relative maxima of the numerical fine analysis.
At this point we should bring to mind that our analysis of atomic predicates has been entirely structural. In fact we applied the very traditional procedure of starting with a collection of items (the verbs), establishing a corpus, arranging it in an orderly manner, and reducing the discovered structures appropriately to a basic set of items (the fundamental verbs etc.). Where then are the dynamics? Of course, verbs designate processes. And processes are dynamic. As lexical items (conforming to the principle of relevance) they designate in fact the processes which are characteristically relevant for language users: the processes of existence, of life, of action, locomotion, activity, grasping, elaborating, give and take etc., including linguistic actions such as expression, enaction, interaction and discourse. This means that with our attempt to provide a lexical foundation for a Fregean semantics of natural language by investigating the verb lexicon, we got an unexpected bonus: a parametrization of the space of linguistically relevant processes (including linguistic actions). The internal and external neighbourhood relations of the prototypical processes are made explicit in the verbal meaning space. This fact allows approximation procedures in that meaning space, which is a considerable advantage for applications. Thus we have arrived at a structural analysis which has immediate consequences for a dynamic and procedural view. Some of these consequences we shall discuss in the following section. 4. Text and Discourse Structures First, we now have at our disposal an empirically based typology of 3S, vol. 2, no. 3/4
237
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Thus our attempt to attain a semantic foundation of the Fregean meaning analysis of texts and discourses by means of a sense semantical analysis of the atomic predicates (primordially the verbs) leads us naturally to a semantical rock bottom: the fundamental (or basic) verbs. These verbs - brought to prominence by our numerical fine analysis - can be shown, in great detail, to be the basis for paraphrasing (by help of ad-forms, nouns, syntactic constructions) at least all other verbs and possibly also the nouns and ad-forms. A lexical decomposition procedure for atomic predicates is therefore at hand, which has been gained indirectly by a non-compositional and rather holistic method. Confirmed by the results of this method we may now even risk looking out for an axiomatic base, underpinning the form and structure of the entire meaning space. The small number of fundamental verbs seems to promise feasibility. However, a specific proposal for axiomatization has not yet been achieved (cf. however Ballmer 1981).
THOMAS BALLMER speech acts and speech activities. This is a conditio sine qua non for any discourse theory, because this typology is the framework which the speakers of the language implicitly acknowledge as the relevant actions, non-actions (cf. Brennenstuhl 1975) and processes dealing with linguistic matters.
The typology of speech acts (speech activities) Speech activities
Expression
Enaction Model
Discourse (basically dialogical)
Argumentation Model
Discourse Model
Institutional Model
Text Models
Valuation Models
Theme Models
Fig. 9 This is the result of an empirically based typology of speech acts as put forward by Bailmer/Brennenstuhl (1981). Structuring methods for large scale lexical material have been applied to verbs designating speech activities and their aspects.
Secondly, the details of the analysis carried through in Bailmer/Brennenstuhl 1980 provide us with the phase sequences of linguistic actions which tell us in detail how texts and discourses typically occur and run off: we get considerable information about the chain of communication (cf. Figs. 3, 4) for various circumstances, thus for the emotive, enactive, instructive, and discursive mode of speech. All this is based upon an empirical investigation reproduceable (and thereby amendable) by everyone. 238
JS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Emotion Model
Interaction (dialogical)
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES
An example of a more complex chain (simplified)
Victory
Tactical Phase
Defense Attempt
Successful Rejection
Reply to Defense
Willingness to cooperate
Involvement
Fig. 10 This is a simplified version of a Communication Chain of greater complexity as gained by the methods described in the text for categorizing verbs (of verbal interaction) and thereby the underlying processes of communication.
The phase sequences of cognitive actions provide us thus with some material on how language, as a naive theory, sees what is thought and felt by its users before, during and after communication. We get information about the interaction between cognition and the chain of communication. JS, vol. 2, no.
239
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Making - Dissent - Attack - Making a - Retreat £—Defeat Claims Coalition (Challenge)
THOMAS BALLMER
The shell-structure of the verb thesaurus 1. 1.1 1.2
2.1.1 2.1.2 2.1.3
HD (action) FO (dislocation) BT (play and work) second shell GR
2. 2.1
2.2
2.2.1 2.2.2 2.2.3 2.3! 2.3i.2 2.3i-3 2.3
2.32-1 2:32-2 2.32-3
GR (grasp) TR (transport) TZ (manipulation) thirdx shell BA BA (modification) NG (give and take) TA (transaction) third2 shell SPA : EM (emotives) EN (enactives) IA (interactives) DC (discursives)
being (existence) (life) influence
nonactive processes objects, properties, relations inanimate animate active action (controlled, spontaneous motion)
grasp (local restriction of freedom)
modification (grasp and change)
speech (grasp at a distance and change)
Fig. 11 These are the major models as gained by the verb thesaurus analysis. These models determine by and large what can be expressed in the language. They restrict the expressive power quite rigorously. The content patterns of text and discourses as well as the communication patterns are based on this structure. Fig. 12 Establishment and interpretation of the Verb Thesaurus- Structure (see p. 241).Thi$ figure summarizes how the categories/models and the dimensions of the verb thesaurus structure are established and subsequently interpreted. These facts corroborate the deep entrenchment of the verb thesaurus structure in the world as well as in language. 3S, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1.2.1 1.2.2
first major shell: first shell: VO second shell: ER first subshell EX second subshell LE second major shell: first shell HD
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES Establishment and interpretation of the Verb Thesaurus-Structure 1.
Categories/Models
3. Eingriffsgrad Established as: Presupposition Hierarchy of Models Can be interpreted as: 1. Syntactic Complexity of (the more central) Categories 2. Parameter along which Metapraxes (Syntactic/Semantic Transformations) take place. 3. Increasing Complexity of Process Described by the Verbs in the Model 4. Increasing Complexity of Subjects, syntactically 5. Increasing Complexity of Subjects, ontically 6. Increasing Influence, Control, freedom of Subjects, ontically Eingriffsgrad 7. Basis for a Nominal Classification (by Subject and Object Complexity) 8. Parameter for Paraphrastic Reduction 9. Evolution Parameter 9.1 for Logical Complexity 9.2 for Bio-logical Complexity (Avantgarde evolution) 10. Parameter of Nerve Net Complexity (Cells, Sensory Nervecells, Nerve Networks, Ganglia, Spinal Cord, Medulla, ..., Cortex) as Control Organs for Processual Complexity of the Subjects. 11. "Phylogenetic" (and even "Cosmogenetic") Parameter 12. Prorhematic Reconstruction Dimension. 4. Shell Structure Established by: Numerical Fine Structure of the Verb-thesaurus Can be interpreted: 1. Syntactically: "that" Clause Centers, New Levels of Complexity 2. Semantically: Introduction of New Concepts (Scalar, Vector, Tensor Concepts) 3. Centers of Paraphrastical Reduction 3S, vol. 2, no. 3/4
2fl
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Established as: Similarity Groups of Verb meanings Can be interpreted as: Morpho-syntactical classes (up to 80%) 2. Aktionsart Established as: Presupposition Hierarchy of Categories Can be interpreted as: 1. Aktionsart 2.. Process Phases 3. Parameter for Defining Cases (and not Roles) *f. Word order parameter 5. Morphological Parameter for Composite Verbs 6. Text generator Parameter 7. Parameter for Paraphrastic Reduction 8. "Ontogenetic" Parameter 9. Blasteniatic Reconstruction Dimension
THOMAS BALLMER Thirdly, the phase sequences of ordinary processes, actions, etc. provide us with the basic content patterns that there are (cf. Figs. 11, 12). In fact, the approximately 40 models comprise the entire material to which texts and discourses may refer. The 40 models are as we say in German: der Stoff aus dem die Traume sind. Texts and discourses, being built up from smaller units, say clauses, cannot but refer to these models by means of the verbs which occur in their clauses. The meaning space spanned by the verb thesaurus exhibits this completeness property. It determines quite severely the expressive power of the language: no predicate can refer to anything outside this meaning space. Thus we see that the roughly 40 models, (which are the basic frames of discourse, as one might say in the terms of artificial intelligence) constitute the units upon which a textgrammar can be built (cf. Figs. 13, 14, and Appendix n). They are the building blocks for texts and discourses, in other words: the text generators.
Life Model . Mental Effects Motion Model Twosomeness Model Freedom Model (+ Knowledge Model) Guidance Model.
Biographies Horror Stories Travel Novel Romance Detective Story Political Novel
Fig. 13 This figure illustrates that there is an obvious correlation between (some of) the verb models and certain text-types.
Thus the lexicon, or rather its semantic body, is not an unorderly heap of idiosyncratic information but a well knit structure providing templates for the contents and the form of sentences, texts and discourses. As such it controls also (some of) the dynamics of their synthesis and analysis. A closer look justifies the view that this semantic structure of the lexicon can be called the deep structure. Not only does it provide the basis for a (quasi-) generative text grammar, but it. accounts also for syntactic phenomena like case assignments, transformation, word-formation etc.
242
JS, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Models and text-types
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES
Speech act types, textual modes and traditional genres expressive basic type enactive basic type argumentative basic type discursive basic type
.
Write Poetry Orders, Lesson, Lecture, Indoctrination Dispute, Legal Proceedings, Light Fiction Conventionalised Dialogues, Portrayals, Reports, Prescriptions
A Fregean analysis of texts and discourses is now, with the added foundation in word-semantics and context change, in a much better shape. The needs and goals, the strategies to compile them, the synthesis of canonical logical forms to be expressed at the surface, the context change mechanisms, the analysis of the produced linguistic form, all these problems for dynamical analysis get a structural corset which pre-structures their running off.4 What we should look for now is a dynamic theory explaining the structure of this matrix frame of text and discourse analysis. 5 Thomas T. Ballmer Sprachwissenschaftliches Institut Ruhr-Universitat Bochum Universitatsstr. 1 463 Bochum W. Germany
JS, vol. 2, no. 3/4
243
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Fig. 14 Here the three perspectives upon Textual Activities (!) are displayed. From the point of view of Speech Act Types we distinguish four positions, which reflect themselves in four Types of Textual Modes. Most interestingly the three classical genre distinctions .can be brought together with Speech Act Typology (cf. also Hempfer 1973 pp. 165 ff., 3unker, Kayser 1962, p. 335, B. Snell, 1952, p. 175 Der Aufbau der Sprache, Hamburg, Breuer 197*, pp. 173-175). An additional classical genre which has obviously been "forgotten" is the genre which in a way corresponds to the fourth rhetoric type of the middle ages: "praying", cf. Ueding and the three classical rhetoric types: Lobrede, Gerichtsrede, ...
THOMAS BALLMER APPENDIX 1: AN ENTIRELY FORMAL TREATMENT OF THE BACH-PETERS PARADOX
The general problem, here exemplified by the Bach-Peters Paradox is that for any reasonable and intuitively adequate interpretation, the grammar in question should be able to link surface form and logical form in an algorithmic way. This is what Language Reconstruction Systems and Context Change Logic do with no exception. In the following, I am taking up a proposal made by Gunther Todt and Arnold Oberschelp, which they put forth as a correction of one of my earlier formalizations of the Bach-Peters case. As we shall see, the combination of Language Reconstruction Systems and Context Change Logic (with adjustable constants) suffices easily for this task (cf. Ballmer 1978b).
APAQThexy(P(x,y),Q(x) AuPilot(u) n
(v w> AW V2 ' AVAW Shot(v,w) it AVAW Hlt(v.w)
""V
APA.QThers(P(r,s),Q(x)) AuMig(u)
AVXW Chased(v,w) he lv W
' > »>>
The8 xy(Pilot(x) >(Shot(x,y)Ay=it) jThe" rs(Mig(r)t>(Chased(r,s)AS-he), Hit(x,r))) (3) The8 : -The' (animate) subject "1 The° : -The' (inanimate)
two.variable
/
t>
: relative clause soft constant
it
: soft constant (Inanimate)
he
:"
D
8
D°
(animate)
: Discourse Individual set for (animate) subject position : Discourse individual set for (inanimate) object position
JS, vol. 2, no. 3/4
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(1) THE PILOT WHO SHOT AT IT HIT THE MIG THAT CHASED HIM.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES (4) h(The s xy(n(x,y),E(x)); I? ,D°)
.
s
V1xy(h(n(x,y)AX=a;E(E(x);D otaJ,D0)),h(I(x),E(n(x);Dsu{a},D°))) h(The°rs(n B ; D 3 .D°) • h(a; O S ,D°) Ah(e,- D S ,D°) h(it; D S .D°) S
h(he; D ,D°)
• cv(w£D°) • ev(v6D S )
E(The°rs
(6)
h(The S xy(Pilot(x) t»Shot(x,y) Ay-it, The'rstMigtr)...; DS,D°) • V.jXyChtPilottxjAX-a >(Shot(x,y)Ayit) .EtThe5 rs(Mig(r,s»(Chase(r,s)As-heJ Hit(x,r)) ;Osu{a} ,D°)), htThe3 rs (Mig(r)>(Chased(r,s)AS=he,Hit(x,r)) , E(Pilot(x)fr(Shot(x,y)Ay-it); D S o{a},D°))) • V.,xy (Pilot (x)Ax»aAShot(x,y)Ay=b,V1rs(Mig(r)Ar=bAC3iased(r,s)As=a;Hit(x,iJ)) He use the following shorthand notation: V1xy(P(x)AX«aAS(x,y)Ay=b, V^sJMtrJAr-b A C(r,s)AS=a, H(x,r))) together with: V1xy(n(x,y),E(x)) • VxytAuv(n(x,y) -(x,y)-(vt,v))A we get Vxy[Auv(P(x)AX»aAS(x,y)Ay=b -»(x,y)-(u,v)) A VrstAuv(M(r)Ar-bAC(r,s)AS»a -»(r,s)-( ; u,v)) AH(x f r)]] A.
3.
Vxy .VrstAwlPlxjAjfpaXsfX/yJAV^b^tx.yJof-a.v)) A H(x,r)]] r«y
s»x
1.1.
Vxy V^JrtA\».v(P(x)
AS(x,y)
-•(x,y)-(-u,v))A
y x x y • y Vxy[Aw[ ((P(x)AS(x.y) - ( x , y ) - ( u , v ) ) AM(y)AC(y,x)-(x,y)-(n,v)i3AH(x,y) «»
With ( ( S - » V ) A ( t - » V ) )
(7) Vxy[Auv(P(x)AS(x,y)
IS, vol. 2, no. 3/^
•<(|9At)-'l|
AM(y)*C(y,x) - ( x , y ) - ( i i , v ) ) AH(x,y)]
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
With these formulas translational interpretation gives the.following intermediary r e s u l t s (we s e t D • D • 0 ) :
THOMAS BALLMER
The linguistic surface expression The pilot who shot at it hit the Mig that chased him
The logical constants then indicated (cf. (3)) are characteristic of the 'soft' logic applied for the special purposes of the Bach-Peters Paradox, i.e. for pronominalization. The elimination of these soft logical constants by means of the interpretation function h applying to the logical formula in the context (D s , D°) is the aim of the treatment of the case under consideration. Then we are given a 'hard' logical form (cf. (7)), i.e. a formula of predicate logic. Such a formula is then interpretable along the standard ways of model theory. The way to go from the soft logical expression to the hard logical expression is by systematically eliminating the context, more precisely the context dependent and context affecting soft constants (specifically soft operators) and the context with respect to which they are used. The context dependency of the soft constants is characterized by the inter-
2*6
3S, vol. 2, no. 3/t
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
is first analysed syntactically. This is done by means of a Language Reconstruction System and runs as follows. Each word is assigned, by use of a fixed lexicon (for the language in question) a logical expression. These expressions are formulas of a calculus called the A.-xcalculus. These formulas consist of a X- x-prefix and a matrix consisting of formulas of a given logical language. The A-x-prefix indicates how the words of the linguistic surface are to be related to each other, and the matrix formula represents the semantic content of these words. (The predicate labels used are clear from the context: P, M, S, C, H for pilot, Mig, shot at, chase, hit respectively.) In order to characterize the content of the single words occurring in the surface we employ a context-change logic. The A-x-expressions define the (syntacticsemantic) relation between the words by the way in which they operate on each other. X-expressions reduce, as usual, by taking their arguments from the right. x-expressions reduce, in contrast, by taking their arguments from the left. The situation is complicated somewhat by the fact that linguistic expressions are related hierarchically to each other: words form phrases, phrases form sentences, sentences form units of a larger order. This can be reconstructed formally by taking into account the distinction between primary and secondary linguistic surface expressions, i.e. words and punctuation signs. The latter structure the articulation on a level higher than words. Accordingly the reduction process of the A - x - expressions runs as follows: the X-expressions corresponding to the words are thus 'swallowed' first by the x-expression of the punctuation sign and are then internally reduced along the usual lines of the X- calculus. For convenience the punctuation signs contain y- operators working as A.-operators, but are in the context of punctuation signs. Applying the rules as indicated here, the result is a logical form, formulated in the logic in which the logical content of the single words is expressed; in Context Change Logic (cf. (2)).
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES pretation function h itself, the context change properties are determined by the context change function g (cf. (5)). With these specifications the translational interpretation, i.e. the translation from soft to hard logic, is feasible and in fact explicit in a way alternative approaches to the Bach-Peters Paradox are not (cf. Ballmer 1978a, b). It does not merely cope with the semantics or, alternatively, with the syntax of the problem, but it deals with both aspects integratively. Moreover the approach with Language Reconstruction Systems and Context Change Logic is based on universal grounds of a theory of linguistic action and hence stable enough to deal with the full range of linguistic problems in the very same manner without making use of ad hoc methods and back-packs.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
JS, vol. 2, no. 3/4
247
THOMAS BALLMER APPENDIX I t AN EXAMPLE OF A TEXT ANALYSIS
"Babel-17", she said, "I haven't solved it yet, General Forester. First of all, General, Babel-17 isn't a code. It's a language.": this is how Samuel R. Delany sets out the problem he wants to treat in Babel-17, one of the better-known Science-Fiction novels of the Nineteen-sixties.
The following is the (compressed) result of a text analysis of Delany's Babel-17. It is based upon the knowledge of the structure of all relevant processes as brought out by the Verb Thesaurus Structure presented above. Notice, that this text structure arises from the verbs occurring in the text and the verbmodels in which they are classified. The relations among the verbmodels are based on the semantic relation of presupposition. Models higher up in the figure are presupposed by those lower down. Models concerning the content of a text presuppose those models that concern the pragmatics of texts, those concerning the pragmatics of text presuppose socio-cultural and anthropological models, in Babel-17 the Conflict Model is the presupposition for all other "content" models. Narrative argumentation is' presupposed for any kind of 'content' statement, and thus for all 'content' models. The conventional units, i.e. the models, are the foils of interpretation. With the aid of the models we can get a 'fingerprint' of the text, which facilitates its identification. A text structure forms a cartographic lattice which enables us to carry on the fine analysis of the text.
2kZ
JS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Two sides, the Alliance and the Invaders, confront one another. The hidden key to an invasion which has been going on for twenty years appears to be a language, Babel-17, a "mind-twisting, multidimensional language of the alien race who threatened to smash the peace of the Earth Alliance world." The task of clarifying the function and interpretation of this language has been entrusted to Hydra Wong, a poet and linguist. It is she also who - in the sentences just quoted from Delany's novel - explains the complexity of the problem to General Forester. As commander of the space-ship Rimbaud she will set out on the difficult path to solve the mystery of Babel-17.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES
ROLES
MODELS
I. Society, Culture I I . Author
Arise Exist
Pass away
Birth "-Grow up .-Go to •• Work (Live) school Support
Crow — Die old
Life Model
Reader (Individual) Transaction Model
Author Produce-Offer-Promote »• Sell »-(Be read) • Assessment Notice Buy Consume of Success Reader (= Reader) (Market) Textpragmatic Level: Author
Rest - Challenge •• Operate Tactically
Orientation-Complication » Evaluation
I
I
I
There Once •• Disruption — Arrival, was a ... of a State of . Mission and Equilibrium Trial of the Hero
— -
Resolution I Task Accom— plished by the Hero
Argumentation Model
Coda | Original State reesta-
traditional narrative text-structuring PROPP'S analysis of folktales (cf.also GREIMAS)
HI. Text-thematic Level: Peace-Outbreak of War — War — De-escalation — Peace Conflict Model (Challenge, (Fight) (Bring peace) Declare War) \ Problem—Argue—Prepare—Request—Use—Search — Find Problem-Solving M< (Babel-17) o.s. Aid Aid for Be Sue- ( Thought Model) (Attempt, cessful, (Desire Solve, Solve,' for KnowThink Perceive) ledge, about) Abilities) Persuade—Be put in charge—Receive/ .-" of Aid .••••• .
Argumentation Model Enaction Model
Rest —Set out — Move — Arrive (Search for) (Find)
Motion Model
Make a — Consider — Check — Perceive Discovery
Knowledge Model
KnoV- — Make known Draw Conclusion — Make known
Text Model
Solution
3S, vol. 2, no. 3/4
249
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Reader (Argumentatia persuasion)
• Win the Point - Rest (Win, Be Successful, Make the Point of the Story)!
THOMAS BALLMER
Notes
250
as, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1 If matters get complicated, this is perhaps not only my fault, it may lie in the nature of language itself. This paper was read at the International Colloquium on Discourse Representation at Cleves, September 1981. I would like to express my thanks for the many stimulating comments I received on that paper, especially from Prof. Renate Bartsch who read the coreferate and to Hans-Jurgen Eikmeyer and Hannes Rieser for in- and extensive discussions. 2 We may ask whether the dynamic approach only holds for texts and discourses, or also for some or all lower levels. For sentences we are easily led to ignore that they are temporarily developing entities.Representable in a conventional writing system they are objectualized and detached from the temporal phenomena of their synthesis and analysis. For discourses we are forced to take another viewpoint. Any written record of a discourse is a poor representation of it. First, transcriptions, as is well-known, bear obvious traces of subjective interpretation. Secondly, the process of synthesis and (quasi simultaneous) analysis of a discourse cannot be separated from discourse as an independent object. A discourse experienced or adequately described when it ( occurs is what takes our interest; and this is completely different from the discourse looked at from some point in the distant future: the circumstances have then changed and other things have become relevant. There is in fact a radical difference between the cursive and the complexive view. Nevertheless all that has been said for discourses now could strictly speaking be transferred to the sentences appearing in the discourse and thus to all other lower level entities. We may therefore safely assume that a dynamic approach will serve its purpose for all linguistic levels. There are quite a number of more refined ; notions of dynamics. In all of them temporality is essential. Processuality, i.e. the individuation of quasi closed events and interactivity, i.e. the taking into account of mutual (causal) influence of events, is usually to be included in a dynamic approach. But further refinements have to be considered: the taking into account of structural stability (Thorn 1975, Wildgen 1981), of context change (Ballmer 1972, Bosch 1973, Eikmeyer/Rieser 1982), of synergetics (Haken 1977), of fractals (Mandelbrot 1977), of self-organization (Prigogine 1977, Eigen 1979, Ballmer/Weizsacker 1974) of autopoesis (Maturana 1980, Jantsch 1979), of textuality (Petofi 1982, Ballmer 1972), etc. 3 In order to adequately account for the iterative dynamics as discussed here and pictorially symbolized in Fig. 4, a specific version of Context Change Logic, Cybernetic Logic, is useful (cf. Ballmer, 1977)w It uses the loop operator Q which occurs in formulas like Q cp , meaning: perform cp as long as it can be made true. 4 Accordingly the Language Reconstruction Systems (LRSs), a A.-Kcategorial device of the syntactic synthesis and analysis, now stand on a better ground. Relying, except for the 'one' grammatical rule, nearly exclusively upon the lexicon, LRSs have now a backing by a well working theory of the lexicon.
SEMANTIC STRUCTURES OF TEXTS AND DISCOURSES 5 For a proposal fitting into the framework presented here cf. Ballmer 1981.
References Ballmer, Th., 1971: Grunde fur eine Formale Pragmatik. In: K.HyldgaardJensen (ed.), Linguistik 1971. Athenaum, Frankfurt, pp. 266 28*.
Ballmer, Th., 1978b: Analysis and Synthesis of Linguistic Structure. In: Groenendijk, ] . , Stokhof, M. (eds.), Amsterdam Papers in Formal Grammar (Vol. II). Centrale Interfaculteit, Univ. of Amsterdam, Amsterdam, pp. 1-17. Ballmer, Th., 1981: The Interaction between Ontogeny and Phylogeny: A Theoretical Reconstruction of the Evolution of Mind and Language. In: W.A. Koch (ed.), Semiogenesis. Peter Lang, Frankfurt, pp. *8-5**. Ballmer, Th, 1981: Linguistic Dynamics. In: B. Rieger, B. (ed.), Empirical Semantics. Vol. I. Brockmeyer, Bochum* pp. 2-58. Ballmer, Th. (forthcoming): Biological Foundations of Linguistic Communication. Towards a Bio-Cybernetics of Natural Language. Benjamins, Amsterdam (forthcoming 1983). Ballmer, Th. and Brennenstuhl, W., 1981: Springer, Berlin.
Speech
Act
Classification.
Ballmer, Th. and Weizsacker, E.V., (no year): Biogenese und Selbstorganisation. In: E.V. Weizsacker, Offene Systeme I. Klett, Stuttgart. Bosch, P., 1973: Kontextuelle Voraussetzungen fur das Verstandnis von Text und Rede. Unpublished MA-thesis, Technical University, Berlin. Breuer, D., 197*: Einfuhrung in die pragmatische Texttheorie. UTB, 106, Fink, Miinchen. Eigen, M. and Winkler, R., 1979: Das Spiel. Natwgesetze Steuern Zufall. Piper, Munchen.
den
3S, vol. 2, no. 3/*
251
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Ballmer, Th., 1972: Einfuhrung und Kontrolle von Diskurswelten. In: D. Wunderlich (ed.), Linguistische Pragmatik. Athenaum, Frankfurt, pp. 183-206. Ballmer, Th, 1975: Sprachrekonstruktionssysteme. Scriptor, Kronberg/ Ts. Ballmer, Th., 1978a: Logical Grammar. With Special Consideration of Topics in Context Change. North-Holland, Amsterdam.-
THOMAS BALLMER Eikmeyer, H.-J. and Rieser, H., 1982: A Formal Theory of Context Dependence and Context-Change. In: Th. Ballmer (ed.), Linguistic Dynamics, De Gruyter, Berlin. Haken, H., 1977: Synergetics. Springer, Berlin. Hempfer, D.W., 1973: Gattungstheorie.
UTB, 133, Fink, Munchen:
Jautsch, E., 1979: Die Selbstorganisation des Universums. Hauser, Munchen. Kayser, W., 1962: Das sprachliche Kunstwerk. Eine Einfuhrung in die Literatwwissenschaft. Francke, Bern
Prigogine, I., 1977: Selforganization in Nonequilibrium Systems. New York.
Wiley,
PrigDgine, I., 1979: Vom Sein zum Werden. Piper, Munchen. Snell, B., 1952: Der Aufbau der Sprache. Hamburg. Thorn, T., 1975: Structural Stability and Morphogenesis. Cummings, Reading/Mass. Ueding, G., 1976: Einfuhrung in die Rhetorik. Metzler,
Benjamin/ Stuttgart.
Wildgen, W., 1981: Semantic Description in the Framework of Catastrophe Theory. In: B. Rieger (ed.), .Empirical Semantics. Vol. II, Brockmeyer, Bochum, pp. 792-818.
252
JS, vol. 2, no.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lyons, 3., 1968: Introduction to Theoretical Linguistics. Cambridge UP, Cambridge. Mandelbrot, B., 1977: Fractals, Form, Chance and Dimension. Freeman, San Francisco. Petofi, J.S., 1982: Explikation und Evaluation in der Textproduktion und Textinterpretation. In: J.S. Petofi (ed.), Texte und Sachverhalte. Buske, Hamburg.