T OPIC AND FOCUS
STUDIES IN LINGUISTICS AND PHILOSOPHY
VOLUME 82
Managing Editors GENNARO CHIERCHIA, University of ...
78 downloads
1079 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
T OPIC AND FOCUS
STUDIES IN LINGUISTICS AND PHILOSOPHY
VOLUME 82
Managing Editors GENNARO CHIERCHIA, University of Milan KAI VON FINTEL, M.I.T., Cambridge F. JEFFREY PELLETIER, Simon Fraser University
Editorial Board JOHAN VAN BENTHAM, University of Amsterdam GREGORY N. CARLSON, University of Rochester DAVID DOWTY, Ohio State University, Columbus , GERALD GAZDAR University of Sussex, Brighton IRENE HEIM, M.I.T., Cambridge EWAN KLEIN, University of Edinburgh BILL LADUSAW, University of California, Santa Cruz TERRENCE PARSONS , University of California, Irvine
The titles published in this series are listed at the end of this volume.
TOPIC AND FOCUS CROSS-LINGUISTIC PERSPECTIVES ON MEANING AND INTONATION
edited by
CHUNGMIN LEE Seoul National University Seoul, Republic of Korea
MATTHEW GORDON University of California Santa Barbara, CA, USA and
..
DANIEL BURING University of California Los Angeles, CA, USA
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN-10 ISBN-13 ISBN-10 ISBN-13
1-4020-4795-9 (HB) 978-1-4020-4795-4 (HB) 1-4020-4796-7 (e-book) 978-1-4020-4796-7 (e-book)
Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. www.springer.com
Printed on acid-free paper
All Rights Reserved © 2007 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
TABLE OF CONTENTS Preface Gorka Elordieta Constraints on Intonational Prominence of Focalized Constituents Ardis Eschenberg Polish Narrow Focus Constructions David Gil Intonation and Thematic Roles in Riau Indonesian Matthew Gordon The Intonational Realization of Contrastive Focus in Chickasaw Carlos Gussenhoven Types of Focus in English Nancy Hedberg and Juan M. Sosa The Prosody of Topic and Focus in Spontaneous English Dialogue Emiel Krahmer and Marc Swerts Perceiving Focus Manfred Krifka The Semantics of Questions and the Focusation of Answers Chungmin Lee Contrastive (Predicate) Topic, Intonation, and Scalar Meanings Kimiko Nakanishi Prosody and Scope Interpretations of the Topic Marker ‘wa’ in Japanese Ho-Hsien Pan Focus and Taiwanese Unchecked Tones Elisabeth Selkirk Bengali Intonation Revisited: An Optimality Theoretic Analysis in which FOCUS Stress Prominence Drives FOCUS Phrasing Mark Steedman Information-Structural Semantics for English Intonation Klaus von Heusinger Discourse Structure and Intonational Phrasing
vii 1 23 41 69 83 101 121 139 151 177 195 215
245 265
PREFACE
During the 2001 Linguistic Summer Institute at University of California, Santa Barbara, a group of linguists gathered at a workshop to discuss the expression and role of topicalization and focus from a variety of perspectives: phonetic, phonological, syntactic, semantic, and pragmatic. The workshop was designed to lay the groundwork for collaborative efforts between linguists devoted to the study of meaning and linguists engaged in the quantitative study of intonation. This volume contains papers emerging from the Santa Barbara Workshop on Topic and Focus. A wide variety of methodologies and research interests related to topic and focus are represented in the papers. Some works present results of phonetic studies, either acoustic or perceptual, on the expression of topic and/or focus; others examine semantic or pragmatic features of topic and/or focus, while others are concerned with the interface between intonation and meaning. Data from several different languages are represented in the papers, including several languages with relatively little documentation particularly in the venue of topic and focus, e.g. Basque, Chickasaw, Indonesian, Polish, Taiwanese. The broad sample of languages coupled with the wide variety of research topics addressed by the papers promise to enrich our typological understanding of topic and focus phenomena and provide an impetus for further research. The following paragraphs offer brief summaries of the papers contained in this volume: Gorka Elordieta’s paper describes prosodic conditions governing focus in a dialect of Basque with pitch accents. He finds that narrow focus is only intonationally marked for words carrying a pitch accent. Unaccented words rely on contextual information rather than intonational cues to signal focus, unlike in Japanese, in which unaccented words are free to express focus prosodically. Ardis Eschenberg’s paper explores the role of word order and prosody in the expression of focus in Polish. She shows that intonation and word order are used in different capacities depending on the focused element and the type of focus. Eschenberg discusses the implications her research has for various syntactic and semantic theories of focus. David Gil’s paper focuses on the role of intonation in signalling thematic roles in the Riau dialect of Indonesian, a language with relatively free word order and no obligatory case or agreement marking. Based on an analysis of data from a naturalistic corpus of utterances, Gil finds that intonation is not used to cue thematic roles. Drawing on this result, Gil proposes a model of Indonesian syntax and semantics lacking traditional morphosyntactic categories. Matthew Gordon’s paper is a phonetic study of the effect of focus on fundamental frequency and duration in Chickasaw, a language in which focus is morphologically marked. Gordon finds considerable variation between speakers in the use of f0 and duration as correlates of focus, with temporal disjuncture between elements playing a more important role than f0 in the expression of focus. Based on
viii
PREFACE
these results, Gordon suggests that focus may be marked phonetically even in a language in which focus has an overt morphological realization. Carlos Gussenhoven’s work provides an overview of how various types of focus are expressed syntactically and prosodically. Basing his classification on data from several languages, Gussenhoven suggests that focus may differ along a number of pragmatically conditioned dimensions. He finds that different categories of focus are expressed through different intonational contours, with identificational focus seeming to occupy a special status in its reliance on morphological as opposed to prosodic cues. Nancy Hedberg and Juan Sosa investigate the evidence for a prosodic distinction between topic accents and focus accents in their paper. In an analysis of naturally occurring English speech, they do not find any differences in pitch accent type pointing to separate categories of topic and focus accent. On the other hand, they find extensive marking of information structure categories with high pitch accents. In their paper, Emiel Krahmer and Marc Swerts discuss a dialogue reconstructing experiment designed to examine the role of pitch accents in perceiving focus in two languages, Dutch and Italian, differing in the importance of pitch accents as a marker of focus. Krahmer and Swerts find that Dutch listeners rely more on pitch accent cues to reconstruct focus than Italian listeners, in keeping with the greater role of pitch in signalling focus in Dutch. Results of an audiovisual experiment employing talking heads suggest that visual cues can also play a role in the perception of focus, though primarily when pitch cues are indecisive. Manfred Krifka’s paper explores the proper semantic treatment of focus patterns in response to constituent questions. He finds that neither the framework of Alternative Semantics nor a theory that works with givenness rather than semantic focus as a basic concept offers an adequate analysis of focus arising in answers to questions. On the other hand, Krifka argues that the theory of Structured Meaning provides a superior account of this type of focus. In his paper, Chungmin Lee characterizes Contrastive Topic and Contrastive Predicate Topic, particularly in connection with their ‘conventional’ scalar implicatures. He distinguishes a typical kind that evokes a ‘conventional’ implicature from list contrastive topics, which lack any implicature. The Contrastive Topic marker in Korean gets a high tone responsible for focality, analogously to the fall-rise contour in English. Lee’s paper explores the scalar meaning of type-subtype scalarity and subtype, arguing for the inherent tendency of subtype scalarity even in entities. It also explores scope relations between scope bearers and Contrastive Topic and CT’s narrow-scope nature. The apparent non-narrow-scope of CT is claimed to be a topicalization effect. Predicates are claimed to be inherently subtype-scalar when CT-marked just like numerals and quantifiers. In conclusion, the uttered part is a concessive admission with the intent of conveying a forceful implicature in the unuttered part. In her paper, Kimiko Nakanishi examines the prosodic and semantic properties associated with the Japanese topic marker wa. She shows that the two pragmatic functions of wa, as a marker of theme and contrast, are distinguished prosodically. She further claims that the theme vs. contrast distinction is accounted for by an
PREFACE
ix
Alternative Semantics analysis, in which the two functions of wa correspond to different scope interpretations and pragmatic functions. Ho-hsien Pan’s paper explores the influence of focus on fundamental frequency and duration in Taiwanese, a language with lexical tone. Parallel to languages in which tone is not used at the lexical level, Pan finds that increased duration and expanded pitch range are both associated with narrow focus. However, duration turns out to be a more reliable marker of focus than f0, a result which Pan suggests may be due to the high functional load of f0 height in distinguishing lexical items in Taiwanese. Elisabeth Selkirk’s paper develops an Optimality Theoretic analysis of focus constituency in Bengali, which is typologically unusual in requiring that focused elements be delimited on both sides by phonological phrase boundaries. In order to account for the Bengali focus facts, Selkirk proposes a theory of the prosody-syntax interface in which a family of focus prominence constraints requires a focused morphosyntactic structure to contain a phonological prominence within a specified prosodic constituent. Selkirk shows that a member of this focus prominence contraint family, working in conjunction with other hierarchically ranked tonal and prosodic alignment constraints, offers a principled account of the complex tonal phonology of Bengali. Mark Steedman’s paper builds on his earlier work to develop a new theory of intonation structure in which intonational tones are reduced to a small set of semantically grounded binary oppositions. Steedman’s theory assumes a distinction between the beliefs that the speaker attributes to the hearer by the literal meaning of his or her utterance, and those that the hearer is actually committed to. Steedman shows that this division is crucial in offering an adequate account of situations in which the speaker and hearer do not mutually believe a proposition that the speaker assumes is shared. Klaus von Heusinger’s paper explores the function of intonational phrasing in discourse, finding that semantics plays an important role in determining prosodic constituency in discourse. He argues that discourse relations may hold between relatively small subclausal units, which are defined in terms of their functions as arguments in discourse. Von Heusinger argues that a version of Segmented Discourse Representation Theory is equipped to handle the mutual relations holding between discourse units. The editors gratefully acknowledge the National Science Foundation’s support of the Santa Barbara workshop on Topic and Focus through grant BCS-0104212. In addition, we would like to thank the Linguistic Society of America’s Summer Institute and the Institute for Social, Behavioral and Economic Research for their logistical and administrative support of the workshop. Thanks are also extended to Ed Luna for his editorial assistance in preparing the manuscripts for publication. Chungmin Lee Matthew Gordon Daniel Büring January 2006
GORKA ELORDIETA
CONSTRAINTS ON INTONATIONAL PROMINENCE OF FOCALIZED CONSTITUENTS*
1. INTRODUCTION Across languages, in narrow contrastive focus constructions one or more cues (morphological, syntactic, intonational) are used by speakers in order to express the intended meaning correctly, singling out the focalized element or constituent from the rest of the elements in the sentence. However, in this article I will provide evidence that in the pitch-accent dialects of Basque classified as Northern Bizkaian Basque (NBB, Hualde, Elordieta, Gaminde and Smiljanic 2002) narrow focus expressions may be left unexpressed through these cues. There are cases in which focalized words cannot be identified on the basis of syntax or intonation alone (morphology does not play a role as a focus cue in Basque). They may satisfy the necessary syntactic conditions, but they do not satisfy the necessary conditions imposed by the intonational grammar of these dialects. There is a constraint on intonational focalization limiting main intonational prominence to focalized words that bear a lexical or derived pitch accent, and more radically to words that constitute a separate intonational unit on their own, an Accentual Phrase (AP). A word forms an independent AP if it has a H*+L pitch accent and the word to its left ends an AP. 2. BACKGROUND It is well known that languages differ in the overt cues they use to make the hearer identify clearly the focalized constituent. On the one hand, there are languages which signal focalized elements intonationally, without overt syntactic or morphological cues. These are languages of the so-called English type, in which focalized elements receive main prosodic prominence in-situ, with no movement from their base position.1 Other Germanic languages such as Dutch and German also have this strategy of English for signaling narrow focus. However, in some cases these languages may also resort to syntactic movement operations to cue focus. When the verb is the focus of the sentence and a definite object is used, scrambling of the object may take place so that the verb is interpreted as narrow focus (Reinhart and Neeleman 1998). The verb receives main prosodic prominence by virtue of being in clause-final position.2 1 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 1–22. © 2007 Springer.
2
GORKA ELORDIETA
Some languages display two kinds of strategies for narrow focus manifestation: one in which words are assigned main prominence in their base-generated syntactic position (a strategy of the English-type), and another one in which syntactic displacement operations are produced such that these words or constituents end up occupying a syntactically specified position for narrow focus, by means of scrambling or fronting, or some other means. Unlike Dutch or German, the latter option is available for all constituents and is not subject to definiteness constraints, and perhaps most importantly, in these languages focalized words which are syntactically displaced are also assigned main prosodic prominence in the sentence (cf. Bolinger 1954, Ladd 1980, Culicover and Rochemont 1983, Vallduví 1990, Cinque 1993, Reinhart 1995, Selkirk 1995, Zubizarreta 1998, Frota 1998, among others). Spanish and Italian constitute examples of this type of languages. But in Spanish focalized words can also occur in non-in-situ positions, such as sentenceend position or also a fronted position, in both cases accompanied by main sentence stress (cf. Bolinger 1972, Contreras 1978, 1980, Uriagereka 1995, Zubizarreta 1998 among others, for discussion of the different options). Then, there are languages which signal focus morphologically, by the addition of a suffix, a prefix or some other overt marker that indicates focalization. This strategy can be combined with syntactic displacement, intonational marking, or a combination of both. In Wolof, for instance, a so-called ‘emphatic marker’ inserted before the verb indicates which constituent is being focalized, whether it is the subject, a complement, or the verb (cf. Rialland and Robert 2001). Narrow focus is also cued syntactically in Wolof, as focalized constituents have to appear in sentence-initial position.3 It is important to point out that in this language no intonational prominence or phrasing effects are manifested on the focalized element. English Creoles could be similar to Wolof in this respect, as focus is marked morphologically and also syntactically, by fronting the focalized constituent, and prosodic marking may be absent (cf. Bickerton 1993). In Japanese, morphological marking combines with prosodic marking to indicate focus: prosodic prominence and phrasing effects leave clear which constituent(s) have to be interpreted as narrow focus, and the focus particle -ga follows a focalized subject (cf. Pierrehumbert and Beckman 1988, Haraguchi 1991, Kubozono 1993, among others). In other languages, focus is both syntactically and intonationally identified. That is, narrowly focalized elements not only receive main prosodic prominence and/or are accompanied by intonational phrasing boundaries, but they also occupy a syntactic position structurally defined for focalized expressions, be it Spec-CP, Spec-FocusP, the most embedded position in the sentence, the position immediately preceding the verb, or some other position. Hungarian, Turkish, Quechua, Basque and Hausa are examples of this type of language (cf. among others Horvath 1986, Kiss 1995, 1998 for Hungarian; Vogel and Kenesei 1987, 1990 for Turkish; Ortiz de Urbina 1989, 1999, Hualde et al. 1994, Elordieta 2001, Arregi 2001, Etxepare and Ortiz de Urbina 2003 for Basque; Inkelas and Leben 1990 for Hausa). The following paradigm from Basque illustrates this pattern in which focalized elements must appear immediately preceding the verb. Thus, examples (1e-g) are ill-formed
CONSTRAINTS ON INTONATIONAL PROMINENCE
3
because they contain focalized constituents which are either postverbal or not immediately preverbal. Sentence (1a) represents a neutral declarative sentence, and 4 the rest are sentences with focalized constituents (capitalized): (1) a.
Jonek Mireni liburua eman dio John-erg Miren-dat book-abs give aux ‘John has given the book to Miren’
b.
Jonek liburua MIRENI eman dio John-erg book-abs MIREN-DAT give aux ‘John has given the book TO MIREN’
c.
Mireni liburua JONEK eman dio Miren-dat book-abs JOHN-ERG give aux ‘JOHN has given the book to Mary’
d.
Jonek Mireni LIBURUA eman dio John-erg Miren-dat BOOK-ABS give aux ‘John has given THE BOOK to Miren’
e.
*Jonek liburua eman dio MIRENI John-erg book-abs give aux MIREN-DAT
f.
*JONEK Mireni liburua eman dio JOHN-ERG Miren-dat book-abs give aux
g.
*Jonek Mireni eman dio LIBURUA John-erg Miren-dat give aux BOOK-ABS
These examples show that, although Basque is a language with flexible word order, there is a syntactic constraint in this language on the relative word order between focus constituents and the verb, namely that they must be left-adjacent to it (cf. the references mentioned in the previous paragraph for details on syntactic analyses that could explain this constraint).5 But apart from this syntactic restriction, in Basque the focalized expression receives main prominence in the sentence, that is, focus is cued both syntactically and intonationally. Serbo-Croatian offers a particularly rich case in focus marking possibilities (Godjevac 2000, Frota 2002). Like in English, prosodic phrasing and prominence with canonical word order serves to cue narrow focus. Another strategy to signal narrow focus is to produce a marked word order by scrambling operations, assigning at the same time prosodic phrasing and prominence cues to the constituent that is focalized (i.e., the Hungarian-Basque type). Finally, it is also possible to mark narrow focus by scrambling operations under a neutral intonation, leaving the focalized constituent in sentence-final position, so that it receives default sentence
4
GORKA ELORDIETA
stress (like in Dutch or German for verb focus). Thus, three different strategies or options are available in Serbo-Croatian to signal narrow focus. The different possibilities for signaling narrow focus discussed above might not constitute an exhaustive typology, although they might suffice for expository purposes. The table in (2) summarizes this typology of the different possibilities for signaling focus by means of syntax, morphology or prosody, or a combination of more than one of these strategies. A few representative languages are also included. Slots with a ‘?’ are those that to my knowledge do not have representatives. (2) Strategy for focus marking (a) Prosody alone (b) Morphology alone (c) Syntactic displacement alone (d) Prosody and morphology (e) Morphology and syntactic displacement (f) Prosody and syntactic displacement (g) Prosody, syntactic displacement and morphology
Sample languages - Only strategy: English, European Portuguese - One of the strategies: Dutch, German, Spanish, Italian ? - Only strategy: ? - One of the strategies: Serbo-Croatian, Dutch, German - Only strategy: Japanese - One of the strategies: ? - Only strategy: Wolof - One of the strategies: ? - Only strategy: Hungarian, Basque, Turkish - One of the strategies: Serbo-Croatian, Spanish, Italian ?
Despite all these possibilities of marking focus, I will show that in pitchaccent dialects of Basque (i.e., Northern Bizkaian Basque, NBB) there are cases in which words which constitute the narrow focus of the utterance are not singled out by syntactic, morphological or intonational means. In these dialects, intonational highlighting of narrow focus is restricted to words which bear a lexical or derived accent, or for some speakers, to words that constitute Accentual Phrases (APs) by themselves. That is, not any independent word can bear intonational prominence even though it may be the pragmatic focus of the utterance. I discuss these cases in the following section. 3. SYNTACTIC AND PROSODIC CONSTRAINTS ON FOCUS IN NBB 3.1. Lexically and morphologically conditioned accentual classes in NBB In order to understand the syntactic and prosodic constraints on focus in NBB, it is necessary to provide an overview of the prosodic features of these dialects. NBB dialects are pitch accent varieties of the Bizkaian dialect of Basque, and are spoken in the northwestern Basque-speaking area, along the coast and in a band of around 15 kilometers inland from the coast. A noteworthy feature of these dialects is the
CONSTRAINTS ON INTONATIONAL PROMINENCE
5
lexical distinction between unaccented and accented roots, stems and affixes, like in Japanese (cf. Poser 1984, Pierrehumbert and Beckman 1988, Haraguchi 1991, Kubozono 1993 among others for details on Japanese tone and intonation structure). An accented root or affix is sufficient to render an accented word, which surfaces with prominence on a non-final syllable in all contexts. In most NBB varieties, the syllable preceding the leftmost accented morpheme surfaces with main prominence, as illustrated in (3) below for the Gernika variety (accented morphemes are indicated by an apostrophe). In a few varieties, it is always the penultimate syllable that is accented (as in the Lekeitio variety, cf. Hualde et al. 1994, Hualde 1997, 1999, Elordieta 1997, 1998): (3) a.
sagar -‘ata - ‘tik ĺ apple-plur.loc.-abl.
sa.gá.rra.ta.tik
‘from the apples’
b.
léku -‘ata - ra ĺ place-plur.loc.-all.
lé.ku.e.tara
‘to the places’
A combination of unaccented roots and affixes produces unaccented words (except in compounding, where even if the members are unaccented the compound word is accented). Unaccented words will only receive prosodic prominence if they occur immediately preceding the verb or are pronounced in isolation. In these cases, in most NBB varieties they display prominence on the final syllable, and in a few dialects they show penultimate prominence (e.g., Ondarroa and Markina, cf. Hualde 1997, 2000). This kind of prominence is called derived accent by Jun and Elordieta (1997), to distinguish it from the lexical accent of accented words. In all other contexts, unaccented words do not surface with any kind of prosodic prominence on any syllable. Thus, observe the behavior of the unaccented word laguna ‘the friend’ in (4), corresponding to the Lekeitio variety (henceforth Lekeitio Basque, LB). This word is composed of the unaccented root lagun ‘friend’ and the unaccented singular determiner –a. Prosodic prominence is indicated by an acute accent mark. The different word orders in (4a-d) are due to the flexible word order of Basque, constrained by topic and focus or theme-rheme structures. That is, (4a-d) differ in information structure (rheme constituents are underlined).
(4) a.
umiágas laguná etorri da child-com friend-abs come aux ‘The friend has come with the child’
b.
laguná etorri da umiágas friend-abs come aux child-com ‘The friend has come with the child’
6
GORKA ELORDIETA
c.
laguna umiágas etorri da friend-abs child-com come aux ‘The friend has come with the child’
d.
umiágas etorri da laguna child-com come aux friend-abs ‘The friend has come with the child’
e.
*laguná umiágas etorri da
f.
*umiágas etorri da laguná
The unaccented/accented distinction is directly relevant for intonational phrasing in NBB. Prominence is realized as a H*+L pitch accent on the syllable that is phonologically associated with accent. As already mentioned above, accented words will always bear stress in any position in the sentence, whereas unaccented words only display a H*+L pitch accent if they are immediately left-adjacent to the verb, i.e., when they bear derived accent. The intonational pattern that arises is the following: the sentence starts with an initial low tone (%L), immediately followed by a rise phonetically associated with the second or third syllable of the first word. The pitch level is maintained until reaching a H*+L pitch accent, whether of an accented word or an unaccented word with derived accent. If after that H*+L pitch accent there is another word, the contour that is observed is one in which again there is an initial low tone on the first syllable of that word, the pitch level rising again on the second or third syllable of the following word, and the high tone level plateau being maintained on all syllables until another H*+L accent, corresponding to an accented word or an unaccented word preceding the verb, i.e., with derived accent. And if another word follows, the same pattern is observed. Thus, a cycle of low tone, rise, plateau and H*+L pitch accent is observed. The intonational units or constituents with this shape are identified by Elordieta (1997, 1998) as Accentual Phrases (APs). Jun and Elordieta (1997) and Elordieta (1998) show that APs consist of an initial %L boundary tone, a phrasal H tone (H-) on the second syllable,6 and a H*+L pitch accent. The phrasal H tone spreads phonologically onto all syllables between the second one and the one with the pitch accent. Schematically, the tonal structure of an AP is %L H- H*+L (cf. also Hualde et al. 2002).7 Figures 1-2 illustrate the general shape of APs in NBB, corresponding to (5a-b), respectively. Figure 1 is an example of a sentence containing three unaccented words before the verb; from an IP-initial %L there is a rise on the second syllable, reaching the peak on the third syllable, and the H tone continues until the H*+L pitch accent on the final syllable of the third word (i.e., the one immediately preceding the verb, with the derived accent). The pitch drops on the verb until the end of the utterance. In the figures in this article, the pitch accent is aligned with the right edge of the accented syllable. Fig. 2 contains two accented words, each of them with their corresponding H*+L pitch accent. Due to downstep, the second phrasal H- does not rise as much as the first one, and the second peak is smaller than the first one (cf. Elordieta 1997, 1998, Jun and Elordieta 1997 for details and more pitch tracks):
CONSTRAINTS ON INTONATIONAL PROMINENCE
(5) a.
b.
AP{%L
HH*+L} | | | galdu dot alargunen nebien diruá widow-gen brother-gen money-abs lose aux ‘I have lost the widow’s brother’s money’
AP{%L
H*+L} AP{%L H-H*+L} | | | | | amúmen liburúak biar doras grandmother-gen books-abs need aux ‘I need grandmother’s books’
Figure 1. alargunen nebien diruá galdu dot
7
8
GORKA ELORDIETA
Figure 2. amúmen liburúak biar doras
3.2. Intonational restrictions on the assignment of prominence to focalized words As explained in section 2 above and illustrated in (1), in NBB only words contained in an immediately preverbal syntactic constituent can be focalized. The focalized word does not have to immediately precede the verb, but it has to be contained in a syntactic constituent that is immediately preceding the verb. Thus, in the following examples, (6b) is grammatical, as well as (6a). (6c) is ungrammatical, however, as the syntactic constituent it is contained in is not immediately preverbal (syntactic constituents are separated by square brackets): (6)
a.
[maixuári] [lagúnen LIBURÚAK] emon dotzaras. teacher-dat friends-gen BOOKS-ABS give aux ‘I have given the friends’ BOOKS to the teacher’ (responding to stimuli such as: ‘Which of the friends’ things have you given to the teacher?’)
b.
[maixuári] [LAGÚNEN liburúak] emon dotzaras. teacher-dat FRIENDS-GEN book-abs give aux ‘I have given THE FRIENDS’ books to the teacher’ (responding to stimuli such as: ‘Whose books have you given to the teacher?’)
CONSTRAINTS ON INTONATIONAL PROMINENCE
c.
9
*[MAIXUÁRI] [lagúnen liburúak] emon dotzaras. TEACHER-DAT friends-gen book-abs give aux ‘I have given the friends’ books TO THE TEACHER’ (erroneously responding to stimuli such as: ‘Who have you given the friends’ books to?’)
However, in cases of utterances where one of the words constitutes the narrow focus of the utterance, even if that word is contained in the immediately preverbal constituent, there is a further constraint it must obey in order to be intonationally singled out. In the variety of NBB I have investigated, LB, a focalized word can be the most prominent intonationally if it has a lexical pitch accent (i.e., if it is a lexically accented word) or if it has a derived accent (i.e., it is an unaccented word immediately preceding the verb). Let us illustrate this constraint with sentence (7) (repeated from (5b)), containing only one preverbal constituent with two accented words, amúmen ‘grandmother’s’ and liburúak ‘books’. The intonational structure corresponding to this constituent is thus the following: (7)
AP{%L
H*+L} AP{%L H-H*+L} | | | | | amúmen liburúak biar doras grandmother-gen books-abs need aux ‘I need grandmother’s books’
That is, in the immediately preverbal syntactic constituent there are two APs, each of them containing one accented word. Let us now describe the main patterns observed in contexts of narrow focus, that is, in cases in which the focalized word replaces the variable introduced by a wh-word in a previous question. The two words in (7) would become the narrow focus of an utterance if they formed part of a response to the questions in (8a,b), respectively: (8)
a.
Nóren liburúak biar dósus? whose books-abs need aux ‘Whose books do you need?’
b.
Sér biar dósu amuména? what need aux grandmother-gen ‘What do you need of grandmother’s?’
Since amúmen and liburúak have lexical H*+L pitch accents, they can be pronounced standing out as the most prominent words in the utterance. An interesting aspect worth mentioning is that in narrow focus cases in which the first word is focalized the pronunciation of such utterances is not usually distinguished from cases of broad focus. That is, the first word will not necessarily show a boosted pitch level and/or a following decreased pitch level. In the data I have analyzed from five female speakers of LB, only one speaker produced some utterances in which the
10
GORKA ELORDIETA
first word was pronounced with a higher pitch followed by a lower level on the following word. This might be due to the fact that in broad focus cases the difference in pitch between the first peak and the following peaks is already quite big (cf. Fig. 2). However, when the second word is focalized, there are more instances in which the word is made more prominent intonationally and perceptually distinguishable from broad focus cases. The focalized word may present a higher pitch level (although the peak is still lower than the first peak, due to downstep), followed by a decreased pitch level. Quite often there may also be a displacement of the peak of the first word to the posttonic syllable. This strategy signals old information or topic status for that word.8 For sentence (9), which would be an answer to (8b), Figure 3 illustrates a case without peak delay at the end of the preceding word, and Figure 4 illustrates a case with peak displacement, indicated in the tone tier with a ‘>’ sign: (9)
amúmen LIBURÚAK biar doras. grandmother-gen BOOKS-ABS need aux ‘I need grandmother’s BOOKS’
A similar scenario would apply for a preverbal constituent containing two words, the first one accented and the second word unaccented. The accented word has a lexical H*+L accent, and the unaccented word receives a H*+L pitch accent by virtue of preceding the verb (i.e., it has a derived accent on its final syllable). The sentence in (10) is an example:
Figure 3. amúmen LIBURÚAK biar doras.
CONSTRAINTS ON INTONATIONAL PROMINENCE
11
Figure 4. amúmen LIBURÚAK biar doras.
(10)
AP{%L
H*+L} AP{%L H-H*+L} | | | | | Amáien alabiá topa dot Amaia-gen daughter-abs find aux ‘I came across Amaia’s daughter’
If the first word were the narrow focus of the sentence, most commonly it would not receive more prominence than in broad focus cases. If the second word were the narrow focus, however, it would be made more prominent by presenting a higher pitch level than in broad focus cases, accompanied or not by peak delay in the first word (interestingly, when there is peak delay in the previous word a bigger pitch level on the focalized word is not necessary). An example with peak delay in the first word is illustrated below in Figure 5, corresponding to (11). As described above, however, this pattern is not obligatory, and it is also quite normal to find cases which are intonationally very similar to broad focus utterances.9 (11)
Amáien ALABIÁ topa dot. ‘I came across Amaia’s DAUGHTER’.
12
GORKA ELORDIETA
Figure 5. Amáien ALABIÁ topa dot
However, in the case of preverbal constituents containing one or more unaccented words in nonfinal position (i.e., not immediately preceding the verb) the situation is different. An unaccented word will only get a derived accent if it is leftadjacent to the verb, and hence an unaccented word which is the narrow focus of an utterance but which is not in the position that grants a derived accent cannot be made more prominent intonationally. From a neutral sentence such as (12), the leftmost unaccented word, nebien ‘the brother’s’ would not receive main prominence even though it were the narrow focus of the sentence (as an answer to ‘Whose money have you lost?’, because it does not have a pitch accent, lexical or derived. A crucial aspect of this pattern in NBB is that focus does not insert accents that are not already there lexically or by virtue of a preverbal position. The first word is lexically unaccented, and even if it is focalized, it remains unaccented, that is, no accent is associated to it, as it is not left-adjacent to the verb and hence does not receive a derived accent. This impossibility does not depend on the accentual nature of the following word, as the same impossibility occurs with accented words following the unaccented word. Thus, in a sentence such as (13) it would not be possible to highlight the first word. The type of contours that surface in these instances is one in which the leftmost word has to be pronounced with the same pitch level as the following word, in the same AP. Figure 6 serves to illustrate such a contour, corresponding to narrow focalization of the word nebien in (12): (12)
AP{%L
HH*+L} | | | nebien diruá galdu dot brother-gen money-abs lose aux ‘I have lost the brother’s money’
CONSTRAINTS ON INTONATIONAL PROMINENCE
(13)
13
AP{%L
HH*+L} | | | lagunen liburúa biar dot friend-gen book-abs need aux ‘I need the friend’s book’
Figure 6. nebien diruá galdu dot
As for the second word in sentences such as (12)-(13), we do not find a uniform pattern across speakers. However, such interspeaker variation reveals important facts about constraints on the intonational realization of main prominence in contexts of narrow focus. For two of the five speakers recorded, the second words in those cases would be able to receive main prominence if they were the narrow focus of the utterance, as in (14b), responding to a question such as (14a). An observed strategy in these cases is a continuation rise at the end of the preceding word, signaling old or known information. This rise cannot be due to an accent in the first word, so it must be due to H- (cf. Fig. 7). Another possibility is to have a sustained pitch at the end of the preceding word followed by a rise in pitch level on the focused word (other non-intonational features such as higher intensity may also be present). In both cases, a decrease in pitch level follows the focalized word. The same pattern is observed in cases in which the second word is lexically accented, as in (15): (14) a.
Ser galdu dósu nebiena? What lose aux brother-gen ‘What have you lost of the brother?’
14
GORKA ELORDIETA
b.
nebien DIRUÁ galdu dot brother-gen MONEY-ABS lose aux ‘I have lost the brother’s MONEY’
(15) a.
Ser biar dosu lagunena? what need aux friend-gen ‘What do you need of the friend?’
b.
lagunen LIBURÚA biar dot friend-gen BOOK-ABS need aux ‘I need the friend’s BOOK’
Figure 7. nebien DIRUÁ galdu dot
Importantly, three of our five speakers did not produce utterances like (14b), or could not pronounce the second word in (15) with main intonational prominence. That is, they cannot highlight a word intonationally if it is preceded by an unaccented word. For these speakers, not only the leftmost word but also the second word cannot be prosodically highlighted. Regardless of which word is the corrective focus of the utterance, the whole AP (i.e., the two words) would have to be pronounced together. The explanation for this pattern is that these speakers have a stricter constraint on the intonational highlighting of focalized words. This constraint states that only words which constitute APs by themselves can be made intonationally prominent. In cases of two words with accent, such as the ones in (7)/ (10), each word constitutes its own AP, and can thus be singled out intonationally. But in cases in which the first word is unaccented, the second word does not constitute an AP by itself. Rather, it continues the AP that the first word started. As the intonational schemas in (12)-(13) show, the unaccented word starts an AP, with
CONSTRAINTS ON INTONATIONAL PROMINENCE
15
the initial %L H- tone sequence, but since it does not have a pitch accent, the phrasal H- tone spreads onto the next word, until the H*+L accent (lexical or derived) of the following word ends the AP (cf. Jun and Elordieta 1997; Elordieta 1998). There is thus only one AP before the verb, containing the two words. Since neither word forms an independent AP, they cannot be made intonationally prominent on their own. The two words have to be pronounced in the same pitch level, in the same AP. The contour observed in these instances is similar to the one illustrated in Figure 6, which showed the impossibility of having the leftmost word as the most prominent word in the utterance. The important issue at work here is that no pitch accent is specially inserted to the first unaccented word, even if it is the narrow focus of the sentence from a pragmatic or information-structure point of view, as already mentioned above. Hence no AP boundary can be inserted at the right edge of the first word. That is, the lexical association of pitch accents is respected by focus in NBB. Thus, a mismatch between semantics and intonation arises in cases where a word which does not constitute an AP by itself is the corrective focus of an utterance. No intonational cues are used within the utterance containing the contrastively focalized word alone in order to convey the intended meaning. There is no way to single out the focalized word syntactically, as the word occurs with other words in the preverbal constituent. Disambiguation can only come from the preceding linguistic context. This mismatch situation between semantics and prosody does not arise in languages surrounding NBB (Spanish and French) or in Indo-European languages. And an insufficiency of syntax and/or morphology to mark focalized words is unattested in the languages for which there are descriptions of focus realization, a summary of which was provided in section 2. Thus, this property of NBB is interesting from a typological point of view as well. The patterns of realization of intonational highlighting change slightly when corrective focus is considered. Corrective focus refers to those instances in which the speaker corrects one of the words or syntactic phrases that her interlocutor has stated incorrectly. For instance: (16) a.
b.
Nóren alabia topa dosula? Alaznena? whose daughter-abs find aux Alazne-gen ‘Whose daughter did you come across? Alazne’s?’ Es, AMÁIEN alabiá topa dot. no AMAIA-GEN daughter-abs find aux ‘No, I came across AMAIA’s daughter.’
In (16b) above, the first accented word Amáien can be made more prominent, usually by having a boosted pitch level followed by a decreased pitch level in the rest of the material in the sentence. Thus, in corrective focus the first word is distinguishable from cases of broad focus, unlike in narrow non-corrective focus. The second word in (16b) would also be made more prominent, by means of a delayed peak in the preceding word, signaling the character of topic or old information of that word. This type of contour is illustrated in Figure 8, for a
16
GORKA ELORDIETA
sentence such Es, Amáien ALABIÁ topa dot ‘No, I came across Amaia’s DAUGHTER’. Another option is to have simply a higher pitch level on the focalized word, without a preceding peak displacement. Quite often, the focalized word is accompanied by higher intensity levels and longer duration.10 As already described above, the same options would be available for sentences in which the second word were lexically accented.
Figure 8. Es, Amáien ALABIÁ topa dot
But the interesting cases are those in which the first word is unaccented, forming an AP with the following word. As described above, in narrow noncorrective focus some speakers could not intonationally highlight either of the two words, due to a constraint that a word has to constitute an AP by itself in order to be the most prominent word in the utterance, rather than simply having a pitch accent. In corrective focus, however, these speakers can place main intonational prominence in a word even if it does not constitute an AP by itself. The sufficient condition is that the word has an accent, lexical or derived, like in narrow non-contrastive focus for the other speakers. Words bearing an accent and following an unaccented word may surface with main prominence, cued by a rise in pitch on the focalized word coming from a sustained pitch of the unaccented word, or by a rise at the end of the prefocal unaccented word. In both cases, usually the focalized word displays higher intensity and duration (cf. Elordieta and Hualde 2001, 2003). It is important to bear in mind, however, that this type of prosodic realization are scarce in the production of the most restrictive speakers, that is, those for whom a word has to constitute an AP by itself in order to stand out as the most prominent word.11 Figure 9 illustrates an F0 contour for a sentence such as (17b), in which the first option is realized, and Figure 10 illustrates the second possibility, with a rise at the end of the first word.
CONSTRAINTS ON INTONATIONAL PROMINENCE
(17) a.
b.
Ser biar dosula lagunena? Kuadernúa? what need aux friend-gen notebook-abs ‘What do you need of the friend? His notebook?’ Es, lagunen LIBURÚA biar dot. ‘I need the friend’s BOOK’.
Figure 9. lagunen LIBURÚA biar dot
Figure 10. lagunen LIBURÚA biar dot
17
18
GORKA ELORDIETA
We will finish our presentation of the intonational constraints on the prosodic realization of focus in NBB by summarizing in a table the focus realizations for all logically possible two-word combinations in a preverbal phrase. The left-hand column summarizes the patterns in narrow non-corrective focus, and the right-hand column those of corrective focus. When the two types of constraints for intonational highlighting (having an accent or being an independent AP) produce different outputs, they are distinguished as (a) and (b).12 Narrow (non-corrective) focus H*L H*L | | AP[Accented]–AP[Accented] – Verb
Corrective focus H*L H*L | | AP[Accented]–AP[Accented] – Verb
Each word can be highlighted
Each word can be highlighted (boosted pitch on focalized word more frequent than in non-corrective focus) H*L H*L | | AP[Accented]–AP[Unaccented] – Verb
H*L H*L | | AP[Accented]–AP[Unaccented] – Verb Each word can be highlighted H*L | AP[Unaccented–Accented] – Verb
Each word can be highlighted (boosted pitch on focalized word more frequent than in non-corrective focus) H*L | AP[Unaccented–Accented] – Verb
a. Neither word can be highlighted; they a. Neither word can be highlighted; they are uttered in the same AP are uttered in the same AP b. Only the word with an accent can be b. Only the word with an accent can be highlighted highlighted (more frequent than in noncorrective focus) H*L | [Unaccented–Unaccented] – Verb AP
H*L | [Unaccented–Unaccented] – Verb AP
a. Neither word can be highlighted; they a. Neither word can be highlighted; they are uttered in the same AP are uttered in the same AP b. Only the word with an accent can be b. Only the word with an accent can be highlighted highlighted (more frequent than in noncorrective focus)
CONSTRAINTS ON INTONATIONAL PROMINENCE
19
4. SUMMARY AND CONCLUSION In this paper I have described the main constraints on the realization of prosodic prominence on focalized words in a pitch accent dialect of Basque. It has been shown that the minimum condition a word has to satisfy to receive main prosodic prominence if pragmatically focalized is that it has an accent, whether lexical or derived. However, in cases of narrow non-corrective focus some speakers reveal the existence of a more restrictive constraint, which demands that a word must constitute an AP by itself in order to surface with main prominence. In corrective focus the sufficient condition for the five speakers recorded is that a word has an accent. In either case, the interesting fact is that an unaccented word which does not have an accent cannot receive an accent even if it is pragmatically focalized. The context seems to prevent possible ambiguities between neutral and narrow focus readings of unaccented words without an accent. To my knowledge, these are crosslinguistically unattested constraints, and in this regard NBB is different even from a language like Tokyo Japanese, which also has a lexical distinction between accented and unaccented words, but which allows any unaccented word to be prosodically highlighted (cf. Pierrehumbert and Beckman 1988). Dept. of Linguistics and Basque Studies,University of the Basque Country, VitoriaGasteiz, Spain NOTES *
Many thanks are due to Matthew Gordon and José Ignacio Hualde for comments on earlier versions of this article, as well as to Sónia Frota, Carlos Gussenhoven and Kiwa Ito for help with section 2. Of course, this article would not have been made possible without my native informants, to whom I am indebted immensely. This work was funded by research grants from the Department of Education, Universities and Research of the Basque Government (PI-1998-127), the University of the Basque Country (UPV-HA8025/20 and 9/UPV 00033.130-13888/2001) and the Ministry of Science and Technology of Spain (BFF2002-04238-C02-01/FEDER). 1 For the sake of expository purposes, we exclude cleft and pseudo-cleft sentences from the discussion, as we will compare this type of language with another type of language that marks focus constituents syntactically without clefting, by having focalized constituents occupy a certain syntactic position below in the text. Thus, we want to distinguish languages which have a structural position for focus from languages such as English that do not, although they may make use of cleft sentences to mark focus. 2 Scrambling is disfavored or does not apply with indefinite objects. In such cases, there is simply main prosodic prominence on the verb. 3 However, when an object is focalized and there is a nonpronominal subject, the focalized object has to follow the subject, which obligatorily appears thematized (i.e., topicalized, cf. Rialland and Robert 2001:897-898). 4 The following abbreviations will be used: abl = ablative, abs = absolutive, all = allative, aux = auxiliary, dat = dative, erg = ergative, gen = genitive, ines = inessive, loc = locative, pl = plural, sg = singular. 5 It is possible for focalized constituents to appear after the verb, but they are usually uttered as separate intermediate or intonational phrases. They are usually preceded by pauses, fillers such as e ‘err…/um…’, or final lengthening of the verb ending in a rising intonation. It appears that copulas can be followed by focalized constituents even without a pause (Hualde et al. 1994). In central and eastern dialects it is possible to have focalized elements postverbally without a pause (cf. Hidalgo 1994, Elordieta 2003), apart
20
GORKA ELORDIETA
from the usual preverbal position, but the speakers I have consulted cannot have postverbal focus as an answer to a wh-word. In that case preverbal focus is the only option. Perhaps only informational, noncontrastive focus (Kiss 1998) presented by the speaker in her own discourse can appear postverbally in these dialects, but more research is needed on this topic before making any generalizations. 6 Jun and Elordieta (1997) found that in APs up to four syllables long the peak of H- is reached on the second syllable, and in APs more than fours syllables long it was reached on the third syllable. This H- is not phonetically realized when the second syllable is associated to a pitch accent. 7 For some speakers, in sequences of four or more unaccented words certain dips in pitch can be observed between two unaccented words. Jun and Elordieta (1997) and Elordieta (1998) take these to be APboundaries, in the absence of H*+L pitch accents. However, the dips were difficult to perceive and were much smaller than regular drops after H*+L pitch accents (see relevant pitch tracks in the mentioned articles). Also, the factors conditioning these breaks were not very well established; desire for heaviness reduction and slower rate of speech were suggested as factors involved in the insertion of these breaks, but no systematic study was carried to prove these claims. Moreover, these facts were subject to speaker dependence; some speakers always produce plateaus in sequences of four or more unaccented words, without breaks. This issue deserves a more systematic study, which I plan to undertake in future research. 8 The delayed peaks at the end of prefocal words were already observed for some speakers of LB by Ito et al. (2003). However, their data involved cases of corrective focus, which we also discuss below. The patterns presented in this paper show that it is possible to find such delayed peaks in non-corrective narrow focus as well. Other strategies of main prominence that can be observed in these contexts and which are not intonational in nature are higher intensity and duration on the focalized word. 9 Indeed, the speakers of LB on which Elordieta (2003) based his findings did not produce utterances in which the second word was most prominent intonationally, and this lead to positing the absence of such a possibility. That conclusion must now be corrected to capture the facts presented in this article. 10 Although the results in Elordieta and Hualde (2001, 2003) showed that lengthening applied to words in corrective focus, it must be pointed out that in those utterances speakers were instructed to put special emphasis on those words. In other recordings in which speakers were not told to put emphasis on the correction, I have observed that lengthening did not occur significantly. It seems that a specific experiment (left for future research) is needed to clarify the role of lengthening as a cue to corrective focus. 11 Thus, highlighting words following an unaccented word without an accent is possible, but not frequent in LB. Its frequency is speaker dependent, but as stated in note 10, the possibility of finding such patterns has to be incorporated into the intonational grammar of LB, contra what was assumed in Elordieta (2003). 12 Interestingly, the two speakers that patterned differently from the other three speakers in contexts of narrow non-corrective contexts in being able to highlight a word following an unaccented word also patterned differently in other respects. For contexts in which the first unaccented word was correctively focalized, they produced contours in which this word was prosodically set apart, by having a higher pitch level followed by a fall in pitch for the following word, or by being pronounced with greater intensity and duration. However, such cases were few in number, compared to the majority of cases in which the unaccented word did not surface with main prominence, thus patterning with the other three speakers. At this point I consider it premature to conclude that highlighting the unaccented word in these contexts is a solid possibility in LB, and leave the issue open for further research based on data from more speakers and based on more tokens of each type of context.
REFERENCES Arregi, Karlos. “Focus and Word Order in Basque.” Manuscript, Massachusetts Institute of Technology, 2001. Bickerton, Derek. “Subject Focus Pronouns.” In Francis Byrne and Donald Winford (eds.), Focus and Grammatical Relations in Creole Languages, pp. 189-212. Amsterdam: John Benjamins, 1993. Bolinger, Dwight. “English Prosodic Stress and Spanish Sentence Order.” Hispania 37 (1954): 152-156.
CONSTRAINTS ON INTONATIONAL PROMINENCE
21
Bolinger, Dwight. “Accent is Predictable (If You’re a Mind-reader).” Language 48 (1972): 633-644. Cinque, Guglielmo. “A Null Theory of Phrase and Compound Stress.” Linguistic Inquiry 24 (1993): 239-297. Contreras, Heles. El Orden de Palabras en Español. Madrid: Cátedra, 1978. Contreras, Heles. “Sentential Stress, Word Order, and the Notion of Subject in Spanish.” In Linda Waugh and C.H. van Schooneveld (eds.), The Melody of Language, pp. 45-53. Baltimore: University Park Press, 1980. Culicover, Peter, and Michael Rochemont. “Stress and Focus in English.” Language 59 (1983): 123-165. Elordieta, Arantzazu. Verb Movement and Constituent Permutation in Basque. Utrecht: LOT, 2001. Elordieta, Gorka. “Accent, Tone and Intonation in Lekeitio Basque.” In Fernando Martínez-Gil and Alfonso Morales-Front (eds.), Issues in the Phonology and Morphology of the Iberian Languages, pp. 4-78. Washington, DC: Georgetown University Press, 1997. Elordieta, Gorka. “Intonation in a Pitch-Accent Dialect of Basque.” International Journal of Basque Linguistics and Philology 32 (1998): 511-569. Elordieta. Gorka. “Intonation.” In José I. Hualde and Jon Ortiz de Urbina (eds.), A Grammar of Basque, pp. 72-113. Berlin: Mouton de Gruyter, 2003. Elordieta, Gorka, and José I. Hualde. “The Role of Duration as a Correlate of Accent in Lekeitio Basque.” In Proceedings of Eurospeech 2001 - Scandinavia, 105-108, 2001. Elordieta, Gorka, and José I. Hualde. “Tonal and Durational Correlates of Accent in Contexts of Downstep in Northern Bizkaian Basque.” Journal of the International Phonetic Association, 33 (2003): 195-209. Etxepare, Ricardo and Jon Ortiz de Urbina. “Focalization”. In José I. Hualde and Jon Ortiz de Urbina (eds.), A Grammar of Basque, pp. 459-515. Berlin: Mouton de Gruyter, 2003. Frota, Sónia. Prosody and Focus in European Portuguese. University of Lisbon: Doctoral dissertation, 1998 [Published by Garland in 2000]. Frota, Sónia. Review of Intonation, Word Order and Focus Projection in Serbo-Croatian (Godjevac (2000). Glot International 6 (2002): 251-256. Godjevac, Svetlana. Intonation, Word Order and Focus Projection in Serbo-Croatian. Doctoral Dissertation, Ohio State University, 2000. Haraguchi, Shosuke. A Theory of Stress and Accent. Dordrecht: Foris, 1991. Hidalgo, Bittor. Hitz Ordenaren Estatistikak Euskaraz. Doctoral dissertation, University of the Basque Country, 1994. Horvath, Julia. Focus in the Theory of Grammar and the Syntax of Hungarian. Dordrecht: Foris, 1986. Hualde, José I. Euskararen Azentuerak. Bilbao: Servicio Editorial de la Universidad del País Vasco, 1997. Hualde, José I. “Basque Accentuation.” In Harry van der Hulst (ed.), Word Prosodic Systems in the Languages of Europe, pp. 947-993. Berlin: Mouton de Gruyter, 1999. Hualde, José I. “On System-Driven Sound Change: Accent Shift in Markina Basque.” Lingua 110 (2000): 99-129. Hualde, José I., Gorka Elordieta and Arantzazu Elordieta. The Basque Dialect of Lekeitio. Bilbao and San Sebastián: Servicio Editorial de la Universidad del País Vasco, 1994. Hualde, José I., Gorka Elordieta, Iñaki Gaminde and Rajka Smiljanic. “From Pitch-Accent to StressAccent in Basque.” In Carlos Gussenhoven and Natasha Warner (eds.), Papers in Laboratory Phonology VII, pp. 557-584. Berlin: Mouton de Gruyter, 2002. Inkelas, Sharon, and William Leben. “Where Phonology and Phonetics Intersect: The case of Hausa Intonation.” In John Kingston and Mary Beckman (eds.), Papers in Laboratory Phonology I, pp. 17-34. Cambridge: Cambridge University Press, 1990. Ito, Kiwako, Gorka Elordieta, and José I. Hualde. “Peak alignment and intonational change in Basque.” Proceedings of the 15 th International Congress of Phonetic Sciences. Barcelona. Spain, pp. 2929-2932. Barcelona, 2003. Jun, Sun-Ah, and Gorka Elordieta. “Intonational Structure of Lekeitio Basque.” In Antonis Botinis, Georgios Kouroupetroglou and George Carayiannis (eds., Intonation: Theory, Models and Applications, pp. 193-196. Proceedings of an ESCA Workshop. Athens, Greece, 1997. Kiss, Katalin É. “Introduction.” In Katalin É. Kiss (ed.), Discourse Configurational Languages, pp. 3-27. New York, Oxford: Oxford University Press, 1995.
22
GORKA ELORDIETA
Kiss, Katalin É. “Identificational Focus Versus Information Focus.” Language 74 (1998): 245-273. Kubozono, Haruo. The Organization of Japanese Prosody. Tokyo: Kurosio, 1993. Ladd, Robert D. The Structure of Intonational Meaning: Evidence from English. Bloomington, Indiana: Indiana University Linguistics Club, 1980. Ortiz de Urbina, Jon. Parameters in the Grammar of Basque. Dordrecht: Foris, 1989. Ortiz de Urbina, Jon. “Focus in Basque.” In Georges Rebuschi and Laurice Tuller (eds.), The Grammar of Focus, pp. 311-333. Amsterdam and Philadelphia: John Benjamins, 1999. Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988. Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. Doctoral Dissertation, MIT, 1984. Reinhart, Tanya. “Interface Strategies.” Manuscript, Utrecht University, 1995. Reinhart, Tanya, and Ad Neeleman. “Scrambling and the PF Interface.” In W. Gueder and Myriam Butt (eds.), Projecting from the Lexicon. Stanford: CSLI Publications, 1998. Rialland, Annie, and Stéphanie Robert. “The Intonational System of Wolof.” Linguistics 39 (2001): 893-939. Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In John Goldsmith (ed.), The Handbook of Phonological Theory, pp. 550-569. Cambridge: Blackwell Publishers, 1995. Uriagereka, Juan. “An F Position in Western Romance.” In Katalin É. Kiss (ed.), Discourse Configurational Languages, pp. 153-175. Oxford: Oxford University Press, 1995. Vallduví, Enric. The Informational Component. University of Pennsylvania: Doctoral dissertation, 1990. Vogel, Irene, and István Kenesei. “The Interface between Phonology and Other Components of Grammar: The Case of Hungarian.” Phonology Yearbook 4 (1997): 243-263. Vogel, Irene, and István Kenesei. “Syntax and Semantics in Phonology.” In Sharon Inkelas and Draga Zec (eds.), The Phonology-Syntax Connection, pp. 365-378. Chicago: University of Chicago Press, 1990. Zubizarreta, María Luisa. Prosody, Focus and Word Order. Cambridge, Mass.: MIT Press, 1998.
ARDIS ESCHENBERG
POLISH NARROW FOCUS CONSTRUCTIONS
1. INTRODUCTION1 Polish, a western Slavic language, is a so-called ‘free word order’ or ‘scrambling’ language. SVO ordering has been posited to be basic for Polish (Szober 1963), and a study by Klemensiewicz (1949) found the majority of isolated sentences to conform to this ordering. However, other constituent orders are still common. Variations in word order have often been explained in terms of information structure (Szwedek 1976; Willim 1989), as well as constituent length (Siewerska 1993). However, a single word order can occur with various types of information structure (Eschenberg 1999). In such cases, prosody may provide a way to distinguish between the differing information structure types. Analyses which rely on textual data or fail to consider prosody will be unable to account for cases where one word order is used for differing information structures. This paper explores Polish constructions involving focus on a single constituent, narrow focus constructions. Not only word order but also intonation, particularly sentence stress, is considered. First, declarative sentences are examined. Then, wh-questions are turned to. Word order alone cannot be used to account for narrow focus in Polish; prosody is crucial. Failure to consider prosody will be seen to cause confusion between construction types. Differences in word order will be shown to be motivated by different types of presupposition, as proposed by Dryer (1996). A more restricted definition of focus type offered by Kiss (1998) will be seen to apply in this situation.
2. FOCUS AND SYNTACTIC CONSTITUENTS: NARROW FOCUS CONSTRUCTIONS 2.1. Theoretical Background Analyses of Polish focus structure consistently refer to the syntactic structure of clauses (Szober 1963, Szwedek 1976; Willim 1989, Siewerska 1993, Eschenberg 1999). Lambrecht (1994) bases his theory of information structure on the syntactic notions of predicate, argument and sentence, which also have semantic underpinnings. His concepts of predicate focus, argument focus and sentence focus ‘evoke both differences in syntactic focus domains such as VP, NP, PP, S, and 23 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 23–40. © 2007 Springer.
24
ARDIS ESCHENBERG
differences in the focus portions of the pragmatically structured proposition, i.e. predicate, argument, and sentence (222).’ This captures the generalization that focus, a primarily pragmatic concept, tends to be associated with constituents which are syntactic in nature. In this theory, the syntactic domain which expresses the focus component of the pragmatically structured proposition is the focus domain of the proposition. Thus, for sentence focus constructions, the focus domain is the sentence, for predicate focus constructions it is the VP, and for argument focus constructions it is a NP or PP. Focus constructions must specify not only the focus domain, but also presupposition, assertion, and, most obviously, focus. The presupposition is the set of lexico-grammatically evoked propositions the speaker assumes the hearer knows, believes, or will take for granted at the time of the utterance (52). The assertion is the proposition expressed by a sentence which the hearer is expected to know or believe or take for granted as a result of hearing the sentence uttered (52). The focus of the assertion is the semantic component of a pragmatically structured proposition whereby the assertion differs from the presupposition (213). The following provides an example of these concepts used in an argument focus construction (from Lambrecht 1994: 228, 5.11’) Sentence: Presupposition: Assertion: Focus: Focus domain:
My CAR broke down. “speaker’s x broke down” “x = car” “car” NP
This paper explores argument focus, which has the communicative function of identifying a referent. Argument focus has also been called ‘narrow focus’ (Van Valin & LaPolla 1997) and occurs when one constituent is focal. The term narrow focus captures the fact that this constituent may be not only an actual argument (subject, object, indirect object), but also an oblique NP or PP or a nucleus (V). Narrow focus can be further divided into marked and unmarked narrow focus where unmarked narrow focus occurs when the focal constituent occurs in the unmarked focus position in the sentence for the given language. For example, the final position2 is the unmarked focus position for English. Thus, as English is SVO, objects, which occur finally, are unmarked for narrow focus. Similarly, Polish has a final focus position which is unmarked. The following section will examine narrow focus in Polish, beginning with marked narrow focus on the subject and continuing to unmarked narrow focus on the object. 2.2. Narrow Focus in Polish Narrow focus can be elicited through the use of a wh-question calling for an argument filler. When replying to a wh-question asking about a subject, narrow focus is placed on the subject in the reply sentence. In such a response where the subject is focal, two possible replies are felicitous:
POLISH NARROW FOCUS CONSTRUCTIONS
(1) Q: A: a. b.
25
Kto Ğpiewaá? ‘Who sang?’ PIOTR Ğpiewa-á -Ø. Peter.NOM sing-PAST-3Msg ĝpiewaá PIOTR. ‘PETER sang.’
Both SV (1a) and VS (1b) ordered sentences are felicitous replies placing narrow focus on the subject.3 In each case, the subject is prosodically marked, receiving intonational prominence. Similarly, an answer containing (unmarked) narrow focus on an object can be ordered in two ways, where in each case the object is intonationally prominent. (2) Q: A: a. b.
Co kupi-á-eĞ? ‘What did you buy?’ SAMOCHÓD kupi-á-em. car.ACC buy-PAST-1Msg Kupi-á-em SAMOCHÓD. ‘I bought a car.’
In (2a) the object is sentence initial and prominent, and in (2b) the object is sentence final and prominent. Although the above example (2) did not involve an overt subject, a similar situation arises when an overt subject is present (3). (3) Q: A: a. b. c. d.
Kogo Jan kocha? ‘Who does Jan love?’ Jan kocha MARI-ĉ. John.NOM love.3sg.PRES Maria-ACC Jan MARI-ĉ kocha. # MARI-ĉ kocha Jan. ? MARI-ĉ Jan kocha. ‘John loves MARY.’
Again, the object can occur in its canonical position, sentence finally (a). It can also occur pre-verbally after the subject (b), but is less felicitous sentence initially (c,d). Note that in each of the above, while the word order changes, the pitch accent placed upon the focal constituent is similar. This can be seen in a pitch curve, such as in Figure 1.
.
26
ARDIS ESCHENBERG
Figure 1. Comparison of pitch curves for (a) SVO and (b) SOV ordered sentences.
In Figure 1(a), the final focal object begins at 6.6 seconds as the pitch curve rises and continues until the end of the sentence. In (b) the medial focal object again begins on the ascent of the curve and continues through its peak and descent. The final constituent in each case is lengthened. Therefore, the curve associated with the final object is lengthened compared to the medial object's curve. However, the general shape and range in hertz associated with the focal object is similar for both the final and medial focal objects. Error correction paradigms provide another way to elicit narrow focus constructions, yielding similar results to wh-question elicitation (4). (4) Q: A: a. b. c. d.
Jan kocha KasiĊ. ‘Jan loves Kasha.’ Nie, Jan kocha MARI-ĉ. No, John. NOM love.3sg.PRES Maria- ACC Nie, Jan MARI-ĉ kocha. ?Nie, MARI-ĉ kocha Jan. # Nie, MARI-ĉ Jan kocha. ‘No, John loves MARY.’
The error correction paradigm in (4) provides the same grammaticality judgments and intonational contours as the similar wh-question paradigm in (3). This can be seen in a comparison of the plots of the pitch curves as well (Figure 2).
Figure 2. Pitch curves of (a) wh-question and (b) error correction paradigm responses: Jan kocha MARIĉ.
POLISH NARROW FOCUS CONSTRUCTIONS
27
Thus, for both error corrections and replies to wh-questions, variable word orderings exist in Polish. Subjects can occur initially or finally, and objects can occur pre-verbally or finally. In all cases the focal argument receives prosodic prominence. 3. PREVIOUS ANALYSES Variability in Polish word order is not a newly discovered phenomenon. Indeed, as with many Slavic languages, Polish has been studied extensively by Prague School linguists, who call the principles underlying the flexibility in word order the “functional sentence perspective (FSP).” To describe how information is distributed in a sentence, that is to give the information structure of sentences, Mathesius (1929: 127) divides the parts of an utterance into “theme” and “rheme.” The theme is what “one is talking about, the topic,” and the rheme is “what one says about it, the comment” (Danes & 1970: 134). These have also been explained as a distinction between new information, rheme, and given information, theme. Using the latter interpretation, Szwedek (1976: 51) states that it is “not true that order of sentence elements in Polish is free or is a matter of style,” but that it is “strictly determined” and “reflects the organization of the utterance according to the new/given information distribution which, of course, is dependent on the context and situation.” Thus, for the above, both the focal object and focal subject are predicted to come last due to this organization. Szwedek notes that canonically ordered, insitu focus (SVO) is more colloquial or conversational for focal subjects than focus final placement (VOS). He does not discuss SOV constructions. Variation in word-ordering has also been studied by linguists from other schools of thought. Willim (1989: 38) notes that subjects are often introduced into discourse in final position. She calls these VOS ordered sentences ‘presentational.’ However, she does not note which argument in these presentational sentences is prosodically prominent, and, thus, it is difficult to apply her analysis to the above. In her analysis, OSV ordering with a prosodically prominent object is a case of ‘topicalization,’ where the object is focal and non-presupposed (122-3). Like Szwedek, she also does not discuss SOV constructions. Neither of these analyses completely accounts for all of the variation seen in (1-4). 4. EFFECTS OF PRESUPPOSITION 4.1. Theoretical background Although the alternative felicitous word orderings in the above examples could be strictly equivalent variations, this is not necessarily so. Just as the above Polish wh-questions can be replied to using two different constructions, English whquestions can also be felicitously answered by two different constructions. Dryer (1996) notes that both simple focus (SVO) sentences and cleft sentences can serve as answers to English wh-questions (5, adapted from Dryer 1996: 486). .
28
ARDIS ESCHENBERG
(5) Q: A: a. b.
Who saw John? MARY saw John. It was MARY who
saw John.
Both the simple focus focus sentence (5a) and the cleft sentence (5b) felicitously answer the wh-question. The speaker of the wh-question believes that someone saw John and is asking who that person is. Lambrecht (1994: 283) notes that the speakers of wh-questions typically presuppose that there is an answer which fulfills the question. He states that one does not normally ask questions one does not expect answers to. However, the replies to the wh-question do not necessarily contain this presupposition. Dryer (1996: 188), following Rochemont (1986: 130), claims that cleft constructions necessarily contain this pragmatic presupposition but simple focus sentences do not. In the above, (5b) necessarily presupposes that someone saw John but (5a) does not. The following provides an overview of Dryer’s arguments as relevant to this paper. The presuppositional content of the replies becomes apparent in situations where the question does not contain such a pragmatic presupposition. Example (6), adapted from Dryer (1996: 510), provides an example of a question where the speaker does not assume that someone did in fact see John.
(6) Q: A: a. b.
Did anyone see John? MARY saw John. #It was MARY that saw John.
Note that only the simple focus sentence (6a) is a felicitous reply to the question when there is no presupposition that someone saw John. The cleft cannot be used as a felicitous reply (6b) because it inherently presupposes that someone did see John, and the question does not presuppose this. Although the questions in (5) and (6) differ for presupposition, both activate the proposition ‘someone saw John.’ In (5), the first speaker believes this proposition to be true; s/he presupposes it. In (6), the speaker does not have such a belief. Therefore, the presupposition cannot be part of the common ground between the two speakers. When the presupposition of the answer is negated in the reply, the cleft cannot occur (7). (7) Q: A: a. b.
Who saw John? NOBODY saw John. #It was NOBODY that
saw John.
The answer (7a) does not presuppose that someone saw John. In fact, it asserts just the opposite, that no one saw John. The cleft cannot felicitously assert this due to the fact that cleft contains a presupposition that someone saw John (Rochemont
POLISH NARROW FOCUS CONSTRUCTIONS
29
1986: 130, Dryer 1996: 188). Thus, while clefts inherently contain pragmatic presupposition, simple focus sentence answers do not. 4.2. Effects of presupposition in Polish Turning to Polish, all the paradigms presented so far elicit replies which may contain pragmatic presupposition. For the wh-question, one argument of the presupposed proposition is not known, but assumed to exist. For the error correction paradigm, the argument is incorrectly assumed and must be corrected. (8) provides an example in Polish of a question which does not contain such a pragmatic presupposition. (8) Q: A: a. b. c.
Czy ktoĞ Ğpiewaá? ‘Did anyone sing?’ PIOTR Ğpiewa-á-Ø. Peter.NOM sing-PAST-3Msg #ĝpiewaá PIOTR. ĝPIEWAà Piotr. ‘Peter sang.’
Similar to (6), the question in (8) does not contain the presupposition that someone actually sang. The speaker of the question has activated the proposition ‘someone sang,’ but does not necessarily believe it to be true. The SV ordered sentence with focus on the subject is felicitous (8a). It contains but does not presuppose the proposition that someone sang. However, VS ordering is not grammatical if prosodic prominence is placed on the subject (8b). Behaving analogously to an English cleft construction, the VS construction presupposes that someone sang and cannot felicitously answer a question which does not contain such a presupposition. To use the VS construction would entail that the presupposition is part of the common ground between speakers, but the question shows that it is not. The VS ordering is felicitous if the sentential stress is perceived to be on the verb (8c). This, however, is not a case of narrow focus on the just the subject, but rather the entire sentence is in focus. Indeed, in a spectrogram, the pitch curve actually shows stress on both the verb and the subject in such a construction. While (8a) places narrow focus on Jan, (8c) places focus on the entire proposition. Both necessarily assert that someone sang as the question does not presuppose this. However, one focuses on the actor, entailing the event, and the other focuses on the entire event. Narrow focus on subjects can occur with either sentence initial or sentence final subjects (1). However, sentence final subjects contain pragmatic presupposition which their sentence initially placed counterparts do not (8). Focal objects have also been seen to occur both initially and finally (2). The following explores the effects of presupposition on object word order (9).
.
30
ARDIS ESCHENBERG
(9) Q: A: a. b.
Czy Jan kocha kogoĞ? ‘Does John love anyone?’ Jan kocha MARI-ĉ. John.NOM love.PRES.3sg Mary-ACC #Jan MARI-ĉ kocha. ‘Jan loves MARY.’
In example (9), the reply to a question with object focus but no pragmatic presupposition felicitously occurs only with canonical ordering (SVO), as in (9a). SOV ordering, similar to a cleft construction in English cannot felicitously answer the question. Thus, it can be seen that non-canonical word orderings with prosodic prominence on an argument entail pragmatic presupposition. Canonically ordered SVO sentences with prosodic prominence on an argument do not entail such a presupposition. Without pragmatic presupposition, focus must occur in-situ, that is, the word ordering must be SVO. In all constructions, the focal constituent receives prosodic prominence. Examples such as (8) and (9) necessarily lead to a revision of Lambrecht’s formulations of assertion, presupposition and focus. Presupposition is not simply the set of lexico-grammatically evoked propositions the speaker assumes the hearer knows, believes, or will take for granted at the time of the utterance (Lambrecht 1994: 52). Rather, it is only the set of propositions that the speaker assumes the hearer believes at the time of the utterance. His definition of assertion as the proposition expressed by a sentence which the hearer is expected to know or believe or take for granted as a result of hearing the sentence uttered still holds true (52). However, the focus can no longer be defined as the semantic component of a pragmatically structured proposition whereby the assertion differs from the presupposition (213). Both (8a) and (8c) assert that someone sang and that Jan is the person who sang. Neither contain a presupposition about the beliefs of the hearer. However, the focus in these two constructions is not the same. In (8a), the focus is the argument ‘Jan’ and in (8c) it is the entire sentence. Focus is determined not by subtracting presupposition from assertion but rather by prosody. 5. IDENTIFICATIONAL AND INFORMATIONAL FOCUS 5.1. The phenomenon Somewhat similar to Dryer (1996), Kiss (1998) also distinguishes between two types of focus, ‘identificational focus’ and ‘informational focus,’ using presupposition. Identificational focus conveys the exhaustive subset of a set of contextually or situationally given elements for which the predicate phrase holds. Informational focus conveys new, non-presupposed information (245-6). Informational focus is not associated with movement, and, although all sentences contain information focus, not all contain identificational focus (246).
POLISH NARROW FOCUS CONSTRUCTIONS
31
Using tests developed by Szabolcsi (1981) and Farkas (p.c. to Kiss 1998), Kiss demonstrates that identificational focus expresses exhaustive identification in Hungarian pre-verbal focus constructions and in English cleft sentences. One test involves a pair of sentences where the first contains two coordinated objects and the second contains only one of the two objects. If the second sentence involves exhaustive identification, it cannot be a logical entailment of the first. That is, if the second sentence expresses exhaustive identification, it contradicts the first. The following provides such a test in Polish using both canonical and non-canonical word order. (10) A.
B:
(11) A:
B:
Jan kupi-á-Ø CHLEB i MASàO. Jan.NOM buy-PAST-3Msg bread.ACC and butter.ACC ‘Jan bought BREAD and BUTTER.’ On kupi-á-Ø MASàO. he buy-PAST-3Msg butter.ACC ‘He bought BUTTER.’ Jan CHLEB i MASàO kupi-á-Ø. Jan.NOM bread.ACC and butter.ACC buy-PAST-3Msg ‘It was BREAD and BUTTER Jan bought.’ On MASàO kupi-á-Ø. He butter.ACC buy-PAST-3Msg ‘It was BUTTER he bought.’
While (10B) is a logical consequence of (10A), (11B) is not a logical consequence of (11A). (11B) contradicts (11A) as (11B) asserts an exhaustive set which is not equal to the exhaustive set of (11A). Thus, SOV sentences in Polish are instances of identificational focus and SVO sentences are instances of informational focus. Kiss’s prediction that informational focus is not associated with movement (or noncanonical word ordering) is thus upheld. Kiss also shows that the identificational focus position in Hungarian is not available for certain types of constituents, such as ‘also’ phrases, ‘even’ phrases and the existential quantifiers ‘somebody/something’ (251). This also proves true for pre-verbal focal objects in Polish (12). (12) a.
b.
c.
.
#Jan TEĩ SWETR kupi-á-Ø. Jan.NOM also sweater.ACC buy-PAST-3Msg *‘It was ALSO A SWEATER Jan bought.’ #Jan NAWET SWETR kupi-á-Ø. Jan.NOM even sweater.ACC buy-PAST-3Msg *‘It was EVEN A SWEATER Jan bought.’ #Jan COĝ kupi-á-Ø. Jan.NOM something.ACC buy-PAST-3Msg *‘It was SOMETHING Jan bought.’
32
ARDIS ESCHENBERG
The preverbal focal object placement is not felicitous for ‘also’ phrases (12a), ‘even’ phrases (12b) and an existential quantifier (12c). All of these constructions are possible for final objects (13). (13) a.
b.
c.
Jan kupi-á-Ø TEĩ SWETR. Jan.NOM buy-PAST-3Msg also sweater.ACC ‘4Jan bought ALSO A SWEATER.’ Jan kupi-á-Ø NAWET SWETR . Jan.NOM buy-PAST-3Msg even sweater.ACC ‘Jan bought EVEN A SWEATER.’ Jan kupi-á-Ø COĝ. Jan.NOM buy-PAST-3Msg something.ACC ‘Jan bought SOMETHING.’
Whereas focal objects placed non-canonically were not felicitous for such phrases, focal objects in-situ (clause final) are felicitous for ‘also’ phrases (13a), ‘even’ phrases (13b), and an existential quantifier (13c). Thus, the identificational focus constructions are not felicitous, but the informational focus constructions, which are not associated with movement, are felicitous in these examples. In the analyses in sections 2 and 4, both focal subjects and objects were found to behave in similar ways based on in-situ versus non-canonical word ordering and focus. Although Kiss does not explore subjects, a thorough investigation of the Polish phenomena presented thus far requires such an examination. The following presents sentences similar to (12, 13) involving focal subjects rather than focal objects. (14) a. a’. b. b’. c. c’.
MARIA TEĩ Ğpiewa-á-a. Maria.NOM also sing-PAST-3Fsg ĝpiewaáa TEĩ MARIA. ‘MARIA ALSO sang.’ NAWET JAN Ğpiewa-á-Ø. even Jan.NOM sing-PAST-3Msg ĝpiewaá NAWET JAN. ‘EVEN JAN sang.’ #KTOĝ Ğpiewa-á-Ø. someone.NOM sing-PAST-3Msg ĝpiewaá KTOĝ. ‘SOMEONE sang.’
‘Also’ phrases (14a, 14a’) and ‘even’ phrases (14b, 14b’) are felicitous for focal subjects regardless of whether the subject is placed initially or finally. Although in such constructions focal objects could only occur in the canonical position of informational focus (13), focal subjects can occur in canonical or non-canonical positions. However, focal existential quantifiers are not felicitous in initial position (14c) but are felicitous in final position (14c’).
POLISH NARROW FOCUS CONSTRUCTIONS
33
Although the felicity judgements of (14c, 14c’) seem very odd considering the results seen earlier, Kiss notes that existential quantifiers cannot function as either identificational or informational focus. Thus, (14c’) must be a different type of construction; it cannot be an identificationally focused final subject as in (1b). Indeed, it is a presentative with a pitch accent on the introduced element ktoĞ. This is an example of the VOS ordered sentences Willim (1989: 38) refers to. These constructions introduce a new element rather than providing the contrastive reading (section 3) of identificational focus due to exhaustive identification. Here, rather than exhaustive identification, a constituent is introduced. Similarly, the non-canonically ordered subjects in (14a’) and (b’) are not examples of identificational focus, but rather presentatives. In the earlier examples (1, 2, 3, 4) informational focus and identificational focus constituents have similar pitch accents but different word orderings. This is confirmed by both native speaker judgment and spectrographic analysis (figure 1). However, speakers do not judge the SV ordered and VS ordered sentences in (14) to have the same pitch accents. Whereas speakers state that in (14a) and (14b) the strongest pitch accent is on the adverb (and a lesser pitch accent occurs on the noun4), they consistently judge (14a’) and (14b’) to place the strongest pitch accent on the noun (and a lesser pitch accent on the adverb). Spectrographic analysis confirms speaker judgments of prosodic prominence (Figure 3).
Figure 3. Pitch curves of (14a) and (14a’), ‘also’ phrases with prosodically prominent subjects.
In Figure 3, the highest points in the pitch curve differ for (a) and (b). In (a), the highest point is over ‘also,’ but in (b) it is over ‘Maria.’ This confirms native speaker judgements. Identificational focus with a subject noun phrase results in prosodic prominence on the adverb in ‘also’ and ‘even’ phrases. Speakers also judge the strongest pitch accent to be on the adverb in such constructions when the object is focal (13).
.
34
ARDIS ESCHENBERG
That (14a’) and (14b’) are not identificational focus is further supported by the fact that their pitch curves differ from clear examples of subject identicational focus (Figure 4).
Figure 4. Comparison of pitch curves for a final focal subject (a) and presentational final subject (b).
Whereas Maria begins when the pitch curve is already mid-ascent (4.5 sec.) in the focal subject construction (a), it begins on the lowest point of the pitch curve (7.96 sec.) in the presentative construction. That is, a local minimum occurs in the pitch curve well before the subject in (a) but coincides with the subject in (b). The fact that the pitch curves are not identical is due to the fact that the VS sentences in (14) are not instances of identificational focus, but rather are presentational constructions. Thus, careful analysis of prosody can distinguish between sentence final identificational focus subjects and sentence final presentational subjects. Thus, in Kiss’ analysis, SVO and SVO sentences are examples of informational focus, while SOV and VOS sentences are instances of identificational focus. Additionally, VOS sentences can occur as sentences involving introduction of a constituent. 5.2. Identificational focus versus focus with pragmatic presupposition Although both Dryer’s and Kiss’ analyses are able to distinguish between the variations found in section 2, they are not necessarily identical. Both concur that informational focus (simple focus in Dryer’s terms) conveys non-presupposed information. However, whereas Dryer explicitly states that clefts contain pragmatic presupposition that involves belief and not simply activation, Kiss states that identificational focus may convey contextually or environmentally given elements. Crucially for these constructions, Dryer examines the non-focus portion of the sentence, while Kiss considers the focal portion. That is, Dryer concentrates on what is presupposed by the sentence while Kiss considers what is asserted. While Dryer
POLISH NARROW FOCUS CONSTRUCTIONS
35
notes that a cleft necessarily presupposes a proposition, Kiss notes that identificational focus asserts all the variables that fulfil this proposition. For example, for the sentence ‘it is Jan that sang,’ Dryer’s analysis shows that this construction presupposes the proposition that someone sang. Kiss’ analysis shows that Jan is the only person who sang. Thus, their insights are complimentary. Together, they yield a larger picture of this construction, giving both its presupposition and assertion. However, identificational focus does not always lead to an exhaustive set of variables cross-linguistically. In languages such as Finnish, Kiss notes that identificational focus may or may not be exhaustive (1998: 271). Thus, ultimately, a [+exhaustive] feature must be noted to truly account for the phenomenon of identificational focus (or focus with presupposition) in Polish. 6. RELATED PHENOMENON 6.1. Clitic pronouns Further evidence supporting a distinction between identificational and informational focus can be found in the Polish pronoun system. Polish object (accusative case) pronouns have two forms for the second person singular. These are the long form ciebie and the short form ciĊ. Ciebie is used to give emphasis, to point out that it is only you of all the possible people. This coincides with Kiss' identificational focus, where the one person from a group is being pointed out. (15) a.
b.
c. d.
Ewa kocha ciĊ. Ewa.NOM loves you.ACC ‘Ewa loves you.’ Ewa (TYLKO) CIEBIE kocha. Ewa.NOM only you.ACC loves ‘Ewa loves (ONLY) YOU.’ #Ewa ciĊ kocha. #Ewa kocha (TYLKO) CIEBIE.
Accordingly, use of ciebie coincides with the structure and intonation used for identificational focus. It is placed in non-canonical position, pre-verbally, and given prosodic stress (15b). It is less felicitous in the canonical (final) object position reserved for informational focus (15d). Conversely, the non-presupposed ciĊ occurs most felicitously in canonical object position (15a) and less felicitously pre-verbally (15c). This phenomenon further supports the above analysis of identificational versus informational focus in Polish.
.
36
ARDIS ESCHENBERG
6.2. Wh-questions In the literature, wh-questions are often assumed to be a type of narrow focus with properties similar to non-wh focus. For example, Kiss (1998: 249) states that for Hungarian, a wh-phrase other than ‘why’ is ‘always placed in the preverbal identificational focus position…’ However, she notes that wh-questions can be answered by identificational or informational focus. This leads to an ambiguity as to whether wh-question words are a type of identificational focus or not. Polish, however, provides clear evidence that wh-focus is not the same as identificational focus in a declarative (16). (16) a.
b. c.
KTO umar-á-Ø? who.NOM die-PAST-3Msg ‘Who died?’ *Umar-á-Ø KTO? UMAR-à-Ø kto? ‘Did anyone die?’
In (16a), the felicitous wh-question, the subject is both initial and focal. This is similar to the informational focus position of a subject (14a,b). It is unlike identificational focus subjects, which have been seen to occur finally (8). In (16b) the focal subject is final and the resulting sentence is ungrammatical. Example (16c) shows that a ‘wh’ subject can occur finally, but only when it is not prosodically prominent, or focal. In such a case, it also does not receive a wh-reading. Unlike in Hungarian, Polish focal wh-subjects are clearly not in the identificational focus position. The fact that (16c) does not have a wh-reading can be seen by looking at its felicitous answers: (17) Q: A: A’: A”: A’”: A””:
UMAR-à-Ø kto? MARIA umar-á-a. Maria.NOM die-PAST-3Fsg UMAR-à-A Maria. #Umar-á-a MARIA.
‘Maria died.’ ?MARIA. ‘Mary.’ Nie. ‘No.’
Only answers which do not presuppose that someone did indeed die are felicitous, such as (A) with canonical order and prosodic prominence on the subject (informational focus). (A”), an example of identificational focus, has the pragmatic presupposition that someone died and is not grammatical. The answer ‘no’ (A””) is a felicitous reply here but would not be for the wh-question ‘who sang?’ This
POLISH NARROW FOCUS CONSTRUCTIONS
37
paradigm proves different from an actual wh-question, such as (1), and, rather, is similar to a question involving an indefinite pronoun (8). This focal whquestion/ non-focal indefinite pronoun patterning can also be seen in Siouan languages such as Omaha and Lakhota where words which function as wh-words when focal act as indefinites when non-focal. Just as focal wh-subjects occur initially (16), focal wh-objects also occur initially (18): (18) a. b. c.
CO Jan kupi-á-Ø? What.ACC Jan.NOM buy-PAST-3Msg *Jan kupi-á- Ø CO? ‘What did you buy?’ KUPI-à-Eĝ co? buy-PAST-2Msg what.ACC ‘Did you buy anything?’
Similar to wh-subjects, wh-objects must occur initially and be prosodically accented to receive a wh-reading (18a). Wh-objects in final position, which is the canonical, informational focus position for non-wh-objects, cannot receive prosodic prominence or a wh-word reading (18b). The wh-object word can occur finally but in this case it is not prosodically prominent and functions as an indefinite and not a wh-word (18c). Again, it can be seen that the grammatical wh-word placement is not equivalent to the identificational focus position. Identificational focus objects are placed pre-verbally but after the subject, SOV, as in (11). Here, the wh-word is before the subject, WHSV, (18a). Thus, it can be seen that wh-focus differs from non-wh-focus. It requires initial position in the sentence, regardless of what type of argument the wh-word is. Prosody importantly distinguishes grammatical and ungrammatical final placement of a wh-word. Even grammatical sentence-final occurrence of a wh-word does not yield a wh-reading and does not have prosodic prominence (focus) on the wh-word. 7. SUMMARY AND CONCLUSION Word order and prosody intertwine to create different focus constructions in Polish. An analysis based on only one or the other fails, as both are integral to Polish focus. For narrow focus constructions, when word order differs, focal constituents can have similar pitch accents (1, 2, Figure 1). In other constructions, the same word order may involve differing prosody based on focus type (14). Failing to consider prosody as well as word order results in an inability to draw relevant conclusions about the word order felicitousness (8, 11, 14). Also, it has been seen that narrow focus in Polish involves a finer distinction than provided by a theory such as Lambrecht (1994), which is based on syntactic constituenthood and semantic role. Under Lambrecht's theory, different word orderings seem interchangeable (1,2). However, these differing word orderings function in distinct ways (8, 9). In order to distinguish between constructions which .
38
ARDIS ESCHENBERG
differ in word order but not prosodically prominent constituent (for example, 1a and 1b), Dryer’s notion of presupposition proves valuable (8, 9). Kiss’ definition of identificational focus proves equally applicable (10,11). In both cases, a stipulation that the construction provides exhaustive identification needs to be integrated. In addition to refining the concept of narrow focus to include presupposition, Kiss’ analysis additionally provides that movement is not associated with informational focus. Supporting this, in Polish informational focus occurs in-situ (10, 14), while identificational focus is associated with non-canonical position (11, 13). Use of Polish clitics versus full pronouns provides additional evidence for the distinction between informational and identificational focus (15). However, Kiss’ observation that wh-words in Hungarian tend to occur in the identificational focus position does not hold for Polish. A different type of focus, wh-word focus behaves differently than focus in declaratives. Wh-word focus in wh-questions entails placing the wh-word in initial position and giving it prosodic prominence. This is true regardless of the argument type of the wh-word. Again, accounting for prosody proved crucial in that a non-prosodically prominent wh-word can occur sentence finally. However, in this case, an indefinite and not a wh-reading is attained. Thus, Polish, as a flexible word order language, provides an ideal testing ground for theories of focus. Just examining prosodic accent on single constituents leads to evidence for identificational focus, informational focus, wh-question focus, and presentatives. Word order and/or prosody can distinguish each; there are no overlaps where two constructions are homophonous and only distinguishable through context. Positing a focus position applicable regardless of semanticosyntactic roles proves valid for wh-words, but not for other forms of narrow focus. The position of constituents involved in presentatives, informational focus and identificational focus is best explained as in-situ versus non-canonical position, rather than as fixed positions. Table 1 provides a summary of the word orders and prosody involved for the constructions examined in this paper.
POLISH NARROW FOCUS CONSTRUCTIONS
39
Table 1. Different focus constructions in Polish and their syntactico-prosodic realization
Focus Type: Narrow focus on subject, informational focus Narrow focus on subject, Identificational focus
Polish manifestation: SV(O)
Presentative subject construction
V(O)S, pitch curve minimum at the beginning of S SVO
Narrow focus on object, informational focus
V(O)S, pitch curve minimum before the beginning of S
Narrow focus on object, Identificational focus
SOV
wh-question focus
WH(S)V(O)
Ardis Eschenberg University at Buffalo Nebraska Indian Community College 8. NOTES 1 I would like to thank Janina Aniszewska, Jolanta àapat, Maágorzata àapat, Czesáaw Prokopczyk, and Piotr Szewczyk, and for their patience, teaching and insight into the Polish language. Any mistakes here are the responsibility of the author, but all the truth obtained is due to the kindness of these consultants. I would also like to thank Daniel Büring for his insightful comments. 2 Final position in the core, not the clause, where the core consists of the predicate and its arguments. 3 Bold underline represents prosodically accented constituent. Small caps are used to indicate sentence stress in sample sentences. 4 The stronger pitch accent is indicated by bold small caps, while the lesser is in small caps.
9. REFERENCES Daneš, FrantiĞek. “One instance of Prague school methodology: functional analysis of utterance and text.” In Paul L. Garvin (ed.), Method and Theory in Linguistics. Paris: Mouton & Co, 1970. Dryer, Matthew. “Focus, pragmatic presupposition, and activated propositions.” Journal of Pragmatics 26 (1996): 475-523. Eschenberg, Ardis. Focus in Polish. M.A. thesis. University at Buffalo, 1999. Kiss, Katalin. “Identificational versus information focus.” Language 74.2 (June 1998): 245-273. Klemensiewicz, Zbigniew. Lokalizacja podmiotu i orzeczenia w zdaniach izolowanych. Biuletyn PTJ 9 (1949): 8-19. Lambrecht, Knud. Information Structure and Sentence Form: a theory of topic, focus, and the mental representations of discourse referents. New York: Cambridge University Press, 1994.
.
40
ARDIS ESCHENBERG
Mathesius, Vilem. Functional linguistics. In M. Mayenova, ed., O spojnosci tekstu, pp. 121-42. Warsaw: 1987. Siewierska, Anna. “Syntactic weight vs. information structure and word order variation in Polish.” Journal of Linguistics 29.2 (1993): 233-266. Szabolcsi, Anna. “The semantics of topic-focus articulation.” In Jan Groenendijk, Theo Janssen, and Martin Stokhof (eds.), Formal methods in the study of language, pp. 513-41. Amsterdam: Matematisch Centrum, 1981. Szober, Stanislaw. Gramatyka jĊzyka polskiego. Warsaw: PWN, 1963. Szwedek, Aleksander. Word Order, Sentence Stress and Reference in English and Polish. Edmunton: Linguistic Research, Inc, 1976. Van Valin, Robert and Randy LaPolla. 1997. Syntax: Structure, meaning and function. New York: Cambridge University Press, 1997. Willim, Ewa. On word order: a government binding study of English and Polish. Krakow: Uniwersytet Jagellonski, 1989.
DAVID GIL
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN*
1. INTRODUCTION What kinds of meanings may be expressed by intonation? There is general agreement that intonation may convey emotions, and, related to this, speakers’ attitudes towards the propositional content of utterances. It is also well-known that certain intonation contours may be associated with specific speech acts such as questions. Moreover, as reflected by the title of this volume, intonation may encode various pragmatic functions such as topic and focus. Another, rather more indirect way in which intonation may express meanings is via its relationship to syntactic structure. In general, intonation contours parse an utterance into intonation groups, which correspond closely, albeit not always perfectly, to syntactic constituents. However, in many cases, a given string of words may be associated with two or more different constituent structures, each of which in turn is associated with a different meaning. In such cases, the different syntactic structures and corresponding meanings may be reflected by different intonation groups. Nevertheless, the range of meanings expressible by intonation is highly constrained. For example, no language has intonation contours which, when applied to any sentence, add meanings such as past tense, ‘in the rain’, or ‘because John came to the party’. Thus, a major goal of any theory of intonation must be to determine the set of meanings potentially encodable by intonation in one or more human languages. This paper contributes to the above goal through the examination of one specific semantic domain, namely thematic roles: actor, undergoer, goal and the like. Most commonly, thematic roles are encoded with various morphosyntactic features, typically some combination of word order, case marking and verbal agreement. One might wonder whether there are any languages in which thematic roles can also be expressed by means of intonation. This paper addresses the question through an empirical examination of intonation and thematic roles in one particular language, namely the Riau dialect of Indonesian. The results of the study are negative: no evidence is found that might point towards any correlation between intonation and thematic roles in Riau Indonesian. This, in turn, is suggested to lend greater cogency
41 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 41–68. © 2007 Springer.
DAVID GIL
42
to the question whether in fact it is possible in any language for thematic roles to be encoded by intonation. 2. THE STATE OF THE ART What is known with regard to the relationship between intonation and thematic roles? At present, I am not familiar with a single study in the linguistic literature showing the existence of a language in which intonation can be used to encode thematic roles. An email query on the LINGTYP Discussion List (22 March 2001) seeking references to such studies produced no clear cases. However, the email query did reveal the presence of a common belief that languages in which intonation may distinguish between thematic roles “ought to” exist; some potential examples that were suggested include Hebrew, Persian, Russian and Italian. In Hebrew, for example, if a number of morphosyntactic variables are set right, it is possible to construct sentences exhibiting actor-undergoer ambiguities, such as the following: (1)
Kelev radaf yeled chase:PST:3:SG:M child:M dog:M (i) ‘A dog chased a boy’ (ii) ‘A boy chased a dog’
Speakers of Hebrew occasionally claim that the two meanings can be distinguished by intonation. But when asked how, they do not provide systematic answers. In general, the most readily available interpretation is that in which the actor precedes the undergoer, as in (1/i) above. In order to obtain the less readily available interpretation, that in (1/ii), speakers of Hebrew sometimes offer a distinctive intonation contour, involving greater pitch variation and greater duration for certain syllables. However, when questioned, they will generally concede that even with the distinctive intonation contour, the sentence can also be understood as in (1/i); and then they will often admit that even with an ordinary intonation contour, the sentence can also be understood as in (1/ii). Similar facts are reported also for Persian and other Middle-Eastern languages by Stilo (1984, personal communication). As suggested by the above, there would seem to be a rather striking mismatch between the widespread conviction that intonation can be used to differentiate between thematic roles, and the absence of any detailed empirical studies testing the veracity of such claims. To the best of my knowledge, then, this paper represents the first attempt to subject the possible relationship between intonation and thematic roles to systematic empirical investigation. 3. RIAU INDONESIAN Riau Indonesian is the variety of Indonesian spoken in informal situations by the inhabitants of Riau province in east-central Sumatra. Riau Indonesian is quite
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
43
different from Standard Indonesian, familiar to many general linguists from a substantial descriptive and theoretical literature.1 One of the most salient characteristics of Riau Indonesian is the absence of obligatory morphosyntactic coding for a wide range of categories which play a central role in the grammars of many other languages. In particular, there is no obligatory morphosyntactic device for distinguishing thematic roles: word order is flexible, and there is no case-marking or morphological agreement. Thus, in a simple clause, a given expression denoting a participant in an activity could bear any thematic role whatsoever with respect to that activity: it could be the actor or the undergoer, or it could stand in any other semantic relationship that makes sense in the given context. Indeed, it is only context that enables the hearer of such utterances to interpret them in appropriate ways. Below are some examples of Riau Indonesian sentences illustrating thematic role indeterminacy. These examples, and all the Riau Indonesian examples that follow in this paper, are from a corpus of naturalistic texts. As abstract sentences, each of the following examples is indeterminate with respect to thematic roles; however, as actual utterances, each is associated with a specific interpretation, as indicated in the translation. Since the interpretation of the utterance is heavily context-dependent, the context is also indicated, right above the translation, within square brackets.2 (2a)
Beli aku laser, ‘kan laser Q buy1:SG [Contemplating a shopping trip] ‘I’ll buy a laser, right’
(b)
Beli nasi goreng aku buyrice fry 1:SG [Group of people decide they want to pay cards; somebody tells speaker to go out and buy some; speaker objects on the grounds that it’s somebody else’s turn to go out] ‘I bought the fried rice’
(3a)
Saya pakai kaca mata, Vid use glass eye FAM|David 1:SG [Speaker putting on a new pair of glasses] ‘I’m wearing my glasses, David’
(b)
Honda pakai abang Elly motorcycle use elder.brother Elly [Interlocutor tells speaker to go and buy food; speaker doesn’t budge; interlocutor asks speaker why he isn’t going; speaker explains] ‘Elly’s using the motorcycle’
DAVID GIL
44
(4a)
Si
Pai aku usir Pai 1:SG send.away [Complaining about his younger brother Pai, who won’t have anything to do with him] ‘Pai sent me away’ PERS
(b)
Abang elder.brother
dia 3
sendiri one-AG-stand
dia 3
usir send.away [Complaining about his younger brother Pai, who won’t have anything to do with him] ‘His very own brother he sent away’ In each of the above examples, a word denoting an activity is in boldface, and its two associated participants are in italics. In (2) the activity word occurs before its two participants, in (3) it occurs between them, and in (4) it occurs after them both. Within each of the three sentence pairs, the activity word is the same; however, the actor precedes the undergoer in the first sentence while following it in the second sentence. Thus, in (2a) actor aku ‘I’ precedes undergoer laser ‘laser’ while in (2b) actor aku ‘I’ follows undergoer nasi goreng ‘fried rice’; in (3a) actor saya ‘I’ precedes undergoer kaca mata ‘glasses’ while in (3b) actor abang Elly ‘Elly’ follows undergoer Honda ‘motorcycle’; and in (4a) actor si Pai ‘Pai’ precedes undergoer aku ‘I’ while in (4b) actor dia ‘he’ follows undergoer abang dia sendiri ‘his very own brother’. Thus, each of the three sentence pairs constitutes a near minimal pair illustrating the indeterminacy of thematic role assignment. Together, sentences (2) - (4) show that in a basic sentence consisting of activity, actor and undergoer, these three items may occur in any of the six possible orders. Similar facts obtain also with respect to other thematic roles. Examples such as the above occur frequently in the corpus; other similar examples are cited in Gil (1994:181, 1999:191-193, 2002b:246-249). Thus, sentences such as these point towards the conclusion that in Riau Indonesian, grammar does not provide any obligatory grammatical means for distinguishing between thematic roles.3 Given the kind of indeterminacy present in examples such as the above, it is only natural to wonder whether intonation might play a role in differentiating between various interpretations. In fact, practically every time I have presented examples such as the above in lectures, somebody in the audience has asked whether it isn’t perhaps the case that different interpretations involving different assignments of thematic roles might be distinguishable by means of different intonation contours. However, the answer to this question is a simple, straightforward ‘no’: intonation does not and cannot differentiate between different assignments of thematic roles in Riau Indonesian. Thus, for example, in sentences such as those in (2) - (4), there are no systematic differences between the intonation contours of the (a) sentences, in
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
45
which the actor precedes the undergoer, and the (b) sentences, in which the actor follows the undergoer. Here the matter should rest, but unfortunately it does not always do so. Rather, many scholars continue to hold steadfast to the belief that intonation must distinguish thematic roles in Riau Indonesian, and in other varieties of Malay/ Indonesian. (Some of the possible reasons behind the persistence of this belief are discussed in Gil 2003). However, not a single one of these scholars, when challenged, has been able to formulate an explicit description of exactly how intonation can be used to distinguish thematic roles, and to the best of my knowledge, no such account appears anywhere in the linguistic literature on Malay/ Indonesian. The closest to an explicit proposal that I have come across is perhaps the following. (The claim is stated in my own words, and constitutes my interpretation of one or two suggestions made by colleagues in informal discussions.) In general, in Riau Indonesian, there is a significant tendency for undergoers to follow activities, as in (2) and (3a) above. Accordingly, when undergoers precede activities, as in (3b) and (4), this unusual word order is signalled by a pause occurring right after the undergoer. Within a generative framework involving movement, this generalization might be restated as follows: when a undergoer is fronted to a higher position in the clause, a pause occurs between it and the clause from which it was extracted. This “pause proposal” at least constitutes an explicit hypothesis which can be examined in face of the facts. But as shown in Section 6 below, it is clearly false.4 4. TWO HYPOTHESES So what needs to done in order to finally put such claims to rest? Three methods suggest themselves. First, one might use elicitation, and ask native speakers for their judgements of sentences exhibiting various possible pairings of intonation contours and thematic roles. Secondly, one might construct experiments, which would present native speakers with various tasks requiring them to make use of intonational cues in order to distinguish thematic roles. Thirdly, one might study naturalistic corpora, and search for possible correlations between intonation contours and thematic roles. While each of these three methods is in principle equally valid, this study chooses to make use of the third method, involving naturalistic corpora. The reasons for this choice are entirely practical. On the one hand, elicitation and experiments are particularly problematical in the study of Riau Indonesian. As a regional colloquial language variety, Riau Indonesian stands in a basilect-to-acrolect relationship with Standard Indonesian. Put a speaker of Riau Indonesian in what is perceived to be a learnèd setting such as an elicitation session or a controlled experiment, and he or she is likely to switch to Standard Indonesian, no matter how clearly and repeatedly the investigator has asked the speaker to use “ordinary language”, that is to say, Riau Indonesian. On the other hand, in Riau Indonesian an extensive naturalistic corpus is available, containing recordings of speech from many different speakers in a variety of settings, including narrative and
DAVID GIL
46
conversational. Accordingly, the present study makes use of the third method, examining a naturalistic corpus for possible correlations between intonation contours and thematic roles. Two specific hypotheses are examined: (5a)
Hypothesis A (existential): For each sentence, there exists at least one intonation contour which renders the sentence undifferentiated with respect to thematic roles.
(b)
Hypothesis B (universal): For each sentence, every available intonation contour renders the sentence undifferentiated with respect to thematic roles.
Both of the above hypotheses negate the claim that intonation distinguishes between thematic roles in Riau Indonesian. However, the second hypothesis is stronger than the first: one can envisage a state of affairs in which the first hypothesis holds but the second one fails, but not vice versa. As we shall see in Section 6 below, the naturalistic corpus provides overwhelming support for the weaker Hypothesis A, and substantial support for the stronger Hypothesis B. Accordingly, the results of this study lead to the conclusion that intonation does not differentiate thematic roles in Riau Indonesian. 5. BASIC SUPRASEGMENTAL PATTERNS To be in a position to examine the Riau Indonesian naturalistic corpus for possible correlations between intonation contours and thematic roles, it is first necessary to describe the basic suprasegmental patterns and establish an inventory of the major intonation contours available in the language. 5.1. Word Structure In Riau Indonesian, as in most other languages, intonation contours interact with word structure; hence, before going any further, it is first necessary to develop a clear picture of word structure in Riau Indonesian. Riau Indonesian is a strongly isolating language, with no inflectional morphology, little derivational morphology and little compounding. However, unlike the stereotypical isolating languages of mainland Southeast Asia, the typical or canonical word in Riau Indonesian is bisyllabic.
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
47
The bisyllabic nature of the Riau Indonesian word raises the issue of word stress. As observed by Tadmor (1999, 2000), word stress in Malay and Indonesian presents a thorny problem, with different scholars often providing conflicting descriptions. Thus, for example, van Ophuijsen (1915) claims that stress is on the final syllable, Amran (1984:60) maintains that it is on the penultimate, while Kähler (1956:37) asserts that it is either on the final syllable (if the penultimate is a schwa) or on the penultimate (in all other cases). One possible source for these discrepancies might be that different scholars are unwittingly describing different regional and/or social varieties of Malay / Indonesian. Thus, Tadmor (1999, 2000) shows a tendency for word stress in Malay / Indonesian to progress from final, in the western parts of the archipelago, towards penultimate, in the eastern regions, reflecting a similar progression in the local languages, which often constitute substrates for the regional varieties of Malay/ Indonesian. Another possible source for these inconsistencies could well be that Malay/ Indonesian has no word stress. In such a case, the patterns that are being described may be present in the investigator’s ear but not in the language itself, as is suggested by Goedemans and van Zanten (to appear). Alternatively, the patterns described may be phonetically real, but pertaining not to word stress but rather to intonational prominence, as is in fact suggested in the continuation of this section. Indeed, for Riau Indonesian, I am not familiar with any positive evidence supporting the existence of a privileged syllable which could be characterized as the locus of word stress. In this sense, then, Riau Indonesian may be appropriately characterized as lacking word stress. Nevertheless, while Riau Indonesian words lack a privileged syllable, there is strong evidence for the presence of a privileged bisyllabic unit, which may be referred to as the core foot. As represented in (6) below, the core foot (F) consists of two syllables (S), each of which consists in turn of an onset (O) plus a rhyme (R): The Core Foot: F
(6)
S
ke
di
S
O
R
O
R
m
a
k
an
m
i
p
i
t
ing
b
e
l
i
c
at
‘eat’ ‘noodles’ ‘crab’ kan
‘buy’ ‘paint’
DAVID GIL
48
Most words, such as makan ‘eat’, are bisyllabic and thus coextensive with most or all of the core foot. A few shorter monosyllabic words, such as mi ‘noodles’, occupy only the second syllable of the foot, while a small number of longer words, such as kepiting ‘crab’, occupy the entirety of the core foot plus additional space preceding it. Clitics, when present, invariably occur outside of the core foot, either after it, for example the end-point marker -kan in belikan ‘buy’, or before it, for example the undergoer marker di- in dicat ‘paint’. The core foot is thus what underlies the basic bisyllabic nature of Riau Indonesian words. However, the existence of the core foot is also supported by a number of additional independent phenomena. One such phenomenon involves patterns of reduction in fast connected speech. Typically, as shown in (7) below, material belonging to the core foot is retained, while preceding material may undergo partial or complete deletion: Reduction in Fast Connected Speech: F
(7)
S
S
O
R
O
R
p
s
a
w
at
→
[psawat] ~ [sawat] ‘airplane’
tang
k
e
r
ang
→
[taNkeraN] ~[NkeraN] ~ [keraN] ‘[place name]’
Whereas the above phenomenon involves the contraction of overly long words, a number of others involve the expansion of words that are too short to fill the core foot. One such phenomenon pertains to the personal marker si, which marks expressions as constituting names of people:
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
49
The Personal Marker “si” in Non-Vocative Expressions: F
(8)
S
S
O
R
O
R
t
o
p
an
→
[sitopan]~[stopan]~[topan] ‘[name]’
s
i
p
an
→
[sipan], *[span], *[pan] ‘[name]’
si
Before bisyllabic names, such as Topan, the personal marker si is optional, and, when present, it may undergo reduction of the kind exemplified in (7); this is shown in the first line underneath the tree diagram in (8) above. However, names also possess a monosyllabic familiar form derived by truncation; for example Pan from Topan.5 Often, this form is used vocatively; however, it is also used in non-vocative functions, in which case the use of the personal marker si is obligatory; this is shown in the second line in (8). Thus, one of the functions of the personal marker si is to expand the monosyllabic familiar form of the name to fill the core foot. A similar phenomenon involves words with what might be characterized as a defective penultimate rhyme. For this purpose it is necessary to acknowledge the existence of two subdialects of Riau Indonesian, which may be referred to as the schwa dialect and the schwaless dialect respectively. In the former dialect, the schwa ´ is part of the phonemic inventory, though even in this dialect, it never occurs in the final syllable. Of interest here however is the second, or schwaless dialect, in which there is no phonemic schwa. Consider the way in which a word containing a schwa in the schwa dialect, [b ´sar] ‘big’, is realized in the schwaless dialect: Spreading and Epenthesis: F
(9)
S O
b
S R
O
s
R
ar
→ [bs`ar] ~ [b´sar] ~ [besar] ‘big’
As shown above, realizations of the word in question involve a syllabic [ s` ´] (as evidenced by the ways in which native speakers parse the sequence into syllables), a
DAVID GIL
50
phonetic schwa [´], or a full mid-high front vowel [e] (phonetically identical to the mid-high front vowel phoneme). This range of possibilities can be most appropriately accounted for by positing a segmental melody bsar occupying the core foot as per (9) above, with an empty penultimate rhyme position which is subsequently filled either by backward spreading of the sibilant s or by epenthesis of a schwa or full vowel. Thus, these phonological processes, spreading and epenthesis, beef up an impoverished segmental melody, thereby enabling the word to extend across the entire core foot. An analogous though somewhat less systematic phenomenon involves loan words which, in the source language, are monosyllabic: Expansion of Monosyllabic Loan Words: F
(10)
S O
S R
O
R
o g
o
l
om
< Dutch oom ‘uncle’
op
< English golf
As suggested by the above examples, such monosyllabic words are often expanded to form bisyllabic words in Riau Indonesian, though the strategies by which such expansion is achieved are idiosyncratic and unpredictable. However, a particular subclass of such cases, in the schwaless subdialect, make use of the same processes of spreading and epenthesis that apply, as in (9) above, to native words: (11)
Expansion of Monosyllabic Loan Words through Spreading and Epenthesis: F
S
S
O
R
O
R
s
(n)
tr
um
m
ek
s
→ [strum] ~ [s´trum] ~ [setrum] ~ [s n`trum] ~ [s´trum] ~ [sentrum] < Dutch stroom ‘electric current’ → [sm`ek] ~ [s´mek] ~ [semek] < English smack
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
51
In the first example, the borrowing of Dutch stroom involves the optional introduction of a nasal stop n, followed by various combinations of spreading and epenthesis. In the second example, the borrowing of English smack involves either the spreading of the nasal stop m or epenthesis. In general, evidence from borrowing may be open to alternative interpretations, since the path from source to target language could potentially involve any number of intermediate way stations, with the word in question actually entering Riau Indonesian from another variety of Indonesian, already in bisyllabic form. However, in at least one case, smek < smack, it may be safely surmised that the word entered Riau Indonesian directly from English. This is because the borrowing was actually observed to take place, in the late 1990’s, via television, immediately following the introduction into US professional wrestling (hugely popular throughout Indonesia) of the brand name Smack Down. Accordingly, this latter example provides clear-cut evidence for the relevance of the core foot as a factor governing the incorporation of loan words into Riau Indonesian. The final phenomenon supporting the core foot comes from the Warasa ludling, a secret language in which the sequence war- is inserted at the beginning of each word.6 In (12) below the results are shown of applying the ludling to the words represented in (6) above: Warasa Ludling: F
(12)
S
wa
S
O
R
O
R
r
a
k
an
makan → warakan ‘eat’ wa
m
r
i
mi → waremi ‘noodles’ wa
r
i
t
ing
kepiting → wariting ‘crab’ wa
r
e
l
i
belikan → warelikan ‘buy’ wa
r
c
dicat → warecat ‘paint’
at
kan
DAVID GIL
52
As shown in (12) above, the sequence war- is inserted into a position that is defined structurally, with reference to the core foot: r occupies the first onset of the core foot with wa immediately preceding it. The effect of adding war- to a word thus depends crucially on the size of the original word. For most words, which are bisyllabic, adding war- involves deletion of the first consonant, if the word begins with a consonant, for example makan → warakan. However, for monosyllabic words, adding war- involves not deletion but rather the further insertion of an epenthetic vowel, for example mi → waremi. Conversely, for polysyllabic words, adding war- involves the deletion not just of the first consonant of the penultimate syllable, but of any and all preceding material, for example kepiting → wariting. For stems combined with an enclitic, the ludling ignores the enclitic and treats the stem as though it constituted the entire word, for example belikan → warelikan. In contrast, for stems combined with a proclitic, adding war- involves the deletion of the proclitic, and treats the remainder of the word as though the clitic were absent; for example dicat → warecat, with the further insertion of an epenthetic vowel. Thus, as shown in (12) above, the application of the Warasa ludling relies crucially on the core foot, thereby providing yet additional evidence for its central role in the structure of the Riau Indonesian word. Thus, a number of independent phenomena support the existence of a core foot underlying the structure of the word in Riau Indonesian. Although, as noted in the beginning of this section, Riau Indonesian has no privileged syllable which could be characterized as the locus of word stress, the core foot does constitute a privileged unit, albeit of a larger size. As such, Riau Indonesian may be characterized as being endowed with a somewhat more abstract variety of word stress, whose locus is not the syllable, as in most typical instances of stress, but rather the bisyllabic core foot. As we shall see in Section 5.3 below, the characterization of the core foot as bearing word stress may account also for properties of focus intonation. 5.2. Intonation Groups and Final Prominence As in most other languages, intonation contours form intonation groups with a hierarchical tree structure, in which smaller units group together to form larger ones, which in turn group together to form even larger ones, and so forth; see, for example, Nespor and Vogel (1986). Such intonational groupings often coincide to a certain degree with syntactic groupings. Because of this, intonation can sometimes help to disambiguate between different readings associated with different syntactic constituencies underlying the same sequence of words. Consider the following example: (13)
Tengok tikus aku look mouse 1:SG [Speaker learning to play a game of laptop billiards in which it is rather difficult to control the position of the simulated player with the
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
53
track pad, and the player often ends up under the table; the first time this happened, I jokingly asked him whether he was looking at the mice; when this happened once again, speaker joked] ‘I’m looking at the mice’ In order to facilitate the intended interpretation, the above sentence was associated with an intonation contour which effected the grouping [Tengok tikus] aku. However, in a different context, a different intonation contour could have been used to effect a different grouping, Tengok [tikus aku], which would have a quite different meaning, ‘Looking at my mice’. It should be acknowledged, however, that the above sentence may also be uttered with intonation contours that do not reflect any internal constituent structure and hence do not disambiguate between the two potentially available meanings. Perhaps the most noticeable characteristic of intonation groups is final prominence. Within each intonation group, the final syllable is accented, thereby providing a salient marker of intonation phrase boundaries. Thus, for example, in (13) above, the grouping [Tengok tikus] aku was affected by accent on the final syllable of the intonation group, namely kus. As in many other languages, accent is realized by a combination of phonetic features including greater pitch variation, greater intensity and greater duration. However, compared to some other languages, the contribution of greater duration would appear to be relatively larger. Examples (14) and (15) illustrate the phenomenon of phrase-final lengthening, with durations indicated in milliseconds: 370 (14)
Banyak se mut many ant [Eating newly bought fruit] ‘Lots of ants’
760 (15)
700
1230
Aku main seo rang play one-person 1:SG [Speaker squabbling over who gets to play on laptop computer] ‘I’m playing by myself’
Each of the above examples represents a single intonation group. As indicated by the figures, in each example the final syllable is almost twice as long as all of the preceding syllables combined. The prolongation of the final syllable of the intonation group is not always quite as dramatic as in the above examples. However,
DAVID GIL
54
the above examples are quite typical of the way in which final lengthening may be exaggerated in order to increase the affective expressiveness of the utterance. For the unwary investigator, one of the consequences of final prominence in intonation groups is that it gives rise to the illusion of final word stress. For example, in a situation involving elicitation, where the researcher asks what the word for such-and-such is, the speaker typically responds with a one-word utterance bearing the final-prominent intonation contour. This sounds like final word stress; however, it is important to keep in mind that the suprasegmental pattern is not a property of the word, but rather of the entire utterance, which just happens to consist of a single word. Mistaken analyses of final-prominent intonation contours as word stress are apparently responsible for the probably erroneous characterization of many related Malayic language varieties of Sumatra as possessing final lexical stress, for example Nurzuir et al (1985:32-33) for Jambi, Umar et al (1986:28) for Muko-Muko, and Suwarni et al (1989:80) for Lintang. 5.3. Focus Intonation Final-prominent intonation groups provide the backdrop for an additional layer of intonational organization, that of focus intonation. Within each intonation group, a single word, which may occur in any position within the group (initial, medial or final), may optionally be assigned focus intonation. Focus intonation provides an expression for the semantic focus operator, though many of the details remain to be worked out. (The term “focus” is thus used here in the sense that is current in general semantic theory, not in the rather peculiar sense that has gained acceptance among Austronesianists, where it refers to what is known elsewhere as verbal voice.) Focus intonation is realized through a bundle of phonetic properties associated with the core foot of the word in focus, as shown below: (16)
Focus Intonation: F
S
ke
di
S
O
R
O
R
M
A
K
AN
M
I
P
I
T
ING
B
E
L
I
C
AT
‘eat’ ‘noodles’ ‘crab’
kan
‘buy ’ ‘paint’
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
55
Example (16) above illustrates the domain of focus intonation for the words previously illustrated in (6) and (12); in this and subsequent examples, the domain of focus intonation is indicated with upper case letters. As shown in (16), the domain of focus intonation coincides precisely with the core foot, as supported by the various phenomena discussed in Section 5.1 above. The phonetic realizations of focus intonation are distributed unevenly over the two syllables of the core foot. The most salient feature of focus intonation involves the lengthening of the rhyme of the first syllable of the core foot, and sometimes also the onset of the second syllable. (In some varieties of Riau Indonesian, the onset of the second syllable may be lengthened if and only if it is other than an oral stop, while for other varieties, more influenced by a Minangkabau substrate, the onset of the second syllable may be lengthened no matter what its contents are.) This lengthening is generally associated with a level pitch contour. At the same time, focus intonation is also reflected by pitch prominence and secondary lengthening on the rhyme of the second syllable of the core foot. Some examples of focus intonation are given in (16) and (17) below, with durations again indicated in milliseconds: 780 (17)
380
350
PAY…AH budak i ni DEM-DEM:PROX bad child [Bantering with friends] ‘This kid’s really bad’
330 (18)
270
750
1030
Rekam LA GI record again [Seeing me turn the laptop computer recorder on] ‘Recording again’
56
DAVID GIL
Each of the above examples consists of a single intonation group. In (17), focus intonation falls on the first word, payah. In this example, focus intonation is reflected primarily by the length of the first syllable plus second onset, pay, totalling 780 msec. The second rhyme, ah, is also relatively long, and in addition bears salient pitch prominence. The remainder of the intonation group follows the usual pattern of final prominence, with three short syllables followed by a final much longer one, ni. In (18), focus intonation falls on the second word, lagi. Here, once more, focus intonation is reflected by the length of the first syllable, la, totalling 750 msec., but in this case the second syllable gi is even longer, showing the combined effect of secondary lengthening due to focus plus the regular final prominence of the intonation group.7 This particular constellation of features, involving lengthening of a penultimate syllable followed by some kind of pitch accent on the final syllable, is not peculiar to Riau Indonesian. In the Jakarta dialect of Indonesian, focus intonation occurs more frequently than in Riau Indonesian, and its phonetic realization is more pronounced; so much so that when speakers from Riau attempt to imitate a Jakarta accent, one of the things that they do is exaggerate the frequency and the phonetic properties of focus intonation. Outside of Mala y /Indonesian, penultimate lengthening coupled with some kind of final accentuation has been reported, among others, for the Formosan language Amis (Edmundson, Huang and Pahalaan 2001), for various Micronesian languages (Rehg 1993), and for the Polynesian language Marquesan (Margaret Mutu, personal communication), thereby suggesting that the feature may be of considerable antiquity within the Austronesian language family. Just as final prominence in intonation groups sometimes creates the illusion of final word stress, so focus intonation and concomitant penultimate lengthening may occasionally give rise to an unwarranted impression of penultimate word stress, at least in those cases where penultimate lengthening is more salient to the investigator’s ear than final pitch accent. For example, such a misanalysis is what underlies some descriptions of Minangkabau, for example Zarbaliev (1987:23) and Adelaar (1992:12), as having penultimate word stress, even though in reality the suprasegmental patterns of Minangkabau are largely identical to those of Riau Indonesian. In some other dialects, such as Jakarta Indonesian, focus intonation and penultimate lengthening are often used in place of the final-prominent intonation contour in the context discussed earlier, where, in response to being asked what the word for such-and-such is, the speaker responds with a one-word utterance. This use of focus intonation thus contributes further to a characterization of Malay / Indonesian as having penultimate word stress. However, in actual fact, focus intonation and the way in which duration and pitch prominence split across the two syllables of the core foot provide additional support for the claim that in Riau Indonesian, as in many other related varieties, word stress is present not at the domain of the syllable but rather at the level of the entire foot, with respect to which it occurs in fixed position, falling invariably on the core foot.
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
57
6. INTONATION AND THEMATIC ROLES The description of the basic intonational patterns of Riau Indonesian presented in the preceding section makes it possible for us now to address the main concern of this paper, namely, the purported correlation between intonation and thematic roles. In order to do this, we shall examine the distribution, in the naturalistic corpus, of four basic intonation contours, associated with declarative statements and imperatives.8 These four contours make reference to intonation groups, as recognizable by the feature of final prominence, and to pauses, which may separate successive intonation groups. (19)
Four Basic Intonation Contours:
(a)
Intonation Contour A: Two intonation groups separated by pause, no focus
(b)
Intonation Contour B: Single intonation group with no pause, no focus
(c)
Intonation Contour C: Single intonation group with no pause, initial focus
(d)
Intonation Contour D: Single intonation group with no pause, final focus
The above four contours span much of the variety that is in evidence in the intonational patterns of Riau Indonesian, though of course they do not exhaust it. For declarative statements and imperatives, additional intonation contours may involve more complex configurations containing two or more intonation groups, focus, and pauses; however, as complexity increases, these intonation contours become less and less frequent. Alternatively, other intonation contours of a qualitatively different nature include those associated with certain specific sentencefinal particles, and also with other kinds of speech acts such as polar and information questions, and direct quotation. Nevertheless, the above four basic intonation contours suffice to give the proponents of a correlation between intonation and thematic roles a good run for their money: if such a correlation did exist, it is most likely that it would involve at least one of the above four contours. The four basic intonation contours are examined with respect to a set of basic sentence patterns defined in terms of an activity in construction with a single associated participant. The participant in question may either precede or follow the activity, and it may be associated with the thematic roles of either actor or undergoer. Resulting from these two binary choices are the following four basic sentence patterns:
58
DAVID GIL
(20)
Four Basic Sentence Patterns:
(a)
Actor precedes activity
(b)
Undergoer precedes activity
(c)
Actor follows activity
(d)
Undergoer follows activity
Again, the above four basic sentence patterns do not exhaust the inventory of sentence patterns in Riau Indonesian. However, it is reasonable to suppose that if intonation did distinguish thematic roles, its effect would be observable with respect to at least some of the above basic sentence patterns. The four basic intonation contours in (19) and the four basic sentence patterns in (20) may be combined to yield sixteen potentially possible pairings of intonation contours and sentence patterns. These sixteen pairings are represented in the sixteen cells of Table 1. (In Table 1, letters a, p and v stand for actor, undergoer and activity respectively, upper case letters denote focus intonation, while ø represents a pause between intonation groups.) Table 1:
Intonation Contours and Sentence Patterns
Participant precedes activity Actor
Undergoer
Participant follows activity Actor ↔
↔ Intonation Contour A: Pause, no focus
aØv (21a)
Intonation Contour B: No pause, no focus
av (23a)
Intonation Contour C: No pause, initial focus
Av (25a)
Intonation Contour D: No pause, final focus
aV (27a)
Undergoer
pØv (21b)
vØa (22a)
pv (23b)
va (24a)
Pv (25b)
Va (26a)
pV (27b)
vA (28a)
vØp (22b) ↔
↔
vp (24b) ↔
↔
Vp (26b) ↔
↔
vP (28b)
Table 1 provides a classificatory scheme for utterances in the naturalistic corpus. If intonation does distinguish thematic roles, then one would expect to find an unequal
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
59
distribution of utterances across the table, with, crucially, some empty cells, reflecting impossible pairings of intonation contours and sentence patterns. Conversely, if intonation does not differentiate thematic roles, then one would expect to find utterances exemplifying all of the potential pairings of intonation contours and thematic roles, with no empty cells in the table. The facts are quite clear. Even a cursory examination of a small subset of the naturalistic corpus turns up examples of all sixteen potential pairings of intonation contours and sentence patterns: there are no empty cells in the table. Thus, there is no correlation between the intonation contours defined in (19) and the sentence patterns represented in (20): intonation does not differentiate thematic roles in Riau Indonesian. In examples (21)-(28) below, each of the sixteen pairings of intonation contours and sentence patterns is illustrated with an utterance from the naturalistic corpus; for easy cross-referencing, the number of each example is shown in the appropriate cell in the table. As in examples (2)-(4) previously, the activity word is in boldface, while the relevant associated participant is in italics. (In some of the examples, the pairing of intonation contour and sentence pattern extends over just part of a larger utterance; in such cases, the remaining parts of the utterance are enclosed in parentheses. Breaks between intonation groups, either within the relevant part of the utterance or outside of it, are represented with commas.) (21a)
Kepala head
desa itu, (Intonation contour A) village DEM-DEM:DIST
pindah move
rumah house
papan board
itu DEM-DEM:DIST
[From narrative about peeping tom] ‘The village head moved into that wooden house’ (b)
( Vid, ) FAM|David
hilangkan disappear-EP
dah
Vid
PFCT
FAM|David
ini, DEM-DEM:PROX
lupa forget
[Playing billiards on laptop computer; speaker asking me to help him get rid of the lines on the screen which show where the balls will go] ‘David, I’ve forgotten how to get rid of these, David’
60
DAVID GIL
(22a)
(Sangkut be.caught.on
situ
‘kan,
LOC-DEM-DEM:PROX
Q
selamat safe
dia,) 3
tidur, sleep
dia, 3
(anak child
si
Yung Yung
tadi
ini)
PERS
PST:PROX
DEM-DEM:PROX
[From narrative about village boy and sparrowhawk; boy has fallen off a bridge into a mangrove tree] ‘He was caught there, he was safe, he fell asleep, the boy Yung’ (b)
Jumpa, meet
satu one
jauh far
Q
asap, (nampak asap dari smoke see smoke from
‘kan )
[From narrative about village boy and sparrowhawk; boy is wandering through forest] ‘He noticed a plume of smoke, he saw smoke from afar, right’ (23a)
masuk (Intonation contour B) Bola putih ball white enter [Playing billiards on laptop computer] ‘The white ball’s gone in’
(b)
Rokoknya buang throw cigarette-ASSOC [Cleaning a room with friends] ‘Throw the cigarette stubs away’
(24a)
Kawin dia, (David) marry 3 David [From narrative about boy who grows up, gets married, and learns the facts of life] ‘Then he got married, David’
(b)
Tutup pintu oy EXCL close door [Speaker wants to prevent other people from coming in to the room] ‘Hey, close the door’
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
(25a)
Bola PUTIH masuk (Intonation contour C) white enter ball [Playing billiards on laptop computer] ‘The white ball’s gone in’
(b)
INI
61
pencet ha, (ini pencet) press DEIC DEM-DEM:PROX press [Playing billiards on laptop computer; speaker showing friend which key to press] ‘Press this one, press this one’ DEM-DEM:PROX
(26a)
MASUK bola putih enter ball white [Playing billiards on laptop computer] ‘The white ball’s gone in’
(b)
TUKUL dio ha, (macam mano, DEIC kind what hammer 3 [From narrative about peeping tom] ‘She hammered him, it hurt’
(27a)
(E,) EXCL
bola ball
putih white
sakit die) hurt 3
MASUK enter
(Intonation contour D) [Playing billiards on laptop computer] ‘The white ball’s gone in’ (b)
Catur chess
tak NEG
PANDAI know.how
itu, DEM-DEM:DIST
(Vid) FAM|David
[Discussing what game to play next on laptop computer; someone suggests chess; speaker reacts] ‘I don’t know how to play chess, David’
DAVID GIL
62
(28a)
Rekam DIE record 3 [Speaker discovers I’ve been recording] ‘He’s recording’
(b)
“Aku 1:SG
nak want
TANGAN hand
dikau”, (katanya, 2 say-ASSOC
dia bilang) 3 say [From horror story about ungrateful son who tries to rob his mother’s tomb; at the end of the story, the mother’s ghost tries to snatch her son’s hand] ‘“I want your hand” she said’ Each of the above eight numbered examples presents a near minimal pair, as close a contrast as one is likely to find in a naturalistic corpus. Within each pair, the intonation contours are the same, the relative orders of activity and participant are the same, but the thematic role of the participant is different: whereas in the first, or (a) example, the participant is an actor, in the second, or (b) example, it is a undergoer. Thus, each of these minimal pairs shows that for a particular intonation contour and a particular sentence pattern, the intonation contour in question fails to differentiate between thematic roles, allowing a certain participant to be understood either as an actor, in the first member of the pair, or as an undergoer, in the second. For example, (21) shows that Intonation Contour A does not differentiate between actors and undergoers when these occur in a position preceding an activity. Similarly, (23) shows that Intonation Contour B does not distinguish between actors and undergoers when these come before an activity. Thus, examples (21) and (23) refute the “pause proposal”, discussed in Section 3 above, which suggests that when a undergoer precedes an activity, it must be followed by a pause. Such, indeed, is the case in (21b); however, in (23b), a undergoer also precedes an activity and here, contrary to the pause proposal, there is no pause (and there are many more examples like this in the corpus). Moreover, in (21a) there is a pause, even though here it is an actor rather than an undergoer that precedes the activity. Thus, examples such as these show that when the participant in question occurs before the activity, the presence or absence of a pause plays no role whatsoever in distinguishing actors from undergoers. In conjunction, then, examples (21) - (28), and many others like them in the corpus, show quite clearly that intonation plays no role in the differentiation of thematic roles in Riau Indonesian. To the extent that the four basic sentence patterns in (20) are representative of the variety of sentence patterns in the language, the above examples provide overwhelming support for Hypothesis A, as formulated in (5a), suggesting that for each sentence there is at least one intonation contour which
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
63
renders that sentence undifferentiated with respect to thematic roles. Moreover, to the extent that the four basic intonation contours in (19) encompass the major patterns of intonation that are available in the language, the above examples provide substantial support for the stronger Hypothesis B, as formulated in (5b), asserting that for each sentence, every intonation contour renders that sentence undifferentiated with respect to thematic roles. In view of examples such as these, it is hard to see how anybody can continue to maintain an uncritical position to the effect that intonation can function to distinguish thematic roles in Riau Indonesian. Playing devil’s advocate for a moment, it could, admittedly, still turn out to be the case, contrary to everything suggested in this paper, that intonation somehow does differentiate thematic roles in Riau Indonesian. For example, there could exist some additional intonation contours, not considered in this paper, which do differentiate thematic roles: such intonation contours would provide a counterexample to Hypothesis B, though not contradict the weaker Hypothesis A. More far-reachingly, it could conceivably be the case that each of the four would-be basic intonation contours defined in this paper actually lumps together two or more distinct intonation contours which do differentiate thematic roles: if this were true, then counterevidence would be provided even for the weaker Hypothesis A. Thus, the claims made in this paper constitute explicit hypotheses for which it is easy to imagine hypothetical counterevidence. Nevertheless, the results of this paper suggest that such counterevidence is indeed no more than strictly hypothetical. Accordingly, if anybody still wishes to claim that intonation can differentiate thematic roles in Riau Indonesian, then the burden of the proof now rests solidly on their shoulders: they must produce the data, and specify exactly which intonation contours distinguish which thematic roles. (To assist in such a challenge, I would be happy to share the naturalistic corpus, including the sound files, with anybody wishing to examine them for scientific purposes.) In the meantime, in the absence of such counterarguments, the only position that can reasonably be maintained is that intonation does not and cannot differentiate thematic roles in Riau Indonesian. 7. CONCLUSION The results of this paper underscore the need for linguistic descriptions to avoid Eurocentric assumptions with regard to the expressive power of languages. Just because thematic roles are central to the grammatical organization of many familiar languages does not mean that they are of equal importance in all of the world’s languages. Riau Indonesian shows how a language can manage just fine, fulfilling a wide range of communicative functions, without any obligatory grammatical means for distinguishing between thematic roles: word order, case marking, agreement, or intonation. More specifically, the absence of any relationship between intonation and thematic roles in Riau Indonesian provides reinforcement for previous descriptions of the language which have argued that it is lacking in many of the categories that are considered to be central to the grammatical organization of most other languages.
DAVID GIL
64
The reader may have noted that no mention was made at any point in this paper of parts of speech (such as noun and verb), syntactic categories (such as noun phrase and verb phrase), or grammatical relations (such as subject, direct object, indirect object, and so forth). Indeed, in Gil (1994, 1999, 2000, 2001a,b, 2002b, 2005) it is argued that such categories are absent in Riau Indonesian. As statements of non-existence, such claims can be readily refuted, by showing how a single grammatical generalization makes reference to the category in question. Conversely, such claims can be supported only in gradual incremental fashion, through the examination, one after the other, of a wider and wider range of phenomena, each of which can in turn be accounted for without reference to the categories in question. In the case at hand, the absence of any correlation between intonation and thematic roles adds further to the plausibility of the claim that Riau Indonesian does not possess any categories whose definitions make reference to thematic roles, such as grammatical relations, or whose prototypical characteristics involve thematic roles in any way, such as parts of speech and syntactic categories. How would the grammar of Riau Indonesian work in the absence of so many commonplace grammatical categories? Following are syntactic and semantic representations for a typical Riau Indonesian sentence, example (2a) above, ‘I’ll buy a laser’. (For ease of exposition, the final particle ’kan in (2a) is ignored.) (29)
syntactic representation:
semantic representation:
S
S
S
S
beli
aku
laser
BUY
1:SG
LASER
A (BUY, 1:SG, LASER)
As argued in Gil (1994, 2000, 2001a, b, 2005), Riau Indonesian syntax contains a single open syntactic category, S(entence). As shown above, beli, aku and laser are all Ss, as is the construction as a whole; from a formal point of view, beli aku laser is thus a coordination of three Ss. The semantics of Riau Indonesian centers around the association operator, represented above with the letter A. In its monadic, or one-place guise, the association operator provides a semantic representation for markers of association, possession, and genitive case in many languages. For example, in English, in an expression such as John’s, the possessive ’ s is interpreted as the association operator A, applying to the denotation JOHN, yielding the formula A (JOHN), which can be read as ‘entity associated with John’, where the detailed nature of the association is left unspecified by the grammar and is instead determined by context. However, in a typical Riau Indonesian sentence, the
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
65
association operator applies polyadically, to a sequence of items, and without any overt morphosyntactic realization. For example, in (29) above, it applies to the three meaning components of the sentence, yielding the formula A (BUY, 1:SG, LASER), which may be read as ‘entity associated with buying, speaker and laser’, where, again, the precise nature of the association is left unspecified, to be determined by context. Accordingly, the sentence Beli aku laser is endowed with a single unitary semantic representation which is indeterminate with respect to a variety of categories such as number, definiteness, tense, aspect and thematic roles. In the right context, it could thus mean anything from ‘I’ll buy a laser’, as in fact it did in (2a), to ‘Somebody bought me the laser’, ‘Somebody’s buying something from me with some lasers’, and so forth. Thus, as suggested in (29) above, the basic sentence structure of Riau Indonesian is extremely impoverished, making no reference to thematic roles. It is thus hardly surprising that intonation, too, fails to differentiate thematic roles in Riau Indonesian. But what of most other languages, with more elaborate clause structure, in which thematic roles play a central part? Prima facie, there might perhaps be more reason to expect intonation to correlate with thematic roles in at least some such languages. Imagine, for example, a language like Hebrew, with flexible word order, and in which, for basic transitive clauses, there is evidence for hierarchical syntactic structure of the kind commonly represented in terms of a VP containing the verb and the object to the exclusion of the subject. Now imagine that in such a language, intonational grouping were to reflect the VP constituent in sentences such as Hebrew (1) in the same way that it reflects other kinds of constituency in Riau Indonesian (13) and in many other examples in most or all languages. In such a language, then, intonation would distinguish between thematic roles, albeit not directly, but rather through the mediation of syntactic constituency, in accordance with the alternative scenario suggested in the introduction. However, as noted in Section 2, there are no clear documented cases of languages in which intonation works in this way. Can we thus conclude that intonation is in principle incapable of encoding thematic roles in human language? At present, this is perhaps most appropriately viewed as a conjecture still in need of serious further investigation, so that it may either be refuted by means of one or more counterexamples, or else recognized as a linguistic universal, a substantive constraint on what constitutes a possible human language. Max Planck Institute for Evolutionary Anthropology 8. NOTES *
I would like to thank all my colleagues who asked whether intonation differentiates thematic roles in Riau Indonesian, and/or insisted and perhaps still insist that it does, for providing me with the impetus to write this paper. In particular, I am indebted to Peter Cole, Gabriella Hermon and Uri Tadmor for numerous discussions on the issues dealt with in this paper, and to Matt Gordon for constructive comments on an earlier draft. I am especially grateful to the many speakers of Riau Indonesian who
66
DAVID GIL
provided the naturalistic data on which this paper is based: Arief, Benny, Danzha Selpas, Desrul, Ellyanto, Dwiarpianto, Fuad, Jumbro, Junaidi, Muchlis, Pai, Per, Riki, Rudy Chandra, Septianbudiwibowo, Wira, Zainudin. Versions of this paper were presented at the Fifth International Symposium on Malay/Indonesian Linguistics, Leipzig, Germany, 17 June 2001; at Topic and Focus: A Workshop on Intonation and Meaning, University of California, Santa Barbara, CA, USA, 21 July 2001; and at the Ninth Annual Meeting of the Austronesian Formal Linguistics Association, Cornell University, Ithaca, NY, USA, 26 April 2002; I would like to thank participants at all three events for their helpful comments and suggestions. 1 In addition to Riau Indonesian, some of the data cited in this paper show evidence for interference from Siak Malay, the dialect of Malay spoken in the lower part of the Siak river basin, in Riau province. Riau Indonesian and Siak Malay share a considerable degree of mutual intelligibility; in fact, in some cases it is difficult to determine whether a given utterance is in one dialect or the other. Although this paper focuses on Riau Indonesian, all of its main points are equally germane also for Siak Malay. 2 The interlinear glosses in this paper make use of the following abbreviations: AG ‘agent’; ASSOC ‘associative’; DEIC ‘deictic’; DEM ‘demonstrative’; DIST ‘distal’; DISTR ‘distributive’; EP ‘end point’; EXCL ‘exclamation’; FAM ‘familiar’; M ‘masculine’; NEG ‘negative’; PERS ‘personal’; PFCT ‘perfect’; PROX ‘proximal’; PST ‘past’; Q ‘question’; SG ‘singular’; 1 ‘first person’; 3 ‘third person. 3 Readers familiar with Malay / Indonesian may be wondering about the well-known “voice markers” and whether they might perhaps be involved in the differentiation of thematic roles. In Riau Indonesian, the relevant forms di- and N- are indeed present; however, their use is optional, and, crucially, they do not help to differentiate thematic roles: sentences with di- or N- (or even both) remain indeterminate with respect to thematic roles (see Gil 1999, 2002b for examples and detailed discussion). Perhaps the most productive means for differentiating thematic roles in Riau Indonesian is provided by the form sama, which can mark participants in any thematic role except that of absolutive, thereby discriminating between roles such as, for example, actor and undergoer, by overtly marking the former. However, even this form is optional; moreover, it is only very weakly grammaticalized, and is actually more appropriately considered as an ordinary “content” word with a very broad and abstract meaning centered around the notion of togetherness (see Gil 2004, for examples and argumentation). 4 Another proposal occasionally mentioned in discussions of intonation and clause structure in Malay / Indonesian is that of Chung (1978), pertaining to a language variety that she refers to as “informal Indonesian”, but which is actually closer to Standard Indonesian than to any of the regional colloquial varieties (including those of Jakarta and Bandung, from where her speakers hailed). Chung is concerned with a particular sentence pattern of the form AVP (Agent - Activity - Patient), where the V is devoid of any morphological voice marking. For a subset of such sentences, those in which the A is a pronoun or a proper noun, she maintains that two distinct intonation contours are available, which she calls “normal declarative” and “subject shifting”. She then claims that these two intonation contours correspond to two different syntactic analyses of the sentence in question, as “active” and “passive” respectively. In the latter case, her suggestion involves the following derivation. First, an active sentence with AVP order undergoes passivization (of the variety known in Indonesian studies as the pasif semu, or “second passive”), resulting in a structure of the form PAV, where the P assumes some subjecthood properties, and the A is cliticized to the V. Next, the P undergoes subject shifting, a process which moves subjects to the end of the sentence, in this case restoring the original AVP order. Although it may seem as though we’re back where we started, Chung asserts that such sentences are passive, and cites as evidence the purported “subject shifting” intonation contour associated with such constructions. Whether or not the facts are as described, and whether or not the analysis provided is the most appropriate one to account for such facts, Chung’s proposal does not involve any suggestion to the effect that intonation may differentiate thematic roles, since both intonation contours are associated with the same assignment of thematic roles. Indeed, this could hardly be otherwise, since, in the variety of Indonesian described by Chung, there is no thematic role indeterminacy of the kind illustrated in (2) - (4), and in particular no sentences of the form PVA such as in (3b). 5 In general, in the derivation of such monosyllabic forms, the lighter of the two syllables is omitted, while the heavier one is retained – w here the weight of the respective syllables is defined in terms of the number of segments they contain and their position on the sonority hierarchy, greater sonority
INTONATION AND THEMATIC ROLES IN RIAU INDONESIAN
67
corresponding to lesser weight. Thus, in the above example, pan is heavier than to by dint of the additional coda segment n; hence the familiar form of Topan is Pan, not To. 6 The name of the ludling, Warasa is derived by application of the ludling in question to the Malay / Indonesian word bahasa ‘language’. This and other Riau Indonesian ludlings are described in detail in Gil (2002a). 7 Occasionally, focus intonation occurs in a variant form, which might appropriately be referred to as super-focus. Phonetically, super-focus has all the special features of ordinary focus, plus an additional one, lip rounding on the lengthened penultimate syllable. Semantically, super-focus adds emphasis and affective force; one common usage of super-focus is with scalar adjectives, where it lends itself to translation into English with an accented intensifier such as “very”. 8 As far as I can tell, there are no systematic differences between the intonation contours of declarative statements and imperatives. In fact, there would seem to be no grammatical differences whatsoever distinguishing between sentences used to perform these two particular speech acts.
9. REFERENCES Adelaar, K. Alexander. Proto-Malayic: The Reconstruction of Its Phonology and Parts of Its Lexicon and Morphology, Pacific Linguistics Series C – 119. Canberra: The Australian National University, 1992. Amran Halim. Intonasi dalam Hubungannya dengan Sintaksis Bahasa Indonesia, Seri ILDEP di bawah Redaksi W.A.L. Stokhof. Jakarta: Penerbit Djambatan, 1984. Chung, Sandra. “Stem Sentences in Indonesian.” In S.A. Wurm and L. Carrington (eds.), Second International Conference on Austronesian Linguistics: Proceedings, Fascicle 1, Western Austronesian, Pacific Linguistics Series C - No. 61, pp. 335-365. Canberra: Australian National University, 1978. Edmundson, Jerold A., Tung-Chiou Huang and Akiyo Pahalaan. “Phonological Strengthening in Hsiukuluan Amis of Taiwan”, Paper presented at the Eleventh Annual Meeting of the Southeast Asian Linguistics Society, Mahidol University, Bangkok, Thailand, 17 May 2001. Gil, David. “The Structure of Riau Indonesian.” Nordic Journal of Linguistics 17 (1994): 179-200. Gil, David. “Riau Indonesian as a Pivotless Language.” In E.V. Raxilina and Y.G. Testelec (eds.), Tipologija i Teorija Jazyka, Ot Opisanija k Objasneniju, K 60-Letiju Aleksandra Evgen’evicha Kibrika (Typology and Linguistic Theory, From Description to Explanation, For the 60th Birthday of Aleksandr E. Kibrik), pp. 187-211. Moscow: Jazyki Russkoj Kul’tury, 1999. Gil, David. “Syntactic Categories, Cross-Linguistic Variation and Universal Grammar.” In P. M. Vogel and B. Comrie (eds.), Approaches to the Typology of Word Classes, Empirical Approaches to Language Typology, pp. 173-216. New York: Mouton, 2000. Gil, David. “Creoles, Complexity and Riau Indonesian.” Linguistic Typology 5 (2001a): 325-371. Gil, David. “Escaping Eurocentrism: Fieldwork as a Process of Unlearning.” In P. Newman and M. Ratliff (eds.), Linguistic Fieldwork, pp. 102-132. Cambridge: Cambridge University Press, 2001b. Gil, David. “Ludlings in Malayic Languages: An Introduction.” In Bambang Kaswanti Purwo (ed.), PELBBA 15, Pertemuan Linguistik (Pusat Kajian) Bahasa dan Budaya Atma Jaya: Kelima Belas, Jakarta: Unika Atma Jaya, 2002a. Gil, David. “The Prefixes di- and N- in Malay / Indonesian Dialects.” In F. Wouk and M. Ross (eds.), The History and Typology of Western Austronesian Voice Systems, pp. 241-283. Canberra: Pacific Linguistics, 2002b. Gil, David. “Intonation Does Not Differentiate Thematic Roles in Riau Indonesian.” In A. Riehl and T. Savella (eds.), Proceedings of the Ninth Annual Meeting of the Austronesian Formal Linguistics Association (AFLA9), Cornell Working Papers in Linguistics 19 (2003): 64-78. Gil, David. “Riau Indonesian sama, Explorations in Macrofunctionality.” In M. Haspelmath (ed.), Coordinating Constructions (Typological Studies in Language 58), pp. 371-424. John Benjamins, Amsterdam, 2004. Gil, David. “Word Order Without Syntactic Categories: How Riau Indonesian Does It.” In A. Carnie, H. Harley and S.A. Dooley (eds)., Verb First: On the Syntax of Verb-Initial Languages, pp. 243-263. John Benjamins, Amsterdam, 2005.
68
DAVID GIL
Goedemans, Rob and Ellen van Zanten. “Stress and Accent in Indonesian.” In D. Gil (ed.), Studies in Malay and Indonesian Linguistics. London: Curzon Press, to appear. Kähler, Hans. Grammatik der Bahasa Indonesia. Wiesbaden: Otto Harrassowitz, 1956. Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Reidel, 1986. Nurzuir Husin, Zailoet, M. Atar Semi, Isma Nasrul Karim, Desmawati Radjab and Djurip. Struktur Bahasa Melayu Jambi. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1985. Rehg, Kenneth L. “Proto-Micronesian Prosody.” In J.A. Edmondson and K.J. Gregerson (eds.), Tonality in Austronesian Languages, pp. 25-46. Oceanic Linguistics Special Publication No. 24. Honolulu: University of Hawaii Press, 1993. Stilo, Don. “Alternative Devices for Object Marking in Middle Eastern SOV Languages”, Paper presented at the Middle East Studies Association of North America, San Francisco, CA, USA, 29 November - 1 December 1984. Suwarni Nursato, Sutari Harifin, Zainin Wahab, Nangsari Ahmad and Homsen Nanung. Fonologi dan Morfologi Bahasa Lintang. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1989. Tadmor, Uri. “Can Word Accent Be Reconstructed in Malay?”, Paper presented at Third International Symposium on Malay / Indonesian Linguistics, Amsterdam, The Netherlands, 24 August 1999. Tadmor, Uri. “Rekonstruksi Aksen Kata Bahasa Melayu.” In Yassir Nasanius and Bambang Kaswanti Purwo (eds.), PELBBA 13, Pertemuan Linguistik (Pusat Kajian) Bahasa dan Budaya Atma Jaya: Ketiga Belas, Pusat Kajian Bahasa dan Budaya, pp. 153-167. Jakarta: Unika Atma Jaya, 2000. Umar Manan, Zainuddin Amir, Nasroel Malano, Anas Syafei and Agustar Surin. Struktur Bahasa MukoMuko. Jakarta: Pusat Pembinaan dan Pengembangan Bahasa, 1986. Van Ophuijsen, Ch. A. Maleische Spraakkunst. Leiden: van Doesburgh, 1915. Zarbaliev, X.M. Jazyk Minangkabau. Moscow: Nauka, 1987.
MATTHEW GORDON
THE INTONATIONAL REALIZATION OF CONTRASTIVE FOCUS IN CHICKASAW
1. INTRODUCTION While the realization of focus in languages which express focus either syntactically or prosodically or through a combination of both prosody and syntax has been studied relatively extensively, e.g. English (Beckman and Pierrehumbert 1986), Korean (Cho 1990, Jun 1993), Chichewa (Kanerva 1990), Bengali (Hayes and Lahiri 1991, Lahiri and Fitzpatrick-Cole 1999), Shanghai Chinese (Selkirk and Shen 1990), Hungarian (Horvath 1986, Kiss 1998), Hausa (Inkelas and Leben 1990), there is very little work on languages which mark focus morphologically through affixes or particles attached to or adjacent to focused elements. Of particular interest is the question of whether languages with morphological marking of focus also utilize prosodic cues to signal focus, much as languages with special word orders associated with focus may redundantly use prosody to cue focus. In their study of Wolof, a language which marks focus morphologically, Rialland and Robert (2001) claim that Wolof does not use intonation to signal focus redundantly. Beyond this study of Wolof, however, there is little phonetic literature dealing with the prosodic manifestation of focus in languages with morphological expression of focus. It is thus unclear to what extent languages that mark focus morphologically tend to also employ prosodic cues to focus.1 This study attempts to broaden our understanding of the phonetics of focus by examining prosodic cues to focus in Chickasaw, a language like Wolof with morphological marking of focus. A number of potential pitch and duration cues to contrastive focus are examined to determine whether Chickasaw redundantly use both prosody and morphology to mark focus. 2. BACKGROUND ON CHICKASAW Chickasaw is a Western Muskogean language spoken by no more than a few hundred predominantly elderly speakers in south-central Oklahoma. Chickasaw has been the subject of extensive work by Pamela Munro and colleagues. Munro (2005) provides a grammatical overview of Chickasaw and includes an analyzed text of a traditional Chickasaw story. Munro and Willmond (1994) is a dictionary that also contains a thorough description of Chickasaw grammar. Gordon et al. (2000) provides a quantitative phonetic description of Chickasaw and Gordon (1999, 2005)
69 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 69–82. © 2007 Springer.
70
MATTHEW GORDON
are descriptions of various aspects of the intonational system, including boundary tones, prosodic phrasing, and pitch accents. 2.1. Intonation Chickasaw utterances may be divided into a hierarchically ordered set of prosodic constituents (Gordon 1999, 2005). The largest clearly defined intonational unit is the Intonation Phrase, which is marked by a f0 excursion at its right edge, typically a f0 rise in statements and a f0 fall in questions. An Intonation Phrase consists of one or more Accentual Phrases which are canonically associated with a [LHHL] tone sequence when there is sufficient material in the phrase. The L tone is aligned with the left edge of the Accentual Phrase, and the first H tone occurs early in the Accentual Phrase, typically falling on or near the second sonorant mora, with considerable gradience in its alignment. The final two tones usually associate with the final syllable, yielding a f0 fall on the final syllable. Stressed final syllables, those containing a coda consonant or a long vowel (see Gordon 2002 on stress in Chickasaw) may not realize the final low tone, however. A short Accentual Phrase, one with fewer than three sonorant moras, may also not realize all the tones of a canonical AP, with deletion of the initial or final L being the typical strategy for truncating the AP. An AP with three sonorant moras is usually sufficient to realize all tones though a two syllable AP with three sonorant moras may not realize all its tones. Schematic examples of the realization of tones in an AP appear in (1). (1)
a. Monomoraic 1st Syllable
L
H
H L
[ µ µµ ! ]AP n a S oÚ b a… t
b. Bimoraic first syllable c. Short AP
L H
H L
H
L
[ µµ ! ! ! ]AP [ µ µ ]AP n am bi laÚma/ fala
Chickasaw strongly tends to align Accentual Phrase boundaries with word boundaries; thus, it is most common for a single word to constitute an entire Accentual Phrase. It is possible, however, for two relatively short words, i.e. words of one or two syllables, to group together into a single Accentual Phrase. 2.2. Focus Chickasaw has at least two types of focus markers (Munro and Willmond 1994) which are suffixed to focused nouns and differ according to whether the focused element is a syntactic subject or an object. The first focus suffix, -ho…t when attached to subjects and –ho when affixed to objects, is termed a “focus/inferential case ending” by Munro and Willmond (1994:liv) and will not be discussed further in this paper. The focus of this paper is the contrastive focus
CONTRASTIVE FOCUS IN CHICKASAW
71
suffix, which is realized as -akot with subjects and as -ako)… with objects (Munro and Willmond:liii). Although the precise semantic conditions that give rise to the contrastive focus are not completely understood, one of its primary functions is to attract narrow focus to the noun which it modifies. There is no comparable suffix affixed to verbs to signal narrow focus on the verb. Sentences exemplifying the contrastive focus suffixes and their counterparts lacking contrastive focus marking appear in (2). (2)
hat…ak-at koni(a)) pisa. Man-subj skunk sees The man sees the skunk. hat…ak-akot koni(a)) pisa. Man-cont.subj. skunk sees THE MAN sees the skunk. hat…ak-at koni-ako)… pisa. Man-subj skunk-cont.obj. sees The man sees THE SKUNK.
As the sentences in (2) indicate, non-focused subjects are marked with the suffix – at, while non-focused objects may either have no overt suffix or be marked with the suffix – a)…. The unmarked word order in Chickasaw is SOV, though other orders are possible under certain as yet not well-understood semantic conditions, including focus, which may be associated with fronting of the focused element. For example, sentence (2c) could appear with a fronted object, i.e. koniako)… hat…akat pisa ‘The man sees THE SKUNK’. 3. PRESENT STUDY 3.1. Methodology The present study examines the prosodic realization of sentences involving contrastive focus on subjects and verbs. Data were collected during elicitation sessions with individual speakers. Subjects were presented with English sentences containing a subject, object, and verb and instructed to give the Chickasaw equivalent. Focus was elicited by offering English translations emphasizing the focused element. Three different focus conditions were elicited: one involving broad focus, i.e. no special focus on any particular element, one with narrow focus on the subject and one with narrow focus on the object. Subjects repeated each sentence between three and five times. The corpus used in the experiment appears in Table 1.
72
MATTHEW GORDON
Table 1. Corpus recorded for the focus experiment
NO FOCUS Speakers 1-4 hat…akat naSo…bai pisa hat…akat ampaska pisa hat…akat wa…ka/ pisa hat…akat hopa…ji/ pisa Speaker 5 na…hol…a…t naSo…ba pisa…tok na…hol…a…t ampaska pisa…tok na…hol…a…t wa…ka/ pisa…tok na…hol…a…t hopa…ji/ pisa…tok
The man sees the wolf. The man sees my bread. The man sees the cow. The man sees the fortune teller. The white man saw the wolf. The white man saw my bread. The white man saw the cow. The white man saw the fortune teller.
SUBJECT FOCUS na…hol…a…kot a)…nampaka)…li/ pisa…tok na…hol…a…kot minko/ pisa(…tok) na…hol…a…kot ofo)…lo pisa…tok
THE WHITE MAN saw my flower. THE WHITE MAN sees(saw) the chief. THE WHITE MAN saw the owl.
OBJECT FOCUS na…hol…a…t minka…ko)… pisa…tok na…hol…a…t amofo)…la…ko)… pisa…tok na…hol…a…t sat…iba…piSiako)… pisa…tok
The white man saw THE CHIEF. The white man saw MY OWL. The white man saw MY BROTHER.
Data was collected and analysed for a total of five female speakers. Four of the speakers were recorded in Oklahoma in 1996 while the remaining speaker was recorded in Los Angeles in 2002. Subjects were recorded on DAT tape while wearing a high quality noise cancelling microphone on their heads. Data were then transferred onto computer using Scicon MacQuirer at a sampling rate of 22.5 kHz. Two measurements that could potentially distinguish different focus conditions prosodically were made using the MacQuirer software. First, the average fundamental frequency for each of the three words comprising each sentence was calculated to determine whether focused words are produced with heightened pitch relative to postfocus elements, a common prosodic realization of focus crosslinguistically. Second, the duration of the pause between the subject and object and between the object and verb was measured to ascertain the degree of juncture
CONTRASTIVE FOCUS IN CHICKASAW
73
between different words under different focus conditions. Prosodic boundaries between words in postfocus position in other languages are commonly eliminated, reducing the level of temporal disjuncture between elements in postfocus position. 4. RESULTS 4.1. Fundamental Frequency A two factor (focus condition and syntactic category) analysis of variance pooling together results from all speakers failed to indicate a significant effect of either focus condition or syntactic category on f0 values: for syntactic category (subject, object, verb), F(2, 349) = 1.706, p = .1831; for focus condition (no focus, subject focus, object focus), F(2, 349) = .664, p = .5153. There was, however, a significant interaction between focus condition and syntactic category: F(4, 349) = 3.280, p = .0117. This interaction was attributed primarily to an overall raising of f0 for subjects in sentences involving any type of focus, either subject or object focus. This effect was only observed for certain speakers but not others. Another effect contributing to the interaction between focus and syntactic category was a lowering of f0 on verbs in sentences with a narrow focused noun. Again this effect was speaker dependent, however. Given the considerable interspeaker variation in the expression of focus, it is thus instructive to consider results for individual speakers. Average f0 results for individual speakers are given in Table 2. Speaker 1 displayed a significant raising of f0 for subjects in sentences with either narrow focus on the subject or object. Unpaired t-tests for this speaker revealed a significant difference between f0 values for subjects in broad focus sentences and subjects in sentences with either narrow focus on the subject, t(2,13) = 2.824, p = .0144, or narrow focus on the object, t(2,14) = 3.146, p = .0072. F0 values did not differ reliably between subjects in sentences with subject focus and those with object focus. Nor was there any significant difference in f0 values for objects or verbs under the three focus conditions. Results for speaker 2 were similar to those for speaker 1: f0 values for subjects were higher in sentences involving narrow focus than those with broad focus. This difference was only a trend, however, and did not quite reach statistical significance in unpaired t-tests: broad focus vs. narrow subject focus, t(2, 20) = 1.899, p = .0721 ; broad focus vs. narrow object focus, t(2, 9) = 1.963, p = .0662. F0 values for objects and verbs did not differ significantly under different focus conditions. Speaker 3 also displayed an overall raising of f0 in subjects in sentences with narrow focus either on the subject or the verb: broad focus vs. narrow subject focus, t(2, 25) = 4.588, p<.0001; broad focus vs. narrow object focus, t(2,22) = 5.832, p<.0001. In addition, f0 values were heightened for objects in sentences involving narrow focus on either the subject or object: broad focus vs. narrow subject focus, t(2,25) = 2.340, p = .0275; broad focus vs. narrow object focus, t(2,22) = 2.221,
74
MATTHEW GORDON
p = .0369. Finally, verbs were found to have lower f0 values in sentences with narrow subject focus than broad focus sentences: t(1,4) = 3.033, p = .0387. The data recorded from this speaker did not allow for measurement of f0 values for verbs in sentences with narrow object focus. Interestingly, a tendency to lower f0 of verbs in sentences with narrow focus also was observed in speaker 1, though this effect did not reach significance for this speaker. Speaker 4 also raised f0 values for subjects in narrow focus sentences: t(2,21) = 2.748, p =. 0120. Sentences with narrow focus on the object were not recorded from this speaker. Focus did not impact f0 values for either objects or verbs for speaker 4. Speaker 5 was the only speaker for whom subject narrow focus and object narrow focus were differentiated both from each other and from broad focus along the f0 dimension. Interestingly, for this speaker, f0 values for subjects were highest in object focus sentences (184Hz on average), and lowest in broad focus sentences (158Hz on average), with intermediate values obtaining in subject focus sentences (165Hz on average). Values differed significantly from each other between the three focus conditions: broad focus vs. narrow subject focus, t(2,27) = 2.056, p = .0495; broad focus vs. narrow object focus, t(2,22) = 3.919, p = .0007; narrow subject focus vs. narrow object focus, t(2,21) = 2.811, p = .0105. Speaker 5 also raised f0 for objects under focus relative to unfocused objects in both broad focus sentences, t(2,23) = 3.176, p = .0042 and sentences with narrow focus on the subject, t(2,23) = 2.456, p = .0220. Objects did not differ reliably in f0 between broad focus and narrow subject focus sentences. Differences in focus condition did not significantly affect f0 values for verbs. Figures 1-3 illustrate sentences uttered by speaker 5 with three different focus conditions. Figure 1 is realized with broad focus, Figure 2 with narrow focus on the subject, and figure 3 with narrow focus on the object. As the figures show, the sentence with object focus (figure 3) is associated with a blanket rising of f0 for the subject and object (and to a lesser extent, the verb, though this is not a consistent property of object focus). Subject focus (figure 2) triggers a raising of f0 in the subject relative to the subject in the broad focus sentence (figure 1) but not relative to the subject in the object focus sentence. It may also be observed that the broad focus sentence in figure 1 differs in prosodic constituency from the two sentences with a narrow focused element. The subject and object together form a single Accentual Phrase when neither is focused but belong to different Accentual Phrases when either one is focused.
CONTRASTIVE FOCUS IN CHICKASAW
75
Figure 1. Pitch track for broad focus sentence na…hol…a…t minko/ pisa…tok ‘The white man saw the chief.’
Figure 2. Pitch track for subject focus sentence na…hol…a…kot minko/ pisa…tok ‘THE WHITE MAN saw the chief.’
76
MATTHEW GORDON
Figure 3. Pitch track for object focus sentence na…hol…a…t minka…ko)… pisa…tok ‘The white man saw THE CHIEF.’ Table 2. Average f0 results for individual speakers (in Hertz, N=narrow focus)
Subject
Object
Verb
Broad N-subj N-obj Broad N-subj N-obj Broad N-subj N-obj
1 191 205 204 199 195 205 216 203 ----
2 192 201 204 189 196 198 199 206 ----
Speaker 3 160 183 188 166 177 180 187 149 ----
4 192 210 ---202 203 ---187 191 ----
5 158 165 184 159 164 181 164 166 174
In summary, both subject and object narrow focus consistently triggered raising of f0. One speaker also displayed raising of f0 in objects under both object narrow focus and subject narrow focus sentences. In addition, f0 for verbs was also lowered in sentences involving narrow focus for two speakers. Somewhat surprisingly, object narrow focus and subject narrow focus were only differentiated for one speaker in terms of average f0 values. For this speaker, object focus triggered raising of f0 for the focused object, as one might expect. However, this speaker also curiously displayed higher f0 values for subjects in sentences with object focus than for subjects under narrow focus themselves.
CONTRASTIVE FOCUS IN CHICKASAW
77
4.2. Duration A two factor (syntactic category and focus condition) ANOVA pooling results from all five speakers indicated a significant effect of both syntactic category and focus on the pause duration between words in sentences: for syntactic category, F(1,284) = 6.200, p = .0133; for focus condition, F(2,284) = 11.242, p<.0001. There was also a significant interaction between the two factors: F(2,284) = 23.029, p<.0001. Overall, the pause between subject and object was shortest in broad focus sentences and longest in sentences with narrow focus on the object. In contrast, the pause between object and verb was shortest in object focus sentences and longest in sentences with broad focus. Results averaged across speakers appear in Figure 4. 300
milliseconds
240 broad focus
180
subject focus 120
object focus
60 0
post-subject
post-object
Figure 4. Pause durations under three different focus conditions (all speakers pooled together, bars represent one standard deviation from mean)
A series of pairwise t-tests revealed a highly significant difference in the postsubject pause between sentences with broad focus and both sentences with narrow focus on the subject, t(1,120) = 5.404, p<.0001, and those with narrow focus on the object, t(1,98) = 6.540, p<.0001. Only one of the three pairwise comparisons involving the post object pause, however, reached significance, the difference between the broad focus and narrow object focus conditions: t(1,98) = 2.363, p = .0201. Pause durations for individual speakers appear in Table 3. There was some variation between speakers in their duration results. Looking first at the postsubject pause, four of the five speakers displayed the shortest pause after subjects in broad focus sentences, while speaker 5 did not reliably differentiate the broad focus and narrow subject focus conditions in terms of post-subject pause duration. Only speaker 5 had a reliable difference in the post-subject pause between the two narrow focus sentence types: the pause in object focus sentences was greater than in subject focus sentences. For the other speakers, the two narrow focus conditions were not significantly different from each other in their post-subject pauses.
78
MATTHEW GORDON
Turning to post-object pause duration, there was greater interspeaker variation, with the most dominant pattern involving decreased duration following narrowly focused objects. Speaker 2 had the shortest post-object pause in narrow object focus sentences and roughly similar post-object pause durations in sentences with narrow subject focus and those with broad focus, though none of the pairwise comparisons reached significance in t-tests. Speaker 3 also displayed the shortest post-object pause in sentences with object focus though differences between the three focus conditions were quite small. Only the difference between object focus and subject focus conditions reached statistical significance for this speaker: t(1,13) = 3.346, p = .0053. Speaker 5 followed a similar pattern with shorter pauses following focused objects with both pairwise comparisons involving narrow object focus sentences reaching significance: narrow object focus vs. narrow subject focus, t(1,24) = 2.652, p = .0139; narrow object focus vs. broad focus, t(1,24) = 2.518, p = .0189. The close degree of juncture between a focused object and the following verb can be observed in figure 3 earlier in the paper. Speaker 1 displayed a very different pattern: she had the shortest post-object pause under the narrow subject focus condition, and the longest post-object pause under the object focus condition, with intermediate values in the broad focus condition. Only the comparisons involving narrow subject focus reached significance: narrow subject focus vs. narrow object focus, t(1,7) = 3.961, p = .0055; narrow subject focus vs. broad focus, t(1,14) = 3.639, p = .0194. Speaker 4 for whom sentences with narrow object focus were not recorded, displayed shorter pauses after objects in sentences with narrow focus on the subject, though this difference narrowly missed significance: t(1,21) = 2.057, p = .0523. Table 3. Pause duration results for individual speakers (in milliseconds, N=narrow focus)
Postsubj
Postobj
Broad N-subj
1 44 316
2 13 313
Speaker 3 0 41
4 8 132
5 93 75
N-obj Broad N-subj
235 66 25
244 144 123
88 97 104
---125 104
143 64 69
N-obj
96
61
90
----
56
In summary, broad focus was typically associated with a very close degree of temporal juncture between subjects and objects (with zero or nearly zero pause after the subject for speakers 2, 3, 4), while the two narrow focus conditions were not consistently differentiated in terms of their effect on the duration of pauses after the subject. The two narrow focus sentence types were, however, differentiated in their
CONTRASTIVE FOCUS IN CHICKASAW
79
effect on the level of juncture between object and verb. Objects carrying narrow focus were followed by very short pauses relative to unfocused objects both in sentences with broad focus and sentences with narrow focus on the subject. These patterns, though dominant, however, were not entirely consistent across speakers. Speaker 5 differed from the other speakers in terms of pause durations after the subject, whereas speaker 1 differed from the other speakers in her results for postobject pauses. It should also be noted that the increased temporal proximity between a focused object and verb observed for most speakers is not associated with elimination of the Accentual Phrase boundary typically separating most lexical items greater than two syllables in sentences lacking any narrow focused element. As figure 1-3 show, the first syllable of the verb is realized with low tone, the initial tone of a Chickasaw Accentual Phrase, which characteristically has the tonal pattern [LHHL] (Gordon 1999, 2005). 4. DISCUSSION This paper has shown that Chickasaw marks contrastive focus not only morphologically but also through prosody. The strategies employed by Chickasaw to mark focus prosodically are similar in some respects to those exploited by other languages but also differ in some respects from other languages. Both narrow object focus and narrow subject focus were characteristically associated with raised f0 values for subjects, and, for one speaker, objects as well. Only one speaker differentiated narrow object focus and narrow subject focus, however: for this speaker, f0 values were higher for focused objects than non-focused objects. The raising of f0 of subjects in both sentences with narrow subject focus and sentences with narrow object focus is an unusual feature of Chickasaw, as increased f0 is characteristically associated with only the focused element in most languages, including English (Beckman and Pierrehumbert 1986), Korean (Jun 1993), Hausa (Inkelas and Leben 1990). The dominant cross-linguistic pattern entailing localized raising of f0 under focus was found only for a single Chickasaw speaker. Even this speaker, however, displayed higher f0 values for subjects in sentences with narrow object focus than in sentences with narrow subject focus. It thus seems that raising of f0 is a general strategy for signalling any type of focus in Chickasaw and is not a reliable cue to picking out which element is being focused. It is also worth noting that two speakers displayed lowering of f0 in verbs in sentences with narrow focus on either the subject or object. This pattern may be viewed as similar to the deaccenting of words in the same intermediate phrase following a focused element in English (Beckman and Pierrehumbert 1986), though focus leads only to a blanket lowering of f0 in verbs in Chickasaw and does not actually lead to suppression of the nuclear pitch accent in an IP final verb. Chickasaw’s use of duration to signal focus follows, in some respects, a pattern typical of other languages. A focused object increases the temporal proximity of the object and following verb, a pattern similar to that found in Korean (Cho 1990, Jun 1993). It is important to note, however, that while a focused object triggers deletion of the Accentual Phrase boundary between an object and following verb in Korean,
80
MATTHEW GORDON
the change in temporal proximity of object and verb in Chickasaw is not necessarily associated with a change in prosodic constituency. An Accentual Phrase boundary may also separate the verb preceding a focused object as it typically separates a verb and a preceding unfocused object. It is conceivable, however, that examination of more data will reveal a statistically greater likelihood for focused objects to be grouped together in an Accentual Phrase with the following verb. Thus, it is as yet unknown whether the temporal effects induced by placing narrow focus on the object in Chickasaw are purely phonetic or whether the increased temporal closeness of a focused object and verb has ramifications for prosodic constituency. Another temporal phonetic effect triggered by narrow focus is increased separation between the subject and object. For all but one speaker, this enhanced level of disjuncture is associated with either narrow focus on the subject or object and often has phonological ramifications on Accentual Phrase formation: the subject and object are more likely to be grouped in the same Accentual Phrase when neither carries narrow focus than when one or both does. Although the symmetry of this effect under both narrow focus conditions, subject focus and object focus, is atypical from a cross-linguistic standpoint, it serves to set off the focused element from adjacent words perhaps increasing its prominence. In the case of a focused object, the increased pause before the object complements the decreased pause following the object. For two speakers (the pause preceding a focused object is greater than the pause preceding an unfocused object in both sentences without narrow focus and sentences with a focused subject. The increased disjuncture before a focused element for this speaker accords with other languages in which a phonological phrase boundary is obligatory before a focused constituent, e.g. Korean (Jun 1993), Hausa (Inkelas and Leben 1990), Japanese (Pierrehumbert and Beckman 1988), and Greek (Condoravdi 1990). 5. SUMMARY Results of this study suggest considerable diversity among Chickasaw speakers in their prosodic realization of focus. More generally, the examined data suggest that Chickasaw is less reliant on prosody to signal focus than other languages in which focus is not signalled through morphology. While broad focus sentences is characteristically differentiated from narrow focus through its lower f0 in nouns and, for certain speakers, higher f0 in verbs, f0 does not, with the exception of one speaker, distinguish sentences with narrow focus on the subject from those with narrow focus on the object. Interword pause durations appear more reliable in cueing focus, with both narrow focus conditions triggering increased temporal disjuncture between subject and object, presumably a strategy for increasing the salience of focused elements. For three speakers, narrow object focus was associated with increased temporal proximity of the object and verb relative to the other two focus conditions, broad focus and narrow focus on the subject, a trend which parallels the dephrasing of post-focus elements in other languages. For one speaker, narrow focused objects were preceded by a longer pause than objects not under narrow focus.
CONTRASTIVE FOCUS IN CHICKASAW
81
The results for Chickasaw may be contrasted with the results of Rialland and Robert’s study of Wolof, another language with morphological expression of focus. Rialland and Robert do not report any use of prosodic cues to focus for Wolof, though it should be noted that their study focused on intonation, i.e. f0, the parameter which less reliably differentiated various focus conditions in Chickasaw. It is thus conceivable that durational cues to focus are also present in Wolof. The present study of Chickasaw suggests that, although the role of prosodic cues to focus may be less consistently exploited in Chickasaw than in languages without overt focus morphology, measurable phonetic cues to focus are potentially present even in languages in which morphology carries the primary burden in signalling focus. 6. NOTES 1
A sincere thanks to the Chickasaw speakers, who so generously provided the data discussed in this paper, and to Pam Munro for her assistance in preparing the corpus examined in this paper, and more generally, for her insights and suggestions related to Chickasaw prosody. Portions of the data discussed here were collected as part of an NSF grant awarded to Peter Ladefoged and Ian Maddieson to document the phonetic properties of endangered languages. 2 Note that the long vowel in naSo…ba ‘wolf’, hopa…ji/ ‘fortune teller’ and pisa…tok ‘saw’ are not phonemic long vowels but are the result of a process of rhythmic lengthening targeting a non-final vowel in an open syllable immediately preceded by a short vowel in an open syllable (see Munro and Willmond 1994, Munro 2005 for discussion of rhythmic lengthening). Rhythmically lengthened vowels behave parallel to phonemic long vowels phonologically and are either, depending on the speaker, identical in length or nearly identical in length to phonemic long vowels (see Gordon et al. 2000 for phonetic data).
7. REFERENCES Beckman, Mary and Janet Pierrehumbert. “Intonational structure in Japanese and English.” Phonology Yearbook 3 (1986): 255-310. Cho, Young-Mee. “Syntax and Phrasing in Korean.” In Sharon Inkelas and Draga Zec (eds.), The Phonology-Syntax Connection, pp. 47-62. Chicago: University of Chicago Press, 1990. Condoravdi, Cleo. “Sandhi Rules of Greek and Prosodic Theory.” In Sharon Inkelas & Draga Zec (eds.), The Phonology-Syntax Connection, pp. 63-84. Chicago: University of Chicago Press, 1990. Gordon, Matthew. “The Intonational Structure of Chickasaw.” Proceedings of the 14th International Congress of Phonetic Sciences (1999): 1993-1996. Gordon, Matthew. “Intonational phonology of Chickasaw.” In Sun-Ah Jun (ed.), Prosodic Models and Transcription: Towards Prosodic Typology, pp. 301-330. Oxford: Oxford University Press, 2005. Gordon, Matthew, Pamela Munro and Peter Ladefoged. “Some Phonetic Structures of Chickasaw.” Anthropological Linguistics 42 (2000): 366-400. Hayes, Bruce and Aditi Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic Theory 9 (1991): 47-96. Horvath, Julia. Focus in the Theory of Grammar and the Syntax of Hungarian. Dordrecht: Foris, 1986. Inkelas, Sharon and William Leben. “Where Phonology and Phonetics Intersect: the Case of Hausa Intonation.” In John Kingston and Mary Beckman (eds.), Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, pp. 17-34. New York: Cambridge University Press, 1990. Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. The Ohio State University: Doctoral dissertation, 1993. Kanerva, Jonni. “Focusing on Phonological Phrases in Chichewa.” In Sharon Inkelas and Draga Zec (eds.), The Phonology-Syntax Connection, pp. 145-162. Chicago: University of Chicago Press, 1990. Kiss, Katalin. “Identificational Focus Versus Information Focus. Language 74 (1998): 245-273.
82
MATTHEW GORDON
Lahiri, Aditi and Jennifer Fitzpatrick-Cole. “Emphatic Clitics and Focus Intonation in Bengali.” In Kager, René and Wim Zonneveld (eds.), Phrasal Phonology, pp. 119-144. Nijmegen: University of Nijmegen Press, 1999. Munro, Pamela. “Chickasaw.” In H. Hardy and J. Scancarelli (eds.) Native Languages of the Southeastern United States, pp. 114-156. Lincoln: University of Nebraska Press, 2005. Munro, Pamela and Catherine Willmond. Chickasaw: an Analytical Dictionary. Norman: University of Oklahoma Press, 1994. Pierrehumbert, Janet and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988. Rialland, Annie and Stéphane Robert. “The intonational system of Wolof.” Linguistics 39 (2001): 893-939. Selkirk, Elisabeth and Tong Shen. “Prosodic Domains in Shanghai Chinese.” In Sharon Inkelas and Draga Zec (eds.), The Phonology-Syntax Connection, pp. 313-338. Chicago: University of Chicago Press, 1990.
CARLOS GUSSENHOVEN
TYPES OF FOCUS IN ENGLISH
1. INTRODUCTION This chapter has two aims. First, section 1.0 intends to shows that the way pitch accents express information structure in English is subject to structural constraints. This view is contrasted with one in which the pitch accent directly signals the information status of the word it occurs on. The second aim, pursued in section 2.0, is to show that there isn’t just a single semantic contrast between ‘old’ and ‘new’ information: languages express various kinds of focus meanings, like reactivating focus, contingency focus and corrective focus. 1.1. The expression of focus by means of pitch accents in English Intuitively, pitch accents in English indicate that the speaker means to stress the importance of the words they appear on. Recapitulating earlier research, this section endorses a research tradition in which this intuition is undermined by a demonstration that there are structural constraints on the way pitch accents signal the focus constituent of the sentence. That is, the connection between the pitch accent and pragmatic ‘importance’ is not word-based. Depending on the syntactic structure, an accented word may signal the focus of a larger constituent than that formed by word on its own. Before exploring the role of the syntactic structure in determining the relation between accentuation and the focus constituent, the alternative, ‘direct’ position is presented as a background. Generally, speakers conduct conversations so as to establish a common understanding about some aspect of the world. They keep track of the development of their common understanding in a ‘discourse model’, and indicate the way their information relates to the hearer’s understanding. Pitch accents express this ‘information status’. The focus constituent may, in Ladd’s (1980, p. 77) terminology, be ‘broad’ or ‘narrow’, depending on size. If a speaker takes someone to task for making a pedantic remark, the sentence Even a nineteeth-century professor of CLASSics wouldn’t have allowed himself to be so pedantic contains a relatively broad focus on a nineteenth-century professor of classics. In Ladd’s words, the addressee here ‘has nothing to do with classics, is not a professor, and is more or less contemporary’, and a nineteenth-century professor of classics just so happens to be the most pedantic type of person the speaker could think of. However, if the speaker were trying to come up with what to him is a particularly clear case of nineteenth-century pedantry, the focus would be narrowed down to professor of 83 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 83–100. © 2007 Springer.
84
CARLOS GUSSENHOVEN
classics, while the focus would be narrowed down further to just classics if the discussion was more specifically about pedantry among nineteenth-century professors. The variation between ‘broad’ and ‘narrow’ focus which this example shows was earlier discussed under the rubrics of ‘normal stress’, for the ‘broadest’ case, and ‘contrastive stress’, as in other cases (where ‘stress’ is equivalent to ‘accent’) (Newman 1946; Chomsky & Halle 1986; Bresnan 1971; Bresnan 1972). This older view held that ‘normal’ accent patterns (which were never defined, but were assumed to be the most natural pattern when reading out an isolated sentence) were determined by syntactic factors, but that ‘contrastive’ accent patterns arose from independently meaningful considerations. Thus, ‘normal’ stress was believed to yield to formal linguistic rules, while ‘contrastive’ stress was not. This position came under attack by Bolinger (1972) and Schmerling (1974). Bolinger stressed that all accent placements are meaningful, and that it is impossible to draw a dividing line between ‘normal’ and ‘contrastive’ accents. In this view, all new information implicitly contrasts with other information: the sentence I’ll give you a BOOK does not change its structure if the implication changes from ‘I won’t give you a cd-rom’ to ‘I won’t give you anything else’, or even to ‘I won’t behave in any other way’. These differences are gradient and non-structural, Bolinger argued (1972), and in all three cases the accent location is determined by the speaker’s informational bias towards the concept ‘book’. Bolinger’s point that ‘neutral’ and ‘contrastive’ accentuations should be explained within a single conception of information structure was welcomed by later researchers (Schmerling 1974; Ladd 1980; Gussenhoven 1983a; Selkirk 1984). His position was otherwise vulnerable on two counts, however. One is that ‘contrastive’ focus may actually be expressed differently from ‘neutral’ focus. In fact, when looking at languages other than English, these differences turn out to be of two kinds. First, ‘contrastive’ may refer to a type of focus, to be discussed as ‘corrective’ focus in section 2.0. Even if English does not always distinguish between ‘presentational focus’ (Zubizarreta 1998; Selkirk 2002) or ‘information focus’ (Kiss 1998) and corrective focus, some languages, like European Portuguese, consistently use different forms (Frota 1998). In such cases, the equivalents of (1a) and (1b) are not homophonous. (1)
a. (A: Has she driven any other cars besides Fords and Chevrolets?) B: She used to drive [a RENault CLIO]FOC b.
(A: Helen used to drive a Ford Capri) B: No, she used to drive [a RENault CLIO]FOC
Second, some languages appear to make a distinction between broad focus, in which there is no particular constituent which is focused (or, alternatively, the entire expression is considered the focus constituent) and narrow focus. For instance, Bengali has different pitch accents in cases equivalent to (2a) and (2b) (Hayes & Lahiri 1991). In such cases, the term ‘neutral’ is applied to the broad-focus form.
TYPES OF FOCUS IN ENGLISH (2)
85
a. (A: What else can you tell us about Helen?) B: She [used to drive a Renault CLIO]FOC b. (A: What kind of Renault did she drive?) B: She used to drive a Renault [CLIO]FOC
The other element in Bolinger’s position to be challenged is more directly relevant to English. It was the belief that a pitch accent highlighted the informational value of just the word it is placed on. Bolinger (1985, 1987) insisted that the relation between accent and focus was direct, rejecting what is known as focus projection, the ability of an accented word to signal the focus for a higher constituent, like the phrase or clause, causing differently sized focus constituents to have the same form (Chomsky 1971; Jackendoff 1972; Selkirk 1984). Such ‘focus ambiguity’ is excluded under Bolinger’s direct view of the relation between accent and focus, but almost inherently part of approaches which see the ‘focus-to-accent’ relation, a term introduced in Gussenhoven (1985), as indirect, and mediated by the linguistic structure. The structural nature of accentuation in English in fact stretches all the way from the lexicon to the sentence. The phonology determines that in the adjective MANifest the accent is on the first syllable. Morphological formations impose accentuation patterns, as in SaTANic, where the adjectival suffix causes the accent to be on the second syllable of the stem (cf. SAtan), or in LANguage consultant, where consultant is unaccented because it is the second constituent of a compound (Burzio 1994, Hayes 1995, Zonneveld, Trommelen, Jessen, Rice, Bruce, & Arnason 1999, and references therein.) Post-lexically, the phonology determines the ‘shifted’ location of the pitch accent in the adjective in CHInese LANtern (cf. It’s ChiNESE). Schmerling (1974) pointed to a further regularity at the level of the syntax, which requires a predicate to be unaccented when paired with its subject or object in what she called ‘news sentences’, i.e. when the focus is broad. Thus, while Schmerling agreed with Bolinger on the untenability of the distinction between ‘contrastive’ stress and ‘normal’ stress, she disagreed on the role of structure. Her work formed the basis of accounts of the focus-accent relation that rely on predicate-argument structure (Gussenhoven 1978, 1983a, 1992, 1999a,b; Ladd 1983; Selkirk 1984, 1995). Ladd (1980) also emphasized the structural nature of accent distributions, but related the fact that the verb often goes without an accent to a scale of accentability applying to word classes, but endorsed my 1983 proposal in Ladd (1983). Bolinger continued to argue for the conflation of the syntactic regularity - and indeed that of the regularity expressed by Compound Stress - with deaccenting due to absence of focus (Bolinger 1972, 1978, 1985, 1989). 1.2. The Focus-to-Accent relation in English: SAAR The central observation is that English predicates that are surface-adjacent to an accented argument need not be accented in order to be interpreted as focused (Schmerling 1974; Gussenhoven 1983a; Gussenhoven 1992; Selkirk 1984; Selkirk
86
CARLOS GUSSENHOVEN
1995).1 Importantly, Schmerling observed that by the side of the SV ‘news sentence’ with its unaccented verb, (3), an unaccented predicate also appears after a nonsubject argument, as in (4). Accordingly, she formulated a principle stating that, in news sentences, accents go to the argument (the subject and the object), but not to the predicate. Thus, the lack of accent on died and grow has the same explanation, as has the lack of accent on hit in (5). (All examples from Schmerling (1974)). (3)
JOHNSON died
(4)
Great oaks from little ACORNS grow
(5)
JOHN hit BILL
As I pointed out in a review of her book, Schmerling’s principle really extends to all (presentational focus) sentences, provided the notion of ‘focus’ is brought in. To account for the accentual pattern in her (6), she introduced the ‘Topic-comment’ sentence, and formulated the principle that in such sentences, both topic Truman and comment died are accented. This is incorrect to the extent that when we reverse topic and comment, the topic Truman remains unaccented, as in (6b), from Gussenhoven (1978). (6)
a. TRUMAN DIED b. The disease KILLED Truman
If we assume that ‘topic’ means ‘outside the focus’, things fall into place. Comments are accented, topics are not; the reason why the topic in (6a) is accented is due to its position before the focus, where accents are optional. Not only do we now account for (6a) and (6b), we can also generalize the instruction ‘accent the comment’. Subjects and objects are ‘arguments’, as noted by Schmerling. That is, they represent necessary elements in the semantics of the predicate, and as such contrast with constituents that express circumstantial conditions on the predicate, like time, space and manner adjuncts, henceforth ‘modifiers’. Schmerling’s principle amounts to the generalization that any focused argument, predicate or modifier forms its own comment, except the special case of the single comment formed by a predicate that is adjacent to one of its arguments. The argument-predicate connection seems especially clear from cases like (7a,b). Since direction adjuncts are arguments of verbs of motion, as in (7a), no accent appears on the verb taken, but in (7b), where the verb bury appears in combination with a place adjunct, there are two ‘comments’, and two accents appear. The fact that Independence occurs in a Prepositional Phrase in both cases is immaterial. (Truman, again, is a topic.) Following Schmerling’s strategy to employ German to demonstrate the same regularities in a language with a different word order, I give Dutch translations to bring out the difference more clearly (Gussenhoven 1978). (7)
a. Truman was taken to INDEPENDENCE Truman werd naar INDEPENDENCE gebracht
TYPES OF FOCUS IN ENGLISH
87
b. Truman was BURIED in INDEPENDENCE Truman werd in INDEPENDENCE BEGRAVEN The ‘comments’ were relabelled ‘focus domains’ in Gussenhoven (1983a), to mean constituents that can be placed in focus by the accentuation of only one word, and the generalization was given the status of a Sentence Accent Assignment Rule (SAAR). It is comparable to the Compound Rule, which deletes accents on the second constituent of compounds, but operates at a higher level of structure: a predicate loses its accent when an adjacent accent appears on one of its arguments. While the next paragraph discusses the way SAAR applies in complex sentences, the structural nature of the relation can be shown in a number of ways even within the clause. Interruption of the adjacency of predicate and argument by an accented modifier breaks up the focus domain, causing the predicate to be accented, as in (8c), which is to be compared with the uninterrupted focus domain in (8a) and with the intervening unaccented time modifier just in (8b). (8)
a. [My TYRES have been cut]FOC b. [My TYRES have just been cut]FOC ‘only a little while ago’ c. [My TYRES have JUST been CUT]FOC ‘without further ado’
Second, the fact that the semantically similar (9a) and (9b), from Gussenhoven (1985), have accentuations that follow SAAR, not the semantics. (9)
a. [Your TROUSers are torn]FOC b. [There’s a TEAR in your trousers]FOC
Third, the argument must have its head in focus. The focus constituent in (10a) is black, and since the noun bird is outside it, the predicate cannot be deaccented. By contrast, in (10b) the noun blackbird is included in the focus constituent, and the pattern goes through. Both examples could be answers to a question about the wellbeing of a group of birds, but only (10a) can count as a straightforward answer. Example (10b) carries an implication of some awkward downplaying of the fact that one of the birds has escaped (Gussenhoven 1985). (10)
a. ??The [BLACK]FOC bird [has escaped]FOC (Cf. The BLACK bird has ESCAPED) b. [The BLACKBIRD has escaped]FOC
Fourth, complex predicates behave like predicates. These include ‘natural predicates’ like take advantage of, pay attention to, which Di Sciullo & Williams (1987) argue are syntactically atomic. This explains why we can have (11a), but not (11b). They support the syntactic difference between these structures by pointing out that (11b) is in fact ill-formed, quite regardless of how it is accented. That is, take great advantage of is not a multi-word verb, like take advantage of, but a syntactic phrase.
88
CARLOS GUSSENHOVEN
(11)
a. [ BILL’s been taken advantage of]FOC b. ?[ BILL has been taken great advantage of]FOC
In Selkirk (1984, 1995), focus projection continues to higher levels of structure, like the VP and the S. In my view of focus projection, only predicate focus can be licensed by an accent on an argument. In fact, there is a further restriction to be stipulated, which is that subjects can only license focus on the predicate if no further lexical constituents follow. That is, Her HUSband beats the poor soul is not a wellformed reply to Why has SHE come to this family refuge centre?, while Her HUSband beats her is. In effect, the legitimate argument-predicate focus domains are as in (12), where the constituents are lexical (i.e. function word are ignored). (12)
Possible predicate-argument focus domains: [SUBJ-pred]]S, [pred-OBJ] ... ]S
1.3. SAAR in complex sentences In complex sentences, SAAR applies as often as there are clause nodes in the expression (Gussenhoven 1992). Consider the constructions in (13). (13)
a. Embedded nonfinite object clause (I heard a clock tick) b. Embedded nonfinite object clause plus indirect object (I forced a clock to tick) c. Resultative (I’ve painted the door green) d. Depictive (I drank the coffee cold )
In (13a), a nonfinite clause a clock tick functions as the argument of heard in the main clause. SAAR requires that within the nonfinite clause, the argument a clock is accented if both it and its predicate tick are included in the focus constituent. This is shown in (14). In the main clause, the requirement is that the argument a clock tick is accented and its predicate heard unaccented, if both constituents are included in the focus constituent. Since the argument is accented, on clock, the condition has been met. The accent on clock thus functions at two levels of structure, once at the level of a clock (tick) and once at the level of (heard) a clock tick. In the same way, lion and a lamb are arguments of the predicate devour in (15). At the higher level, the requirement that a lion devour a lamb, the argument of saw, be accented here is met through the presence of two accents. (14)
I [heard a clock tick]FOC (I) [heard]Pred [[a CLOCK]Arg [tick]Pred]Arg
(15)
I [saw a lion devour a lamb]FOC (I) [saw]Pred [[a LION]Arg [devour]Pred [a LAMB]Arg]Arg
The structure of (13b) differs from (13a) in that the predicate (e.g. force, promise, teach, tell ) takes three arguments rather than two. In (13b), there is an object to tick and an indirect object a clock, in addition to the subject. The latter
TYPES OF FOCUS IN ENGLISH
89
licenses the unaccented predicate forced, while to tick forms its own focus domain. Therefore, two accents appear in (16). When the direct object is a clause, as in (17), SAAR applies within it: the argument devour a lamb is a clause, which has an accent on the argument a lamb and leaves the predicate devour unaccented. (16)
I [forced a clock to tick]FOC (I) [forced]Pred [a CLOCK]Arg [[[to TICK]Pred]S]Arg
(17)
I [taught a lion to devour a lamb]FOC I [taught]Pred [a LION]Arg [[[to devour]Pred [a LAMB]Arg]S]Arg
Selkirk (1995) offers an alternative explanation for the difference between (13a) and (13b), which relies on the presence of a subject trace for the verb to tick in (13a), as shown in (18a). Assuming that a pitch accent in any event licenses focus on the word it occurs on, her syntactic theory of focus projection, which also builds on Rochemont (1984), postulates three projection relations that license focus for higher constituents. First, heads license focus on phrases; second, objects (i.e. internal arguments) license focus of the head; and three, moved constituents license their trace (Selkirk 1995). Because subjects are assumed to be raised from their clause, they leave a trace which is focus if the subject is focus, and the trace then projects focus to the VP and ultimately to the whole clause. In effect, because a subject trace is now treated as an internal argument, this procedure equates subjects with objects for the purposes of the second projection relation. It has the additional effect of explaining why to tick in (18a) can be unaccented, and yet be focus, since (18a) has a trace, but (13b) has not, as shown in (18b), after Selkirk (1995). Her theory is considerably less restrictive than the one defended here. The restriction to internal arguments in the second projection clause would appear to be rendered vacuous by the addition of the third clause. While in the original two-clause version subjects were incapable of projecting focus at all, in the three-clause version subjects can project focus to the entire clause, even in a sentence like JOHNson died of natural causes. This seems incorrect; for discussion see Gussenhoven (1999a). (18)
a. [I heard [a CLOCK [[t] tick]]] b. [I forced [the CLOCK [to [PRO TICK]]]
Moving to (13c,d), a summary of the syntactic analyses proposed for these constructions is provided in Winkler (1996, ch.2). Two analyses would at first sight be compatible with the accentuation facts. First, Di Sciullo & Williams (1987) analyse sentences like (13c) as containing only one clause. The special feature is the complex predicate: paint green is a single constituent, which can form a focus domain with an argument like the door, thus remaining unaccented itself. (Within the predicate, the accent goes to the adjective, just as it goes to the particle in phrasal verbs like to look up). This is shown in (19). As pointed out by Rod Walters audience after a presentation of these data at the 2000 LAGB meeting In London, this formation of complex predicates is probably subject to a size constraint, since To paint the DOOR a bright green seems ill-formed.
90
CARLOS GUSSENHOVEN (19)
I [have painted the door green]FOC (I) [have painted green]Pred [the DOOR]Arg
(20)
(I) [have painted]Pred [[the DOOR]Arg [green]Pred]Arg
A second possible analysis of (19) is (20), where door would be accented because it is a theme in the small clause the door (be) green. In this analysis, (be) green is the predicate, while the small clause itself is a theme of paint. However, it can be shown that resultatives don’t behave like argument-predicate structures. If the door green is a clause, it should be able to be a focus constituent and have just an accent on door. In (21), we see that this can be done with birds in birds sing, an undoubted clause. By contrast, (22) cannot have the same accentuation. This is explained if we assume that arguments like doors and walls do not have green, black as predicates, but rather paint green, paint black. Under that interpretation, green, black are headless fragments of predicates, and understandably arguments cannot form focus domains with them (cf. also (10)). (21)
(A: So what have you seen in the nature reserve?) I’ve seen BIRDS fight, I’ve seen TURTLES mate, I’ve seen ELEPHANTS feed ...
(22)
(A: So what have you ever painted?) ?? I’ve painted DOORS green, I’ve painted WALLS black, I’ve painted FURNITURE red ... I’ve painted DOORS GREEN, I’ve painted WALLS BLACK, I’ve painted FURNITURE RED ...
Resultatives contrast with depictives of the type illustrated by (13d), which is pronounced I drank the COFFEE COLD (Winkler 1996, p. 277 ff ). With respect to the predicate drank, cold functions as a modifier requiring the proposition ‘I drank the coffee’ to be valid under the condition ‘X be cold’. The possible interpretations for ‘X’ are ‘the coffee’ and ‘he’. Since cold functions as a modifier in the clause I drank the coffee, it has no argument at that level of structure with which it can form a focus domain, even though the modifier itself may well analyzed as a small clause containing just cold as a predicate. This concludes the section on the structural relation between pitch accents and information structure in English. The next section discusses different types of focus. 2. TYPES OF FOCUS As Dik (1980, 1997) makes clear, languages not only express information packaging in different ways, they also express different focus meanings, or ‘focus types’. Unlike Culicover & Rochemont (1983), I take formal characteristics rather than contextual differences to be the criterion for recognizing a focus type. This section lists a number of focus types that have been distinguished in English. In each case, the form is described, and the meaning informally characterized.
TYPES OF FOCUS IN ENGLISH
91
2.1. Presentational focus The term ‘focus’ is usually equivalent to ‘presentational focus’. A commonly used diagnostic is questioning: the focus constituent is the part of the sentence that corresponds to the answer to a question, either overt or implied (Kanerva 1989). In the preceding section, many examples were given. 2.2. Corrective focus When the focus marks a constituent that is a direct rejection of an alternative, either spoken by the speaker himself (‘Not A, but B’) or by the hearer, the focus is ‘corrective’ (or ‘counterassertive’ cf. Dik 1980; Gussenhoven 1983a). As explained in section 1.0, this type is often referred to as ‘contrastive’, as in Chafe (1974), which term must not be confused with ‘narrow focus’. English bans pitch accents to the right of the presentational focus within the intonational phrase, but to the left of the focus, pitch accents are commonly used, as in (23a). However, with corrective focus, deaccentuation would equally seem appropriate before the focus constituent, as illustrated in (23b). (23)
a. (A: What’s the capital of Finland?) B: The CAPital of FINland is [HELsinki]FOC b. (A: The capital of Finland is OSlo) B: (NO.) The capital of Finland is [HELsinki]CORRECTIVE
Languages that make a formal distinction between presentational focus and corrective focus include Efik, where a focused answer to a WH-question is not expressed in the same way as a focused correction, which requires a corrective focus particle (de Jong 1980; Gussenhoven 1983a). Lekeitio Basque, too, expresses corrective focus and presentational focus differently (Elordieta, this volume). Navajo has a neutral negative, doo ... da, shown in (24a), and one that expresses corrective focus, hanii, as shown in (24b,c) (Schauber 1978). The acute indicates high tone. (24)
a. Jáan doo chidí yiyííáchø’-da John NEG car 3RD-PAST-wreck-NEG ‘John didn’t wreck the car’ b. Jáan hanii chidí yiyííáchø’ John NEG car 3RD-PAST-WRECK ‘JOHN didn’t wreck the car (someone else did)’
c. Jáan chidí hanii yiyííáchø’ ‘John didn’t wreck the CAR (he wrecked something else)’
92
CARLOS GUSSENHOVEN
2.3. Counterpresupposition focus A third type, which may be rare, is ‘counterpresuppositional’ focus, which involves a correction of information which the speaker detects in the hearer’s discourse model. English has a special form for counterpresuppositional focus if the focus constituent is the polarity of the sentence. In (25a), originally from Ladd (1980, 87), the focus is on the negation, while John reads book is the background. If the focus had been ‘corrective’, i.e. been a correction of new information brought in by a preceding utterance, the expression would have been (25b), which has corrective focus on the negation.2 (25)
a. (A: Has John read Slaughterhouse Five?) B: John does [n’t]COUNTERPRESUP READ books b. (A: I’m telling you: John reads books!) B: I’m sorry, John does [NOT]CORRECTIVE/ DOES[n’t]CORRECTIVE read books
English counterpresuppositional polarity focus may be expressed by means of a pitch accent on a preposition in a non-focused constituent. Such accentuation of prepositions should be distinguished from (presentational or correction) focus for the preposition itself. Example (26a) contrasts with (26b) in this way (Gussenhoven 1983a). Importantly, they would have different translations in German or Dutch. (26)
a. (A: What other artistes have been in your car?) B: Patty Grey was [never]COUNTERPRESUP IN my car b. Patty Grey was never [IN]CORRECTIVE my car (implying ‘but she may have been underneath it’)
2.4. Definitional focus By means of ‘definitional’ focus, the speaker indicates that the information does not refer to a change in the world, but informs the hearer of attendant circumstances. While presentational focus in English subject-predicate sentences requires the predicate to be unaccented, as in (27a), a definitional focus requires accents on both constituents, as in (27b) (originally from Kraak (1970)). I termed the pattern in (27a) ‘eventive’ in Gussenhoven (1983a). Semantically, definitional focus seems special, but phonologically it is the accentuation pattern used for the eventive meaning which seems marked: the predicate is left unaccented, even though it is in the focus constituent.3 Below, I will label the semantically default type with EVENTIVE rather than plain FOC. (27)
a. [Your EYES are red]EVENTIVE b. [Your EYES are BLUE]DEFINITIONAL
TYPES OF FOCUS IN ENGLISH
93
The eventive vs. definitional distinction is akin, but not identical to the distinction between ‘individual level’ and ‘stage level’ predicates (Kratzer 1996). Stage level predicates involve transient qualities, as in (27a), where the redness is due to swollen eyelids, and individual level predicates to permanent qualities, as in (27b), where blue refers to the colour of the iris. However, (28) shows that the eventive interpretation may combine with the inherent colour interpretation. Genericity of the subject, as suggested by Diesing (1992), does not explain the pattern either. In (29a), an existential subject licenses focus on the verb, but the generic subject in (29b) does not. However, generic subjects may occur in eventive sentences, as shown by (30), which the keeper of the last dodo might have used to announce its demise, leaving his listeners to infer the death of the last dodo from his communication that none in fact survive (Gussenhoven 1983c). (28)
(A: Why have you chosen me?) B: Your EYES are blue (eventive, but permanent property)
(29)
a. FIREMEN are available b. FIREMEN are ALTRUISTIC
(30)
The DOdo is extinct (eventive; but generic subject)
Having ruled out equations between ‘eventive’ and ‘stage level’ and between ‘eventive’ and ‘non-generic’, we would of course like to have a semantic definition of ‘eventive’ that will cover all instances of this pattern. An eventive sentence reports a change in the world. However, there are two caveats. First, the pattern would appear to carry some additional semantic feature of ‘non-agentive’ or ‘non-volitional’ (Faber 1987). Thus, The BAby’s crying is the expected accentuation in a reply to ‘Why are you getting up?’, but so is GRANDmother’s CRYing, where a volitional involvement of the subject is somehow conveyed by the accent on the verb. Second, in a case like (28), there is no change in the world to report, as observed by Daniel Bühring (personal communication, 2003). Here, the change would appear to lie in the announcement of the relevance of blue eyes for mate selection, or for this particular case of mate selection. Neither of these aspects seem at all easy to incorporate in a definition of ‘eventive’. Under definitional focus, objects retain their power to license definitional focus on the predicate. Definitional focus thus differs from eventive focus in disallowing subject-predicate focus domains. The accentuation of a broad-focus SOV sentence like JOURnalists report the NEWS therefore corresponds to that of an eventive A JOURnalist was reporting the NEWS. The next focus type, contingency focus, not only bans Subject-Predicate focus domains, but also Predicate-Object ones, i.e. also requires focused predicates with adjacent accented objects to be accented. The situation can be summarized as in (31).
94
CARLOS GUSSENHOVEN
(31)
Possible argument-predicate focus domains: Eventive: [SUBJ-pred]]S, [pred-OBJ] ... ]S Definitional: [pred-OBJ] ... ]S Contingency: -
2.5. Contingency focus As with definitional focus, with ‘contingency focus’ the speaker indicates that the information is not about a change in the world, but defines attendant circumstances, but the difference is that the information is presented as potentially relevant. Examples (32a,b) are from Halliday (1967, p. 38), who incorrectly explained (32a) as being due to the status of ‘dogs’ as ‘old information’. In Gussenhoven (1983a) I pointed out that dogs is in fact accented and ‘new’ in both interpretations, but that the meaning of (32a) is ‘contingency’ (‘If there are dogs, they must be carried’), while that in (32b) is ‘eventive’, and implies that the speaker might be ‘worried because he had no dog’, to quote Halliday (cf. also the discussion in Ladd (1996, p. 199)). Similarly, eventive The King of FRANCE is bald carries an implication that there is a King of France which is absent from the contingency sentence the King of FRANCE is BALD (Gussenhoven 1983c). (32)
a. [DOGS must be CARRIED]CONTINGENCY b. [DOGS must be carried]EVENTIVE
Unlike definitional focus, contingency focus is evident in SOV structures, as illustrated in (33), where the proverbial interpretation (33a) is an example of contingency focus, and contrasts with eventive (33b). The phonetic difference with the eventive reading is less salient than in (32a,b), because of the non-final position of the word carrying the accent in the contingency version. The contingency of the proposition need not always be due to the conditional status of the subject. In (34a), from Gussenhoven (1983a), the object is conditional (‘If there are thieves’). The phonetic salience of the contrast is again low in English, due to the non-final position of the predicate. However, in Dutch, which has the accentable part (aan) of the phrasal verb aangeven ‘report’ in final position, the difference is as salient as that in (32a,b). (33)
a. [TOO many COOKS SPOIL the BROTH]CONTINGENCY b. [[TOO many COOKS spoil the BROTH]EVENTIVE (implying ‘we need to take soup off the menu’)
(34)
a. [The MANagement rePORTS THIEVES]CONTINGENCY Dutch: De DIRECTIE geeft DIEVEN AAN
TYPES OF FOCUS IN ENGLISH
95
b. [The MANagement reports THIEVES]EVENTIVE (e.g. a caption for a cartoon) Dutch: De DIRECTIE geeft DIEVEN aan In broad-focus SOV structures, therefore, it is definitional and eventive that contrast with contingency, as shown in (35), where the verb is accented only in (35c). (35)
a. [The HUNters were shooting ANimals]EVENTIVE b. [HUNTers shoot ANImals]DEFINITIONAL c. [These HUNTers SHOOT ANimals]CONTINGENCY (‘So don’t let your pets get near them!’)
In addition to the obligatory accent on the predicate, contingency sentences obligatorily accent the negator, if there is one. A three-way contrast therefore arises in negative SV structures, as shown in (36). In eventive (36a), the entire predicate is unaccented, in definitional (36b), an accent is added to the verb, and in contingency (36c), both the verb and the negation are accented. The presence of the accent on the negation is more salient in German, where it appears post-verbally, as in (37): (37a) is either eventive, which expression could be used to complain that someone doesn’t blink, in spite of an earlier agreement that he would at that point in time, or definitional, in which case it describes a state of affairs whereby someone just never blinks. Contrast these with the contingency version (37b), which is a warning just in case. (36)
a. (A: What seems to be the problem?) B: [Our CUStomers aren’t admitted]EVENTIVE b. [Our CUStomers aren’t adMITted]DEFINITIONAL (‘That’s the way it is’) c. [Our CUStomers AREN’T adMITted]CONTINGENCY (‘In case you had forgotten’)
(37)
a. You [don’t BLINK]EVENTIVE/DEFINITIONAL German: Du ZWINKERST nicht! b. You [DON’T BLINK]CONTINGENCY German: Und du ZWINKerst NICHT!
2.6. Reactivating focus Instead of, or in addition to, new information, languages may also mark old information, an option referred to as ‘reactivating focus’. The term is somewhat paradoxical, as it is the background information that is now marked for information status. (That is, ‘focus’ is here used in the general sense of ‘structural marking of information status’.) In (38), the constituent John has ‘reactivating focus’. Speaker B considers the fact that she is not just acquainted with John, but actually dislikes him, significant enough to single out the ‘given’ John by means of the syntactic device of
96
CARLOS GUSSENHOVEN
TOPICALIZATION. In English, constituents can be topicalized, giving the meaning ‘as for this constituent’. (38)
(A: Does she know JOHN?) B: JOHN she DISLIKES
2.7. Identificational focus English has the syntactic device ‘clefting’, the ‘It is X [who/that VP]’ construction, where X is the subject of VP. It would appear that clefting causes the non-clefted constituent to be reactivated information if it is accented, as in (39). Here, the implication in B’s response is that Helen’s dislike of someone had been discussed relatively recently. The clefted constituent is optionally accented. If the non-clefted constituent is unaccented, it is old information, as in (40). The clefted constituent is now obligatorily accented, and constitutes new information. It is impossible to have both clefted and non-clefted constituents contain new formation. That is, in It is the POSTMAN who CAME either the postman or the notion of arriving must be in the context. (39)
(A: Does Helen know JOHN?) B: It is John/JOHN she DISLIKES
(40)
[A: I wonder who she dislikes] B: It is JOHN she dislikes
Clefting, therefore, presents a somewhat complex picture when viewed from the perspective of information status. Since no ready generalization arises, its meaning may not really be concerned with legitimacy or recency of information in the background. Rather, the meaning is to exhaustively identify a constituent (Szabolcsi 1981; Kiss 1998). In (41a), the focus constituent is egy kalapot ‘a hat+ACC’. The sentence differs from that in (41b), which also has egy kalapot in focus, in that (41a) entails that Mary bought nothing but a hat. By contrast, in (41b) the hat may be one of a number of items that were bought by Mary. In other words, clefting expresses identificational focus (Kiss 1998). (41)
a. Mari egy kalapot nézett ki magának Mary a hat+ACC picked out herself+ACC ‘It was a HAT that Mary picked for herself’ b. Mari ki nézett magának egy kalapot ‘Mary picked a HAT for herself’
TYPES OF FOCUS IN ENGLISH
97
The difference between (41a) and (41b) is brought out by a test attributed by Kiss to Szabolcsi (1981). Compare (42) with (43): (42b) is semantically incompatible with (42a), since it claims that the hat in question is the only item bought by Mary, thus denying (42a). By contrast, no such conflict arises in the case of (43a,b), even though the speaker of (43b) may be accused of being parsimonious with the truth. This is true regardless of the information status of the clefted constituent. All examples could be answers to ‘What did Mary buy?’, so that the non-clefted constituents (that Mary bought) are unaccented, but they can also be placed in a context in which Mary has presentational focus and the clefted constituents are old information (in which case the examples could be answers to I wonder why no one bought a hat or a coat or a similar item of clothing). (42)
a. It was a hat and coat that Mary bought b. It was a hat that Mary bought
(43)
a. Mary/MARY bought a HAT and a COAT b. Mary/MARY bought a HAT 3. CONCLUSION
One dimension of meaning expressed by sentence-level pitch accents in languages like English concerns the size of the focus constituent, which is expressed through deaccentuation of constituents after the focus. Beginning with Schmerling (1974), researchers have found that the relation between the pitch accent and the focus is mediated through the predicate-argument structure of the sentence, which is evident from the fact that predicates remain unaccented when they abut a focused argument. In many cases, therefore, the accent on the argument is properly to be seen as an accent on the predicate-argument combination, a regularity which obtains as many times as there are clauses in the sentence. A second dimension of meaning concerns the meaning of ‘information packaging’ itself. The semantics would appear to involve a number of distinctions. •
Background vs. New information. This is the basic distinction which has been referred to as ‘topic’ vs. ‘focus’, ‘old/given’ vs. ‘new’, etc. Information that serves to further develop the discourse model was discussed as ‘presentational focus’, while ‘reactivating focus’ was used for information retrieved from the background.
•
Development vs. Correction. If ‘development’ is the default situation whereby speakers add information to the discourse model, correction involves the removal of information. When applied to new information, it is ‘corrective focus’ and when applied to the background, it is ‘counterpresuppositional focus’.
98
CARLOS GUSSENHOVEN •
Eventive vs Non-eventive. The development of the discourse may involve reports of changes in the world, or may further define the existing world. In the former case, we have ‘eventive focus’; in the latter the focus is non-eventive. Non-eventive focus subdivides into ‘definitional’ and ‘contingency’.
•
Definitional vs. Contingency. In both cases, the information serves to define the world, but for ‘contingency focus’ the speaker indicates that the information is only potentially relevant to the discourse model.
The above summary suggests that the speaker indicates how the information in his expression is to be related to the hearer’s information about the mini-world about which they are together trying to reach a state of mutual understanding. The meanings of the melodic aspects of the pitch accent in English proposed in Brazil (1975) and Gussenhoven (1983b) as well as those proposed by Pierrehumbert & Hirschberg (1990) fit this type of meaning well. The former include ‘Addition’ (Brazil’s ‘Proclaiming’), used for the commitment of information to the discourse model and signalled by falling contours, and ‘Selection’ (Brazil’s ‘Referring’), used for reference to information in the background and signalled by falling-rising contours. A third meaning, ‘Testing’, signals the speaker’s inability or refusal to commit information to the discourse model, signalled by rising contours Gussenhoven (1983b). ‘Identificational’ focus somehow doesn’t quite match these other meanings. The information that John is the only person who caught a fish, as conveyed by It’s John who caught a fish, concerns information content rather than information status. Possibly, therefore, intonation can only be used for the expression of information structure, implying that identificational focus can only be expressed through the morphology or syntax.4 Centre for Language Studies University of Nijmegen The Netherlands 4. NOTES 1 One difference between my account and Selkirk’s theory is that the latter contains two indirectness relations rather than one. First, there is a relation between ‘focus interpretation’ and ‘F-marked constituent’ (the focus constituent), and there is a second relation between the F-marked constituent and accent distribution. While in my account the first relation is trivial in the sense that the interpretation of each focus constituent is that it is focused, and thus ‘new’, in Selkirk’s theory, focus interpretation principles are applied to the focus constituent so as to establish which parts in it are interpreted as ‘new’ and which as ‘given’. See also Gussenhoven (1999a). 2 I incorrectly analyzed (25) as having focus on the verb in Gussenhoven (1983a, note 5). The latter would indeed have the same form, but is only appropriate in some context like ‘What doesn’t John do with books?’. Ladd (1980) himself analyzed his example as having ‘default accentuation’ on the verb, his point being that the accentuation signals that books is outside the focus, rather than that read is included in it.
TYPES OF FOCUS IN ENGLISH
99
3
A class of ‘event’ sentences was independently identified by Cruttenden (1984) in connection with the accentuation pattern SUBJECT-predicate. My definition referred to a focus type regardless of syntax. 4 Recent treatments which have not been covered in this survey include Lambrecht (1994), Vallduví & Engdahl (1996), Erteschik (1997), and Zubizarreta (1998).
5. REFERENCES Bolinger, D. Intonation: Selected Readings. Harmondsworth: Penguin, 1972. Bolinger, D. Review of Schmerling (1974). American Journal of Computational Linguistics (1978), 1-23. Microfiche. Bolinger, D. “Two views of accent.” Journal of Linguistics 21 (1985): 79-123. Bolinger, D. More views on ‘Two views on Accent’. In On Accent, pp. 124-146. Bloomington, IN: Reproduced by the Indiana University Linguistics Club, 1987. Bolinger, D. Intonation and its Uses. Stanford, CA: Stanford University Press, 1989. Brazil, D. Discourse Intonation I. Birmingham UK: English Language Research,Birmingham University, 1975. Bresnan, J. “Sentence Stress and Syntactic Transformations.” Language 47 (1971): 257-281. Bresnan, J. “Stress and Syntax: a Reply.” Language 48 (1972): 326-342. Burzio, L. Principles of English Stress. Cambridge: Cambridge University Press, 1994. Chafe, W. L. “Language and Consciousness.” Language 50 (1974): 111-113. Chomsky, N. “Deep Structure, Surface Structure and Semantic Interpretation.” In D. D. Steinberg and L. A. Jakobovits (eds.), Semantics: an Interdisciplinary Reader in Philosophy, Linguistics and Psychology, pp. 183-216. Cambridge, UK: Cambridge University Press, 1971. Chomsky, N. and M. Halle. The Sound Pattern of English. New York: Harper and Row, 1986. Cruttenden, A. “The Relevance of Intonational Misfits.” In D. Gibbon and H. Richter (eds.), Intonation, Accent and Rhythm. Studies in Discourse Phonology, pp. 67-76. Berlin: de Gruyter, 1984. Culicover, P.W. and M. Rochemont. “Stress and Focus in English.” Language 59 (1983): 123-165. De Jong, J. “On the Treatment of Focus in Functional Grammar.” GLOT, Leids Taalkundig Bulletin 3 (1980): 89-115. Di Sciullo, A. and E. Williams. On the Definition of Word. Cambridge, MA: MIT Press, 1987. Diesing, M. Indefinites. Cambridge University: Doctoral dissertation, 1992. Dik, S. C. The Theory of Functional Grammar. Part 1: The Structure of the Clause. New York: Mouton de Gruyter. Edited by Kees Hengeveld, 1997. Dik, S. C. “On the Typology of Focus Phenomena.” GLOT, Leids taalkundig bulletin 3 (1980): 41-74. Erteschik-Shir, N. The Dynamics of Focus Structure. Cambridge: Cambridge University Press, 1997. Faber, D. “The Accentuation of Intransitive Sentences in English.” Journal of Linguistics 23 (1987): 341-358. Frota, S. Prosody and Focus in European Portuguese. University of Lisbon: Doctoral dissertation, 1998. [Published by Garland, New York, 2000.] Gussenhoven, C. Review of Schmerling (1974). Dutch Quarterly Review of Anglo-American Letters (DQR) 8 (1978): 233-240. Gussenhoven, C. “Focus, Mode and the Nucleus.” Journal of Linguistics 19 (1983a): 377-417. Gussenhoven, C. A Semantic Analysis of the Nuclear Tones of English. Distributed by Indiana University Linguistics Club (IULC). Bloomington, Indiana, 1983b. Gussenhoven, C. “Van focus naar zinsaccent: Een regel voor de plaats van het zinsaccent in het Nederlands.” GLOT 6 (1983c): 131-155. Gussenhoven, C. “Two views of accent: A reply.”Journal of Linguistics 21 (1985): 125-138. Gussenhoven, C. “Sentence Accents and Argument Structure.” In I. M. Roca (ed.), Thematic Structure: Its Role in Grammar, pp. 79-106. Berlin/New York: Foris, 1992. Gussenhoven, C. “Discreteness and Gradience in Intonational Contrasts.” Language and Speech 42 (1999a): 281-305. Gussenhoven, C. “On the Limits of Focus Projection in English.” In P. Bosch and R. van der Sandt (eds.), Focus: Linguistic, Cognitive, and Computational Perspectives, pp. 43-55. Cambridge, UK: Cambridge University Press, 1999b.
100
CARLOS GUSSENHOVEN
Halliday, M. A. Intonation and Grammar in British English. The Hague: Mouton, 1967. Hayes, B. Metrical Theory: Principles and Case Studies. Chicago: Chicago University Press, 1995. Hayes, B. and A. Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic Theory 9 (1991): 47-96. Jackendoff, R. S. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972. Kanerva, J. M. Focus and Phrasing in Chichewa Phonology. Stanford University: Doctoral dissertation, 1989. Kiss, K. E. “Identificational Focus and Information Focus.” Language 74 (1998): 245-273. Kraak, R. “Zinsaccent en syntaxis.” Studia Neerlandica 4 (1970): 41-62. Kratzer, A. “Stage-level and Individual-level Predicates.” In G. N. Carlson and F. J. Pelletier (eds.), The Generic Book, pp. 125-175. Chicago: Chicago University Press, 1996. Ladd, D. R. The Structure of Intonational Meaning: Evidence from English. Bloomington: Indiana University Press, 1980. Ladd, D. R. “Phonological Features of Intonational Peaks.” Language 59 (1983): 721-759. Ladd, D. R. Intonational Phonology. Cambridge: Cambridge University Press, 1996. Lambrecht, K. Information Structure and Sentence Form. Topic, Focus, and the Mental Representation of Discourse Referents. Cambridge: Cambridge University Press, 1994. Newman, S. “On the Stress System of English.” Word 2 (1946): 171-187. Pierrehumbert, J. B. and J. Hirschberg. “The Meaning of Intonational Contours in the Interpretation of Discourse.” In P. Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp. 271311. Cambridge MA: MIT Press, 1990. Rochemont, M. Focus in Generative Grammar. Amsterdam: John Benjamins, 1984. Schauber, E. “A Comparison of English Intonation and Navajo Particle Placement.” In D. J. Napoli (ed.), Elements of Tone, Stress, and Intonation, pp. 144-173. Washington, DC: Georgetown University Press, 1978. Schmerling, S. F. Aspects of English Sentence Stress. Austin: Texas University Press, 1974. Selkirk, E. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, Mass.: MIT Press, 1984. Selkirk, E. “Sentence Prosody: Intonation, Stress and Phrasing.” In J. Goldsmith (ed.), The Handbook of Phonological Theory, pp. 550-569. Oxford: Blackwell, 1995.. Selkirk, E. “Contrastive FOCUS vs. Presentational Focus: Prosodic Evidence from English.” In B. Bel and I. Marlien (eds.), Speech Prosody 2002. An International Conference, Aix-en-Provence. Laboratoire Parole et Langage, CNRS and Université de Provence, 2002. Szabolcsi, A. “The Semantics of Topic-Focus Articulation.” In J. Groenendijk, T. Janssen, and M. Stokhof (eds.), Formal Methods in the Study of Language, pp. 513-514. Amsterdam: University of Amsterdam, Mathematisch Centrum, 1981. Vallduví, E. and E. Engdahl. “The Linguistic Realization of Information Packaging.” Linguistics 34 (1996): 459-519. Winkler, S. “Focus and Secondary Predication.” Berlin: Mouton de Gruyter, 1996. Zonneveld, W., M. Trommelen, M. Jessen, C. Rice, G. Bruce, and K. Árnason. “Wordstress in WestGermanic and North-Germanic languages.” In H. van der Hulst (ed.), Word Prosodic Systems in the Languages of Europe, pp. 477-603. Berlin: Mouton de Gruyter, 1999. Zubizarreta, M. L. Prosody, Focus, and Word Order. Cambridge, MA: MIT Press, 1998.
NANCY HEDBERG AND JUAN M. SOSA
THE PROSODY OF TOPIC AND FOCUS IN SPONTANEOUS ENGLISH DIALOGUE*
1. INTRODUCTION Our research addresses the interface between meaning and prosody. In particular, it concerns the way intonation plays a part in the interpretation of an utterance. For example, we are concerned with the extent to which a falling versus a falling-rising intonation at the end of an utterance or an extra tonal height on a specific word or phrase affects the way the utterance is interpreted. Information structure categories such as topic and focus have been correlated with specific types of contours. Many authors have stated that there is a peak associated with focus, while others have stated that there is also a peak associated with topic. Claims have been made as to the specific sequence of underlying tones associated with these categories, at least for constructed examples; for instance, that focus will be marked with H* and topic will be marked with L+H*. Here, we test these claims by analyzing the intonation and information structure of a sample of spontaneous dialogue in English. 2. DATA The data were taken from six half-hour episodes of the PBS political discussion television show, The McLaughlin Group, videotaped in April and May 2001. The host, John McLaughlin, discusses current issues of the day with four journalist guests. The journalists have widely differing political beliefs and therefore the discussions get heated and the speakers produce speech that we believe to be quite spontaneous. The guests vary somewhat from week to week. Each half-hour episode consists of four issues discussed. For the first five episodes, we selected the first issue because it was the longest. For the sixth episode, we analyzed a combination of issue two and three. Each issue is introduced by John McLaughlin in a monologue. We didn’t analyze these portions of the videotapes. All participants are native speakers of American English. An advantage to analyzing the McLaughlin Group as a source of data is that transcripts of the sessions are available on the World-Wide Web. In the few cases where we found discrepancies between the transcript and the videotape in the portions of the transcript we were analyzing, we corrected the transcript. 3. INFORMATION STRUCTURE CODINGS One of us, Hedberg, coded the transcripts for five information-structure categories and then listened to the videotape to confirm these codings. The five informationstructure categories are contrastive focus, plain focus, contrastive topic, unratified
101 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 101–120. © 2007 Springer.
102
NANCY HEDBERG AND JUAN M. SOSA
topic and ratified topic. We follow Gundel (1988) in defining topic, comment, and focus. Topic An entity, E, is the topic of a sentence, S, iff, in using S, the speaker intends to increase the addressee’s knowledge about, request information about or otherwise get the addressee to act with respect to E. Comment A predication, P, is the comment of a sentence, S, iff, in using S the speaker intends P to be assessed relative to the topic of S. Focus That part of the linguistic expression that realizes the comment. The focus is very long in the majority of cases, and consists of multiple pitch accents and sometimes multiple intonational phrases. For that reason, Hedberg picked the final pitch-accented phrase to annotate, except in the case of it-clefts where she picked the clefted constituent since all three it-clefts in the data were either topic-clause it-clefts or all-comment it-clefts (Hedberg, 2000). To explain the five categories, we’ll illustrate with examples from the passage shown in (1). Topics are italicized and foci are bold-faced. Contrastive elements receive double underlines, and unratified topics receive a single underline. (1)
Ms. Clift:
Look, John McCain would be the first one to say this doesn’t improve the system to perfection; it makes it marginally better. And there’s still a possibility that Tom DeLay, who is an enemy of the bill, will forge an unholy alliance with Democrats in the House. Because Democrats have figured out, they do worse under this bill than the Republicans do. But the big thing that comes out of this, to me, is that it’s John McCain who gets the big legislative triumph so far in this first 100day period, while President Bush is looking rather passive on a number of issues across the board, especially foreign policy. (3/31/01)
Ratified Topic Contrastive Topic Unratified Topic Contrastive Focus Plain Focus The topic of the entire issue is the McCain-Feingold bill on campaign finance reform. John McCain has just gotten it passed through the Senate and the question is how it will do in the House. John McCain is an unratified topic because Eleanor Clift is re-establishing him as the topic here and thus he is not already established as a topic. The bill itself is already established as the topic and thus references to it with ‘this’ and ‘it’ are coded as ratified topics. The terms ‘ratified’ and ‘unratified’ topic come from Lambrecht and Michaelis (1998). Both ‘John McCain’ and ‘this’ are marked as topics here because John McCain is the topic of the matrix clause and the
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
103
referent of ‘this’ is the topic of the embedded clause. The focus of both the matrix clause and the embedded clause falls on ‘perfection.’ Plain foci are marked in boldface. Tom DeLay is a Republican representative and is the topic of the next sentence. Here ‘Democrats in the House’ is marked as a contrastive focus because there is an implicit contrast with ‘Republicans in the House’. Likewise ‘John McCain’ is a contrastive focus because it explicitly contrasts with ‘President Bush.’ The whole itcleft expresses a comment here, and thus there is no topic indicated for this sentence. In the next sentence, President Bush contrasts with John McCain and is a topic, and hence the phrase denoting him is marked as a contrastive topic. To help identify the topic, Hedberg used Gundel’s (1974) ‘as for’ test and Reinhart’s (1981) ‘said about’ test. For example, in (2), ‘you’ is identified as the topic because the sentence can be paraphrased, ‘As for you, what do you think?’. (2)
Mr. McLaughlin:
What do you think? (6.16)
A total of 1,669 phrases were coded for information structure category, distributed as shown in Table 1. As can be seen from the table, the distribution of the five information structure types was roughly equivalent across the five transcripts. This rough equality serves as a broad check on the reliability of the informationstructure coding. Ideally we would have two information-structure coders, so that we could compare their coding and come up with an inter-coder reliability statistic. We plan to adopt this methodology in future work on this project. Table 1. Distribution of Information Structure Types across the Six Transcripts
Transcript
Ratified Topic
1
109 33.4% 61 22.2% 36 21.4% 79 28.5% 84 25.7% 89 30.1% 458 27.4%
2 3 4 5 6 Total
Contrastive Topic 16 4.9% 7 2.5% 7 4.2% 17 6.1% 15 4.6% 10 3.4% 72 4.3%
Unratified Topic 45 13.8% 45 16.4% 39 23.2% 36 13.0% 57 17.4% 44 14.9% 266 15.9%
Contrastive Focus 14 4.3% 24 8.7% 15 8.9% 31 11.2% 20 6.1% 23 7.8% 127 7.6%
Plain Focus
Total
142 43.6% 138 50.2% 71 41.2% 114 41.2% 151 46.2% 130 43.9% 746 44.7%
326 275 168 277 327 296 1669
We decided to select seven examples of each of the five categories from each transcript for prosodic coding. For each transcript, Hedberg counted the total number of each category and divided by seven. For example, there were 142 plain foci in transcript 1. Division by 7 yields 20.3, so she selected every 20th example for prosodic analysis. In this way, we acquired seven examples of each category spread evenly across the transcript. She then printed a new copy of the transcript and identified the 35 phrases to be analyzed with a highlighting pen, with no indication of
104
NANCY HEDBERG AND JUAN M. SOSA
information structure category. This transcript was given to Sosa, along with the videotape, for prosodic coding. Because there were 6 transcripts, we subjected 210 phrases to prosodic coding. There were a total of 42 examples of each of the five information-structure categories. Sosa then listened to the videotapes and digitized each of the 210 phrases along with some of their surrounding context. Using the Kay Computerized Speech Lab (CSL 4300), he then analyzed the target phrases prosodically and assigned an autosegmental sequence of tones to each phrase. He used annotations for pitch accents (H*, L*, L+H*, H*+!H, H*+L, L*+H and H+L*), boundary tones (L%, H%), intermediate phrase tones (L, H), downstep (!H), upstep (¡H), and increased range (↑H). Again, in future work on this project, we plan to have two prosodic coders, so that we can calculate an intercoder reliability statistic, to be surer that the prosodic coding is accurate. 4. INTONATIONAL CODINGS The intonational analysis and annotation of all digitized utterances was performed following closely the Guidelines for ToBI Labelling (Beckman and Ayers Elam, 1997) and taking into consideration other published materials on the intonational structure of English, notably Pierrehumbert and Hirschberg (1990) as well as other autosegmental-metrical approaches to the phonology of intonation. The ToBI conventions and assumptions were followed, although we introduced two additional pitch accents that we felt were necessary in order to account for certain distinct patterns. For example, we rescued the H*+L pitch accent (which was originally designed to trigger downstep) to generate a dip between two H* pitch accents, which is not captured by the notation H* ... H* alone.
Figure 1. [Even] Dan Goldin (unratified topic, 5.18) H* H* LL%
Our independent feature downstep !H allowed us to free the H*+L notation and use it for this effect. An example of this distinction is shown in figure 1 versus figure 2.1
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
105
Figure 2. Thirty years [of serious anthropological consideration] (plain focus, 3.19) H*+L H* LL%
We noted that the sequence H* ...H* (equivalent to the high head in the British tradition) is quite scarce in the data since the great majority of the utterances show some kind of downdrifting pattern. The very few instances of sequences of straight H* sequences may show a contrast with British English, which is said to typically have this recurring high-pitched pre-nuclear pattern. As already mentioned, the rest of the pitch accents used in this paper were H*, L*, L+H*, L*+H, H+L*, and H*+!H, all of them with the value assigned to them in the ToBI notation and previous work on English intonation. Given the emphasis on this pitch accent in this paper, we present two instances of the L+H* in figures 3 and 4.
Figure 3. Our voyeurism (plain focus, 6.3) L+H* LL%
106
NANCY HEDBERG AND JUAN M. SOSA
Figure 4. In Britain, in fact... (contrastive topic, 3.8) L+H* LL%
The feature ‘increased range’ as well as the ‘upstep’ pitch accent ¡H* were added to the tonal analysis, to specify high pitch excursions. Range is characterized by higher peaks and low valleys, as shown in figure 5.
Figure 5. Made in China (plain focus, 2.34) ↑H* ↑H* LL%
On the other hand, upstep is mostly a H* that is higher than any previous H*, reversing any downdrift of declination effect, as shown in figure 6.
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
107
Figure 6. Not a PBS documentary (contrastive focus, 3.32) H* ¡H* LL%
The overwhelming majority of our utterances showed a downdrift most of the time realized as one or more downstepped !H* in the tonal tier. We noticed that many long utterances that were semantically coherent also had overall prosodic patterns or designs that were larger than the intonational phrase. For lack of a better term we tentatively called them ‘intonational macro-units.’ Two examples are shown in figures 7 and 8.
Figure 7. Mr. McLaughlin: Can you handle that last question? Where do you think the international community is? (2.14)
108
NANCY HEDBERG AND JUAN M. SOSA
Figure 8. Ms. Clift: It requires a leap of faith, however, to believe that the historical Jesus was, in fact, the son of God. (3.27)
This macro-unit doesn’t necessarily coincide with Nespor and Vogel’s (1986) phonological utterance, and is certainly perceptible in oral discourse and visible as such in pitch tracks. After the intonational coding was completed, it was entered on the data spreadsheet and we proceeded with correlating the intonational coding with the information-structure coding. 5. TOPIC ACCENT VERSUS FOCUS ACCENT HYPOTHESES One important issue is whether there is a special ‘topic accent.’ Jackendoff (1972) was the first to propose a distinction between ‘topic accents’ and ‘focus accents’. He proposed that topics receive a fall-rise (‘B’) accent and that foci receive a fall (‘A’) accent. Gundel (1978) follows Jackendoff in distinguishing between comment (focus) accents and topic accents, but points out that topic accents only fall on unactivated or contrastive topics. Pierrehumbert (1980) follows up with the observation that Jackendoff’s B (‘background’) accents receive an H*LH% tune and that Jackendoff’s A (‘answer’) accents receive a H*LL% tune. See Table 2 for Pierrehumbert’s (1980) hypothesis and also hypotheses of researchers after her. Table 2. Hypotheses of Researchers Concerning Topic Accent and Focus Accent
Pierrehumbert 1980 Steedman 1991 Vallduvi & Engdahl 1996, Gundel 1999, Steedman 2000a, Steedman 2000b, Gundel & Fretheim (in press) Lambrecht & Michaelis 1998
Topic accent H* LH% L+H* LH% L+H*
Focus accent H* LL% H* LL% H*
H%
L%
Steedman (1991) states that foci (‘rhemes’) receive the H*LL% accent and tune and that topics (‘themes’) receive a L+H*LH% (the so-called ‘scooped fall-rise’)
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
109
accent and tune. Vallduvi and Engdahl (1996) state that noncontrastive links (Gundel‘s (1978) ‘unactivated topics’ or Lambrecht and Michaelis’s (1998) ‘unratified topics’) receive an L+H* pitch accent, that contrastive links are obligatorily so marked, and that foci are marked with the pitch accent H*. Gundel (1999) claims that topics, both new and contrastive, are marked with L+H*, and that her category of ‘semantic focus’ (contrastive or noncontrastive) is marked with H*. Steedman (2000a, 2000b) and Gundel and Fretheim (2004) also claim that topics are marked with L+H* and foci with H* pitch accents. Lambrecht and Michaelis (1998) distinguish topic accents from focus accents but don’t claim that there is any prosodic difference between them; however, they mention in a footnote that H% may mark topics and L% mark foci. Pierrehumbert and Hirschberg (1990) suggest that L+H* is used to mark contrast, or in their terms, to mark the selection of an item on a contextually-evoked salient scale. They don’t specify whether this contrastiveness is associated with the information structures of topic and focus. Presumably either a topic or a focus can be marked by L+H*, according to them, just so as long as the category is contrastive in their sense. We speculate that Gussenhoven’s (1983) fall-rise tone, which he says is used to ‘select’ an entity from the background, corresponds to a topic accent, and that his fall tone, which he says is used to introduce an entity into the ‘background’, corresponds to a focus accent. The major goal of our research was to put these hypotheses to the test. 6. PITCH ACCENTS 6.1. Does L+H* mark contrast, or topic? With regard to L+H* marking information structure and/or contrast, we came up with the results in Table 3: Table 3. Distribution of L+H* Relative to Information-Structure Type
Ratified Topic Contrastive Topic Unratified Topic Contrastive Focus Plain Focus
L+H* 1 10 13 11 6
% out of 42 2% 24% 31% 26% 14%
As can be seen from the table, we did find a significant number of L+H* pitch accents marking contrastive topics or contrastive foci, e.g. the examples shown in (3) and (4):
110
NANCY HEDBERG AND JUAN M. SOSA
(3)
Mr. Kudlow: And we need to drill oil and gas in the Rockies. And Jeb Bush is wrong and George Bush is right; we need L+H* !H* L+H* !H* to drill in the Gulf of Mexico. (contrastive topic, 6.27, 28)
(4)
Mr. McLaughlin:
This exit question may be superfluous, but I’m going to hit you with it anyway. Tito cracked the space barrier between civilians and professionals. For the most part, was his way the right way, or for the most part was his way the wrong way, as L+H* LH% Goldin would lead you to believe, Michael Barone? (contrastive focus, 5.32)
However, Pierrehumbert and Hirschberg’s (1990) proposal that L+H* is associated particularly with contrast does not seem to be borne out by the number of noncontrastive topics (6) and noncontrastive foci (14) marked by this tone. Examples of noncontrastive topics are shown in (5) and (6): (5)
Ms. Clift:
(6)
Mr. Barone:
A good working-class guy may well be what Jesus was. And in fact, this is discussed in a documentary that was produced in England. And there they can talk about these kinds of things. I think in this country we’re still a little nervous about suggesting that Jesus may not fit the Westernized, romanticized ideal. In Britain, in fact, the archbishop of Canterbury there has called Britain a L+H* L* HH% H* nation of atheists. In a country of 60 million people, only a million people go to church. (unratified topic, 3.9)
Mr. McLaughlin:
I used to be an editorial writer, and I’ll tell you something, there’s a temptation to harumph when you’re an editorial writer – (laughter) – and I’m afraid that that was the New York Times harumphing. Well, they could have pointed out that $20 L+H* million given to Russia probably wound up with Russian scientists, and that might keep them from making Iranian nuclear bombs. (unratified topic, 5.26)
Similarly, examples of noncontrastive, plain foci marked by L+H* are shown in (7) and (8): (7)
Mr. McLaughlin:
Well, what is – do you think that NASA has egg on its face? (plain focus, 5.29) L+H* !H* HL%
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
(8)
Mr. Kudlow:
111
I have a different view, with all respect. I think it turns this guy into a celebrity, and I think that L+H*LL% actually encourages more of these heinous actions. (plain focus, 6.5)
Example (7) is about NASA’s unwillingness to allow Mr. Tito to pay $20 million to go up in the Space station. Example (8) is about the pending excecution of Timothy McVeigh. It is clear from Table 3 that L+H* is not significantly correlated with topic as opposed to focus, since there are 11 contrastive foci marked by this pitch accent and 6 examples of plain foci, although the raw number of 17 foci versus 24 topics represents a trend in this direction. Example (4) shows an L+H*-marked contrastive focus, and (7) and (8) show L+H*-marked plain foci. We present in figure 9 a pitch track for example (8):
Figure 9. I think it turns this guy into a celebrity. (plain focus 6.5) L+H*LL%
The Information Structure category from the literature that seems to best fit the data concerning L+H* is Gundel’s (1999) category of ‘Contrastive Focus’. Her category of ‘Contrastive Focus’ encompasses our ‘Contrastive Topic’, ‘Unratified Topic’ and ‘Contrastive Focus’. This composite category accounts for 83% of our L+H* marked phrases (34 out of 41). 6.2. Which Pitch Accents Mark Information Structure Categories? It is important to determine what pitch accent information-structure categories are marked with if they are not marked with L+H*. Table 4 shows the distribution of primary pitch accent relative to information structure type.
112
NANCY HEDBERG AND JUAN M. SOSA
Table 4. Distribution of Pitch Accents or their Absence Relative to Information Structure Type
Ratified Topic Contrastive Topic Unratified Topic Contrastive Focus Plain Focus TOTAL
H* 10
H*+L 1
H*+!H 0
L+H* 1
L* 4
L*+H 0
H+L* 0
o 26
23
1
0
10
1
2
0
5
19
4
0
13
0
3
1
2
22
1
0
11
7
0
0
1
26 100
1 8
1 1
6 41
8 20
0 5
0 1
0 34
Except for ratified topics, which tend to be unaccented, most phrases in each information structure category are marked by H*. Except for H*+!H, we abstracted away here from high tones further marked with increased range, upstep or downstep. It is interesting that L+H* is the second most frequent pitch accent in the data, after H*. This shows that the attention to this pitch accent exhibited in the literature has not been misplaced. Ratified topics, unsurprisingly, tend to be unaccented. 34 out of 42 ratified topics were encoded as personal pronouns. Four ratified topics were coded as L*. In the case of two of these, we were unsure as to whether they really received an L* pitch accent, or simply exhibited an unaccented rhythmic beat. Except for the four cases of unratified topics, L* tends to mark focus, either contrastive or plain. The five cases of L*+H all mark topics. The other pitch accents, except for L+H*, do not exhibit any particular pattern. We were especially curious about the phrases coded as contrastive focus, contrastive topic or unratified topic that did not receive the L+H* pitch accent. Is this an error of our information structure coding, or does it represent the actual prosodic marking system of English? One interesting class of examples to check in this regard is cleft sentences, of which there were three in our data. We coded the clefted constituent in each case as a contrastive focus since the meaning of the cleft sentence involves an exhaustiveness condition on the clefted constituent. For example, in (9), it is asserted that nobody other than the Communist Chinese are behaving as a Cold War power right now; in particular not the United States. The proposition that the United States has been behaving as a Cold War power has been previously evoked. (9)
Mr. Buchanan:
What the United States should do, John, is pull the ambassador home right now. The president of the United States should say, ‘I understand why Americans are boycotting Chinese goods, and I believe that if this thing is not resolved satisfactorily, it will be time to suspend PNTR for exactly one year.’ It is the Communist Chinese ↑H* !H* HL% who are behaving as a Cold War power right now. (contrastive focus, 2.23)
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
113
Like the other two it-clefts, the clefted constituent here is marked by some variant of the H* pitch accent, but it is contrastive. It is interesting that the three it-clefts are the only examples in the data of a subject receiving narrow focus. All three are subject clefts. Some narrow foci were coded as contrastive, but perhaps were not treated as contrastive by the prosodic system. For example, at the end of transcript 4, participants were asked to grade President Bush on style and substance during his first 100 days. Because there was a limited set of possible answers (the grades A, B, C, D, and F), we coded the resulting narrow focus answer as a contrastive focus. Perhaps a more refined definition of contrastive focus, one that requires the explicit ruling out of alternatives, would exclude these cases. An example is shown in (10): (10)
Mr. McLaughlin: Ms. Clift:
Yeah, what about substance? Substance, C-minus. H* !H* LL% (contrastive focus, 4.25)
There nevertheless are several cases of focus phrases coded as contrastive which do rule out alternatives but are not marked L+H*. The examples shown in (11) and (12) are explicitly contrastive in this way: (11)
Mr. Page:
Thank you, I want to concur with my colleagues in saying that I think – well, actually, Tito will be remembered as a pioneer; the first space tourist. And this is the wave of the future, and NASA, like most bureaucracies, has a difficult time ‘turning around in the water.’ It’s a big ship, not a speedboat. H* !H* (contrastive focus 5.17)
(12)
Mr. McLaughlin: Mr. Barone: Mr. McLaughlin: Ms. Clift:
I think we’ve reached the end of our seminar here today. Exit question: Will the Richard Neave Jesus endure Michael Barone? No. This is just a guess. Eleanor? I don’t think so. This is a BBC documentary, not a PBS documentary. Republicans on Capitol Hill ¡H* LL% would go nuts if this ever showed on PBS. (contrastive focus 3.32.)
6.3. Can Topics be Marked H*? It can be seen from Table 4 that topics are frequently marked with H*, contrary to predictions made in the literature that topics or at least contrastive topics should be marked L+H*. Examples of H*-marked contrastive topics are shown in (13) and (14): (13)
Ms. Clift:
And the stakes in this confrontation are huge for China. They have 54,000 students in this country. They want to
114
NANCY HEDBERG AND JUAN M. SOSA
get the Olympics. They want to keep trade going. And the stakes for this country are also huge. We H% H* !H* L* don’t want to create an enemy where where there is none. (contrastive topic 2.8) (14)
Mr. Page:
What you call small, but which Democratic contributors call $1,000 a lot of money. The Republicans have a lot H* L more of those kind of hard-money contributors and now you’re going to raise that limit while killing soft money. (contrastive topic, 1.18)
In general, it seems best to conclude that contrastive topics are only sometimes marked L+H*. The same goes for non-contrastive topics, as examples (15) and (16) show: (15)
Mr. McLaughlin: Mr. O’Donnell:
(16)
Mr. McLaughlin: Mr. Blankley:
Mr. McLaughlin: Mr. O’Donnell:
Mr. McLaughlin:
Can you handle that last question? Where do you think the international community is, especially the Third World? The international community is very H* !H* sympathetic to the Chinese. They’re wondering what are we doing with the reflexive old Cold Ward mentality of flying these missions in the first place. (Unratified topic, 2.15) Tony, what was his best move? I think there were two. One, coming off the Florida event, establishing his legitimacy as president….On a policy basis, his biggest success is taxes…. Do you see his best move as the tax cut’s tenacity? Yes, I do. I agree with Eleanor it’s not a good tax cut, it’s not a good policy; but it is an amazing accomplishment to come from where it’s come from…. Actually, his best move was the handling of the H* !H* !H* China spy plane. He kept his cool; he kept the country cool, he was measured and moderate. And it worked. (unratified topic, 4.7)
In (15), ‘the international community’ expresses the topic, as it is repeated from the question; similarly in (16), ‘his best move’ clearly expresses the topic. Indeed these two phrases are so topical in their contexts that perhaps they should be considered ratified topics. However, both are marked with H* (or !H*) instead of L+H*. We present in figure 10 a pitch track for example (16):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
115
Figure 10. Actually his best move was the handling of the China spy plane. (4.7) H* !H* !H*
In our future work on this project, we will explicitly distinguish topic-comment utterances from all-comment utterances (Gundel, 1988). Some of our unratified topics and contrastive topics could have alternative codings. For example the subject in (17) was coded as a contrastive topic, but the utterance could probably have been coded as an all-comment one, so that ‘Eisenhower’ would be coded as part of the focus, and thus the H* which marks it would not constitute a counterexample to theories that associate H* only with foci.2 (17)
Mr. Buchanan:
I’ll just remind you of one thing. Eisenhower H* !H* HL% refused to apologize for the U-2, and even blew up a summit, and we were a lot more at fault then. (contrastive topic, 2.25)
Here the entire event of Eisenhower’s refusal is being put forth as the ‘new information’ in the discourse. The entire clause answers the question ‘What happened?’ Nevertheless, we believe that the bold-faced constituents in (13)-(16) do express topics, and are marked H*, contrary to predictions in the literature. 7. INCREASED PITCH RANGE, UPSTEP, AND DOWNSTEP: We believe that the L+H* pitch accent is a mechanism for emphatically highlighting an element relative to its context. Two other prosodic devices for emphatic highlighting are pronouncing a high-pitch tone with increased pitch range or pronouncing it with upstep. Another variation on a high pitch tone is pronouncing it with downstep relative to a previous high pitch tone. The distribution of these three alternatives to a plain high tone across information type categories is shown in Table 5.
116
NANCY HEDBERG AND JUAN M. SOSA
Table 5. Distribution of Increased Range, Upstep and Downstep Relative to Information Structure Type
Ratified Topic Contrastive Topic Unratified Topic Contrastive Focus Plain Focus TOTAL
range ↑H 0 4 5 5 3 17
upstep ¡H 0 0 0 4 5 9
downstep !H 3 15 16 12 12 58
It is clear from the table that downstep is distributed across the four substantive information structure categories approximately equally, as is increased range. Upstep, however, seems to mark focus, either contrastive or plain, although the data are few. It might be worth following up on this latter tentative conclusion in a more detailed study. 8. BOUNDARY TONES Some of the claims and suggestions in the literature concerning topic and focus accents have involved boundary tones. For example, Lambrecht and Michaelis (1998) suggest in a footnote that H% might mark topic and L% mark focus. Table 6 shows the distribution in our data of intermediate phrase + boundary tone relative to information structure type. Table 6. Distribution of Phrase Accents and Boundary Tones Relative to Information Structure Types
Ratified Topic Contrastive Topic Unratified Topic Contrastive Focus Plain Focus TOTAL
Fall LL% 2 7 12 29 26 76
Level HL% 0 4 2 1 4 11
Rise HH% 0 1 6 4 4 15
Rise from Bottom LH% 0 1 0 5 5 11
TOTAL 2 13 20 39 39 113
It can be seen from Table 6 that LL% is associated primarily with foci, whether contrastive or plain, and foci are most likely to be marked by this boundary tone. It is not surprising that foci as opposed to topics are marked by LL% since this sequence tends to come at the end of the sentence, and topics tend to precede foci in the sentences of the data. Some non-final topics are, nonetheless, marked by LL%, as shown in example (18):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
(18)
117
Mr. Barone: … we’re going to reconsider this decision that Clinton made that would apply in six years from now, or 2006. So nobody’s putting any extra arsenic in the water, but Bush has given the Democrats a good talking point. H*LL% (Unratified topic, 4.8)
There were three wh-questions and four yes-no questions that ended in phrases we examined. Interestingly none of them received H% boundary tones. Two whquestions and two yes-no questions ended in LL%, and one wh-question and two yes-no questions ended in HL%. The one alternative question in our data did end in LH%, see example (4). 8.1. Does H% mark topic? Lambrecht and Michaelis’s (1998) hypothesis, in particular, is not borne out by the data. Table 7 shows that three quarters of both topics and foci are marked by L%, so there is no difference between them in this regard. Table 7. Boundary Tones Relative to Topic and Focus
Topic Focus TOTAL
L% 27 (75%) 60 (77%) 87
H% 8 (25%) 18 (23%) 26
TOTAL 35 78 113
9. ENTIRE TUNES Finally, Pierrehumbert (1980) and Steedman (1991) proposed that topics are associated with entire tunes, H*LH% and L+H*LH%, respectively. Let us first look at H*LH%. 9.1 Does H* LH% mark topic? As Table 8 shows, there are perhaps surprisingly only four examples of H*LH% in our data. Table 8. H*LH% Tune Relative to Information Structure Type
Plain Focus
H*LH% 4
All four of these mark plain focus, and all seem to mark continuation. For example (19) is a rejection of a previous participant’s contribution. It is continued with a correction: (19)
Mr. McLaughlin:
Lawrence and ah two other members are correct. His style rating is probably a B, but your analysis
118
NANCY HEDBERG AND JUAN M. SOSA
of how much he should be doing in the first 100 days is absurd. He’s taking one piece at a time H* LH% and he’s being very successful. He gets an A on substance. (plain focus, 4.35) 9.2. Does L+H* LH% mark topic? Steedman’s (1991) hypothesis that the L+H*LH% tune is associated with topics in particular is also not borne out by the data. Although the data are few, Table 9 shows that the distribution of L+H*LH% primarily targets contrastive foci, instead of topics. Table 9. L+H*LH% Tune Relative to Information Structure Type
Contrastive Topic Contrastive Focus Plain Focus
L+H*LH% 1 5 1
It is interesting that the function of four out of five of the contrastive foci examples of this tune are contradictions. See, for instance, examples (20) and (21): (20)
Ms. Clift:
Well, I think definitions of beauty or handsomeness change over the years, and I, frankly, think this guy is pretty attractive. I don’t find him unattractive. L+H* LH% (contrastive focus, 3. 5)
(21)
Mr. McLaughlin:
Well, he’s been a successful politician, and he’s been a successful statesman, has he not? He’s done – the only thing – he was in a box with China. He did the only thing you could do. He hasn’t done anything extraordinary. L+H* LH% (contrastive focus, 4.20)
Mr. O’Donnell:
The speaker in (20) is contradicting the proposition expressed by other participants that the likeness of Jesus being discussed is unattractive. The speaker in (21) is contradicting the proposition evoked by other participants that Bush’s 63% approval rating after his first 100 days was due to his behaving in an extraordinary fashion, in particular with regard to his handling of the Chinese fighter plane crisis.3 We present in figure 11 a pitch track for example (21):
THE PROSODY OF TOPIC AND FOCUS IN ENGLISH
119
Figure 11. He hasn’t done any extraordinary. (contrastive focus, 4.20) L+H*LH%
In future work on this project, we intend to correlate information structure with entire tunes, i.e. full intonational phrases with specific combinations of heads and nuclei, according to the sentence type. 10. CONCLUSION We conclude that while there are systematic correlations between intonation and information structure categories, these correlations are not as straightforward as is suggested in the literature. In particular we deny that there is any prosodic category as distinctive as a ‘topic accent’ as opposed to a ‘focus accent.’ With regard to L+H*, we found that it falls on contrastive topics and unratified topics and contrastive foci 24-31% of the time and on plain foci 14% of the time. It doesn’t just fall on topics. L+H* occurred in 41 of our analyzed phrases, or approximately 20%, which is a significant number. This shows that this accent deserves the reputation it has received in the literature. Minor conclusions, given the relative lack of data, are that L* tends to mark focus and that L*+H tends to mark topic. Upstep also seems to mark focus, although again the data are few. Except for ratified topics which tend to be unaccented, all information structure categories were extensively marked with H*, including unratified and contrastive topics. The fact that pitch accents with some kind of H* occur six times more often than L* (150 versus 26) shows that American English is an H* language, as opposed to other languages such as Spanish in which L* predominates, at least in prenuclear positions (Sosa, 1999). Finally, given the fact that our results mitigate the conclusions assumed in the literature, it is clear that investigations into intonation should be carried out on naturally-occurring spontaneous dialogue as well as on constructed examples and experimentally induced speech.
120
NANCY HEDBERG AND JUAN M. SOSA 11. NOTES
* Part of this research was funded by a SSHRC Small Grant from Simon Fraser University, 2001. 1 For the contour in figure 2 the ToBI Guidelines would prescribe a notation H* L+H*. The reason for which we decided to use the H*+L is that the salient fall is completely realized during the word ‘thirty’. The point here is that there is an important descent during this word, not that there is a rise for H* on the word ‘years’. 2 We thank Jeanette Gundel for pointing out this general problem to us. 3 In (20) and (21), it has been suggested to us by Mark Steedman and Chungmin Lee that an alternative information structure analysis would treat the marked phrase as topic. Note that this alternative analysis can be justified by the ‘as for’ test as follows: ‘As for whether he is unattractive, I don’t find him so’ and ‘As for whether he has done anything extraordinary, he hasn’t.’ The point here is that the questions of whether or not the Christ image is attractive and whether or not Bush has done something extraordinary are relevant in their contexts and to some extent are already under discussion. Büring (2003) would also analyze the accents in (20) and (21) as contrastive topic accents since (20) and (21) can be seen as answers to implied subquestions in the discourse, e.g. (21) in the context of the explicit question ‘Has Bush been a successful politician?’ negatively answers the subquestion ‘Has he done anything extraordinary?’.
12. REFERENCES Beckman, Mary E. and Gayle Ayers Elam. Guidelines for ToBI Labelling. Version 3. Columbus: Ohio State University, Department of Linguistics, 1997. Büring, Daniel, “On D-Trees, Beans, and B-accents.” Linguistics and Philosophy 26.5 (2003): 511-545. Gundel, Jeanette. The Role of Topic and Comment in Linguistic Theory. Ph.D. Dissertation, University of Texas, Austin, 1974. Gundel, Jeanette. “Stress, Pronominalization and the Given-New Distinction.” University of Hawaii Working Papers in Linguistics 10.2 (1978): 1-13. Gundel, Jeanette. “Universals of Topic-Comment Structure.” In Michael Hammond, Edith A. Moravcsik and Jessica R. Wirth (eds.), Syntactic Universals and Typology, pp. 209-242. Amsterdam and Philadelphia: John Benjamins, 1988. Gundel, Jeanette K. “On Different Kinds of Focus.” In Peter Bosch and Rob van der Sandt (eds.), Focus: Linguistic, Cognitive, and Computational Perspectives, pp. 293-305. Cambridge: Cambridge University Press, 1999. Gundel, Jeanette K. and Thorstein Fretheim. “Topic and Focus.” In Laurence Horn and Gregory Ward (eds.), The Handbook of Contemporary Pragmatic Theory. Oxford: Blackwell, (2004): 175-196. Gussenhoven, Carlos. “Focus, Mode and the Nucleus.” Journal of Linguistics 19 (1983): 377-417. Hedberg, Nancy. “The Referential Status of Clefts.” Language 76 (2000): 891-920. Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972. Lambrecht, Knud and Laura Michaelis. “Sentence Accent in Information Questions: Default and Projection.” Linguistics and Philosophy 21 (1998): 477-544. Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986. Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. MIT: Ph.D. Dissertation, 1980. [Bloomington, IN: Indiana University Linguistics Club, 1987]. Pierrehumbert, Janet and Julia Hirschberg.. “The Meaning of Intonational Contours in the Interpretation of Discourse.” In Philip R. Cohen, Jerry Morgan, and Martha E. Pollack (eds.), Intentions in Communication, pp. 271-311. Cambridge: MIT Press, 1990. Reinhart, Tanya. “Pragmatics and Linguistics: an Analysis of Sentence Topics.” Philosophica 27 (1981): 53-94. Sosa, Juan M. La Entonación de Español. Madrid: Cátedra, 1999. Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 260-296. Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT Press, 2000a. Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 31 (2000b): 649-689. Vallduvi, Enric and Elisabet Engdahl. “The Linguistic Realization of Information Packaging.” Linguistics 34 (1996): 459-510.
EMIEL KRAHMER (1) AND MARC SWERTS (1,2)
PERCEIVING FOCUS
1. INTRODUCTION Many linguists approach intonational matters from a purely speaker-oriented perspective1. For instance, in different studies, in as far as these are empirical in nature, evidence for particular tonal distinctions is often solely based on acoustic analyses of fundamental frequency (F0) traces. However, if one wants to gain full insight into how intonation ‘functions’, such an approach is arguably incomplete. That is, a prosodic feature, as any other linguistic feature, can only be said to be communicatively relevant if it is not only encoded in the speech signal by a speaker, but if it also has an impact on how an utterance is processed by a listener. In other words, claims about important intonational categories and their respective meanings are somewhat premature if they are not backed up with results that show that these are also relevant at the receiving end of the communication chain. Ideally, such an analysis should be more than an individual linguist’s interpretation of a prosodic phenomenon. Unfortunately, one cannot simply take it for granted that all prosodic detail really matters to a listener. One obvious, but sometimes neglected, condition is that tonal variation clearly needs to be above a perceptual threshold to be functionally relevant. In that respect, it is striking to see that many researchers attach functional load to particular tonal distinctions, which, from a purely phonetic point of view, are only minimally separable or even highly overlapping in “tonal space”. For instance, the difference between H* and L+H*, as defined in the ToBI framework, has been claimed to indicate semantically distinct categories such as rheme and theme (Steedman 2000) or new and contrastive information (Pierrehumbert & Hirschberg 1990). Yet these two intonational categories are often confused by labellers who are instructed to transcribe intonation (e.g., Pitrelli et al. 1994), even to the extent that some investigators simply give up on the distinction. In comparison, many vowel systems of the world obey a contrast principle, which states that any two vowels need to be optimally distinct in order to be appropriately applicable in speech communication (the idea of vowel dispersion, see e.g., ten Bosch 1991). Also, linguistic systems are highly redundant in that speakers have various strategies at their disposal to signal particular meanings. Since tonal markers of semantic events often covary with morpho-syntactic, lexical or other prosodic cues, it is theoretically possible that their communicative function is ‘overruled’ by that of other resources, or by the situational or linguistic context in which they occur. In this chapter, we argue that controlled perceptual studies allow us to investigate the communicative importance of intonational features. Rather than concentrating on
121 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 121–137. © 2007 Springer.
122
E. KRAHMER AND M. SWERTS
subtle differences between intonational categories, we will illustrate this viewpoint with a series of studies on the cue value of pitch accents. In languages such as Dutch and English, the distribution of accents has been claimed to be exploited as a means to distinguish important bits of information in an utterance from unimportant ones. That is, in such languages, pitch accents serve as a linguistic strategy to put ‘new’ or ‘contrastive’ information in focus, whereas speakers take care not to highlight ‘given’ information, i.e., information which is present (explicitly or implicitly) in the preceding context. However, it is unclear whether such observations generalize to other languages as well; in addition, most studies on pitch accent distribution have not looked at the relation of such accents, which are encoded in the speech signal itself, to visual cues speakers may send to a communication partner. Therefore, in order to gain insight into the relative cue value of pitch accents for signaling focus, we will tackle this question in different perception tests, and approach it from (1) a multilingual and (2) a multimodal perspective. In the studies presented below we shall ignore newness accents, for reasons that will become clear later, and hence in the present study being contrastive is equivalent to being in focus. First, we will present a cross-linguistic approach, given that the communicative importance of pitch accents is likely to be different for different languages. Consider the following utterances (after Cruttenden 1993) i.e., readings of the scores in English and Italian football reports (capitalized words indicate accents). In both examples, the football score is a tie so that a particular number (the words ‘one’ and ‘uno’) is repeated: English TOTTENHAM ONE
-
LIVERPOOL one
Italian UDINESE UNO
-
ROMA UNO
In a typical English realization of such scores, the second instance of ‘one’ is deaccented, since it has just been mentioned in the preceding phrase. In the Italian scores, the second instance of ‘uno’ is typically accented again, even though it is literally given from the preceding context. This second accent in the Italian case is due to the fact that Italian strongly disfavours deaccentuation within NPs or other syntactic constituents (Ladd 1996:177-178 & p.c.). More general, Italian, as other Romance languages such as Catalan or French, is an example of what Vallduví (1992) would call a non-plastic language, i.e., a language which has a rather constrained intonation structure to mark information status, but which more heavily uses word order variation for that purpose. These languages are different from the plastic ones, such as Dutch and English, whose prosodic pattern is “moulded” to fit the information structure, so that intonation is used to mark information status. These claims do not entail that deaccentuation is impossible in Italian, as Ladd acknowledges that deaccentuation on sentence level in Italian is entirely possible, e.g., repeated full NPs may be deaccented (see also Avesani et al. 1995; Hirschberg & Avesani 1997; D’Imperio 1997), but they do mean that under certain conditions deaccentuation is infelicitous. The current study aims to test whether differences
PERCEIVING FOCUS
123
between accent structures in Italian and Dutch, as two cases of a non-plastic and a plastic language, respectively, have implications on the way listeners perceive focus. Second, apart from variation between languages, the importance of pitch accents may also depend on the communicative setting in which they are used, in particular if we compare communicative settings in which dialogue participants can or cannot see each other during a spoken interaction. Different studies have suggested that there exist specific visual cues to focus structure as well. In particular, like pitch accents, rapid eyebrow movements have been claimed to play an accentuation role (e.g., Birdwhistell 1970, Condon 1976). It has even been argued that there is a oneto-one connection between the two; see, for instance, the so-called Metaphor of Up and Down (Morgan 1953, Bolinger 1985:202ff ): when the pitch rises or falls, the eyebrows follow the same pattern. In fact, to see that there is indeed a close connection between pitch and eyebrows, one may try to utter a two word phrase, say “blue square”, with a pitch accent (but no corresponding eyebrow movement) on the word “blue” and a rapid eyebrow movement (but no corresponding pitch accent) on the word “square”. Most people find this a difficult exercise. One of the few empirical studies devoted to the relation between pitch accents and eyebrow movements is Cavé et al. (1996), who report on a significant correlation between the two (in particular, and surprisingly, for the left eyebrow). It appears that rapid eyebrow movements often co-occur with pitch accents. The opposite is not the case: people do more with their pitch than with their eyebrows. Cavé and co-workers suggest that eyebrow movements and pitch do not link automatically (e.g., due to muscular synergy), but coincide for communicative reasons. Naturally, this raises the question what these communicative reasons might be. In the literature on Talking Heads (i.e., combinations of computer animations with speech), there is no consensus on the timing and placement of eyebrow movements. Pelachaud et al. (1996) note that the decision to raise the eyebrows is affect dependent, but in the examples they discuss, pitch accents and eyebrows coincide. Thus to the question I know that Harry prefers POTATO chips, but what does JULIA prefer? the Talking Head of Pelachaud et al. (1996:19) would respond with the following utterance, in which capitalized words again indicate an accent, whereas overlined words are accompanied by a rapid eyebrow movement: ( JULIA prefers) theme ( POPCORN ) rheme
Cassell et al. (2001) use eyebrow raising (or “flashes” as they call them) more sparingly. The eyebrows are raised when an object in the “rheme” is described. So in reply to the question above, the algorithm of Cassell et al. would not produce a ‘flash’ on “Julia”. It should be noted that neither Pelachaud et al. (1996) nor Cassell et al. (2001) report on evaluation: it is not known whether the animations are effective in the way human listeners process the information. We get no insight in the contribution of the eyebrow movement: its function remains unclear. Again, to learn more about the relative importance of pitch accents and eyebrow movements, this issue is tackled in the current study from a perceptual point of view, testing how listeners detect focus in audiovisual stimuli.
124
E. KRAHMER AND M. SWERTS
To facilitate comparisons across languages and across modalities regarding the cue value of pitch accents to signal focus, we have set up a particular experimental paradigm which can be applied to different languages and to both speech-only and audiovisual stimuli. The experiment consists of a perceptual task in which listeners essentially have to detect the main focus in an utterance. More specifically, subjects are instructed to decide, solely on the basis of a particular utterance, what the information would have been in the preceding utterance, i.e., subjects have to ‘reconstruct the dialogue history’. Rather than using manipulated speech materials (read-aloud or synthetic) with controlled prosodic properties, the stimuli for the perceptual task discussed here consist of semi-spontaneous data whose intonational features are untouched when used in the test. By using naturally elicited speech materials, one avoids the risk that one tests the effect of intonation contours that are not representative of real data. For that purpose, we developed a specific dialogue game that triggers speakers’ productions of different focus distributions in particular target sentences. The paradigm works for different languages so that it becomes easier to make cross-linguistic prosodic comparisons. In addition, the resulting utterances can be combined with visual cues, which makes it possible to study the relative cue value of accents and visual information. In the next section, we first describe the experimental design to elicit accent patterns in both Dutch and Italian utterances, and the method to create audiovisual stimuli to be used in a series of perception tests. The following sections then describe the procedure and the results of the actual experiments on the perception of focus in speech-only stimuli in Dutch (study 1) and in Italian (study 2), and in multimodal stimuli in Dutch (study 3). We end with a general discussion and a conclusion. 2. MATERIALS 2.1. Speech For all three studies, utterances were used which were obtained in a semispontaneous way via a simple dialogue game. The game was played each time by two subjects, call them A and B, separated from each other by a screen. Figures 1 and 2 visualize the experimental set-up with a bird’s-eye perspective on the starting situation of the game and the situation after the first turn in the game. In each game, both players have an identical set of eight cards at their disposal, each card showing a geometrical figure in a particular colour. Four of these cards are put on a stack in front of them, the four other cards are in a row before them. The four cards in the stack of A are the same as the four cards in the row of B, and vice versa. The game consists of a series of turns in which one participant gives instructions to select a card with a particular geometrical figure and the other follows these instructions. In each consecutive turn, the participants switch roles so that the original instructiongiver becomes the instruction-follower, and the other way around. In turn 1, the instruction giver, say A, begins with describing the figure on the top of his stack (“a blue square”). After he has described this figure, he removes it from his stack and
PERCEIVING FOCUS
125
puts it behind number 1 on his list. The instruction follower, B, listens to the description of A and removes that figure from his row of figures, and also puts it behind number 1 on his list. Now, the participants switch roles, so that B describes the figure that is on top of his stack (“a black triangle”), and A follows the instructions of B which will prompt both A and B to place the card with this object on the second place in the row with figures, and so on. The game is over when both players have no cards left. Each pair of subjects played a sequence of eight games, each time separated by a break of at least two minutes. Note that the players are given the instruction to describe the figure on top of their stack in terms of its colour and figure property. Speakers generally found it a very easy game to play, and as a consequence there are no faulty descriptions in the respective data sets.
Figure 1. Visualization of the initial set-up of the experiment to elicit different referring expressions. A and B represent the two participants in the dialogue game. In the actual experiment, the different figures were given different colours. Further explanations in the text
The speech data thus obtained allow for an unambiguous operationalization of the relevant contexts. A property is defined to be new (N) to the conversation if it is
126
E. KRAHMER AND M. SWERTS
mentioned in the first turn of the current dialogue game, it is given (G) if it was mentioned in the previous turn and finally a property is contrastive (C) if the object described in the previous turn had a different value for the relevant property. We define a property to be in focus, if it is not given. (In the three studies described below, we will ignore newness and hence in these studies a property will be in focus if, and only if, it is contrastive.) By systematically varying the order of the cards in the stack, target descriptions (Dutch: “blauw vierkant” (blue square); Italian: “triangolo nero” (black triangle)) could be collected in all contexts of interest: no contrast (all new, NN), contrast in the prefinal word (CG), contrast in the final word (GC), all contrast (CC). Notice that in the 2-letter abbreviations, the first letter corresponds with the contextual status of the first word, and the second letter with the contextual status of the second word. Table 1 summarizes the situation. It is worth noting that in the Dutch elicited utterances the adjective always precedes the noun, whereas in the Italian data it follows the noun. In other words, if we refer to the first word in the elicited NPs, we mean the adjective in case of the Dutch data, and we mean the noun in the case of the Italian data. Table 1. Examples of the four contexts
NN
B:
(beginning of game) “blue square”
CC
A: B:
“red circle” “blue square”
CG
A: B:
“yellow square” “blue square”
GC
A: B:
“blue triangle” “blue square”
PERCEIVING FOCUS
127
Figure 2. Visualization of the set-up of the experiment after A’s first move (“blue square”)
Eight Dutch speakers were recruited from students and colleagues from IPO, speaking the variant of standard Dutch as spoken in the Netherlands; the eight Italian speakers we recorded were all living in Italy, and were native speakers of the Tuscan variety of Italian. The Dutch speech materials are used in studies 1 and 3, the Italian ones in study 2.
128
E. KRAHMER AND M. SWERTS
2.2. Animations For study 3 we combined the Dutch speech materials with an animated talking head. Since this was a male head, we only used the four male voices collected for Dutch. In addition, two synthetic male voices were used, copying the intonation contours of two of the human voices. We use both synthetic and natural voices in order to see to what extent the naturalness of the voice influences the perception of focus. A human voice has more natural and better sounding prosody, but a synthetic voice might be better suitable to accompany the visual counterpart of a synthetic character. A Dutch diphone speech synthesizer was used for the generation of the two synthetic versions. The animations were produced with the CharToon environment (Ruttkay et al. 1999). A 2D head of a male person formed the basis of the animations. Visual speech is generated on the basis of a set of 48 visemes (elementary mouth positions). Phonemes from the input are matched to corresponding visemes with a sampling rate of 100 ms, while intermediate stages are computed using linear interpolation. Rapid eyebrow movements coincide with the stressed syllable of either the first (“blauwe”) or the second word (“vierkant”). Notice that these are the eyebrow counterparts of focus on the adjective and focus on the noun respectively. We did not include an eyebrow counterpart to “all focus”, since this would involve either a raised eyebrow for a longer stretch of time or two rapid eyebrow movements in succession. For Dutch subjects both of these primarily have a non-focus signalling interpretation. It is worth stressing that in certain stimuli eyebrow movements are associated with words which are not accented. Eyebrow movements always had the following pattern: first, a 100 ms dynamic raising part, then a static raised part of 100 ms, and finally a 100 ms dynamic lowering part. The overall length of the movement is comparable to the average duration of rapid eyebrow movements of human speakers (± 375 ms, Cavé et al. 1996). We opted for slightly shorter movements due to the overall short duration of the stimuli. Figure 3 shows two stills from a typical animation used in the experiment.
Figure 3. Two stills from the Talking Head uttering “blauw vierkant” (blue square) with a raised eyebrow on the first word (left) and no eyebrow action on the second word (right)
PERCEIVING FOCUS
129
3. STUDY 1: FOCUS IN DUTCH 3.1. Preliminaries The first study tests to what extent Dutch listeners are able to determine the main focus in an utterance by means of pitch accent distribution. For this purpose, we used data collected via the game described above. Before performing the dialogue reconstruction experiment, a distributive analysis of the target utterance “blauw vierkant” (blue square) was carried out. A consensus labelling was done by three independent intonation experts. The results of the labelling can be summarized as follows: in most cases, a property which is in focus receives a pitch accent. Interestingly the only exceptions to this general rule can be attributed to speaker differences among the eight speakers. One group of four speakers always end their utterance on a low boundary tone and always associate focused properties with a pitch accent. The four remaining speakers uniformly employ high boundary tones, and they associate the CC utterances with a single accent on the noun. 3.2. Procedure Dialogue reconstruction data were obtained from 25 native speakers of Dutch (different from the eight speakers). The experiment was performed on an individual basis and was self-paced. All three versions (CG, GC and CC) of the target utterance (“blauw vierkant”) produced by the eight speakers were used, making a total of 24 stimuli. In studies 1 and 3, Dutch subjects are presented with speech realizations of “blauw vierkant” taken from their original context, and the task is to determine by forced choice whether the preceding utterance would be: (1) “rood vierkant” (red square), (2) “blauwe driehoek” (blue triangle) or (3) “rode driehoek” (red triangle). The corresponding contexts are (1) CG (focus on the first word), (2) GC (focus on the second word) and (3) CC (all focus), respectively. The stimuli were presented in two random orders, to compensate for potential learning effects. Before the actual experiment started subjects entered a brief training session (consisting of three stimuli) to make them acquainted with the materials and the setting of the experiment. No feedback was given on the correctness of their answers, and there was no communication with the experimenters. Notice that the all new situation (NN) is not incorporated in the experiment, because there are no utterances
130
E. KRAHMER AND M. SWERTS
preceding the NN so that subjects cannot reconstruct the preceding utterance. The NN utterances have been studied extensively in Krahmer & Swerts (2001), to find out whether there are prosodic differences between newness and contrastive accents in this setting2. 3.3. Results Table 2 contains the results for all eight speakers taken together. The overall 2 distribution is significantly different from chance (Ȥ = 395.3, df = 4, p < 0.001). The first thing to note is that for each line the highest numbers are on the diagonal. This means that each context is most likely to be classified correctly. However, these chances are much higher in the case of single focus, on contrastive items (CG and GC) than in the all focus case (CC). Subjects are particularly good in reconstructing the dialogue history when the adjective is the single focused item (note that these are the classic cases of narrow scope), which stands out prosodically due to the occurrence of a nuclear accent in non-default position. However, also when it is the noun that is the single item in focus, subjects are generally capable of reconstructing the context. Interestingly, the number of confusions with the all focus (double contrast) context increases. This seems to imply that there is at least some amount of broad focus / narrow focus ambiguity (but see below), although the narrow focus interpretation is still prevalent. This result is compatible with earlier findings from Gussenhoven (1983) and Rump & Collier (1996) that these ambiguous cases are more confusable than the CG case, which only allows a narrow focus interpretation. In the case of double contrast there appears to be a very substantial broad vs. narrow focus confusion. However, looking at the results for each speaker separately (all significantly different from chance as well), reveals an interesting difference between high and low boundary speakers. The main difference between speakers is found for the Table 2. Summary of the results of Study 1: classification of all 24 stimuli, for all 25 listeners (n=600). The vertical axis indicates the actual CONTEXT of the target utterance “blauw vierkant” (blue square). The horizontal axis indicates how many subjects CLASSIFIED the utterance in each of the three contexts
CONTEXT
CC GC CG
CLASSIFIED as CC GC CG 95 83 22 60 119 21 10 6 184
Total 200 200 200
PERCEIVING FOCUS
131
double contrast (CC) case. For low boundary speakers, utterances made in a CC context are predominantly classified as CC. Strikingly, this is not the case for high ending speakers, whose CC utterances are very frequently classified as GC utterances, which matches the earlier observation that these speakers tend to produce all-contrast utterances with a single accent on the noun. Thus, the fact that in table 1 CC utterances are often misclassified as GC utterances is essentially due to the difference between low and high ending speakers rather than broad vs. narrow focus interpretations. 4. STUDY 2: FOCUS IN ITALIAN 4.1 Preliminaries The second study tests to what extent Italian listeners are capable to determine the main focus and reconstruct the dialogue history of an utterance using prosodic cues. Before performing the dialogue reconstruction experiment, a distributive analysis of the target utterances “triangolo nero” (black triangle) was performed. Three independent intonation experts listened to all realizations of “triangolo nero” produced by the eight speakers in the various contexts of interest, and decided on which words they perceived an accent. The three judges were in full agreement: every word is always accented, irrespective of context. All speakers produce the same contour, namely a flat hat shape with the second accent downstepped with respect to the first. Of course, it might be that different kinds of accents are realized in different contexts. However, an analysis of the fundamental frequency did not reveal any differences between contexts (see Swerts et al. 2002). In addition, we found no evidence for a clear correlation between information status and the perceived prominence of accents for the Italian data. Therefore, it seems a reasonable hypothesis that, contrary to the Dutch subjects, Italian subjects will not be able to reconstruct the dialogue history on the basis of prosodic cues. 4.2 Procedure Subjects of the second dialogue reconstruction experiment were 25 native speakers of Italian (different from the eight speakers), mostly from Tuscany. The experiment was performed on an individual basis and was self-paced. All three versions (CG, GC and CC) of the target utterance (“triangolo nero”) produced by the eight speakers were used, making a total of 24 stimuli. In this study, Italian subjects hear versions of “triangolo nero” (black triangle), and have to guess whether the preceding utterance was (1) “rettangolo nero” (black rectangle), (2) “triangolo viola” (violet triangle) or (3) “rettangolo viola” (violet rectangle), again representing the following contexts: (1) CG (focus on the first word), (2) GC (focus on the second word) and (3) CC (all focus) respectively. The stimuli were again presented in two random orders, to compensate for potential learning effects. Before the actual experiment started subjects entered a brief training session (consisting of three stimuli) to make them acquainted with the materials and the setting of the
132
E. KRAHMER AND M. SWERTS
experiment. No feedback was given on the correctness of their answers, and there was no communication with the experimenters. 4.3 Results The results of the Italian reconstruction experiment on the basis of all eight speakers are displayed in table 3. A Ȥ 2 analysis reveals that the distribution is not significantly different from chance. Looking at the results of the eight individual speakers, we see that the results for seven of them are not significant.3 The picture is significantly different from the one obtained for the Dutch data (Pearson Ȥ 2 = 223.8, df = 8, p < 0.001). Thus, as expected, Italian listeners are not able to reconstruct the prior dialogue context on the basis of prosodic properties of the current utterance, in contrast to Dutch listeners. Table 3. Summary of the results of Study 2: classification of all 24 stimuli, for all 25 listeners (n=600). The vertical axis indicates the actual CONTEXT of the target utterance “triangolo nero” (black triangle). The horizontal axis indicates how many subjects CLASSIFIED the utterance in each of the three contexts
CONTEXT
CC GC CG
CLASSIFIED as CC GC CG 52 70 78 53 82 65 61 73 66
Total 200 200 200
5. STUDY 3: FOCUS IN AUDIO-VISUAL SPEECH 5.1 Preliminaries In the third study we investigate the relative contributions of pitch accents and eyebrow movements to the perception of focus in Dutch. For this purpose, we use an animated male Talking Head and six different male voices. Four of these voices are human, and have also been used in study 1. The two remaining voices are synthetic, with the respective intonation contours copied from two of the human speakers. This makes it possible to compare the results of study 3 with those of study 1. The rapid eyebrow movements have been shown to be clearly perceivable. A further test indicated that the eyebrow movements boost the perceived prominence of words that also receive a pitch accent, and downscale the prominence of unaccented words in the direct context of the accented word (see Krahmer et al. 2002b). The question of interest to us here is whether this also has functional ramifications. 5.2 Procedure A total of 25 native speakers of Dutch participated in the audio-visual dialogue history reconstruction experiment (different from the eight speakers, and also
133
PERCEIVING FOCUS
different from the 25 listeners from study 1). The experiment was individually performed and self-paced. Subjects watched and listened to the Talking Head uttering the two-word phrase “blauw vierkant” (blue square), with a particular intonation contour (taken from its original context; CG, GC or CC) and a rapid eyebrow movement on either the first or the second word. Eyebrow movements are indicated with a hat on the relevant item; the resulting six contexts are ƘG, CƢ, ƢC, GƘ, ƘC and CƘ. Since six voices are used the total number of stimuli is 36. The stimuli were displayed on a high-resolution color PC screen, sound came over the loudspeakers to the left and the right of the screen. Dutch subjects had to perform the same task as those of study 1, except that they were now presented with audiovisual stimuli. The stimuli were presented in two different random orders, to compensate for possible learning effect. Before the experiment started, subjects entered a brief training session (consisting of three stimuli) to make them acquainted with the material and the setting of the experiment. No feedback was given on the ‘correctness’ of their answers and there was no communication with the conductor of the experiment. Table 4. Summary of the results of Study 3: classification of all 36 stimuli, for all 25 listeners (n= 900). The vertical axis indicates the actual CONTEXT of the target utterance “blauw vierkant” (blue square) plus the word which is associated with a rapid eyebrow movement. The horizontal axis indicates how many subjects CLASSIFIED the utterance in each of the three contexts
CONTEXT
ƘC CƘ ƢC GƘ ƘG CƢ
CLASSIFIED as CC GC CG 64 41 45 59 70 21 34 91 25 33 90 27 16 22 112 16 30 104
Total 150 150 150 150 150 150
5.3 Results Table 4 summarizes the results. The total distribution is significantly different from chance: Ȥ 2 = 292.2, df = 10, p < 0.001. First consider the cases with single pitch accents, i.e., the cases with a single prosodic focus on either the adjective or the noun. Notice that in these cases the majority of subjects indeed perceived the focus on the adjective or the noun respectively, no matter which of the words is accompanied by an eyebrow movement. Subjects are somewhat more likely to classify the cases with the prosodic focus on the adjective correctly than those with prosodic focus on the noun. Certainly for these single prosodic focus cases, the distribution of pitch accents is more important for the perception of focus than the placement of eyebrow movements. This is also reflected by the fact that in the postexperiment interview, all subjects indicated that they paid most (if not all) attention
134
E. KRAHMER AND M. SWERTS
to information in the auditory channel. Nevertheless, there is an overall effect of eyebrow movements: the distribution obtained with an eyebrow movement on the first word is significantly different from the distribution with a movement on the second word (Ȥ 2 = 19, df = 8, p < 0.025). Closer inspection of table 4 reveals that this is primarily due to cases with a double pitch accent. If we compare the cases in which the first word (the adjective “blauw”) is associated with a rapid eyebrow movement with the cases in which the first word is not associated with such a movement, we see that in the former case the focus is perceived on the first word in 45 instances, as opposed to 21 in the latter situation. And, conversely, when we compare the cases in which the second word (the noun “vierkant”) is associated with a rapid eyebrow movement with the cases in which it is not, we see that in the former case 70 times a subject classified the noun as being in focus as opposed to only 41 times in the latter case. In other words, when the intonation contour provides less cues about the focus (since it contains two pitch accents), eyebrow movements have relatively more impact. Overall, the results for the four human voices are similar to the results for two synthetic voices, albeit that the effect of eyebrow movements is a bit (but not significantly) more pronounced for the synthetic ones. One subject explicitly indicated that she “trusted” the human voices more than the synthetic ones, and thus paid special attention to pitch accents in the former situation. 6. DISCUSSION AND FUTURE WORK The perceptual approach to intonational phenomena has most strongly been promoted in the so-called IPO school of intonation ( t Hart et al. 1990). The original goal of this approach was to develop a formal metalanguage to describe the intonational properties of Dutch and a few other languages. Starting from the observation that perception acts as a filter that can stylize the acoustic signal, this enterprise has led to a phonetically explicit specification of a few basic intonational categories, i.e., a limited set of pitch rises and falls, that serve as building blocks out of which larger intonation contours can be constructed. In the current chapter, we have shown that such a perceptual approach is also useful to gain insight into functional aspects of intonation. In particular, we have shown that it helps to comprehend how useful pitch accents are as signals of focus. This research question was tackled from a multilingual and multimodal perspective, applying a particular experimental approach, which consists of a dialogue game to elicit target utterances in different discourse contexts, and a series of perception tests to evaluate the functions of accents in different languages and in different communicative settings. As to the results of the current study, we have found that the two languages investigated, Dutch and Italian, are markedly different regarding accent patterns inside NPs. In Dutch, it appears that accent patterns are indeed used to mark information status: accent distribution is the main discriminative factor with new and contrastive information generally accented, while given information is deaccented. Study 1 shows that our Dutch listeners are capable, in the majority of the cases, to reconstruct the prior dialogue utterance on the basis of properties of the current utterance. Italian differs from Dutch in terms of accent structure: ‘
PERCEIVING FOCUS
135
distribution is not a significant factor in this language, since within the elicited NPs both adjective and noun are always accented, regardless of the information status. As a result, it is not surprising that the Italian listeners fail completely to interpret the target utterances in terms of the dialogue history (study 2). As noted in the introduction, Italian, being a non-plastic language, has other means besides prosody of marking information status. For instance, it has a freer word-order than plastic languages such as Dutch, and it is known to exploit this freedom to mark information status. However, the constraints of the experimental paradigm did not offer any room for Italian speakers to use word-order as an indicator of information status. Therefore it would be interesting to look for an experimental set-up in which speakers have more freedom to describe a particular state of affairs. This might also shed a different light on the deaccentuation debate, given that Ladd claims that deaccentuation of complete NPs within a sentence is quite possible in languages like Italian, which is supported by data from previous studies (Avesani, Hirschberg & Prieto 1995, D Imperio 1997, Hirschberg & Avesani 1997). Regarding the outcome of the audiovisual test (study 3), we have found that both auditory (accent distribution) and visual (eyebrow movement) cues can have a significant effect on the perception of focus. However, the effect clearly differs in magnitude; the impact of pitch accents is large, that of rapid eyebrow movements comparatively small. The visual cues contribute more when the auditory cues are inconclusive. Thus, for the condition which caused most confusion in study 1, eyebrows contribute the most in study 3. One consequence of the overall dominance of speech is that inconsistent cues go largely unnoticed (although a recent experiment indicates that subjects have a preference for animations in which eyebrow movements coincide with pitch accents, Krahmer et al. 2002b). That the auditory cues appear to be more important for focus perception may —with hindsight— be explained as follows: since human speakers do more with their pitch than with their eyebrows, it is not unnatural that human listeners have learned to pay more attention to changes in pitch than to eyebrow movements. It is interesting to compare the result of study 3 with those of study 1. Since the auditory cues dominate the visual ones, it is no surprise that the results basically confirm the speech-only results of study 1. Nevertheless, there is clearly more confusion in the audio-visual case. In part, the increase in confusion can be ascribed to the presence of the eyebrow movements. Certainly, they account for much of the “confusion” in the cases with a double pitch accent. However, eyebrows cannot account for the slight increase in confusion for the cases with a single pitch accent. It might be that the mere addition of a visual channel leads to more confusion (compare DohertySneddon et al. 2001). As possible follow-up studies, it is useful to investigate real speaker behaviour in natural interactions to gain more insight into possible visual cues. For study 3, use was made of an analysis-by-synthesis technique, creating stimuli whose visual properties were systematically varied to learn more about the relative effect of this parameter on focus perception. While the manipulations were inspired by claims in the literature, it would be nice to supplement the current results with findings of observations on real speakers to see whether they indeed use eyebrow movements for signaling focus as suggested here, or whether these mainly signal other types of ‘
136
E. KRAHMER AND M. SWERTS
information, if any. It would also be highly interesting to see what happens with Talking Heads for non-Germanic languages such as Italian. As shown above, the results of study 2 reveal that Italian listeners systematically fail to correctly classify the Italian utterances in terms of dialogue history when confronted with speech-only stimuli. We are currently planning to do the dialogue reconstruction experiment with an Italian Talking Head lifting its eyebrows on either the first (“triangolo”) or the second word (“nero”). We would expect that rapid eyebrow movements have more impact for the Italian head than for the Dutch one, since the auditory cues are less informative for Italian than for Dutch. This would be in line with one of the findings of study 3, that eyebrow movements become more important when pitch cues are less clear.4 (1) Tilburg University, Communication & Cognition (2) Antwerp University, Center for Dutch Language and Speech 7. NOTES 1 This chapter presents an overview of our work on the perception of focus, a research topic that we have been involved with since 1998. The studies focusing on the dialogue reconstruction for Dutch and Italian are presented with more detail in Swerts et al. (2002). A preliminary version of the third, audiovisual study is described in Krahmer et al. (2002a). Thanks are due to our colleagues Cinzia Avesani, Zsófia Ruttkay and Wieger Wesselink for their help in carrying out these studies. 2 Superficially, newness accents and contrastive accents appear to differ in our data, but a closer look reveals that this is not the case. In particular, at first sight it seems that (1) single contrastive items on the adjective (CG) have a different shape from newness accents in the same position and (2) contrastive items are judged to be more prominent than newness accents. However, (1) the difference in accent type is not so much associated with a contrast-specific prosodic shape but with the occurrence of a nuclear accent in a non-default position. And (2) the perceived prominence is not so much the result of inherent melodic properties of contrastive accents but seems due to the fact that the prosodic context does not contain other intonationally comparable pitch peaks. When the words are presented in isolation, contrastive accents are not perceived as more prominent than newness accents. 3 The results for the eighth speaker were just above the significance threshold. This was due to the fact that his CC utterance was often classified as CG. There is no obvious reason for this. Anyway, it is hard to see how this can be related to information status. 4 POSTSCRIPT (2004) Since the first version of this chapter was written (2002), both follow up studies mentioned in the discussion have been carried out. Swerts and Krahmer (2004) report on a production experiment in which subjects were asked to pronounce short utterances with one syllable marked for focus. When the audio-visual recordings were analysed, it was indeed found that subject may use eyebrow movements to signal focus, but various other cues were found of which head movement and visual articulatory emphasis were the strongest. Krahmer and Swerts (2004) describe a series of experiments with an Italian Talking Head. Contrary to our expectations, Italian subjects made less functional use of eyebrow movements than Dutch subjects. In general, we found a number of interesting differences between subjects’ evaluation of Dutch and Italian Talking Heads, but all of these could be reduced to prosodic differences between the two languages.
8. REFERENCES Avesani C. “I Toni della RAI. Un Esercizio di Lettura Inton Ativa”. In Gli Italiani Trasmessi: la Radio, pp. 659-727. Firenze : Accademia della Crusca, 1997.
PERCEIVING FOCUS
137
Avesani, C., J. Hirschberg, and P. Prieto. “The Intonational Disambiguation of Potentially Ambiguous Utterances in English, Italian and Spanish.” Proceedings of the 13th International Congress of Phonetic Sciences, pp.174-177, 1995. Birdwhistell, R. Kinesics and Context. Philadelphia: University of Pennsylvania Press, 1970. Bolinger, D. Intonation and its Parts, London: Edward Arnold, 1986. ten Bosch, L. On the Structure of Vowel Systems. Aspects of an Extended Vowel Model Using Effort and Contrast. University of Amsterdam: Doctoral dissertation, 1991. Cassell, J., H. Vihjálmsson, and T. Bickmore. “BEAT: the Behavior Expression Animation Toolkit.” Proceedings of SIGGRAPH'01, Los Angeles, CA, pp.477-486, 2001. Cavé, C., I. Guaítella, R. Bertrand, S. Santi, F. Harlay, and R. Espesser. “About the Relationship between Eyebrow Movements and F0 Variations.” Proceedings of the International Conference on Spoken Language Processing (ICSLP), Philadelphia, pp. 2175-2179, 1996. Condon, W. “An Analysis of Behavioral Organization.” Sign Language Studies 13 (1976): 285-318. Cruttenden, A. “The De-accenting and Re-accenting of Repeated Lexical Items.” Proceedings of the ESCA Workshop on Prosody, Lund, pp. 16-19, 1993. Doherty-Sneddon, G., L. Bonner, and V. Bruce. “Cognitive Demands of Face Monitoring: Evidence for Visuospatial Overload.” Memory and Cognition 29.7 (2001): 909-919. Gussenhoven, C. “Testing the Reality of Focus Domains.” Language and Speech 26 (1983): 61-80. t Hart, H., R. Collier and A. Cohen. A Perceptial Study of Intonation: An Experimental-Phonetic Approach to Speech Melody, Cambridge: Cambridge University Press, 1990. Hirschberg, J. and C. Avesani. “The Role of Prosody in Disambiguating Potentially Ambiguous Utterances in English and Italian.” Proceedings of the ESCA Workshop on Intonation, Athens, pp. 189-192, 1997. D Imperio, M. “Narrow Focus and Focal Accent in the Neapolitan Variety of Italian.” Proceedings of the ESCA Workshop on Intonation, Athens, pp. 87-90, 1997. Krahmer, E. and M. Swerts. “On the Alleged Existence of Contrastive Accents.” Speech Communication 34 (2001): 391-405. Krahmer, E. and M. Swerts. “More about Brows.” In Zs. Ruttkay and C. Pelachaud (eds.), Evaluating ECAs. Dordrecht: Kluwer Academic Publishers, 2004. Krahmer, E., Zs. Ruttkay, M. Swerts, and W. Wesselink. “Pitch, Eyebrows, and the Perception of Focus.” Proceedings of Speech Prosody, Aix-en-Provence, pp. 443-446, 2002a. Krahmer, E., Zs. Ruttkay, M. Swerts, and W. Wesselink. “Perceptual Evaluation of Audiovisual Cues for Prominence.” Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO, pp. 1933-1936, 2002b. Ladd, D. Intonational Phonology. Cambridge: Cambridge University Press, 1996. Morgan, B. “Question Melodies in American English.” American Speech 2 (1953): 181-191. Pelachaud, C., N. Badler, and M. Steedman. “Generating facial expressions for speech.” Cognitive Science 20 (1996): 1-46. Pierrehumbert, J. and J. Hirschberg. “The Meaning of Intonational Contours in the Interpretation of Discourse.” In P. Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp. 342-365. Cambridge MA: MIT Press, 1990. Pitrelli, J.F., M. Beckman, and J. Hirschberg. “Evaluation of Prosodic Transcription Labeling Reliability in the ToBI Framework.” Proceedings of the International Conference on Spoken Language Processing (ICSLP), Yokohama, Japan, pp.123-126, 1994. Rump, H.H. and R. Collier. “Focus Conditions and the Prominence of Pitch-Accented Syllables, Language and Speech 39 (1996): 1-15. Ruttkay, Zs., P. ten Hagen, and H. Noot. “CharToon; a system to Animate 2D Cartoon Faces.” Proceedings Eurographics, 1999. Steedman, M. “Information Structure and the Syntax Phonology Interface.” Linguistic Inquiry 31.4 (2000): 649-689. Swerts, M. and E. Krahmer. “Congruent and Incongruent Audiovisual Cues to Prominence.” Proceedings of Speech Prosody, Nara, Japan, 2004. Swerts, M., E. Krahmer, and C. Avesani. “Prosodic Marking of Information Status in Dutch and Italian: A Comparative Analysis.” Journal of Phonetics 30.4 (2002): 629-654. Vallduví, E. The Informational Component. University of Pennsylvania: Doctoral dissertation, 1990.
‘
‘
MANFRED KRIFKA
THE SEMANTICS OF QUESTIONS AND THE FOCUSATION OF ANSWERS*
1. INTRODUCTION In Krifka (2001) I argued that three distinct phenomena of question semantics – alternative questions like Did it rain or not?, multiple constituent questions with pair-list readings like Who bought what? and the focus patterns of answers to constituent questions – cannot be dealt with adequately within the framework of Alternative Semantics. In Krifka (to appear) I argue that Alternative Semantics also is problematic as a framework for focus semantics in general; in particular, it makes wrong predictions in case focus occurs in syntactic islands. In this paper I will take up an issue of Krifka (2001) again, concentrating specifically on focus patterns in answers to constituent questions. Büring (2002) argued that the discussion of phenomena in Krifka (2001) was inconclusive, and that Alternative Semantics actually does not have problems with the data put forward there. I agree with the first point, but I will also show that on closer inspection, Alternative Semantics does not predict the correct patterns of answer focus. I will also show that the same holds for the theory of Schwarzschild (1999) which works with Givenness instead of a semantic notion of Focus. The Structured Meaning theory, on the other hand, does not have these problems. 2. ALTERNATIVE SEMANTICS FOR QUESTIONS AND ANSWERS I will start with summarizing the essentials of the Alternative Semantics approach to the meaning of questions and the corresponding focus of answers. The crucial idea is that the meaning of a question is the set of propositions that answer the question. It goes back to Hamblin (1958, 1973); Karttunen (1977) proposed a variant of it, and Groenendijk & Stokhof (1984) developed a version that is quite different with respect to what questions mean and how questions meanings are derived compositionally. The original version of Hamblin, which is also the one assumed by Rooth (1992), can be illustrated with the following examples:
139 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 139–150. © 2007 Springer.
MANFRED KRIFKA
140
(1)
[[Which student read Ulysses?]] = { p | ∃x[STUDENT(x) ∧ p = λi[READ(i)(ULYSSES)(x)] } equivalently, { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)] }
(2)
[[Which novel did John read?]] = { λi[READ(i)(y)(JOHN)] | NOVEL(y) }
(3)
[[Which student read which novel?]] = { λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y) }
I represent propositions as functions from indices (possible worlds or times) i to truth values. Predicates are dependent on indices; I make the simplifying assumption that arguments are independent of indices. I also assume for simplification that noun meanings are independent of indices. The meaning of the question Which student read Ulysses? then is the set of propositions of that can be described as ‘x read Ulysses’, where x ranges over the set of students, cf. (1). This representation of question meanings predicts that certain assertions are possible answers, whereas others are not. This is the criterion for congruent questionanswer pairs (to be extended later): (4)
A question-answer pair Q – A is congruent iff [[A]] ∈ [[Q]].
As an example, consider the following assertions as answers to (1). (5.a) is a felicitous answer; its meaning is an element of the meaning of (1). (5.b,c) are infelicitous answers, as their meanings are not elements of the meaning of (1). (5)
a. b. c.
[[John read Ulysses.]] = λi[READ(i)(ULYSSES)(JOHN)], where STUDENT(JOHN). [[John read Moby-Dick.]] = λi[READ(i)(MOBY-DICK)(JOHN)], where STUDENT(JOHN). [[Jill read Ulysses.]] = λi[READ(i)(ULYSSES)(JILL)], where ¬STUDENT(JILL).
The criterion that the meaning of the answer must be an element of the meaning of the question is too crude to exclude answers that may express the right proposition but whose prosody does not fit to the question. The generalization is that the position of the main accent must correspond to the wh-element of the question (see Paul 1891 [1880]). With Jackendoff (1972) and many others, I assume that the main accent is determined by a focus feature F in syntax. In modern terminology, we can rephrase Paul’s observation as: The F-feature of the question must correspond to the wh-constituent of the answer. This is illustrated by the following question-answer pairs.
THE SEMANTICS OF QUESTIONS (6)
a. b.
Which student read Ulysses? JóhnF read Ulysses. / *John read Ul´yssesF.
(7)
a. b.
Which novel did John read? John read Ul´yssesF. / *JóhnF read Ulysses.
(8)
a. b.
Which student read which novel? JóhnF read Ul´yssesF (and MáryF read Moby-DíckF).
141
In the theory of Alternative Semantics, such correspondences have been captured as follows (cf. von Stechow (1990), Rooth (1992)). Focus introduces alternatives to regular question meanings; if [[α]] is the regular meaning of an expression α, then [[α]]A is the set of its alternative meanings. The two propositions in (6.b) and (7.b) do not differ in their regular meanings, but in their alternatives. In the following example, De is the domain of entities of individuals. (9)
a. b. c.
[[John read Ul´yssesF]] = [[JóhnF read Ulysses.]] = λi[READ(i)(ULYSSES)(JOHN)] [[ John read Ul´yssesF.]]A = { λi[READ(i)(x)(JOHN)] | x ∈ e D } [[ JóhnF read Ulysses.]]A = { λi[READ(i)(ULYSSES)(x)] | x ∈ e D }
Felicitous question-answer-pairs must satsify the extended congruence criterion, which says that the meaning of the question must be a subset of the alternatives of the answer: (10)
A question-answer pair Q – A is congruent iff i. [[A]] ∈ [[Q]] ii.[[Q]] ⊆ [[A]]A.
The second clause of this congruence criterion, (10.ii), excludes answers with focus in the wrong place, like the infelicitous answers of (6) and (7). To see this, consider example (6): (11)
Which student read Ulysses? – JóhnF read Ulysses. Well-formed, as { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)} ⊆ { λi[READ(i)(ULYSSES)(x)] | x ∈ De}
142
(12)
MANFRED KRIFKA
Which student read Ulysses? – *John read Ul´yssesF. Not well-formed, as { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)} ⊄ { λi[READ(i)(y)(JOHN)] | y ∈ De}
The question meaning and the answer meaning in (12) share one proposition, namely the proposition λi[READ(i)(ULYSSES)(JOHN)], but the question meaning is not a subset of the answer meaning. The congruence criterion also predicts that answers must be focus marked, as otherwise the alternative meaning is reduced to a singleton set, and the subset requirement cannot be satisfied: (13)
Which student read Ulysses? – *John read Ulysses. Not well-formed, as { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)} ⊄ { λi[READ(i)(ULYSSES)(JOHN)] }
In general, the congruence criterion (10) ensures that there is enough focus marking. For example, it rules out question-answer pairs like (14) but allows for question-answer pairs like (15): (14)
Which student read which novel? – *JóhnF read Ulysses. Not well-formed, as { λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y)} ⊄ { λi[READ(i)(ULYSSES)(x)] | x ∈ De}
(15)
Which student read which novel? – JóhnF read Ul´yssesF. Well-formed, as { λi[READ(i)(y)(x)] | STUDENT(x), NOVEL(y)} ⊆ { λi[READ(i)(y)(x)] | x, y ∈ De}
But it is evident that congruence criterion, as it stands, does not rule out too much focus marking. For example, it allows for unfelicitous question-answer relations as in (16): (16)
Which student read Ulysses? – *JóhnF read Ul´yssesF. But we have: { λi[READ(i)(ULYSSES)(x)] | STUDENT(x)} ⊆ { λi[READ(i)(y)(x)] | x, y ∈ De }
THE SEMANTICS OF QUESTIONS
143
This was the major point of criticism in Krifka (2001). In that paper, I also considered other possible congruence criteria within Alternative Semantics that assume additional restrictions of the alternatives introduced by focus and by wh-elements, but I concluded that they could not systematically exclude overfocused or underfocused answers. 3. A PREFERENCE FOR MINIMAL FOCUS? Büring (2002) proposes that we can exclude overfocused answers by requiring in addition that focus be minimized. This option I dismissed in Krifka (2001) without appropriate discussion, and I will turn to it here. The proposal is incorporated in the following revised extended congruence criterion: (17)
A question-answer pair Q – A is congruent iff i. [[A]] ∈ [[Q]] ii. [[Q]] ⊆ [[A]]A iii. There is no A′ that is like A except A′ has less focus marking than A and still satisfies (i) and (ii).
With this congruence criterion, the problematic example (16) can be ruled out. To see this, consider the following three potential answers to the question Which student read Ulysses? and their alternative sets. (18)
Which student read Ulysses? { λi[READ(i)(ULYSSES)(x)] | STUDENT(x) } a. *JóhnF read Ul´yssesF. { λi[READ(i)(y)(x)] | x, y ∈e D } b. JóhnF read Ulysses. { λi[READ(i)(ULYSSES)(x)] | x ∈e D } c. *John read Ulysses. { λi[READ(i)(ULYSSES)(JOHN)] }
All answers satisfy clause (i) of the congruence criterion. Answers (a) and (b) also satisfy clause (ii), as [[(18)]] ⊆ [[(18.a)]]A and [[(18)]] ⊆ [[(18.b)]]A, but answer (c) is ruled out by it, as [[(18)]] ⊄ [[(18.c)]]A. Clause (iii) rules out answer (a), as it has more focus marking than (b): Where (a) has two F markings, (b) only has one. The underlying idea is that focus marking has to be used sparingly, to achieve the required purpose of ensuring that the question meaning is a subset of the alternative meanings of the answer. This could plausibly be modelled within optimality theory by two constraints: A higher ranked one that requires the focus marking to capture the meaning of the question (that is, [[Q]] ⊆ [[A]]A), and a lower ranked one that prefers minimal focus marking.
MANFRED KRIFKA
144
Extending the congruence criterion by clause (iii) is a promising move, but notice that (iii) contains a notion that is undefined so far, namely “less focus marking”. It is clear what less focus marking means when comparing sentences like (18.a) and (b): In (a), there is an additional focus feature that (b) lacks, and in this sense (b) shows less focus marking. But there are cases in which it is not so clear what less focus marking should mean. In particular, we should consider cases of broad and narrow focus, and compare them with cases of more or less focus. Let us start with the following case, in which the question asks for an activity, indicated by the verb do. I again specify the meaning of the question and the alternative meanings of potential answers. (19)
What did John do? { λi[P(i)(JOHN)] | P ∈ Dset, P: activity } a. John [read Ul´ysses]F. { λi[P(i)(JOHN)] | P ∈ Dset} b. *John [read Ul´yssesF]. { λi[READ(i)(y)(JOHN)] | y ∈ De} c. *John réadF Ulysses. { λi[R(i)(ULYSSES)(JOHN) | R ∈ Dseet } d. *[John read Ul´ysses]F. { λi[p(i)] | p ∈ Dst }
The VP question (19) asks for any property of John that is an activity. Here, Dset is the domain of meanings that are functions from indices to functions from entities to predicates, that is, the domain of properties, type set (or, in another notation, ¢s, ¢e, t²²), and Dseet is the domain of relations-in-intension, type seet. If the answer is formed with a transitive verb, as in (19.a), the accent on the object NP marks focus on the whole VP, a case of so-called focus projection or accent percolation. The answer (b) with object NP focus, which happens to be homophonous with (b), is unfelicitous. The same holds for answers like (c), with focus on the transitive verb. Also, answer (d) is unfelicitous; it would be felicitous in the context of a question like what happened. Again, the marking is similar to (a) by focus projection, with the main accent on the direct object. Obviously, all answers satisfy clause (i) of the congruence criterion (17). Answers (b) and (c) are ruled out by clause (ii), as we have [[(19)]] ⊄ [[(19.b)]]A, [[(19.c)]]A. The question asks for activities of John in general; the alternatives of the answer are restricted to reading activities by John and to relations of John to Ulysses, respectively. Answers (a) and (d) satisfy clause (ii), as we have [[(19)]] ⊄ [[(19.a)]]A, [[(19.d)]]A. Answer (d) should then be excluded by clause (iii) if we interpret “less” focus marking as meaning “more narrow” focus marking, if two expressions are compared that differ only insofar as one has a broader focus marking than the second.
THE SEMANTICS OF QUESTIONS
145
Consider now the following multiple constituent question and two potential answers. (20)
What did John do with which novel? { λi[R(i)(y)(JOHN)] | R ∈ Dseet, R: activity, NOVEL(y) } a. John réadF Ul´yssesF (... and críticizedF [Finnegan’s Wáke]F) { λi[R(i)(y)(JOHN)] | R ∈ Dseet, y ∈ De} b. *John [read Ul´ysses]F { λi[P(i)(JOHN)] | P ∈ Dset}
Multiple wh-questions are often supposed to be answered by a list answer a fact that I will disregard here. In the appropriate answer, each wh-element of the question corresponds to a focus of the answer, cf. (20.a). This satisfies clause (ii); we have [[(20)]] ⊆ [[(20.a)]]A. Answer (20.b) is not felicitous, even though we have [[(20)]] ⊆ [[(20.b)]]A. Can (20.b) be ruled out by clause (iii) of the congruence criterion? We have to decide what counts as less focusation: While (20.a) has more focus features, (20.b) has a broader focus. If we want to keep up our general hypothesis, then we must assume that broad focus is worse than having more foci: (21)
When two answers A and A′ compete, where both expressions are equal except that A has more but smaller foci, and A′ has fewer but broader foci, A is to be preferred over A′.
Consider now again question (19), repeated here, and two potential answers: (22)
What did John do? { λi[P(i)(JOHN)] | P ∈ Dset, P: activity } a. John [read Ul´ysses]F. { λi[P(i)(JOHN)] | P ∈ Dset} b. *John réadF Ul´yssesF. { λi[R(i)(y)(JOHN)] | R ∈ Dseet, y ∈ De}
Notice that (22.a) is a good answer but (22.b) is infelictous. Both answers satisfy clause (ii) of the congruence criterion. In particular, answer (22.b) does, as we have [[(22)]] ⊆ [[(22.b)]]A. To see this, we have to prove that each element of [[(22)]] is also an element of [[(22.b)]]A. Take p to be an arbitrary element of [[(22)]]. This means that p can be expressed as λi[P1(i)(JOHN)], where P1 is some constant of type set. Now we can take an arbitrary constant y2 of type e and define a constant R2 of type seet as follows: R2 := λyλxλi[P2(i)(x)]. Then we can express p as λi[R2(i)(y2)(JOHN)], and hence we have p ∈ [[(22.b)]]A. As the choice of p was arbitrary, we have [[(22)]] ⊆ [[(22.b)]]A, q. e. d.1
146
MANFRED KRIFKA
The proof goes through if the choice of R2 is totally unrestricted, that is, R2 is an arbitrary element of Dseet. This might be criticized; we might only allow “natural” relations. But, first, it is difficult to determine what “natural” relations are. And secondly, restricting the domain of focus alternatives easily yields to situations in which it is not guaranteed anymore that the question meaning is a subset of the alternatives of the answer; it might be just the other way round. See Krifka (2001) for a discussion of alternative congruence criteria and their problems with excluding over- and underfocused answers. Can clause (iii) of the focus criterion (17) decide between the two answers? Yes, it can, but if we follow the preference rule (21) then it selects, incorrectly, (22.b) over (22.a). And if we change the preference rule so that more foci are dispreferred over broader foci, then clause (iii) would select, incorrectly, (20.b) over (20.a). This means that the preference rule for less focusation cannot be spelled out in a general way so that it always identifies the felicitous answer. 4. GIVENNESS AS AN ALTERNATIVE? Büring (2002) also suggested to switch to the theory of Schwarzschild (1999) as a generally more adequate theory of the distribution of sentence accents. In particular, Schwarzschild assumes a rule of focus avoidance that is, in essence, the same as the preference rule for minimal focusation expressed by (17.iii). Schwarzschild (1999) follows Selkirk (1984) in assuming that focus on the larger constituent is licensed by focus projection. The general rule is that focus on an argument licenses focus on the head, and focus on the head licenses focus on the whole constituent. This is how VP focus is generated, step by step: (23)
a.
John [read Ul´yssesF]. (focus licensed by accent)
b.
John [readF Ul´yssesF]. (focus of head licensed by focus on arg.)
c.
John [readF Ul´yssesF]F. (focus on VP licensed by focus on head)
According to this theory, VP focus in John read Ul´ysses contains three focus features. In contrast, multiple focus on the transitive verb and the object NP only contains two focus features: (24)
John réadF Ul´yssesF.
Hence this theory makes a clear prediction for cases in which VP focus and V focus + NP focus are to be compared. VP focus as in (23.c) contains more focus marking than multiple focus on the verb and on the object NP as in (24). Consequently, everything else being equal, (24) should be preferred over (23.c), and in general having
THE SEMANTICS OF QUESTIONS
147
more foci should be preferred over having broader foci. This gives us the correct prediction for (20) but the false one for (22). Schwarzschild’s theory adds to Selkirk’s rule of recursive F-marking the following assumptions: (25)
If a constituent α is not F-marked, then α is Given.
(26)
Avoid F-marking.
The notion of Givenness is defined as follows: (27)
An utterance α is Given iff it has a salient antecedent β, and i. If α denotes an entity, α and β corefer, ii. or, modulo existential type shifting, β entails the existential F-closure of α.
To see how this is supposed to work, consider the following example: (28)
Q: A:
Who did John’s mother praise? She praised HIMF.
F-marking on him is allowed, even though the pronoun has a salient antecedent, John. Why is this so? Existential type shifting of the question Q gives us the proposition ∃x[PERSON(x) ∧ PRAISE(x)(MOTHER(JOHN))], for which I will write ∃Q, for short. The existential F-closure of the answer A is what we get when we replace the focus, if there is any, by a variable which is bound by an existential quantifier with wide scope. In the case at hand, this is ∃x[PRAISE(x)(MOTHER(JOHN))]. Note that this is entailed by ∃Q. This means that the sentence She praised HIMF is Given. Similarly, the VP praised HIMF is Given, as its existential F-closure, ∃y∃x[PRAISE(x)(y)], is also entailed by ∃Q. The object noun phrase HIMF is also Given, as it has an antecedent, John’s. Now, (25) allows for Fmarked constituents that are given, and so it allows that HIMF is F-marked. But (26) says that F-marking should be avoided if possible. Can F-marking on HIMF be avoided? No, because then the existential F-closure of the sentence without focus marking, She praised him, is PRAISE(JOHN)(MOTHER(JOHN)) (notice that there is no existential closure because there is no F-marking), and this is not entailed by ∃Q. But the projection of F-marking as in She [praisedF HIMF]F can be avoided, as it is not necessary to ensure that the resulting existential F-closure ∃P[P(MOTHER(JOHN))] is entailed by ∃Q. This is already achieved by less focus
MANFRED KRIFKA
148
marking, on HIMF. For similar reasons, additional focus marking as in SHEF praised HIMF, is not necessary, and hence avoided. Schwarzschild’s account generally prefers narrow foci over broad foci, and few foci over many. But as we have already seen, this makes wrong predictions. Consider the following case again: (29)
What did John do? a. He [readF Ul´yssesF]F. b. *He [réadF Ul´yssesF].
The existential closure of the question is ∃P[P(JOHN)].2 This entails all the possible focus closures of (29.a), which is ∃P[P(JOHN)], ∃R[R(ULYSSES)(JOHN)], ∃x[READ(x)(JOHN)] and ∃R∃x[R(x)(JOHN)]. But it also entails the focus closure of (29.b), which is ∃R∃x[R(x)(JOHN)]. As (29.b) has less focus marking according to Selkirk, it should be preferred, but contrary to the theory, it is not.
5. THE STRUCTURED MEANING ACCOUNT In concluding, let me point out that the Structured Meaning account of questions and answers has no problems with over- or underfocused answers. The central idea is that questions have functional interpretations: (30)
a. b.
[[Which student read Ulysses?]] = λx∈STUDENT λi[READ(i)(ULYSSES)(x)] [[Which novel did John read?]] = λy∈NOVEL λi[READ(i)(y)(JOHN)]
THE SEMANTICS OF QUESTIONS c.
149
[[Which student read which novel?]] = λ¢x,y² ∈STUDENT × NOVEL λi[READ(i)(y)(x)]
Focus in answers leads to a background-focus structure that can be presented as a pair: (31)
a. b. c.
[[JóhnF read Ulysses.]] = ¢λxλi[READ(i)(ULYSSES)(x)], JOHN² [[John read Ul´yssesF.]] = ¢λyλi[READ(i)(y)(JOHN)], ULYSSES² [[JóhnF read Ul´yssesF]] = ¢λ¢x,y² λi[READ(i)(y)(x)], ¢JOHN, ULYSSES²²
The obvious congruence criterion in this representation is that the question meaning should correspond to the background of the answer, in the sense that the question meaning differs from the background of the answer only insofar as it might have more restricted domains. I will write f ⊆ g if the functions f is like g except that the domain(s) of the argument(s) of g may be larger. In addition, the focus must be an element of the domain of the question. (32)
A question-answer pair Q – A with meanings [[Q]] and [[A]] = ¢B, F² is congruent iff: i. [[Q]] ⊆ B ii. F ∈ DOM([[Q]])
Clearly, the question-answer pairs (31.a) – (32.a), (31.b) – (32.b) and (31.c) – (32.c) are congruent. We also find that the problematic cases considered above are treated in the expected way. First, consider the following two questions: (33)
[[What did John do?]] = λP ∈ [Dset ∩ Activities] λi[P(i)(JOHN)]
(34)
[[What did John do with which novel?]] = λR ∈ [Dseet ∩ Activities] λy ∈ NOVEL λi[R(i)(y)(JOHN)]
Now, consider the following two answers. For VP focus in (36) I do not assume focus projection in the style of Selkirk; rather, I assume that focus is assigned directly to the VP and expressed by accent on the object NP. (35)
[[John [read Ul´ysses]F]] = ¢λPλi[P(i)(JOHN)], λi[READ(i)(ULYSSES)]²
150 (36)
MANFRED KRIFKA [[John réadF Ul´yssesF]] = ¢λRλy λi[R(i)(y)(JOHN)], READ, UYLYSSES²
Clearly, only the combinations (33) – (35) and (34) – (36) satisfy the congruence criterion (32); no other combinations do. No rule of minimization of focus is called for; wrong focusation leads to a direct violation of clause (i) of the congruence criterion. In conclusion, it appears that the careful consideration of focus in answers to constituent questions argues against the alternative semantics account, and for the structured meaning account, of questions and answers. Zentrum für Allgemeine Sprachwissenschaft, Typologie, und Universalienforschung Berlin and Humboldt-Universität zu Berlin 6. NOTES
* Thanks to Regine Eckardt, Andreas Haida and Kerstin Schwabe for discussion of the points of this paper, and to Daniel Büring for pointing out problems in the argumentation in Krifka (2001). 1 As a matter of fact, we can also prove that [ (22)]] ⊇ [ (22.b)]]A, that is, the two sets are equal. 2 Or rather, ∃P[P(JOHN) ∧ P: activity], as the question asks for an activity. Then it is actually unclear whether the existential closure of the question entails the existential F-closure of the answer because this does not have to be restricted to activities.
7. REFERENCES Büring, Daniel. Question-Answer Congruence - Unstructured Comments on Krifka (2001). Berlin: ZAS, 2002. Groenendijk, Jeroen and Martin Stokhof. Studies on the semantics of questions and the pragmatics of answers, Department of Philosophy, University of Amsterdam: Doctoral Dissertation, 1984. Hamblin, C. L. “Questions.” The Australasian Journal of Philosophy 36 (1958): 159-168. Hamblin, C. L. “Questions in Montague English.” Foundations of Language 10 (1973): 41-53. Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972. Karttunen, Lauri. “Syntax and Semantics of Questions.” Linguistics and Philosophy 1 (1977): 3-44. Krifka, Manfred. “For a Structured Account of Questions and Answers.” In Audiatur Vox Sapientiae. A Festschrift for Achim von Stechow, eds. Caroline Féry and Wolfgang Sternefeld, 287-319. Berlin: Akademie-Verlag, 2001. Krifka, Manfred. “Association with Focus Phrases.” In Valerie Molnar and Susanne Winkler, eds., The architecture of focus. Berlin: Mouton de Gruyter, to appear. Paul, Hermann. Principles of the History of Language [Prinzipien der Sprachgeschichte]. Translated from the second edition of the original by H. A. Strong. London: Longmans, Green, and Co., 1891 Leipzig, [1880]. Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75-116. Schwarzschild, Roger. “GIVENness, AvoidF and other Constraints on the Placement of Accent.” Natural Language Semantics 7 (1999): 141-177. Selkirk, Elisabeth O. Phonology and Syntax: The Relation between Sound and Structure: Current studies in Linguistics. Cambridge, Mass.: MIT Press, 1984. von Stechow, Arnim. “Focusing and Backgrounding Operators.” In Werner Abraham, ed., Discourse Particles, 37-84. Amsterdam: John Benjamins, 1990.
CHUNGMIN LEE
CONTRASTIVE (PREDICATE) TOPIC, INTONATION, AND SCALAR MEANINGS
1. INTRODUCTION In this chapter I will consider Contrastive Topic (CT), Contrastive Predicate Topic (CPT) and Focus in information structure and their relations to intonation and meaning, as I have attempted to account for in a series of papers on related topics1. Particularly, I will try to see the conventional scalar implicature meanings triggered by CPT and CT in connection with its intonation. In dealing with those phenomena, I will use data extensively from Korean, where CT is surprisingly clearly marked morphologically and intonationally, in comparison with data from English. Information structure, claimed to constitute a separate component from phonological, syntactic and semantic components (Vallduvi 1992), consists basically of Topic – Comment or Background – Focus information. Apart from whether it constitutes a separate component in grammar, no one can deny that it is closely interwoven with morphological structure (particularly in Korean and Japanese), syntactic linear and hierarchical structure, semantic structure, and prosodic phonological structure. That is why we came to organize the present workshop and create a volume on Topic and Focus in connection with their meaning and intonation. Recently the phenomenon of CT in particular has been well characterised. Through this kind of common efforts we believe we can deepen our understanding of underlying principles governing related issues cross-linguistically. The organization of the chapter is as follows: In 2 Contrastive Topic is distinguished from non-contrastive Topic and from list contrastive topics, which do not leave implicature; CT is examined in a dialogue model and the notion of sum considered; Korean CT is shown on pitch tracks. In 3 scalar meanings are analyzed; type-subtype scalarity and subtype scalarity are distinguished and CT’s inherent tendency of subtype scalarity even in entities is advocated. In 4 scope relations between scope bearers and CT and CT’s narrow-scope nature is discussed, together with non-narrow-scope topicalization effect. In 5 Contrastive Predicate Topic and the scope relation between CT and REASON clause are explored. 6 concludes the chapter.
151 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 151–175. © 2007 Springer.
152
CHUNGMIN LEE 2. ASPECTS OF CONTRASTIVE TOPIC
2.1. Topic We can view an utterance from a Topic perspective and get a Topic – Comment structure, as follows (Topic here being a non-contrastive Topic): (1) [Water]Topic [consists of oxygen and hydrogen]Comment. [hankwukin-i palmyenghay-ss-ta]Comment (2) [kumsok hwalca]Top -nun metal type -TOP Koreans-NOM invent-PAST-DEC ‘As for the metallic type, Koreans invented it.’ (3) Inswu -nun sosel chayk -ul sa-ss-e-yo -TOP novel book-ACC buy-PAST-DEC(POLITE) ‘Inswu bought a novel.’ (to the question “What did Inswu buy?”) Typically, a non-contrastive Topic is given, presupposed, or anchored in the speech situation. It is something that is talked about by the Comment (or often predicate) and lacks contrastiveness and is located at the initial, prominent position of a sentence, with -nun (Korean) or -wa (Japanese) marking, though a null Topic or bare nominal Topic without a Topic marker is possible, unaccented. The natural kind in (1) and the artifact kind in (2) from an underlying object, as nominals in common ground, both quantificational and proper name-like (though not placed in Prince’s 1989 or Gundel et al’s familiarity or givenness hierarchies), as well as the previously mentioned proper name in (3), function as Topics, being talked about by the following Comment. The notion of unmarked, non-contrastive Topic is psychologically and theoretically real, basically based on categorical or double (as opposed to thetic) judgment (Kuroda 1972, Brentano 1973, Marty 1918, Ladusaw 2000). The structure of Topic – Comment is most natural in information and discourse structure. Thus, Roberts’ (1997) pessimism about the theoretical status of Topic in information structure, and Buring’s (2003) exclusion of non-contrastive Topic as a category in information structure, largely based on English, are not tenable. Jackendoff (1972) failed to provide any intonational status for a noncontrastive Topic, although Steedman (2000) assigned L to it. But Topic is a basic category just like Focus. Null Topics in various languages have no phonetic (or prosodic) manifestation but are conceptually real for propositional semantic interpretations. CT is marked in meaning and intonation, constituting a complex category, and therefore came to draw wide attention rather recently. First, the intonation pattern of (3), a Topic sentence, is distinct in pitch and energy concentration, as in (Fig.1). This is a typical sentential intonation (IntP=IP) in Korean, with a Topic and a preverbal Focus. The Focus constituent, answering a previous wh-word, is informative (via intercategorial entailment (Zuber 2002) and existential closure (Scharzschild 1999 and Karttunen 1977)). The non-constituent ‘Inswu bought’ is given and relatively low in pitch compared to the Focus constituent in the middle and Inswu-nun in the given is a Topic phrase. The 200 mh
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
153
peak comes on a novel at the end of the corresponding SVO English S Sam bought a novel. Observe the intonation pattern of a Topic sentence in Korean in Fig. 1:
Figure 1. Non-contrastive Topic We will shortly see how the above Topic intonation is sharply distinct from the CT intonation shown in Figure 2. 2.2. The Nature of Contrastive Topic Contrastive Topic, on the other hand, is also given, presupposed, or anchored in the speech situation to a certain degree like a non-contrastive Topic. It is controversial whether it is also something that is talked about; Hetland (2003), for instance, does not agree that those CT instances derived from predicate positions meet the aboutness condition of Topic and calls them simply “Contrasts” like some other linguists. CT necessarily shows contrastiveness and is located typically in the middle or some times at the initial position of a sentence, with morphological markers –nun (Koresn), -wa (Japanese), thi (Vietnamese) or nan (Thai), together with a high tone, or with a contrastive contour alone such as B accent (L+H%LH%) (English). CT is distinct from unmarked, non-contrastive Topic but some linguists (Jackendoff partly, Buring’s earlier works (though his 2003 adopts the term “Contrastive Topic” in general for the first time) and Steedman (2000), etc.) confusingly label it as Topic (or variously as S-Topic) or Theme (though Steedman (in this volume) began to incorporate kontrast). On the other hand, some syntacticians call it contrastive focus (CF). I will address the distinction between CT and CF briefly later. “CT” is basically used to mark Contrastive Topic in logical form but here it will be used as abbreviation of Contrastive Topic as well for convenience.
154
CHUNGMIN LEE
People often tend to forget that Jackendoff’s (1972) dialogue examples of A accents and B accents are situated in a context of a given number of people eating a given number of different foods. Sums (pluralities and mass-partitions with join semilattices) are involved and they or their parts function as potential Topics or CTs in the relevant question for a CT answer. Therefore, when the speaker asks about FRED in (4), HE in the second sentence cannot be assigned a pure Focus as done by Kadmon (2001: 392) (with her ‘LarryFF’). (4) A: Well, what about FRED? What did HE eat? B: FREDB ate the BEANSA. (Jackendoff 1972) Here HE must be marked CT (or Topic), not F, however its intonation may be modified in the English question sentence (the fall-rise accent remains in an echo question (O’Connor et al 1973, Hetland 2003; in Hungarian a CT in a question is reported in Molnar 1998). It is one of those people in the context and was mentioned or accommodated in the previous question sentence, thus being in the background as given. If Focus is assigned, because of rhe preceding focal wh-word, the sentence becomes a reclamatory question such as (5): (5) What did you say HEF ate? Similarly, MARY in (6), with alternative individuals in the speaker’s mind, i.e. CTalternatives, not Focus alternatives must be marked CT, not F, contra Krifka (2003). (6) What did John give MARYCT as a birthday present? A multi-wh question (such as Who ate what? or Who kissed who?), appearing on the top of discourse tree structures (Carlson 1983, Roberts 1995, Buring 2003) typically requires a multi-narrow focus answer such as ‘FREDA ate the BEANSA ’ or ‘LarryA kissed NinaA (often a reciprocal alternative question), as an exhaustive answer, a pair-list answer, etc. (cf. Krifka 2002). This will get the following dual focal value, which Buring himself employed to criticize Roberts’ (1995) characterization of CT as a set of propositions:
(7) {x ate y 蹙 x, y ം De} In other words, immediate daughters of the top multi-wh question are not warranted to get a person or food in them. CT utterances cannot be felicitously at the beginning of a discourse and they cannot be felicitously preceded by a multi-wh question abruptly. There must be an appropriate way of introducing a topical element in the question (Kadmon 2001 also criticized this point; see Krifka in this volume for a structural account) and at least a D-linked wh-question may have to be given such as Which person ate what for a subject CT question-answer (What did Fred (and Sam) eat?-FredCT ate the beans) and Who ate which food for an object CT questionanswer daughters for real congruence in the tree. Otherwise, the derivation is arbitrary and unpredictable, ignoring which element is previously given. Thus, a CT
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
155
is ‘about’ a given part in the previous discourse and locally ‘about’ the rest of the CT sentence. Hence it is topical. A CT is selection of one or part of the potential sum Topic denotations and focal in this local sense in the given potential Topic. In the multi-foci case in Korean, the Nominative marker –ka and the Accusative marker –rul but not the Topic marker –nun is employed (Lee 1999). The given or accommodated part as a potential Topic of the previous discourse context must be present to represent an appropriate CT (below in the tree), as something like FRED/HE in (4A). In Korean, a CT occurring in a question sentence has a tone lower than a CT in a declarative S. The most natural and relevant question that precedes a CT answer should include a potential Topic of a sum of individuals of <e> type or properties of <e, t> type. Buring’s claim, on the other hand, that his proposed CT-value is rather a set of sets of propositions against Roberts’ (1995) ‘a set of propositions’ (Kadmon 2001 also criticizes this) is surely an improvement. The CT-value of (4B), then, should be: (8) {{x ate y 蹙 y ം De}蹙 x ം De}} = {{Fred ate the beans, Fred ate the peanuts, Fred ate the eggplant}{Sam ate the beans, Sam ate the peanuts, Sam ate the eggplant}{Mary ate the beans, Mary ate the peanuts, Mary ate the eggplant}}(The variables can be equivalently bound by Ȝ operator). In each subset above, the subject happens to be fixed and functions as Topic for alternative objects – foods. The choice of one of the alternative foods, i.e. the beans here, is marked Focus at the outset because it is not relativized any further, being exhaustive. The choice of one Topic from the alternative Topics – persons, i.e. Fred here, is focal. The would-be Topic is relativized to become a CT, involving a focal process. In this sense, CT is both topical and focal, but because of its Topic base, the head of the term Contrastive Topic is Topic, not Focus, as in Contrastive Focus. Focus does not have a Topic base. Furthermore, Contrastive Topic is more marked than Topic in its term and content. Kadmon (2000) rightly criticized this CT-value approach for relying too much on Focus-value approach. The invariance of an element in one subset, however, suggests its topic-hood. If it had not a superset, it would be a non-contrastive Topic. There would not be a choice involved. 2.2. List contrastive topics A serious problem about the above and its corresponding D-tree approach by Buring (2003) is that it is partly good only for the phenomenon of “list contrastive topics” (Lee 2000), when the exhaustive list of all the contrastive topics that constitute a big Topic is uttered. But, then, the intonations for these listed contrastive topics are not proper CT contours (L+H*LH%, roughly B accent or fall-rise) except in the topicalized, initial position. Note that people do not accept (9) and (10) but accept (11) and (12). (9) *Fred ate the BEANS but Sam ate the PEANUTS. L+H*LH% L+H*LH%
156
CHUNGMIN LEE
(10) *Fred ate the BEANS but he did not eat the PEANUTS. L+H*LH% L+H*LH% (11) FRED ate the beans but MARY ate the peanuts. L+H*LH% L+H*LH% (12) The BEANS, he doesn’t like; the EGGPLANT, he doesn’t L+H*LH% L+H*LH% like; and the PEANUTS, he doesn’t like, either. In (12), many people do not like the last item having a CT contour of L+H*LH% because they are aware that it exhausts the list of items of the identical presicates either Brown (1980) noted that a high boundary signals that there is more to come on the current topic. If we consider topicalized CTs as special cases of CT requiring a special syntactic position, the most natural and typical situation in which CT occurs is a single sentence utterance with a CT in-situ like (4B), which unmistakably involves a conventional implicature (because it is evoked by the contrastive contour in English or a morpheme plus a high tone in Korean and even without these linguistic devices the same implicature can be evoked purely from context conversationally --- Steedman (in this volume) largely came to take this position but Buring (2003) views it as conversational) of but Sam did not eat the beans (or but I don’t know about the rest of the people). This denial is the first evoked implicature even when ‘Sam ate the peanuts’ but it is somewhat redundant and trivial because the alternative that entails the denial is rather explicitly asserted. This listing effect (with no implicature) occurs in a discourse even across speakers or sentence boundaries. Consider Kadmon’s interesting observation in (13). The only potential relevant kissers are Larry and Bill
(13) A: Who kissed who? B: (Let’s see) LarryTF kissed Nina FF. C: (Right, and) BillTF kissed Sue. Therefore, the notion of “Contrastive” may better be understood as showing a contrast between the said part and the polarity-reversed, implicated unuttered part of the partly realized, contrastively conjunctive complex sentence. The conjunction, of course, includes more directly contrasted elements, one in the first conjunct and the other in the implicated second conjunct. List contrastive topics do not have the implicature part of this nature because the said sentence is complete as a whole. Thus explored, the CT contour (L+)H*LH% in English (and similarly L*H(H%) in German (Fery 1993)), with the required implicated proposition is used in rather limited discourse contexts. Only syntactically topicalized contrastive topics, as list contrastive topics, share the same CT contour with no argumentatively assertive implicature, as can be seen in a typical CT utterance.
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
157
2.3. Contrastive Topic in Korean: Intonation CT in Korean remarkably shares a great deal of features witnessed in English. First, a typical CT with implicature requires the topic marker –nun and a high tone ((L)H*). The topic marker –nun is shared by a non-contrastive Topic, as we have seen. Second, list contrastive topics do not show a high tone required for a typical CT, although it is marked by the same topic marker –nun. Let us first observe how sharply a CT contour in Fig. 2 is distinguished from the non-contrastive Topic pitch in Fig. 1. (14) (After hearing that Inho didn’t come, regarding his friend Yengswu) Yengswu-nun w-ass-e -CT come-PAST-DEC ‘YengswuCT came.’
Figure 2. Contrastive Topic
There is a sharp difference in pitch height between the Topic –nun (Fig. 1) (150 mh) and the CT –nun (Fig. 2) (over 200 mh). This is why I described the CT -nun phrase as (L)H*(%). There occurs a direct rise from L on the final syllable of the nominal or other lexical constituent (CT target) to the CT marker –nun, a non-lexical function element, unlike in Indo-European languages (C. Lee 2000). This implies that contrastive accent and contour in Korean and English is different from other focus accents. In Japanese, according to Nakanishi (in this volume), a CT marker wa from Subject in initial position does not seem to be high, but mid-sentential CT wa is high in tone according to my fieldwork. The marker -nun shows phrasal boundaries, those of Intonational Phrase (IntP) or Accentual Phrase (AP)2. In
158
CHUNGMIN LEE
naturally occurring speeches, non-contrastive Topic and list Topic are so low in pitch that marking H indiscriminately on their S-initial –nun in Jun’s (1998) K-ToBI may have to be reconsidered, despite the tendency of LHLH AP in Korean. Because of the phrase-final rise, CT has nothing to do with dephrasing effect witnessed in (non-phrase-final) Focus elements (Jun 1993). Therefore, Focus may follow it. Dephrasing is analogous to de-accenting in English (Pierrehumbert 1980), e.g. Q: Who did Anna marry? A: (Anna married) MANNYH*LL%. Because of the following Focus, backward deaccenting occurs and no pitch accent or boundary is marked on the string of the non-contrastive Topic and the verb in the background (a noncontrastive Topic given in Korean is similar, as in Fig. 1). Typologically, in Italian and Romanian given information is not de-accented, contrastively focused elements already lacking accent (Ladd 1996).CT –nun is also the longest in duration among different phrase final elements. In contrast to the high pitch of the above typical CT, observe the low pitches of the list contrastive topics in Fig. 3. (15) A: ai-tul-un myet haknyen –i-ci-yo children-TOP what grade –be-POLITE ‘What grades are your children in?’ B. kun ay nun sa-haknyen-i-ko cakun ay nun i-haknyen-i-ey-yo older one –CT 4th grade-be-and younger one –CT 2nd grade-be-POLITE ‘The older one is in 4th grade and the younger one is in 2nd grade.’
Figure 3. List contrastive topics 2.4 Contrastive Topic to be Preceded by Potential Topic of Sum The crucial requirement of CT is that potential Topic of sum must precede or be assumed to precede it. If a sum is impossible, an entailing stronger element cannot be marked CT. Consider:
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
159
(16) A: Did she give birth to a baby? B: Yes, she got a daughterF. B’: #She got a daughterCT. In a join semilattice, a (local) top type is entailed by its lower types in the ontological type/sort hierarchy, and thus ‘given’ (Schwarzschild (1999) by the latter if a lower type element occurs first, e.g. male/female→gendered, gorilla/ monkey→animal. Likewise, daughter/son→offspring (baby) but we cannot get the idea of sum in the situation of ‘giving birth to a baby’ in (16A). Therefore, a stronger daughter is informative and can be not CT-marked but F-marked or CFmarked (to be discussed shortly) because an assumed intervening direct question is an alternative disjunctive question, ‘If yes, is it a daughter or son? If the question is (17A), we can get the notion of sum in children (or babies) and hence B. (17) A: Do you have children? B: I have sonsCT. If B’s answer is ‘Yes, I have sonsF,’ then it is exhaustive (but still can have the conversational implicature of ‘but I don’t have daughters’ from the context. Once (17B) is uttered, it by default evokes a scalar implicature and I say it is conventional because it has a special fall-rise pitch contour and is not readily cancellable without epistemic contradiction. Even an explicitly asserted proposition may at times be cancelled in a very roundabout way, with hedges and corrections. A conventional implicature may not be an exception to this kind of roundabout situation. The implicature of (17B) may initially be scalar with something like “But I don’t have daughters and I am not totally satisfied with this,’ tending to give more weight to ‘daughters’ on a pragmatically evoked scale. In a boy preference society, B’s answer, I have daughtersCT’ may evoke a reversed scale of {daughter < son}. Often a question is used indirectly to induce the hearer’s response on his/her possible involvement in the event in question. For instance, ‘Who hit Mary?’ Then, ‘someone hit Mary’ is derived as presupposition via existential closure of the interrogative (Karttunen 1977) such that λp∃x[p & p=hit(x, m)]. Next, a question, “Did you and other people hit Mary?” is accommodated and ICT didn’t hit her is naturally interpreted; here, I has more weight than other people (Lee 2003). 3. SCALAR MEANINGS
3.1. Subtype Scalarity A ‘coin/bill→money’ situation (Lee 1999) evokes clearer scales. Although ‘money’ is a mass term, it can be partitioned into two equivalence classes: coins and bills. When asked, ‘Do you have money?” A sum idea can be evoked because having both coins and bills at the same time is all right unlike in the ‘baby birth’ situation and a
160
CHUNGMIN LEE
typical answer can be (17a) on a contextual scale of (bills with greater weight) (in this situation (17b) is infelicitous), but in a very special context, e.g. getting on a bus, (17b) is possible, in an opposite scale (coins with greater weight). (17) a. I have coinsCT. b. I have billsCT. My claim, then, is stronger than previous accounts in that scales are dually evoked in my account, first by the semantic relations of atom – sum, member – set, subset – superset, and subtype – type, and secondly by pragmatic ordering relations between alternative parts, i.e. atoms, members, subsets, and subtype elements, of larger units or wholes in the query, when individuals are discussed, as exemplified above ({coins < bills}, {daughter < son}. In other words, it is not a simple ordering of money – coin, baby – daughter as values in a basic scale ordered by a relation between type in the query and subtype in the reply. When the query is by sum and the reply is by subset or atom, the reply is not enough and generates the implicature of ‘not sum’ but the reply has affirmed the subset or atom already and it leads to ‘not the rest or its relevant part’ even conversationally without fall-rise. This kind of relation has been well explored by Ward and Hirschberg (1985), although they characterised fall-rise as implicating “uncertainty,” which is general and somewhat vague but was called “conventional implicature.” They defined scale by poset (partially ordered set) and included in it hierarchical and linear orderings such as spatial or temporal orderings, stages of a process, and relationships of type/subtype, or part-whole, in addition to Ladd’s (1980) hierarchical sets ordered from root to leaf. They give a ‘is a part of’ relation by dissertation - first chapter - first half. They also provide a symmetric relation ‘cousin of’ creating oddness in fall-rise. One conjunct cannot be denied, with the other being affirmed, in ‘I am John’s cousin and he is mine’ in my account. Consider their example: (18) A: Are you John’s cousin? B: #He’s \mine/. . The same kind of relation, which may be termed as an abstract LARGER THAN relation, holds in Topic formation: the Topic denotation must be LARGER THAN its parts and the parts again are ordered in the same way LARGER first in the multiple nominative/accusative case construction and only the largest can be Topic (Lee 1989, 1994). In (19), where ‘elephants’ are larger than their parts ‘noses’ and comes first, forming a Topic, as in (a), and if the part nominal ‘noses’ takes a topic marker it comes to function as a CT, as in (b), implicating ‘but not other parts’ or ‘but they do not smell well.’ If the Topic marker in the initial position is replaced by the nominative marker, the nominal is focused, as in (c). (19) a. khokkiri-nun kho-ka kil-ta elephant-TOP nose-NOM long-DEC ‘(As for) Elephants, their noses are long.’
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
161
b. khokkiri-nun kho-nun kil-ta elephan -TOP nose -CT long-DEC ‘(As for) Elephants, their noses are long but ---.’ c. khokkiri-ka kho-ka kil-ta elephant-NOM(FOC) nose-NOM long-DEC ‘It is elephants whose noses are long.’ My further claim is that the lower line sister alternatives in hierarchies may typically form scales in CT. A typical CT with an appropriate contour evokes a scalar implicature conventionally by default but a list alternatives reading may be forced by certain nominals in certain contexts. Consider further examples by them: (20) A: Is she taking any medication? B: \Vi/tamines. (21) A: Are you a doctor? B: I have a Ph.\D/. In (20B) a stronger kind of medication is denied and in (21) ‘a medical doctor,’ which has more weight on that particular pragmatic scale, may be denied. Note that Ladd’s (1980) following example shows that there is a whole-part (poset) relation between the locations in (A) and (B), unlike in (A) and (C). B does not entirely agree, denying the wider range, whereas C agrees with A’s claim strongly, leaving no room for skepticism but metalinguistically negating A’s expression covertly. (22) A: Harry’s the biggest fool in the state of New York. B: In ITHACACT , maybe. C: In THE WHOLE WORLDF, maybe. Consider van Rooy’s (2002) example of scalar interpretation of nominals. He does not introduce fall-rise here. (23) Q: Which Beatle’s autograph do you have? A: George Harrison’s. ~> ¬John Lennon’s, though ¸Ringo Star’s “Standard” partition: 4 Beatles ~> 16 cells. Autographic prestige: Star < Harrison < {Lennon, McCartny} Van Rooy does not distinguish between a semantic scale arising from the hierarchy of the sum of Beatles’ autographs (this must be posited in the assumed query preceding (23Q)) and the individual Beatles’ autographs and a pragmatic scale arising from different weights among different alternative Beatles. He addresses the latter type of scale. Without any CT contour on (23A), it may have an exhaustive interpretation with “standard” partition and list reading, evoking no particular scale among alternative Beatles. Herburger (2000) also indicates that “When a fall contour
162
CHUNGMIN LEE
on free focus is changed to fall-rise, a resulting “at least” interpretation undermines the exhaustivity of focus.” Alternatively, it can have a conversational scalar implicature shown above, based on the given prestige scale in the context. If we use the Contrastive (fall-rise) Contour on the answer “George Harrison’s,” preferably with the question ‘Do you have John Lennon’s autograph?’ the scalar implicature is unmistakable and because of the linguistic device used (a contrastive pitch contour in English or a morpheme + a high tone in Korean) it is a conventional implicature. Even without this contour or morpheme, the answer can have a conversational implicature, depending on contexts or can be free of it, exhaustively interpreted. Evolutionarily, those particular prosodic or morphological devices seem to have come to regularly license fairly predictable Contrastive Topic meanings associated with them from relevant contexts. The unuttered meanings of Contrastive Topic developed from conversational implicatures arising without such special devices and still co-exist with them. In a nutshell, Contrastive Topic is employed to convey this kind of implicature, concessively admitting the uttered proposition. What happens when an answer is uttered negatively with a CT? Let us consider the following dialogue situation: The potential Topic of sum is given in the query (Q) and the answer (A) is negatively uttered with a CT John Lennon’s, which may be located highest in a scale of prestige. This pragmatic scale may be the speaker’s presupposition or accommodated by the hearer’s scalar reply. (24 ) Q. Do you have Beatles’ autographs? A. I don’t have John Lennon’s CT. Then, its conventional implicature is polarity reversed, i.e. affirmative but the value of weight not higher than the given value but lower than it. Therefore, the implicature in the given context turns out to be “But I have other Beatles’ (weaker than John Lennon in the scale of prestige) autographs.” Often the context is limited than this, e.g. the speaker knows whether the hearer has Lennon’s and McCartny’s and he/she knows that the hearer knows the speaker’s knowing of the fact and asks, “Do you have Harrison’s autograph?” The reply is “I don’t have Harrison’s CT . Then the relevant value element is the lower one: Harrison’s, generating the implicature of “I don’t have Star’s.” This is the opposite of what happened in (23), where an affirmative CT reply is uttered. Now a generalization follows: if a sentence with a CT is uttered (as a reply), contrastively (“but”) a polarity-reversed proposition with an alternative value greater, if the reply is positive, and less, if the reply is negative, than the CT denotation, in the pragmatic scale. Next, let us turn to what kinds of categories can be marked CT. In Korean (and presumably crosslinguistically), basically most categories may be marked CT including adverbs. In Korean, however, prenominal quantifying Determiners such as motun ‘all’ cannot be marked CT, unlike in English. Instead, their adverbial forms (motu, ta ‘all’) can. In (25), an adverb cal ‘well’ has been marked CT and a very high tone far over 200 hz is noticed in Fig. 5. (25) is negative and an affirmative
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
163
proposition with a weaker value than ‘well’ in the scale is implicated, such as ‘but I know a little bit.’ This is sharply distinguished from an utterance without CTmarking: cal molla ‘I don’t know it well,’ ‘I am not quite sure,’ which can be used when the speaker knows (almost) nothing about it. Chierchia (2002) discusses a similar, interesting point but does not have the idea of CT at all when it is required. Observe: (25)
cal -un moll-a well-CT no-know-DEC ‘(I) don’t know (it) wellCT .’
Figure 5. Adverb CT Nominals in all grammatical relations or positions take CT in Korean including object CT, as in (26) and Fig. 6. An object CT fronted to the initial position of a sentence tends to be more topical passively with wide scope than that in situ. (26) sakwa –nun mek –ess- eyo apple -CT eat-PAST-DEC ‘(I) ate apples.’ (with a null Topic) -
164
CHUNGMIN LEE
Figure 6. Object CT Nominals with the Possessive marker –uy following cannot take the CT marker neither after the nominals nor after –uy. Only predicatively used categories can take CT (introducing the Nominalizer –ki in the prenominal modifier position, e.g. yeyppu-ki-nun ha-n sonye ‘A prettyCT girl.’ A postpositional phrase of DP + P takes the CT marker after P but not after DP. Ku ai-nun [cip’house’-eyse’at’-nun] nul wun-ta ‘That child cries always at home.’ Contrastive Predicate Topic will be discussed shortly. Hedberg’s (2003) example He hasn’t (H*) done anything (L+H*) extraordinary.( L+H* LH%) [4/27/01] shows a modifier CT in a negative sentence and evokes an affirmative implicature with a lower value such as he may have done something ordinary. Its correspondence in Korean gets CT-marking with –nun on the nominal kes ‘thing,’ but the CT-marking is associated with the modifier thekpyeha-n triggers its alternatives. This is a CT and it seems that she departed from assigning a “Contrastive Focus” to this fall-rise case (Hedberg et al in this volume). Let us further consider what types of sentences license CT in general. A simple declarative sentence is a typical type and an interrogative sentence in Korean is another. I demonstrated elsewhere (Lee 2002, etc) that in most languages CT is licensed in relative and subordinate clauses, though restrictively crosslinguistically, but that occurrence of non-contrastive Topic is impossible in Korean because the relative clause head nominal comes through Topic in the relative clause during relativization (Lee 1973) (and in Japanese as well). Complement clauses license CT in them easily crosslinguistically, as in (27b). (27) a. John knows a song that MARYCT sings well (from Subject) b. John knows that MARYC T sings the song well. In Korean, a whole complement clause can take CT before a main clause attitude or communication verb and it can be focally associated with either the predicate
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
165
(preferred) or the subject of the complement clause. Because (28) is negative, an affirmative proposition with a weaker predicate in the scale than the complement predicate ‘right’ is conventionally implicated. Observe: (28) Yumi-nun ku–ka olh -ta -ko -nun po-ci anh-nun-ta -TOP he-NOM right-DEC-COMP-CT think not ‘Yumi does not think [that he is right] CT.’ The contrastively implicated proposition may be ‘But Yumi thinks that he’s got a point.’ Crosslinguistically, in English, German, and Korean, the pitch accent for (information) Focus, H*(L), is distinct from the one for CT, roughly (L(+))H*(-), whereas in Finnish and Norwegian, Focus and CT are not so distinct prosodically (Vallduví and Vilkuna (1998:89), Fretheim (1992), Gundel (2002)). 4. CONTRASTIVE TOPIC AS A NARROW-SCOPE-BEARER? In Korean, CT-marked universal quantifiers, universally quantifying time, degree and frequency adverbials as well as positively quantifying adverbials such as ‘often’(cacu-nun), ‘much/many’ (manhi-nun) always take narrow-scope over negation. Observe: (29) ta nun an mek-ess-e all –CT not eat-PAST-DEC ‘(I) didn’t eat all.’ (30) ta an mek-ess-e all –CT not eat-PAST-DEC ‘(I) didn’t eat all.’ In (29), the CT marker is attached to the universal quantifier (originally adverb ‘completely’) and we can see the high pitch of the CT marker –nun in Fig. 4 and in (30) the CT marker has been deleted but its tone has been preserved and there is a rising tone from ta ‘all’ to an ‘not’ because of the compensatory high tone coming from the deleted CT marker, as in Fig. 5. Thus it is noted that the CT marker is deletable, just as the non-contrastive Topic marker is, whereas the CT high tone, which is largely responsible for the focality in CT, is not. Thus (29) and (30) are identical in interpretation with the narrow-scope CT or wide-scope negation. Compare it with the pitch track of a negative sentence with no CT marker or its compensatory tone ta an wasse ‘All didn’t come’ in Fig. 6, with wide-scope universal.
166
CHUNGMIN LEE
Figure 4. Universal Quantifier with CT marker in Negation
Figure 5. Universal Quantifier with Compensatory Tone in Negation
Figure 6. Universal Quantifier with no CT or Compensatory Tone in Negation
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
167
Ladd (1980) and Jackendoff (1972) claim that fall-rise forces a narrow-scope reading in (31) and (32) also in English. (31) \All/ the men didn’t go. (32) I didn’t see \all/ of the men. Suppose (31) is interpreted as ษ¬, then all is exhaustive and ¬ go and there is no continuation to a contrasted proposition with weaker affirmation (see (30) above) ‘but some men went,’ etc. The same applies to (32). Therefore, there is no scope ambiguity in (31) and (32). Consider, however, the ‘ambiguity’ between the narrowscope CT and wide-scope CT reading in (32) in English advocated by Buring (1999), Kadmon (2001). (32) Two thirdsCT of the politicians are not corrupt. a. ¬ 2/3 b. 2/3 ¬ (not easy with fall-rise) In (a), a typical CT reading of scalar, nonspecific, non-partition cardinality is given. Roughly, (32), on this reading, is ‘it is not the case that up to two thirds of the politicians are corrupt but a little less than that may be corrupt.’ This reading is denial of the other party’s high value assertion, implicating a low value affirmation on the scale. In (b), on the other hand, a topicalized partition reading is given and this reading of (32) is roughly ‘two thirds of the politicians are non-corrupt (and one third may be corrupt.)’ The latter reading is similar to a Topic reading, in which no fall-rise is required. I claim that there occurs a topicalization effect for wide-scope CT. This also occurs in Korean in the Topic position. Consider Korean. (33) is ambiguous but a CT in the object position in (34) is not: (33) cengchika-euy sam-pwun-euy i-nun pwuphay-ha-ci anh-ass-ta. -of 2-CT corrupt was– not -DEC politician-of 3rd ‘Two thirdsCT of the politicians are not corrupt.’ a. ¬ 2/3 (non-partition, less than 2/3 corrupt – by polarity reversal affirmative weaker value implicature) b. 2/3 ¬ (partition, the rest=1/3 corrupt by implicature) (34) euysa-euy sam-pwun-euy i-nun hayko-ha-ci anh-ass-ta. doctor-of 3 –minute-of 2-CT corrupt was– not DEC ‘(The Government) did not fire two thirds of the doctors.’ a. ¬ 2/3 (non-partition, with an assumed null or realized Topic in the initial position) b. (i) ¬ 2/3 (non-partition, with a subject ‘the Government’ after the CT phrase inserted and the CT high tone contour) (ii) 2/3 ¬Focal subject; 2/3 ¬Focal verb; 2/3 ¬ (partition, with a subject, say, ‘the Government’ inserted after the CT phrase and a CT high
168
CHUNGMIN LEE tone which tends to be low) (with constituent negation on focused subject or predicate, evoked by Contrastive Predicate Topic) c. 2/3 ¬ (Topic reading with TOP marking and no high tone, partition, specific, the rest = 1/3 may be fired) (this reading is also possible with the Topic phrase with a low tone in the original object position) (constituent negation readings evoked by Contrastive Predicate Topic as in (bii) are also possible)
Exactly parallel readings evolve in English; the 2/3 ¬ reading in (32) is a topicalization effect and a non-scalar partition is denoted. Consider an object CT. In (35), ¬ 2/3 seems natural. The Government did not fire up to 2/3. So, ‘---fired less than two thirds’ is implicated. (35) The Government did not fire two thirdsCT of the doctors. (With contrastive fall-rise contour on ‘two thirds’) How about the same object CT in the topicalized position? (36) Two thirdsCT of the doctors the Government did not fire. (With contrastive fall-rise contour on ‘two thirds’) In this position, both a partition reading with topicalization effect (with constituent negation possibilities as in Korean) and a scalar non- partition reading seem to be available. We can now see that fall-rise (in CT) in fact forces a narrow scope reading, which is scalar, both in Korean and in English. A non-scalar partition reading is a consequent of topicalization effect. When CT follows a scope-bearing element such as a quantified, focal expression, it shows narrow scope over the scope-bearing element. Observe: (37) motu-ka/nwukwuna-ka sakwa sey kay –nun mek-ess-ta all-NOM/everyone-NOM apple three CL-CT ate ‘Everyone ate three applesCT .’ ∀ > ∃ 3 (CL=Classifier) The CT expression has narrow scope with respect to the preceding universal quantifier in (37) with the meaning of ‘at least three but not more than three apples.’ It has the same effect of having a distributive marker –ssik ‘each’ attached to the numeral classifier (sey kay-ssik-un). When the CT phrase is scrambled to the initial position of the sentence, it still predominantly keeps narrow scope but opens the possibility of wide scope rather marginally. Even when it comes to have wide scope reading, ‘three apples as a whole’ is contrasted with other alternatives. Consider: (38) sakwa sey kay –nun motu-ka/nwukwuna-ka mek-ess-ta apple three CL-CT all-NOM/everyone-NOM ate ‘Everyone ate three applesCT .’ ∃3 < ∀ (∃3 >∀)
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
169
A Focus phrase Yumi-man-i ‘Yumi-only-NOM’ can replace the universal quantifier phrases in (37) and (38), seemingly preserving the same scope relations. In particular, if the CT phrase in (38) is replaced by the [sakwa-rul sey kay-nun] ‘apple-ACC 3-CL-CT,’ then the narrow scope of CT is unmistakable, although the acceptability of the S slightly aggravates; this case-marker-intervening construction lacks specificity. Also, in (38) if the predicate has a modal expression such as ‘can’ and ‘will,’ the CT narrow scope is unmistakable. If the ACC marker –rul replaces the CT marker -nun in those sentences, both sentences get an ambiguous scope relation. This tendency of CT narrow scope is also reported in the CT initial position in Hungarian (Gyuris 2004). 5. CONTRASTIVE PREDICATE TOPIC
5.1. Scalarity of Contrastive Predicate Topic So far we have treated mainly entity type CTs. However, there are ample cases in which properties (or predicates) become Contrastive Topic, which I call Contrastive Predicate Topic (Lee 1999, 2000, 2002). Contrastive Predicate Topic is also a sort of topic (topical) in the sense that it has been a potential Topic, discussed or assumed in the previous discourse. In this sense, it is not Hetland’s (2003) “main news,” although it is a predicate, typically used for Comment information. It is more discoursal than sentential. Therefore, it may not fit the narrow definition of Topic by means of ‘aboutness,’ in which the rest of the sentence talks about it. Steedman (2000) strikingly coincides with my view, though he does not so clearly distinguish between Contrastive Topic and his “unmarked theme” until this volume. Secondly, it is scalar in a stronger sense than entity type CT. Consider (39), (40), in which pragmatic scales are evoked: (39) She ARRIVEDCT. ~> ¬She went on the stage. (40) She PASSEDCT ~> ¬She aced the exam. (39) evokes a scale of {arrive < go on the stage}in context and (40) readily evokes {pass < ace the exam}. Interestingly, the former scale is not semantic but pragmatic, in other words, the larger value ‘go on the stage’ does not entail the lower one. But if we consider a specific context in which ‘go on the stage’ requires ‘arrive’ as a precondition, the former entails the latter in that context and we can call it a pragmatic entailment. The latter scale is semantic; ‘ace the exam’ entails ‘pass the exam.’ (Conventional) scalar implcatures are evoked by both pragmatic and semantic entailments. On the predicate part we can have such as a CT: “All the abstracts DID get accepted. ~> but there may be withdrawals. Rooth’s (1996) simple alternatives by F-marking cannot explain why fall-rise requires the relevant type of
170
CHUNGMIN LEE
scalar implicatures. See Lee (2000) for further examples of scalar Contrastive Predicate Topic. Then, a big question arises: Is a single CT sentence without Focus [Topic + CT] possible, as in (39) and (40)? On surface at least, it is a fact (Steedman 2000 agrees on this, while some others claim there must be a Focus on surface). If we consider, however, why we talk without giving new information by focusing something, we may want to ponder about possible explanations: (1) There is a silent Focus in the scalar implicature part. This phenomenon is not independent; identification focus is silent with a rising Topic marker (-nun (Korean), wa (Japanese), shi (Chinese) in a question such as ney irum-un? or “Your name?”; (2) The yes/no (or verum) question demands an answer with respect to whether or not, i.e. arrived or not; passed or not. So, it may include a (Contrastive) Focus (Lee 2003). A partial affirmative answer to this yes/no question is the concessively admitted CT sentence; (3) CT itself is partially focal and we may assume that the implicature part is also partially focal. Thus, the totality may be fully focal; (4) There is nothing beyond the surface form [Topic + CT]. (1) and (3) above consider the implicature part and are preferable to (2) and (4). Focus is even neurologically real: Some ERP experiment results (Yuki 2004) show striking brain responses to the lack of expected intonational prominence (A2) in Figure 7 for focused words in Japanese. For the Subject wh-Q “Who lost the key?” (Da’re-ga kagi’-o nakushita’-no?), A1 is Match: MA’SAYA-ga kag’i-o nakushita’-N-da-yo and A2 is Mismatch: Ma’saya-ga KAGI’-o nakushita’-N-da-yo. The Subject that lacks the expected intonational prominence (A2) is more positive in the waveform than the properly prominent subject (A1). Observe:
Figure 7. ERP waveforms for Subject-focus WH-Q-answer pairs (A1 vs. A2)
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
171
5.3. REASON Adjunct Clause and Negation A reason adjunct clause and negation interact scopally in various languages and Korean is not an exception. But observe (39) first, which has a Contrastive Predicate Topic. It has the wide-scope negation and the CT is focally associated with the reason clause. If the CT marker is deleted but its compensatory tone is retained, its interpretation is the same as (39). But if the same sentence has no CT marker and no high tone, then its interpretation is the same as (40). In the written text without any intonation marking, the sentence is ambiguous between the two opposite scopal interpretations. Because the Contrastive Predicate Topic is associated with the reason clause both in (39) and in its corresponding sentence with a null CT marker but with a high tone and the reason clause comes to have the direct CT effect, the interpretation is: [It is not because she is richCT that he married her]. Then, its implicature may be: [I married her because she is nice], ‘nice’ being weaker than ‘rich’ in the pragmatic scale. In the narrow-scope reason clause sentences with the CT marker or its compensatory high tone in its narrow-scope reason, the reason clause is rather high and is immediately followed by the matrix clause intonationally, whereas in the wide-scope reason clause sentences with no CT marker or tone the reason clause falls and there arises a big pause before the main clause. There is an exact correlation between intonation and interpretation. (39) pwuca –yese kyelhon-ha-ci-nun anh-ass-e rich-be-because marry -CT not ‘(He) didn’t marry (her) because she is rich.’ REASON < NEG .
Figure 8. REASON Clause < Negation (CT-marked)
172
CHUNGMIN LEE (40) pwuca –yese kyelhon- an hay-ss-e rich-be-because marry not do -DEC ‘(He) didn’t marry (her) because she is rich.’ REASON > NEG
Figure 9. REASON Clause > Negation All the scope relations involving quantifier–negation and REASON-negation depend on whether the sentences in question have inherently Contrastive Predicate Topic (with a pitch accent or marker), related to the previous discourse context. If that is the case, the sentences must take the wide-scope negation, with the Contrastive Predicate Topic focally associated with the relevant quantifiers or REASON clause. Thus viewed, scope ambiguity is not present. Constituent negation also involves Contrastive Predicate Topic, with the latter being focally associated with the relevant constituent (Lee 2006). 7. CONCLUDING REMARKS Contrastive Topic is preceded by a question that includes a sum as a potential Topic or a conjunctive question (or even if it is a disjunctive question, inclusive reading must be possible). On the other hand, Contrastive Focus, which has not been treated here, is preceded by an alternative disjunctive question which expects a choice of a single answer (see Lee 2003). A typical CT, which necessarily evokes a conventional implicature, must be distinguished from a type of list contrastive topics. Not only type-subtype scalarity (based on poset) but also subtype scalarity must be incorporated in any model of Contrastive Topic, although some entities in some contexts are allowed to receive list reading. Contrastive Topic basically behaves as a narrow-scope-bearer in interaction with other scope bearers including a REASON clause. A Contrastive Predicate Topic
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
173
analysis is proposed for the wide-scope negation reading of the scope ambiguous sentences. Predicates are necessarily subtype-scalar when CT-marked and numerals and quantifiers, which are semantically ordered, have the same nature when CT-marked. We cannot miss the real intent of using a CT: it is to convey a conventionally implicated proposition. If ‘CT(p)’ is given, then contrastively (‘but’) ‘not q’ (q: a contextually higher stronger predicate) is conveyed and if ‘CT(not-q)’ is given, then contrastively ‘p’ (contextually a lower weaker predicate) is conveyed (Lee 2002). The rhetorical force of CT is placing more weight on the unuttered implicature proposition. The CT utterance is concessive admission and its concessivity can be shown by the near-paragraph relation of (39) to (41): (41) Even though/Even if/Although she ARRIVED, she didn’t go on the stage. Although ‘even if’ is possible, it is not like a normal conditional, not licensing contraposition. The truth of the consequent is urged, whatever the antecedent may turn out to be in truth. The implicature of (39) i.e. the consequent of (40) is so forceful in rhetorical structure. Steedman (2000) incorporates a CT tone (L+H*) in the specification of ‘married’ in the lexicon (from Anna MARRIED (L+H* LH%) MANNY (H*LL%) but claims that its implicature is “conversational” (this volume). But he emphasizes that “kontrast, thematicity, and hearer responsibility are all elements of literal meaning, and hence in your terms conventional implicature” (p.c.). Scalar implicatures, generated by CT marking, though their higher values are determined by context, are not cancelable and conventional. The intonational device may better be closer to its meaning as conventional. Information structure must be able to show the relation between intonation and meaning more closely by our further scrutiny. Seoul National University
8. NOTES 1
I would like to express my gratitude to Klaus von Heusinger, Mark Steedman and Julia Hirschberg and other audiences of the Workshop on Topic and Focus: Meaning and Intonation at the 2001 LSA Linguistic Institute (UCSB) for their questions and encouragement. I am also grateful to my co-editors Matt Gordon and Daniel Buring for their patience in organizing the workshop and leading it to this volume eventually. For part of this research Sun-Ah Jun’s comments on intonation, Hyunkyung Hwang’s assistance on pitch tracks from subjects, KRF grants and the SNU leave of absence for my staying at UCLA were all helpful. 2 Mira Oh, in her recent experiments (in preparation), ‘Phonetic Realizations of Focus and Topic in Korean’, observes that the Cheonnam dialect shows an IntPBoundary in contrast with the Seoul dialect. 3 Steedman’s (2000) example (1) can be given a similar scalar interpretation. A theatrical musical performance is assumed in the previous query and under it a pragmatic scale <musical, opera> can be set up. (1) Q: Does Marcel love opera? A: Marcel likes MUSICALS. L+H* LH% Therefore, if opera and musicals are substituted by each other, the answer Marcel likes OPERACT would not be appropriate on the scalar reading. On a non-scalar reading, the implicature may be open to a list alternatives reading and even roundabout affirmation.
174
CHUNGMIN LEE 9. REFERENCES
Bach, K. “The Myth of Conventional Implicature.” Linguistics and Philosophy 22.4 (1999): 327-366. Brentano, Franz. Psychology from an Empirical Point of View, trans’. A. C. Rancurrelo, et al London: Routledge and Kegan Paul, 1973. Buring, Daniel. “Topic.” In P. Bosch and R. van der Sandt (eds.), Focus and Natural Language Processing 2, pp. 271-280. Cambridge: MIT Press, 1994. Buring, Daniel. “On D-trees, Beans and B-accents.” Linguistics and Philosophy 26 (2003): 511-545. Carlson, Lauri. Dialogue Games: An Approach to Discourse Analysis, Reidel: Dordrecht, 1983. Chierchia, Gennaro. “Scalar Phenomena and Polarity.” Manuscript, 2002. Choi, Hye-won. Optimizing Structure in Context: Scrambling and Information Structure. Stanford: CSLI, 1999. Diesing, Molly. Indefinites [Linguistic Inquiry Monograph 20]. Cambridge: MIT Press, 1992. von Fintel, Kai. Restrictions on Quantifier Domains. University of Massachusetts, Amherst: Doctoral dissertation, 1994. Fery, Caroline. German Intonational Pattens. Tuebingen: Niemeyer, 1993. Groenendijk, Jeroen and Martin Stokhoff. Studies on the Semantics of Questions and the Pragmatics of Answers. University of Amsterdam: Doctoral dissertation, 1984. Hamblin, C. L. Fallacies. Bungay, Suffolk: Methuen, 1970. Hedberg, Nancy and J. M. Sosa. “The Prosodic Structure of Topic and Focus in Spontaneous English Dialogue.” This volume. Hedberg, Nancy. “The Prosody of Contrastive Topic and Focus in Spoken English.” Talk presented at the Workshop on Information Structure in Context, University of Stuttgart, 2002. Hetland, Jorunn. “Contrast, the fall-rise accent, and Information Focus.” I: Structures of Focus and Grammatical Relations, pp. 1-39. Tubingen: Niemeyer Linguistische Arbeiten, 2003. Horn, L. A Natural History of Negation. Chicago: Chicago University Press, 1989. Ito, Kiwako and Susan M. Garnsey. “Brain Responses to Focus-Related Prosodic Mismatch in Japanese.” at SP2004, Tokyo. Jackendoff, R. Semantic Interpretation in Generative Grammar, Cambridge: MIT Press, 1972. Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. The Ohio State University: Doctoral dissertation, 1993. [Published by Garland, 1996]. Kennedy, Chris. Projecting the Adjective: The Syntax and Semantics of Gradability and Comparison. UC Santa Cruz: Doctoral dissertation, 1997. Krifka, Manfred. “At Least Some Determiners aren’t Determiners.” In K. Turner (ed), The Semantics/Pragmatics Interface from Different Points of View 1, pp. 257-91. London: Elsevier, 1999. Krifka, Manfred. “The Semantics of Questions and Focusation of Answers.” This volume. Ladd, D. R. The Structure of Intonational Meaning, Indiana University Press, 1980. Ladusaw, William. “Thetic and categorical, stage and individual, weak and strong.” In Negation and Polarity, L. Horn and Yasuhiro Kato (eds.), Oxford: Oxford University Press, 2000. Lee, Chungmin. Abstract Syntax and Korean with Reference to English. Seoul; Thaehaksa, 1973. Lee, Chungmin. “(In-)definites, Case Markers, Classifiers and Quantifiers in Korean.” In S. Kuno et al (eds.), Harvard Studies in Korean Linguistics. Department of Linguistics, Harvard University, 1989. Lee, Chungmin. “Definite/Specific and Case Marking in Korean.” In Y.-R. Kim (ed.), Theoretical Issues in Korean Linguistics, CSLI, Stanford University, 1994. Lee, Chungmin. “Generic Sentences are Topic Constructions.” In T. Fretheim and G. Gundel (eds.), Reference and Referent Accessibility. Amsterdam/Philadelphia: John Benjamins, 1996. Lee, Chungmin. “Contrastive topic: A locus of the interface.” In K. Turner (ed.), The Semantics/Pragmatics Interface from Different Points of View 1, pp. 317-41. London: Elsevier, 1999. Lee, Chungmin. “Types of NPIs and nonveridicality in Korean and other languages.” In G. Storto (ed.), UCLA Working Papers in Linguistics 3: Syntax at Sunset 2, pp. 96-132. Department of Linguistics, UCLA, 1999. Lee, Chungmin. “Contrastive predicates and scales.” CLS 36 (2000): 243-257. Lee Chungmin. “Contrastive Topic and/or Contrastive Focus.” Japanese/Korean Linguistics, 2003. Lee, Chungmin. “Contrastive Topic/Focus and Polarity in Discourse.” In K. von Heusinger and K. Turner (eds.), Where Semantics Meets Pragmatics CRiSPI 16, pp. 381-420. London: Elsevier. Marty, Anton. Gesammelte Schriften, II. Halle: Max Niemeyer, 1918.
CONTRASTIVE TOPIC, INTONATION, AND SCALAR IMPLICATURES
175
Molnar, Vleria. “Topic in Focus: the Syntax, Phonology, Semantics, and Pragmatics of the So-called ‘Contrative Topic’ in Hungarian and German”, Acta Linguistica Hungrica 45 (1998): 389-466. Nakanishi, Kimiko. “Prosody and Scope Interpretations of the Topic Marker WA in Japanese.” This volume. Neale, S. “Coloring and composition.” In ed. by K. Murasugi and R. Stainton (eds.), Philosophy and Linguistics. Westview Press, 1999. O’Connor J. D. and G. F. Arnold (eds.). Intonation of Colloquial English 2nd edition. London: Longmans, 1973. Pierrehumbert, J. and J. Hirschberg. “The meaning of intonational contours in the interpretation of discourse.” In Cohen, J. Morgan, and M. Pollack (eds.), Intentions in Communication, pp. 271-311. Cambridge: MIT Press, 1990. Rooth, M. “Focus.” In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory, London: Blackwell, 1996. Roberts, C. “Information Structure in Discourse: Towards an Integrated Formal Theory of Pragmatics.” Manuscript. The Ohio State University, 1996. Van Rooy, Robert. “Questions and Relevance.” NASSLLI 4 handout, 2004. Steedman, Mark. The Syntactic Process, Cambridge: MIT Press, 2000. Steedman, Mark. “Information Structural Semantics of English Intonation.” This volume. Ward, Gregory and Julia Hirschberg. “Implicating Uncertainty: The Pragmatics of Fall-Rise Intonation.” Language 61 (1985): 747-776. Wee, Hae-Kyung. “Semantics and pragmatics of Contrastive Topic in Korean and English.” Manuscript. Indiana University, 1997.
KIMIKO NAKANISHI
PROSODY AND SCOPE INTERPRETATIONS OF THE TOPIC MARKER WA IN JAPANESE*
1. INTRODUCTION: THE TOPIC MARKER WA It is well known that intonational patterns influence pragmatic interpretations in various languages (Bolinger 1965, Halliday 1967, Jackendoff 1972, Lambrecht 1994, Hirst and Di Cristo 1998, Ladd 1998, and Steedman 2000; McCawley 1968, Poser 1984, Pierrehumbert and Beckman 1988, for Japanese in particular). Another well-known fact is that intonation can have an effect on semantic interpretation. For example, in German, different intonational patterns yield different scope readings (Féry 1993, Büring 1997, and Krifka 1998, among others; see section 3 below). This paper discusses how pragmatic information, prosody, and semantic interpretation are related. The empirical domain on which I focus concerns the pragmatics, prosody, and semantics of the topic marker wa in Japanese. It has been claimed that the Japanese topic marker wa is used to convey pragmatic information. In particular, it is said to have two functions, namely, to mark a theme or to mark a contrasted element of a sentence, as shown in (1) (Kuno 1973, among others). In the following, I call occurrences of wa with the first function thematic and examples of wa with the second function contrastive.1 (1)
a. Thematic wa: “Speaking of ..., talking about ...” John-wa gakusei desu. John-TOP student is ‘Speaking of John, he is a student.’ b. Contrastive wa: “X ... but ... , as for X ...” Ame-wa futte imasu ga, yuki-wa futte imas-en. rain-TOP falling is but snow-TOP falling is-NEG ‘It is raining, but it is not snowing’ (cf. Kuno 1973:38)
In section 2, I address the question of whether prosody can express pragmatic information in Japanese. In particular, I examine whether intonational patterns influence the information structure created by the topic marker in a significant way. I conducted an experiment to examine the pitch contours of sentences with the thematic or the contrastive wa and I show that the two functions are realized by
177 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 177–193. © 2007 Springer.
KIMIKO NAKANISHI
178
different contours. Section 3 shows that the two functions of wa which are realized by different intonational patterns are tied to different scope interpretations. This correlation between the pragmatic functions of wa and their scope interpretations can be formalized by applying Büring’s (1997) Alternative Semantics Approach. In the last section, accepting a widely taken view that scope is expressed in syntax at LF, I claim that the thematic wa and the contrastive wa must be syntactically different at least at LF. 2. THE PROSODY-PRAGMATICS INTERFACE2 In the introduction, we saw that the topic marker wa has two interpretations: the thematic wa and the contrastive wa. In this section, based on an experiment, I claim that this difference on interpretations can be conveyed by different prosodic patterns. 2.1. Basics of Phonetics/Phonology in Japanese Japanese has a pitch-accent system, where some words can be distinguished only by accent. The location of the accent corresponds to the mora before the pitch drop, i.e., the accent is on the H immediately before L.3 Accent is a lexical property of some morphemes; underlying accents are modified by rules of word-level phonology (McCawley 1968, Haraguchi 1977, Poser 1984).4 (2)
ʼn a. kaki-ga oyster-NOM
ňʼn b. kaki-ga fence-NOM
ň c. kaki-ga persimmon-NOM
Furthermore, as first claimed in Poser (1984), Japanese has Downstep (cf. Pierrehumbert and Beckman 1988 and Kubozono 1993 for further discussions). Downstep is the reduction in pitch range following an accented syllable, as schematised in (3). (3) Downstep applies within the phonological domain of so-called ‘major phrases’. 5 Poser (1984) claims that “the topic phrase (marked by the particle wa) is generally set off from the rest of the sentence by a major phrase boundary, as indicated by the fact that it seems to have no effect on the following material” (1984:101). The dotted line in (4) expresses an expected Downstep, and the solid line expresses an actual pitch contour.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
179
(4) wa However, Poser did not distinguish between thematic wa and contrastive wa.6 The question is whether the prosodic patterns of thematic wa and of contrastive wa are the same, which I explore in the next subsection. 2.2. Prosodic Patterns of Wa An experiment was conducted to answer the question of whether the thematic and the contrastive wa are prosodically distinct. First, I constructed examples with the thematic wa and also with the contrastive wa. For simplicity, the structure of the examples was ‘Subject-wa Predicate’. The subject and the verb were three or more moras, accented, with voiced segments in accented syllables.7 The following are the examples used for this experiment: (5)
a. Thematic Wa ʼn ň ʼn Naoya-wa nonbiri-si-teiru.8 Naoya-TOP relax-do-PROG ‘Naoya is relaxing.’ b. Contrastive Wa ʼn ň ʼn
ň ʼn
Naoya-wa nonbiri-si-teiru ga Maria-wa Naoya-TOP relax-do-PROG but Maria-TOP
ň ʼn nonbiri-si-tei-nai. relax-do-PROG-NEG
‘Naoya is relaxing, but Maria is not relaxing.’ Five native speakers of Japanese participated in this experiment (3 males, 2 females). The participants were provided with two cards on which the sentence in (5a) or in (5b) was written. They were asked to read sentences on each card aloud five times with an interval of a few seconds between each sentence. They were also asked to read sentences at a natural speed without any pause during each sentence. To see the prosodic patterns, I measured the fundamental frequency (F0). F0 is an acoustic correlate of the psycho-acoustic percept of pitch of the voice. Specifically, I measured the value of the F0 peak immediately before and after wa. P1 is the value of the F0 peak immediately before wa, and P2 is the value of the F0 peak immediately after wa. In the examples in (5), P1 is at ‘na’ in Naoya and P2 is at ‘n’ in nonbiri.
180
KIMIKO NAKANISHI
(6)
The following patterns are found: When wa is thematic, P2 is either slightly higher than P1, or slightly lower than P1. Overall, P1 and P2 are about the same value. When wa is contrastive, on the other hand, P2 is much lower than P1.9 The contours in Figure 1 indicate typical patterns for thematic and contrastive wa.
Figure 1. F0 contours of thematic wa and contrastive wa10 Above: Thematic wa [P1: 127.7Hz, P2: 129.9Hz] Below: Contrastive wa [P1: 159.7Hz, P2: 91.0Hz] (Participant KO: male)
The distribution of the thematic and the contrastive cases of the five participants are given in Figure 2. The X-axis indicates the value of P1 (Hz), and the Y-axis indicates the value of P2 (Hz). As can be seen in the figure, the thematic cases distribute around or above the P1 = P2 line. It means that P1 and P2 are roughly equal or P1 is lower than P2. The contrastive cases, on the other hand, distribute mostly below the P1 = P2 line, indicating that P1 is higher than P2.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
181
Figure 2. Distribution of thematic wa and contrastive wa
2.3. Summary In sum, the difference between thematic wa and contrastive wa is reflected in different F0 patterns. That is, intonation can distinguish two pragmatic functions of wa. Thus, intonational patterns are used in a significant way to convey different pragmatic information. 3. THE PROSODY-SEMANTICS INTERFACE In this section, I first show that prosodic patterns have some effects on quantifier scope interpretations. In particular, the two prosodic patterns above distinguish two different scope readings of wa with respect to negation. I argue that two pragmatic functions of wa which are expressed by different prosodic patterns yield different scope interpretations. The correlation between pragmatics and scope interpretations can be captured by Büring’s (1997) Alternative Semantics Approach. It follows that pragmatic information influences semantic interpretations by way of prosody. 3.1. The Topic Marker Wa and Quantifier Scope There are two sets of data that I intend to show here: First, a universal quantifier with wa exhibits scope interactions with negation, yielding scope ambiguity. Second, the thematic wa and the contrastive wa correspond to different scope readings.
KIMIKO NAKANISHI
182 3.1.1. Interaction with Negation
In Japanese, it is claimed that a sentence with a universal quantifier followed by wa shows scope ambiguity (Kato 1988, among others). For example, in (7), the universal quantifier with wa in the subject position can take either wide or narrow scope with respect to the negation, which appears in a verbal inflection. (7)
Minna-wa ne-nakat-ta.11 everyone-TOP sleep-NEG-PAST ‘Everyone didn’t sleep.’ √Total negation: ∀>¬ (No one slept.) √Partial negation: ¬>∀ (It is not the case that everyone slept, i.e. There is someone who didn’t sleep.)
Given this scope ambiguity, a question to be addressed is the following: Is there a mapping between these two scope readings and the two prosodic patterns of wa that we saw in the previous section? I show in the next subsection that this is the case. 3.1.2. Prosodic Patterns and Quantifier Scope We saw in the previous section that the two functions of wa are realized in different intonational patterns: in sentences with the thematic wa, P1 is almost as high as or can be lower than P2, whereas, in contrastive cases, P1 is always higher than P2. These results are schematised in (8). (8)
a. Thematic wa
P1 wa P2
b. Contrastive wa
P1
wa P2
First, I read aloud the sentence in (7) using the two prosodic patterns in (8) and tape-recorded it. Actual F0 contours are shown in Figure 3.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
183
300
250
200
150
100
50 0
1.37113 Time (s)
300
250
200
150
100
50 0
1.32972 Time (s)
Minna everyone P1
wa TOP
ne-nakat-ta sleep-NEG-PAST P2
Figure 3. F0 contours of thematic wa and contrastive wa12 Above: Thematic wa [P1: 251.13Hz, P2: 251.79Hz] Below: Contrastive wa [P1: 247.47Hz, P2: 174.52Hz]
Second, four Japanese informants (2 males, 2 females) were asked to listen to the recordings, and further asked whether there is any correspondence to the two scope interpretations. They all agreed that the prosodic pattern of thematic wa corresponds to the ∀>¬ reading, whereas the pattern of contrastive wa corresponds to the ¬>∀ reading. (9)
a. Prosodic pattern of thematic wa --- ∀>¬ reading b. Prosodic pattern of contrastive wa --- ¬>∀ reading
Thus, I conclude that two prosodic patterns correspond to different scope interpretations. The next question to be addressed is why there is such a correlation, which is discussed below.
184
KIMIKO NAKANISHI
3.2. Alternative Semantics Approach to Wa and Quantifier Scope In this section, I show that the correlation between prosody and scope interpretations can be captured by Büring’s (1997) Alternative Semantics Approach. 3.2.1. Büring’s (1997) Alternative Semantics Approach to German Scope Inversion Let us first introduce Büring’s (1997) Alternative Semantics Approach to German scope inversion. In German, it is claimed that a rise-fall accent contour has a disambiguating effect with respect to scope interpretations (Féry 1993, Büring 1997, and Krifka 1998, among others). For example, in (10a), a sentence with a universal quantifier as a subject and a negation is scopally ambiguous. However, as in (10b), when the subject is prosodically marked with a rising pitch accent (/) and the negation is marked with a falling accent (\), only the ¬>∀ reading is available. (10)
a. Alle Politiker sind nicht korrupt. all politicians are not corrupt √∀>¬ (For all politicians, it is not the case that they are corrupt.) √¬>∀ (It is not the case that all politicians are corrupt.) b. / ALLE Politiker all politicians *∀>¬, √¬>∀
sind NICHT \ korrupt. are not corrupt
Büring (1997) assumes that each sentence S derives three different semantic objects, that is, the ordinary semantic value [[ S]] o, the Focus value [[ S]]f, and the Topic value [[ S]] t. The first two values are defined by Rooth (1985): According to Rooth, the ordinary value is a proposition and the Focus value is a set of propositions. What is new to Büring is that a Topic as well as a Focus evokes alternatives. In particular, the Topic value is a set of sets of propositions. He claims that the Topic accent marks a deviation from the original Discourse Topic: an element marked with the Topic accent is interpreted as a sentence internal topic such as a contrastive topic. Let us examine an actual example in (11), which includes a contrastive topic. In (11), with the Topic accent, a topic is interpreted as contrastive, and thus evokes alternatives. Following Büring, I represent Topic and Focus marking by using subscripted brackets, [ ]T and [ ]F, respectively. (11)
Q: Which book would John buy? A: [I]T would buy [The Hotel New HAMPshire]F. a. [[(11Q)]]o = which book would John buy b. [[(11A)]]f = {I would buy War and Peace, I would buy The Hotel New Hampshire, I would buy Harry Potter, …}
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
185
c. [[(11A)]]t = {{I would buy War and Peace, I would buy The Hotel New Hampshire, I would buy Harry Potter, …}, {John would buy War and Peace, John would buy The Hotel New Hampshire, John would buy Harry Potter, …}, {Tom would buy War and Peace, Tom would buy The Hotel New Hampshire, Tom would buy Harry Potter, …}, … } Büring further introduces the notion of Residual Topic, which is a set of disputable propositions induced by the Topic. The definitions are given in (12) and (13). (12)
a. Given a question answer sequence QA, [[Q]]o must be an element of [[A]]t. b. Given a sentence A containing Topic, there must be at least one disputable element in [[A]]t after uttering A.
(13)
Disputability: A set of propositions P is disputable wrt a set of worlds CG (the Common Ground) if there is at least one element p in P such that both p and ¬p could informatively and coherently be added to CG (cf. Stalnaker 1978:325). (Büring 1997:178)
The example in (11) satisfies the requirements in (12): [[ (11Q)]] o is an element of [[(11A)]]t and there is at least one Residual Topic, e.g., which book would John buy? With these semantic tools, Büring (1997) accounts for an unambiguity of sentences with a rise-fall pitch in German. The example is cited again in (14). (14)
[Alle]T Politiker all politicians *∀>¬, √¬>∀
sind are
[nicht]F not
korrupt. corrupt
Büring assumes that the sentences are structurally ambiguous by LF at the latest. The different intonational contour leads to certain implicatures that differ for both LF representations. The unavailable reading is ruled out because its LF representation does not yield reasonable implicatures. Thus, the LF for the ∀>¬ reading in (14) does not have reasonable implicatures, whereas the LF for the ¬>∀ reading does. His analysis for these two readings is summarized below. First, let us discuss the ¬>∀ reading. As shown in (15), there are Residual Topics: if not all politicians are corrupt, are there corrupt politicians at all? If so,
186
KIMIKO NAKANISHI
how many? Thus, this reading is available. In the following, non-disputable propositions are crossed out. (15)
a. [[(14)]]o = it is not that all politicians are corrupt b. [[(14)]]f = {all politicians are corrupt, it is not that all politicians are corrupt} c. [[(14)]]t = {{all politicians are corrupt, it is not that all politicians are corrupt}, {most politicians are corrupt, it is not that most politicians are corrupt}, {some politicians are corrupt, it is not that some politicians are corrupt}, {no politicians are corrupt, it is not that no politicians are corrupt}}
The ∀¬ reading is, on the other hand, unavailable because there is no Residual Topic: if all politicians are such that they are not corrupt, then, it is true that most politicians are such that they are not corrupt, and it is also true that some politicians are corrupt. Other elements of the sets express a contradiction. (16)
a. [[(14)]]o = all politicians are such that they are not corrupt b. [[(14)]]f = {all politicians are such that they are not corrupt, all politicians are such that they are corrupt} c. [[(14)]]t = {{all politicians are such that they are not corrupt, all politicians are such that they are corrupt}, {most politicians are such that they are not corrupt, most politicians are such that they are corrupt}, {some politicians are such that they are not corrupt, some politicians are such that they are corrupt}, {no politicians are such that they are not corrupt, no politicians are such that they are corrupt}}
3.2.2. Alternative Semantics Approach to the Japanese Data In this section, I account for the above Japanese data, which indicates the correspondence between prosodic patterns and scope readings. We saw that Büring (1997) captures German scope data by using its pragmatic information, which is realized by a special prosodic pattern. I interpret this approach in the following way: A sentence that conveys certain pragmatic information corresponds to a certain scope reading, i.e., there is a one-to-one correspondence between pragmatic information and scope interpretation. Pragmatic information can be expressed by a certain prosodic pattern. For example, German rise-fall pitch
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
187
marks topic and focus. Büring’s approach captures a direct relation between pragmatics, which is expressed by a certain prosody, and semantics. I apply this approach to the Japanese data: Rather than examining the relation between prosody and semantics, I examine the relation between pragmatics and semantics. That is, in the relevant Japanese data, the thematic wa corresponds to the ∀>¬ reading, whereas the contrastive wa corresponds to the ¬>∀ reading. Indeed, this approach can account for the Japanese data. First, I consider the correspondence between the thematic wa and the ∀>¬ reading, which is shown in (17). (17)
Thematic wa Minna-wa ne-nakat-ta. everyone-TOP sleep-NEG-PAST ‘Everyone didn’t sleep.’ √∀>¬, *¬>∀
Kuno (1973) claims that, when the topic marker wa is interpreted as a theme, the element to which wa attaches must be either anaphoric or generic. If an element is ‘anaphoric’, it should have an antecedent in a previous context. In this sense, an anaphoric element is definite. A ‘generic’ element does not have an antecedent, and it denotes something that holds regardless of time or place of the utterance. Minna ‘everyone’ in (17) cannot be generic, because it is a subject of an eventive predicate, which does not hold for a general time or place. Thus, minna ‘everyone’ in (17) must be anaphoric. It is independently known that anaphoric definite elements do not enter into a scopal relation with other scope-bearing elements in a sentence (Fodor and Sag 1982, for example). In other words, anaphoric definite elements are said to take the widest scope reading not because they take scope over other elements by syntactic mechanisms such as quantifier raising (May 1985), but because they are scopeless. For this reason, in (17), the universal quantifier with the thematic wa has a wide scope interpretation only. In this way, the scope interpretation of the sentence can be accounted for by its pragmatic information. Let us move on to the contrastive wa. The correspondence between the contrastive wa and the ¬>∀ reading can be straightforwardly captured by applying Büring’s (1997) framework. Following Büring, I assume that the contrastive wa always evokes alternatives. The question is where a Focus falls in sentences with contrastive wa. Consider a possible context for a sentence with contrastive wa given in (18). Note that (18b) is uttered using the intonational pattern for contrastive wa, where P1 is much higher than P2. (18)
a. John-wa ne-ta? John-TOP sleep-PAST ‘Did John sleep?’
188
KIMIKO NAKANISHI b. Iie, John-wa ne-nakat-ta. no John-TOP sleep-NEG-PAST ‘No, John didn’t sleep (but someone else slept).’
In the above context, the sentence ‘John didn’t sleep’ with contrastive wa implicates that there is someone else who slept. Following Büring, the contrastive topic evokes alternatives in that ‘John’ is contrasted with ‘someone else’, say, Mary and Bill. In addition, alternatives are evoked with respect to a polarity of a predicate, i.e., ‘didn’t sleep’ and ‘slept’. 13 As stated earlier, the general function of Focus is to evoke alternatives (Rooth 1985). Since a polarity of a predicate here evokes alternatives, we can consider it as a Focus.14 Thus, a Topic and a Focus in (18b) are assigned in the way shown in (19). (19)
[John-wa]T ne-[nakat]F-ta. John-TOP sleep-NEG-PAST ‘John didn’t sleep.’ a. [[(19)]]o = John didn’t sleep [[(19)]]f = {John didn’t sleep, John slept} [[(19)]]t = {{John didn’t sleep, John slept}, {Mary didn’t sleep, Mary slept}, {Tom didn’t sleep, Tom slept}, … } b. Residual Topic: Did Mary sleep?
With the above Topic-Focus assignment, Büring’s approach should straightforwardly apply to the scope of contrastive wa with respect to negation. (20)
Contrastive wa [Minna-wa]T ne-[nakat]F-ta. everyone-TOP sleep-NEG-PAST ‘Everyone didn’t sleep.’ *∀>¬, √¬>∀
We can see that the example in (20) and the German rise-fall example discussed in (14) above have the same Topic-Focus assignments. It follows that Büring’s analysis for German should apply to the Japanese example. The ¬>∀ reading in (20) is available because there are Residual Topics: if not everyone slept, is there anyone who slept at all? If so, how many? (21)
a. [[(20)]]o = it is not that all people slept b. [[(20)]]f = {all people slept, it is not that all people slept}
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
189
c. [[(20)]]t = {{all people slept, it is not that all people slept}, {most people slept, it is not that most people slept}, {some people slept, it is not that some people slept}, {no one slept, it is not that no one slept}} The ∀>¬ reading in (20) is, on the other hand, unavailable because there is no Residual Topic: if all people are such that they didn’t sleep, then, it is true that most people are such that they didn’t sleep, and it is also true that some people are such that they didn’t sleep. Other elements of the sets express a contradiction. (22)
a. [[(20)]]o = all people are such that they didn’t sleep b. [[(20)]]f = {all people are such that they didn’t sleep, all people are such that they slept} c. [[(20)]]t = {{all people are such that they didn’t sleep, all people are such that they slept}, {most people are such that they didn’t sleep, most people are such that they slept}, {some people are such that they didn’t sleep, some people are such that they slept}, {no people are such that they didn’t sleep, no people are such that they slept}}
Thus, the scope interpretation of the sentence with contrastive wa as well as thematic wa can be accounted for by its pragmatic information. 3.3. Summary In Japanese, sentences with negation and the topic marker wa are subject to scope ambiguity. I first showed that the two different prosodic patterns correspond to different scope readings. In other words, the two pragmatic functions of wa expressed by different prosodic patterns correspond to different scope interpretations. I examined a correspondence between pragmatic functions of wa and scope interpretations based on Büring (1997). The thematic wa corresponds to one reading and the contrastive wa to the other. In other words, the scope ambiguity arises because wa has two pragmatic functions. 4. DISCUSSION In this paper, I presented two sets of empirical data: First, the two pragmatic functions of the topic marker, that is, theme and contrast, are realized by different F0
190
KIMIKO NAKANISHI
patterns. Second, these two prosodic patterns correspond to two different scope interpretations. The relevant findings are summarized in Table 1 below. Table 1. Prosody, pragmatics, and semantics of the topic marker Pragmatics Prosody Semantics
Thematic wa P1 is as high as P2 √∀>¬, *¬>∀
Contrastive wa P1 is higher than P2 *∀>¬, √¬>∀
Different prosodic patterns are used to make pragmatic distinctions between theme and contrast. Those pragmatic distinctions, which are realized by distinct prosodic patterns, are correlated with different scope readings. This correlation between pragmatics and semantics is not arbitrary. As formalized in section 3, the correlation between pragmatic functions of wa and scope readings can be captured by Büring’s Alternative Semantics Approach, which uses a direct relation between pragmatics and semantics. In this way, three properties of the topic marker, i.e., prosodic patterns, pragmatic functions, and scope readings, are coherently related to each other. Finally, I would like to briefly address the question that many previous studies in Japanese linguistics have discussed: Should the thematic wa and the contrastive wa be distinguished in syntax? Some previous studies claim that they need not be distinguished in syntax (Mihara 1996, for example). For these studies, theme and contrast might be merely different in pragmatic interpretation, not syntactically different. Others claim that they should (Hoji 1985, Saito 1985, Tateishi 1994, for example). Their claim is based on the argument that two kinds of wa are basegenerated in different positions in a syntactic structure. For example, Tateishi (1994) shows that the thematic wa violates Subjacency, whereas the contrastive wa obeys it. This is because the thematic and the contrastive wa are base-generated in different positions. The current study does not say anything about where the two kinds of wa are base-generated. However, it shows that they have different syntax at least at LF, since they correspond to different scope readings, which are expressed by syntactic structures at LF. I interpret this fact as a piece of evidence that the thematic and the contrastive wa should be distinguished in the syntax. University of Pennsylvania 5. NOTES *
I would like to thank Mark Liberman, Bill Poser, Satoshi Tomioka, and Jennifer Venditti for valuable discussions and their insights. I am also grateful to Daniel Büring, Elsi Kaiser, and Kazuaki Maeda.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
191
Thanks are also due to the audience at Topic, Focus and Intonation Workshop (University of California, Santa Barbara, July, 2001). 1 To be precise, the distinction between the thematic and the contrastive wa is valid only when the topic marker is attached to the subject in canonical word order, which is S-O-V. When the topic marker is attached to the object in canonical position, the object is exclusively interpreted as a contrastive element, as shown in (i). (i) John-ga ringo-wa tabe-ta. John-NOM apple-TOP eat-PAST ‘John ate apples (but there were some food that he didn’t eat)’ For this reason, I only consider the examples where wa is attached to a canonical subject. 2 An earlier version of this section is presented in Nakanishi (2002). 3 See Haraguchi (1999) for a recent survey of the Japanese pitch accent system. 4 ‘ň’ marks a low-high sequence of pitch, and ‘ʼn’ marks a high-low sequence of pitch. 5 For how to determine Major Phrases, see Selkirk and Tateishi (1991). 6 Finn (1984) claimed that thematic wa and contrastive wa were differentiated by pauses as well as fundamental frequency (F0). Her claim is based on experimental studies in which she measured the peak of F0 contours before wa and the valley F0 of wa, and also the pause between wa and the following word. Her experimental methods, however, are problematic; unfortunately, I do not have space to discuss them here. 7 Voiced segments exhibit smoother F0 contours than other consonants, without being disturbed much by segmental effects. 8 The predicate nonbiri-site-iru can describe either the current state ‘be relaxing’ or the permanent state ‘be laid-back’. In other words, it can be either a stage-level or an individual-level predicate (Carlson 1977). To avoid possible prosodic effects of this ambiguity, the participants were informed that the sentences used in the experiment mean ‘be relaxing’, not ‘be laid-back’. 9 For the contrastive case, a question arises as to whether the low value of P2 is a result of Downstep or a reduction of range for another reason. The result of the experiment suggests that the drop of P2 is not due to Downstep, since the difference between P1 and P2 is much larger than the case of Downstep. I thank Jennifer Venditti for discussions of this issue. 10 The first and second arrows in F0 contours indicate P1 and P2, respectively. 11 The negative morpheme is just -na, as we can see in forms such as -na-i ‘-NEG-PRES’. The status of a suffix after negation -kat (or arguably, and certainly historically, -kar) is admittedly a problem. Bill Poser (p.c.) pointed out to me that, synchronically -kat has to be analyzed as obligatorily affixed to adjectives when certain suffixes, such as -ta ‘-PAST’, are added. Following Poser’s suggestion, for the purpose of this study, I assume that -nakat is a suppletive form of the negative required by suffixes like -ta. 12 The first and second arrows in F0 contours indicate P1 and P2, respectively. 13 Related to this is Noda’s (1996) claim that, when a sentence with the contrastive wa is conjoined with another sentence, the predicates of these two sentences tend to express opposite states. For example, in (i) below, the predicate didn’t sleep is most naturally conjoined with the opposite predicate slept. (i) John-wa ne-nakat-ta-ga Mary-wa ne-ta. John-TOP sleep-NEG-PAST-but Mary-TOP sleep-PAST ‘John didn’t sleep, but Mary slept.’ 14 In Japanese, the negation is a morpheme attached to a verb. For this reason, it seems impossible for the negation alone to be accented. Thus, I assume that, although the negation is a Focus, it does not have a special prosodic pattern as in German, where the focused negation is realized with a falling accent.
6. REFERENCES Bolinger, Dwight. Forms of English: Accent, Morpheme, Order. Cambridge: Harvard University Press, 1965.
192
KIMIKO NAKANISHI
Büring, Daniel. “The Great Scope Inversion Conspiracy.” Linguistics and Philosophy 20 (1997): 175−194. Carlson, Gregory. Reference to Kinds in English. Ph.D. dissertation, University of Massachusetts, Amherst, 1977. [New York: Garland, 1980]. Féry, Caroline. German Intonational Patterns. Tübingen: Niemeyer, 1993. Finn, A.N. “Intonational accompaniments of Japanese morphemes wa and ga.” Language and Speech 27:1 (1984): 47−57. Fodor, Jane, and Ivan Sag. “Referential and Quantificational Indefinites.” Linguistics and Philosophy 5 (1982): 355−398. Halliday, M.A.K. “Notes on Transitivity and Theme in English, Part II.” Journal of Linguistics 3 (1967): 199−244. Haraguchi, Shosuke. The Tone Pattern of Japanese: An Autosegmental Theory of Tonology. Tokyo: Kaitakusha, 1977. Haraguchi, Shosuke. “Accent.” In N. Tsujimura (ed.), An Introduction to Japanese Linguistics, pp. 1−61. Cambridge: Blackwell, 1999. Hirst, Daniel, and A. Di Cristo. Intonation Systems: A Survey of Twenty Languages. Cambridge: Cambridge University Press, 1998. Hoji, Hajime. Logical Form Constraints and Configurational Structures in Japanese. University of Washington: Doctoral dissertation, 1985. Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge: MIT Press, 1972. Kato, Yasuhiko. “Negation and the Discourse-Dependent Property of Relative Scope in Japanese.” Sophia Linguistica (1988): 23−24. Krifka, Manfred. “Scope Inversion under Rise-Fall Contour in German.” Linguistic Inquiry 29:1 (1998): 75−112. Kubozono, Haruo. The Organization of Japanese Prosody. Tokyo: Kuroshio, 1993. Kuno, Susumu. The Structure of the Japanese Language. Cambridge: MIT Press, 1973. Ladd, D. Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996. Lambrecht, Knud. Information Structure and Sentence Form: Topic, Focus and the Mental Representations of Discourse Referents. Cambridge: Cambridge University Press, 1994. May, Robert. Logical Form: Its Structure and Derivation. Cambridge: MIT Press, 1985. McCawley, James. The Phonological Component of a Grammar of Japanese. Hague: Mouton, 1968. Mihara, Ken-ichi. Nihongo-no Toogo Koozoo [Syntactic Structures in Japanese]. Tokyo: Syohakusya, 1996. Nakanishi, Kimiko. “Prosody and Information Structure in Japanese: a Case Study of Topic Marker wa.” Japanese/Korean Linguistics 10 (2002): 434−447. Stanford: CSLI. Noda, Hisashi. Wa to Ga [Wa and Ga]. Tokyo: Kuroshio Syuppan, 1996. Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge: MIT Press, 1988. Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. MIT: Doctoral dissertation, 1984. Rooth, Mats. Association with Focus. University of Massachusetts, Amherst: Doctoral dissertation, 1985. Saito, Mamoru. Some Asymmetries in Japanese and their Theoretical Implications. MIT: Doctoral dissertation, 1985. Selkirk, Elizabeth and Koichi Tateishi. “Syntax and Downstep in Japanese.” In C. Georgopoulos and R. Ishihara (eds.), Interdisciplinary Approaches to Language: Essays in Honor of S.-Y. Kuroda, pp. 519−543. Dordrecht: Kluwer, 1991.
PROSODY AND SCOPE OF THE TOPIC MARKER WA IN JAPANESE
193
Stalnaker, Robert. “Assertion.” In P. Cold (ed.), Syntax and Semantics 9: Pragmatics, pp. 315−332. New York: Academic Press, 1978. Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 31:4 (2000): 649−689. Tateishi, Koichi. The Syntax or ‘Subjects’. Stanford: CSLI, 1994.
HO-HSIEN PAN
FOCUS AND TAIWANESE UNCHECKED TONES
Abstract. This study investigated how focus influences f0 contour and duration of Taiwanese lexical tones. F0 and duration values were taken from pitch tracks and spectrograms generated from SVO sentences with different focus conditions. The four focus conditions included a broad focus condition with focus on the entire sentence, and three narrow focus conditions with narrow focus falling on the first, second, and third words. Results of the duration data revealed that (1) duration of narrow focused syllables were longer than syllables in other focus conditions and (2) duration of narrow focused syllables varied as a function of their position within the phrase; penultimate focused syllables were longest. Analysis of f0 minimum and maximum indicated that (1) f0 range of narrow focused syllables was expanded and (2) together, mean f0 value and expansion of f0 range distinguish focus conditions. Comparison between f0 and duration data showed that duration was more consistently used to distinguish focus condition than f0 range and mean f0 value in Taiwanese.
1. INTRODUCTION Focus, tone, and intonation are all manifested through fundamental frequency (f0) contours and duration in Taiwanese. There is no one-to-one correspondence between the surface acoustical realization and the deeper structure, nor do surface f0 contours and duration directly reflect underlying features. To improve our understanding of surface f0 and duration formation, the contribution of underlying global or local factors to surface f0 and duration patterns must be investigated. The global factors that contribute to f0 modulation can be divided into two categories, i.e. declination and final lowering. The gradual decline of f0 over the course of an utterance is called declination, while the f0 decline at the end of an utterance or phrase is called final lowering (Liberman & Pierrehumbert, 1984; Pierrehumbert & Beckman, 1988; Shih, 1988). Global effects also affect duration. For example the duration of a syllable varies according to a syllable’s position relative to a prosodic boundary. Studies have showed that phrase-medial segments are shorter than those in phraseinitial and phrase-final positions (Lindblom & Rapp, 1973). In addition to global effects, f0, and duration are also affected by local factors such as tone and focus (Ho, 1976; Lin, 1988). The contribution of tone to the duration of a tone-bearing unit has been observed in languages such as Taiwanese. In Taiwanese, the rising tone is longer, and the duration of checked syllables (CVC structures with final voiceless stops) is shorter than the duration of unchecked syllables (CV or CVN structures) (Cheng, 1968, 1973; Lin, 1988). Focus also influences syllable duration. It was found that the duration of narrow focus syllables are longer than broad focus syllables, which in turn are longer than post-focus syllables (Jin 1996; Xu 1999).
195 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 195–213. © 2007 Springer.
196
HO-HSIEN PAN
Turning to f0, it was observed that local factors such as tone and focus both affect the surface f0 pattern. There are unique intrinsic tonal targets that each Taiwanese lexical tone possesses. These tonal targets determine the f0 height (register), and f0 shape (contour) of tone bearing syllables. For example, a high level tone has an intrinsic high and level f0 contour which coarticulates with surrounding tones (Lin, 1988; Shih, 1988; Gandour et al., 1994; Xu, 1993, 1997). The contribution of focus on surface f0 patterns was reported in various languages (Pierrehumbert, 1980; Cooper, Eady & Muller, 1985; Eady & Cooper, 1986; Eady, Cooper, Klouda, Mueller & Lotts, 1986; Jin, 1996; Xu, 1999). Jin (1996) found that in Mandarin the f0 range of narrow focus syllables was expanded. In this study he varied the four lexical tones of the first two syllables (or words) in sentences with the following structure, /___ mi ŋ15 nien 15 liau15 yaŋ 15/, ‘X is going to the sanitarium next year.’ Each sentence employed four different focus conditions, including broad focus with focus on the entire sentence, and three narrow focus conditions with focus placed either on the first, second, and third word. Results showed that (1) duration of narrow focus syllables was the longest, (2) the f0 range of the narrow focus syllable was expanded, and (3) the f0 contours of the final word in broad focus sentences were perceptually indistinguishable from narrow focus final syllable. Xu (1999) investigated local factors including tone and focus. He varied the lexical values of the first three words in a sentence. Each of the three words carried four Mandarin lexical tones, e.g. level, rising, falling, and falling rising tones. For sentence /mao55 mi35 mai51 mao55 mi55/ ‘Cat fan sells kitty.’ four questions were asked to elicit production with broad focus, or narrow focus on word one, two, or three. For example, when the question ‘What is kitty doing?’ was asked, the narrow focus was appropriately produced on word three, /mai/, in the target sentence. Results further confirmed that duration increased and f0 range was expanded for narrow focus syllables in Mandarin. A tone language with its clear specification of local tonal targets on each syllable is suitable for studying the contribution of global and local effects on surface f0 realization and duration. This study followed the line of research on the influence of local effects, i.e. tone and focus, on the surface f0 formation and syllable duration in a tone language, by controlling the intonation of each utterance and its syntactic composition, while varying lexical tone and focus condition (Jin, 1996; Xu, 1999). Lexical tones in a tone language are contrastive in terms of f0 height, contour, and duration. In Mandarin, there are four lexical tones, namely high level (55), low rising (15), high falling (51), and falling rising tones (315). These four tones are distinguished mainly through f0 shapes. Each Mandarin tone has its own distinctive f0 contour not shared by other tones. However, little is known about a tone language, like Taiwanese, with tones distinguished mainly by not only f0 contour but also f0 height contrasts. There are seven lexical tones in Taiwanese, i.e. high level (55), low rising (24), high falling (51), mid falling (21), mid level (33), high falling checked (51), and mid falling checked tones (21), as shown in Table 1. There are pairs of lexical tones that differ in only tonal height. For example, high falling and mid falling tones differ only in their relative f0 levels, as do high and mid
FOCUS AND TAIWANESE UNCHECKED TONES
197
level tones. Compared with Mandarin, Taiwanese has a richer tone inventory. This study contributed to the little data on the realization of focus in a tone language. Table 1. Taiwanese lexical tones High level
55
/kun/ ‘army’
Low rising High falling Mid falling Mid level High falling checked Mid falling checked
24 51 21 33 51 21
/kun/ ‘skirt’ /kun/ ‘boiling’ /kun/ ‘batons’ /kun/ ‘near’ /kut/ ‘slippery’ /kut/ ‘plow’
The present study reports on how focus contributes to the realization of f0 and syllable duration of lexical tones in Taiwanese. Words containing different lexical tones were produced in short sentences that controlled for the global effects of intonation and prosodic tonal grouping, while varying the local effects of tonal value and focus pattern. The purpose of this study was to examine the surface realization of f0 and duration of Taiwanese lexical tones under different focus conditions, with attention drawn to following issues: (1) the effect of narrow focus on duration, (2) the effect of a narrow focus syllable’s position in an utterance on its duration, (3) the effect of narrow focus on f0 range and (4) the influence of focus on tone height between high and mid falling tones, and between high and mid level tones. 2. METHOD
2.1. Corpus Each Taiwanese syllable has two different lexical tones, i.e. a juncture tone (underlying) tone and a context (sandhi) tone. The surface realization of tonal values depends on a syllable’s position in a tone group. When a syllable is located at the end of a tone group, that is the juncture position and so the juncture (underlying) tone surfaces. Any other syllables that are not last in a tone group carry a context tone. The juncture and context tone values that each syllable possesses are recursive in nature. For example, a syllable that surfaces with either the tones 55 or 24 at the juncture position of a tone group has a context (sandhi) tone value 33 at nonjuncture positions. The context tone for a syllable with a juncture tone 33 is tone 21, while the context tone for a syllable with a juncture tone 21 is tone 51. A syllable with a juncture tone 51 would carry a tone 55 in a non-juncture position, as shown in Table 2. It should be noted that tone 24 only surfaces at juncture positions, and not in initial or medial positions of a tone group. The domain of the tone group boundary is prosodically determined and closely related to syntactic structures in Taiwanese (Chang, 1968, 1973; Chan, 1987; Lin, 1988).
198
HO-HSIEN PAN
In the corpus, the sentence type was a statement with SVO structure. The tone group boundaries for these short sentences were located between the first and second words. That is, the first word (first and second syllables) formed a tonal group, while the second word (third syllable) and third word (fourth and fifth syllables) formed another tone group. Table 2. Taiwanese tonal sandhi rules
Unchecked
55
Checked
51
24
(1)
[ σcntxt σjnctr ]tone group
21
53
33
21
[ σcntxt σcntxt σjnctr ]tone group
According to tone sandhi patterns, the second and fifth syllables which are the last syllables in these tone groups carried a juncture tone, while the first, third, and fourth syllables carried context tones. Since a low rising tone is not a possible context tone, it was not used in the third and fourth syllables, as shown in Table 3. In the corpus, only sonorants were used as initial consonants to minimize perturbation in vocal fold vibration in order to ensure smooth pitch tracks, as shown in Table 3. The subject, including the first and second syllables of the sentence, was a surname. The first syllable of the subject was a diminutive morpheme /a- 55/. The second syllable consisted of five juncture tones: high level (55), low rising (24), high falling (51), mid falling (21), and mid level (33). The third syllable consisted of four context tones: high level (55), high falling (51), mid falling (21), and mid level (33). Since a low rising tone was not a possible context tone, only the four tones were used in the third and fourth syllables. The fourth and fifth syllables formed the object. The fourth syllable consisted of the tones 55, 51, 21, and 33. The fifth syllable was the diminutive affix /-a 51/. Since it was not possible to find an object carrying a high falling tone for the fourth and fifth syllables (e.g., 51 51) the lexical item, /a 51 ´ŋ 33/ ‘duck egg’, was chosen for the object with a high falling tone in the fourth syllable. Checked syllables were not investigated in this study.
FOCUS AND TAIWANESE UNCHECKED TONES
199
Table 3. Tones and syllables used as corpus. Tones are in underlying form within // and surface forms within [ ]. Word 2 (3rd syllable) /51/ [55] [lam] ‘hug’
Word 3 (4th and 5th syllables) /51 51/ [55 51] [liu a] ‘button’
55
Word 1 (1st and 2nd syllables) /55 55/ [33 55] [a me]
24
/55 51/ [33 51] [a mã] 51
Tone
/55 24/ [33 24] [a mõ]
33
21
/55 21/ [33 21] [a lun]
/21/ [51] [liam] ‘pinch’ /21 33/ [51 33][´׀ŋ] ‘duck egg’ /33 51/ [21 51] [lua a] ‘comb’ /33/ [21] [mã] ‘scold’
/55 33/ [33 33] [a liaŋ ]
/24/ [33 ] [law] ‘save’
/24 51/ [33 ‘silkworm’
51]
[n ĩ ũ a]
Nine hundred and sixty sentences (5 first word X 4 second word X 4 third word X 4 focus conditions X 3 repetitions) in the corpus were formed by alternating the five words in position 1 to match the four alternating words in position 2 and the four alternating words in position 3. There were four focus conditions. Narrow focus was placed either on the first word (first and second syllables), the second word (third syllable), or the third word (fourth and fifth syllables), while broad focus was placed on the entire sentence. Each sentence was repeated three times. The order in which the 960 sentences were produced was randomized. The sentences were written on a list with no specification of the placement of focus. A question list corresponding to the order of the corpus list was created to elicit focus on the desired part of the sentence, as shown in (2). For example, to elicit broad focus on the sentence ‘A-mei holds buttons’, the precursor question listed on the question list would be ‘What happened?’ as shown in (2) d. (2) a. Who holds buttons?
‘A-MEI holds buttons.’
b. What did A-mei do to the buttons?
‘A-Mei HOLDS buttons.’
c. What did A-mei hold?
‘A-Mei holds BUTTONS.’
d. What happened?
‘A-Mei holds buttons.’
200
HO-HSIEN PAN
2.2. Speaker Four male native Taiwanese speakers, CYS, LWS, LYK, and HYH, participated in the experiment. They were all trilingual speakers of Taiwanese Min, Mandarin, and English. HYH spoke a variety of dialects in which the underlying low rising tone changes into a mid falling surface tone. All speakers were students at National Chiao Tung University at the time of the recordings. They were paid for their participation. 2.3. Instrumentation Recordings were made in a sound-treated booth in The Department of Foreign Languages and Literatures at National Chiao Tung University in Hsinchu, Taiwan. A TEV TM-728II unidirectional dynamic microphone was placed 40 cm in front of each speaker’s mouth and 1 m from the experimenter. A SONY MZS-R4ST Mini Disk recorded acoustical signals in digital quality. The digital acoustical signal was transferred from Mini Disk to PC through an optical fiber at 22kHz to the digital input of Creative Sound Blaster Live sound card, and saved in .wav format. The ESPS xwaves program was used to generate fundamental frequency tracks for each sentence. 2.4. Procedure During the recording a female experimenter and a speaker were present in the sound booth. Short dialogues between the experimenter and speaker were exchanged to ensure that each speaker produced the corpus in a conversational, and not in a citation manner and to ensure that each speaker placed focus in the target position naturally, as opposed to reading the sentence directly from the list. During the recording, speakers read the sentences without indication for the placement of focus from a randomised corpus list. Speakers waited until the experimenter read a precursor question from a question list and then responded by producing the sentence, which he read from the corpus list with focus on the specific part of the sentence. Different questions elicited focus on different parts of the sentence as shown in Table 2. The experimenter judged the utterance according to the desired location of focus at the targeted position. If the experimenter decided that the desired focus condition was not produced, then she would repeat the precursor again, and ask for another production. 2.5. Data Analysis An Emu labelling program (http://www.shlrc.mq.edu.au/emu/) was used to display fundamental frequency (f0) patterns, spectrograms, and waveforms and to provide a means for labelling relevant tonal and intonational aspects of the utterance using the Taiwanese ToBI annotation conventions, currently under progress. Syllabic boundaries were determined by identifying spectrographic cues, such as the energy difference between nasals and vowels and the formant transitions between consonants and vowels. After identifying and labelling syllable boundaries, labelling
FOCUS AND TAIWANESE UNCHECKED TONES
201
words, phones, tones, and the location of focused elements, another Emu program (Emuquery) was used to obtain the time at the onset and offset of the second syllable, third, and fourth syllables. The duration of each syllable was calculated by subtracting the time at the syllable onset from the time of the syllable offset. Next, the fundamental frequency was extracted for each syllable using get_track, and the Emu pitch extraction program. Fundamental frequency values at 5%, 20%, 40%, 60%, 80%, and 95% time points in the target syllables were obtained from these pitch tracks. The average f0 and duration for the second, third, and fourth syllables carrying the same tone in different focus conditions were compared. Oneway ANOVAs (focus position) were used to determine the effect of focus position on peak f0, f0 range expansion, and duration. 3. RESULTS 3.1. Duration
3.1.1. Effect of focus Table 4 shows the results of 51 one-way ANOVAs (focus position) on the duration of the syllable carrying the same tone at the same position produced by the same speaker. For CYS, there was a significant difference between the duration of the syllable carrying the same tone at the same position in different focus condition. The mean duration of the narrow focus syllable was the longest among syllables carrying the same tone at the same position in different focus conditions. For speaker HYH, there was a significant difference between the duration of the syllable carrying the same tone at the same position in different focus conditions. This excludes tone 55 in the fourth syllable, tone 51 in the third syllable, and tone 21 in the second syllable. Mean duration showed that the duration of the narrow focus syllable was the longest among syllables carrying the same tone at the same position in different focus conditions. This excludes tone 55 in the fourth syllable, tone 21 in the second syllable, and tone 33 in the fourth syllable. For speaker LWS, there was a significant effect of duration on the syllable carrying the same tone in the same position and with different focus conditions. Mean duration showed that the duration of the narrow focus syllable was the longest among syllables carrying the same tone in the same position. This excludes tones 55, 51, and 33 in the fourth syllable. For speaker LYK, the durations for the same syllable in different focus conditions were significantly different. This excludes tone 21 in the fourth syllable. Mean duration showed that besides tones 55 and 21 in the fourth syllable and tone 33 in the second syllable, the duration of narrow focus syllables was the longest among syllables carrying the same tone at the same position under different focus conditions. There was a trend for a narrow focus syllable to be the longest.
202
HO-HSIEN PAN
3.1.2. Effect of syllable position on duration Table 4 displays mean duration of syllables in the same position carrying the same lexical tone with different focus conditions across speakers. As shown in Table 4, the duration of narrow focus second syllables was longer than broad focus, prenarrow focus, or post-narrow focus second syllables. Duration of narrow focus third syllables was also longer than broad focus, pre-narrow focus, and post-narrow focus third syllables. Table 4. One-way ANOVA’s (4 focuses) on mean duration (ms), ** p < .001, p < .05, NF: Narrow Focus, bold face: narrow focus syllable
24 51
24
Tone
55
33
21
51
Tone
55
Speaker
Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Speaker Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4
CYS Syllable position 2 3 211.8 ** 240.6 ** 248.4 ** 240.2 199.0 ** 306.2 191.2 ** 250.4 ** 236.1 * 248.3 226.2 * 226.9 * 213.3 ** 234.7 ** 239.8 ** 230.4 204.9 ** 280.6 207.8 ** 243.9 ** 227.3 ** 246.7 ** 257.6 ** 244.9 222.7 ** 328.6 211.3 ** 257.0 ** 219.4 * 244.7 ** 248.4 ** 232.2 221.4 * 310.2 201.7 * 254.3 ** LWS 2 3 265.3 ** 233.5 ** 255.2 ** 294.0 246.4 ** 320.6 252.3 ** 273.4 ** 255.1 ** 296.3 249.3 ** 250.7 ** 252.0 ** 245.8 ** 265.4 ** 295.8 244.3 ** 329.5 242.1 ** 253.0 **
4 257.4 ** 255.9 ** 280.6 ** 322.5
199.3 ** 197.7 ** 223.3 ** 251.4 223.3 ** 227.1 ** 220.2 ** 257.4 268.7 ** 273.6 ** 281.9 ** 320.9 4 227.5 * 226.1 * 250.3 * 234.0
182.7 ** 186.7 ** 219.8 ** 202.8
HYH Syllable position 2 3 195.8 ** 219.1 * 210.7 * 242.1 202.3 ** 233.8 205.6 ** 225.7 * 214.5 ** 270.8 228.1 ** 224.1 ** 205.8 ** 193.3 196.4 243.2 214.8 ** 207.4 216.5 ** 206.0 262.4 210.4 ** 217.4 ** 262.2 255.9 255.4 261.1 218.6 ** 167.6 ** 211.4 177.0 ** 166.9 ** LYK 2 3 225.7 ** 233.7 ** 229.9 ** 241.3 221.8 ** 263.8 207.3 ** 245.4 ** 242.0 ** 280.0 239.5 ** 218.4 ** 234.4 ** 246.4 ** 231.2 ** 258.6 228.0 ** 266.0 208.6 ** 251.9 **
4 210.2 216.8 224.8 221.3
209.8 ** 198.8 ** 216.0 ** 225.5 207.1 * 190.1 * 205.5 * 216.9 219.9 223.3 224.3 219.4 4 276.3 ** 268.7 ** 290.9 ** 297.7
193.4 ** 177.1 ** 199.4 ** 212.4
33
21
FOCUS AND TAIWANESE UNCHECKED TONES
Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4
251.3 ** 278.0 246.7 ** 252.5 ** 223.7 * 251.8 232.3 * 242.9 *
250.8 ** 271.8 ** 342.2 251.2 ** 221.6 ** 251.9 ** 309.7 231.8 **
213.2 * 218.4 * 225.0 * 231.6 223.0 ** 214.8 ** 259.8 ** 237.3
189.3 ** 205.2 186.6 ** 165.6 ** 199.3 ** 213.2 215.0 ** 185.7 **
203
245.1 ** 230.9 ** 295.7 241.1 ** 225.0 ** 213.8 ** 255.1 228.8 **
247.6 239.4 264.6 255.0 292.6 * 274.8 * 296.9 * 302.7
Among the narrow focus syllables carrying the same tone in different syllable positions, the duration of the narrow focus third syllables was the longest, compared with the duration of the narrow focus syllables in the second and fourth syllable position. The effect of position on syllable duration was confounded by the vowel quality and the syllable structure (closed vs. open) which were not controlled in the corpus. Although the duration of narrow focus second and third syllables was longer than the same syllable in other focus conditions, the narrow focus fourth syllable was not the longest, as shown in Table 4. According to Table 4, the duration of the narrow focus tones 55 and 33 in the fourth syllable produced by HYH, was not the longest when compared to the same syllable produced in the other focus conditions. This was also the result for syllables with tones 55, 51, and 33 in the fourth syllable produced by LWS, and for syllables produced with tones 55 and 21 in the fourth syllable produced by LYK. The duration of the narrow focus fourth syllable was similar to that of the post-focus fourth syllable, as shown in Table 4. In summary, increased duration for narrow focus syllables was most obvious in second and third syllable position and least noticeable in the fourth syllable position. 3.2. F0 3.2.1. Tonal register (f0 level) contrast The f0 contours were averaged across speakers to reveal a potential contrast in tonal register between high level vs. mid level tones and between high falling vs. mid falling tones, as shown in Figure 1. A comparison between the f0 range of narrow focus tones 55 and 33 in the syllable onset and the f0 peak revealed that f0 onset of both the tones 55 and 33 was between 140 to 160 Hz. However, the f0 peak was between 170 to 190 Hz for the tone 55 and remained below 160 Hz for the tone 33. The only exception was the 20% point of tone 33 in the second syllable and the 95% point of tone 33 in the fourth syllable, which was slightly above 160 Hz for tone 33. Turning to the tones 51 and 21, we see that the f0 peak of the narrow focus tone 51 was between 180 to 200 Hz, while the f0 peak of the narrow focus tone 21 was between 120 to 140 Hz. As for the lowest point of f0, it was between 150 to 170 Hz for tone 51 and below 140 Hz for tone 21. The f0 level difference between tones 51 vs. 21 and between tones 33 vs. 55 was maintained for narrow focus syllables even after the f0 range was expanded under narrow focus condition.
204
HO-HSIEN PAN HH tone f0 average
ML tone f0 average
220
220
200
200
180
2
f0(Hz)
2
160
0 3 4
0 3 4
0 3 4
0 3 4
2 0 3 4
2
2 3 0 4
4
0 2 3 4
0
3 0 2 4
3 0 2 4
3 0 2 4
4 3 4 0 3
0 4 2
3 2 4
0 4 3 2
0 4 3 2
0 3 2
0 3 2
4
180 0 3
f0(Hz)
2
2
2
140
160 2 0 4 3
2 0 4 3
140
3 0 2 4
2 0 3 4 0 2 4 3
3 03 2 4
0 4 2 0 4 2 3
3 0 4 2
0 4 3 2
0 4 3 2
03 2 4
0 4 3 2
4 0 3 2
4 0 3 2
4 0 3 2
0 4 2 3
120
120
100
100
syllable 2
syllable 3
syllable 2
syllable 4
syllable 3
220
200
200
180
180
2 0 3 4
f0(Hz)
220
160
2 0 3 4
0 3 2 4
140
2 0 3 4
0 2 3 4
syllable 4
MM tone f0 average
LH tone F0 average
f0(Hz)
3 0 4 2
4
160
2 0 3 4
2 0 3 4
2 0
2 0
0 2
3 4
3 4
3 4
0 2 3 4
0 2 3 4
0 0 3 4 2
0 4 2 3
4 0 3 0 4 2
3 4 0 2
3 0 4 2
0 2 3 4
0 3 4 2
140
0 3 4 2
4 0 3 2
4 0 3 2
3 2
3 2
120
120
100
100
syllable 2 0
broad focus
2
Narrow focus on syllable 2
3
narrow focus on syllable 3
4
narrow focus on syllable 4
syllable 3
syllable 4
Syllable Position
HL tone f0 average 220
200 2
180
f0(Hz)
2
160
2 3 0 4
3 0 4
3 0 4
2 3 0 4
3 3 0 2 4
0 3 4 2
0 3 4 0 3 4 2
2
0 4 2
3 0 4 2
4 4
0 4 3 2
0 4 2 3
0 0
2
2
3
4
0 3 2
4 0 3 2
4 0 3 2
4 0 3 2
3
140
120
100
syllable 2
syllable 3
syllable 4
Syllable position
Figure 1. F0 of five tones in the second, third, and fourth syllable position receiving four different focus conditions
3.2.2. Tonal shapes Observation of f0 movement within the vowel nuclei revealed both assimilatory and anticipatory tonal coarticulation in Taiwanese (Peng, 1997). The corpus of the present study was composed of sonorants and vowels, therefore f0 movements during consonants surrounding the vowel nuclei were also included. Since surrounding lexical tones influenced f0 movement of different lexical tones, the averaged tonal contexts for each syllable should be discussed first.
FOCUS AND TAIWANESE UNCHECKED TONES
205
Lexical tones in the second syllable were preceded by tone 33 with mid offset at the first syllable and followed by tones 55, 51, 21, and 33 at the third syllable with an averaged onset realized at a slightly above mid average. For the third syllable, it was preceded by tones 55, 24, 51, 21, and 33 at the second syllable and produced with a slightly below mid average f0 onset. The third syllable was followed by tones 55, 51, 21, and 33 and produced with a slightly above mid average f0 offset. The fourth syllable was preceded by tones 55, 51, 21, and 33 and realized at a mid average onset. The fourth syllable was followed by suffixes, /a51/, in seventy five percent of the tokens and followed by the morpheme, /ls´ŋ 33/ ‘egg’, in twenty five twenty five percent of the tokens. On the average offset of fourth syllable was realized at an upper mid to high average. Due to preservatory tonal coarticulation, tone 55 at the second, third, and fourth syllables started around the mid tonal range following the averaged mid offset of the first, second, and third syllables. The f0 contours of tone 55 at second and third syllables then gradually rose to a higher offset target at 80% into the syllables then slightly declined to coarticulate anticipatorily with the following mid onset of third and fourth syllables. The f0 contours of tone 55 at the fourth syllable did not decline at the end of the syllable, since they were followed by an upper mid to high onset at the fifth syllable. Both preservatory and anticipatory tonal coarticulation was observed on tone 55. The gradual decrease of the high offset of tone 55 from the second, to third and fourth syllables was a sign of global declination. The onset of tone 24 at the second syllable started from the mid offset of preceding syllable then moved downward to the low onset target of rising tone 24. The low onset target of tone 24 was reached around the 60% time point into the syllable and then the f0 pattern began to take on the rising contour of tone 24. Preservatory tonal coarticulation can be observed at the beginning of tone 24. The onset of tone 51 at the second, third, and fourth syllables began around the mid tonal range then began to rise toward the high onset target. The high target was reached at the 60% time point in the second syllable and the 40% time point in the third and fourth syllables. After this, the f0 pattern began to move downward toward the low offset target of falling tone 51. Effects of declination can be observed by comparing the f0 height of high onset targets that gradually decreased from the second to the third and to the fourth syllable. Preservatory tonal coarticulation was observed at the beginning of tone 51. The onset of tone 21 began around the mid tonal range for the second and third syllables. The onset of tone 21 in the fourth syllable was much lower due to global declination and the lower averaged offset of the third syllable. F0 moved downward toward the target and then began to rise at the 95% for the third syllable and the 60% time point for the fourth syllable. Effects of declination were observed on the f0 height of the low offset target between the second and third syllables. The rising contour of tone 21 at the fourth syllable was due to anticipatory tonal coarticulation with the high to mid onset of following fifth syllable. The onset of tone 33 gradually declined from the second, third to the fourth syllable. The rising f0 of tone 33 at the fourth syllable was due to anticipatory tonal
206
HO-HSIEN PAN
coarticulation with averaged upper mid to high onset of the following fifth syllable. Both anticipatory and preservatory tonal coarticulation was observed here. 3.2.3. Effect of focus on f0 range Fifty-one one-way ANOVAs (focus position) were used to analyse individual speakers’ f0 range of syllables carrying the same tone in the same sentence position. F0 range was the difference between the highest and lowest f0 values for a given syllable. Results are shown in Table 5. There was missing data for the narrow focus tone 55 in the second syllable, since HYH produced this syllable with tone 33. Results indicated that a significant effect of focus on f0 range was consistently observed on tones 24 and 51, but not on level tones (55, 33) or the low falling tone (21). The exceptions were tone 55 in the second and fourth syllables, tone 21 in the third syllables, and tone 33 in the third and fourth syllables produced by CYS; tone 55 in the second syllables produced by HYH; tone 55 in the third syllables and tone 33 in the second syllables produced by LWS; tone 55 in the second syllables, and tones 21 and 33 in the third syllables produced by LYK. The mean f0 range of syllables carrying the same tone in the same position and produced by same speaker, but under different focus conditions revealed that the f0 range of narrow focus syllables was the greatest, as shown in Table 5. However, there were some exceptions. These included the f0 range of tone 24 in the second syllables and tone 33 in the fourth syllables produced by CYS; tones 55 and 33 in the second syllables, tone 21 in the third syllables, and tone 33 in the fourth syllables produced by HYH; tones 24 and 33 in the second syllables produced by LWS; tones 24 and 33 in the second syllables, tone 33 in the third syllables, and tones 21 and 33 in the fourth syllables produced by LYK. 3.2.4. Effect of focus on mean f0 In addition to differences in f0 range, a significant effect of focus was observed in the mean f0 value of syllables carrying the same tone in the same position but with different focus conditions. For syllables that did not have a significant effect of focus on f0 range, a significant effect of focus on mean f0 height was usually observed, as shown in Table 6. This is illustrated in production of tone 55 in the second and fourth syllables, and tones 21 and 33 in the third syllables produced by CYS; productions of tone 55 in the second syllables produced by HYH; productions of tone 55 in the second syllables, and tones 21 and 33 in the third syllables produced by LYK; and in productions of tone 55 in the third syllables, and tone 33 in the second syllables produced by LWS. Table 7 summarizes the significant effect of focus on duration, f0 range, and mean f0 on each syllable. Duration was more consistent than f0 range and mean f0 in distinguishing focus conditions produced by CYS, LWS, and LYK, but not HYH. A significant effect of focus was found on either f0 range or mean f0, and sometimes both f0 range and mean f0 of most syllables. The exceptions occurred mainly on tones 33 and 21 in either the third or fourth syllable.
FOCUS AND TAIWANESE UNCHECKED TONES
207
Table 5. One-way ANOVAs (4 focus conditions) on f0 range (Hz), ** p < .001, * p <.05, NF: Narrow focus, bold face: narrow focus syllable
51 21
33
Tone
24
55
33
21
51
Tone
24
55
Speaker
Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Speaker Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4
CYS Syllable position 2 3 10.95 11.11* 13.08 12.31* 12.14 15.15 11.52 12.47* 7.22 ** 13.11 16.90 ** 10.62 ** 20.49 ** 23.19 ** 26.88 24.71 ** 22.49 ** 34.17 20.11 ** 29.04 ** 18.35 * 19.72 21.12 22.17 19.43 * 24.90 17.54 * 22.98 7.22 ** 13.11 10.62 13.30 6.63 ** 15.58 6.39 ** 13.06 LWS 2 3 27.91 ** 19.47 27.68 19.56 33.72 ** 25.16 30.69 ** 23.19 9.31 ** 14.86 26.38 ** 7.97 ** 29.25 ** 26.36 ** 35.13 24.24 ** 34.33 ** 45.66 29.86 ** 28.36 ** 22.59 ** 25.81 ** 27.23 25.51 ** 20.08 ** 38.62 17.36 ** 24.60 ** 9.31 14.86 * 7.97 13.71 * 8.63 18.44 9.89 15.07 *
4 10.89 11.08 11.56 12.56
17.56 ** 21.11 ** 26.05 ** 29.44 14.73 * 16.57 * 15.20 * 13.28 16.90 17.29 19.27 17.65 4 24.21 ** 21.68 ** 29.80 ** 41.93
18.73 ** 16.41 ** 21.25 ** 27.69 17.33 * 17.64 * 22.10 * 23.76 26.38 ** 19.10 ** 29.66 ** 33.98
HYH Syllable position 2 3 49.08 22.25** 35.56 25.05** 37.71 40.71 33.82 25.50** 13.97 **
36.89 ** 14.34 ** 37.52** 56.04 38.23 ** 37.18 ** 30.06 ** 40.11 25.02 ** 26.74 ** 13.97 * 14.34 15.05 * 10.69 * LYK 2 29.54 30.80 34.39 30.13 12.17 ** 18.05 30.96 ** 10.52 ** 29.25 ** 35.13 34.33 ** 29.86 ** 23.85 ** 31.00 23.62 ** 16.96 ** 12.17 * 10.52 11.55 * 13.50 *
4 22.81** 22.42** 24.29** 41.05
34.04 ** 42.61 ** 49.38 33.59 ** 34.85 * 47.18 * 38.80 32.33 *
40.37 * 37.65 * 45.86 * 52.81 36.24 ** 28.21 ** 36.01 ** 42.04 36.89 * 33.48 * 42.00 * 41.56
3 25.63* 27.60* 37.44 26.25*
4 38.09 ** 25.85 ** 36.50 ** 47.08
26.36 ** 24.24 ** 45.66 28.36 ** 39.20 37.38 41.77 37.76 18.05 19.22 16.67 15.63
18.73 ** 16.41 ** 20.91 ** 27.69 25.88 24.17 29.53 26.30 30.96 25.80 30.52 29.86
208
HO-HSIEN PAN
Table 6. One-way ANOVAs (4 focus conditions) on mean f0, ** p < .001, * p < .05, NF: Narrow focus, bold face: narrow focus syllable
51 21 33
Tone
24
55
33
21
51
Tone
24
55
Speaker
Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Speaker
Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4 Broad focus NF on syllable 2 NF on syllable 3 NF on syllable 4
CYS Syllable position 2 3 145.0 ** 139.2 133.9 141.2 150.5 ** 141.9 147.4 ** 140.7 137.3 ** 131.3 129.7 ** 142.8 ** 151.3 ** 144.0 157.8 145.7 155.5 ** 146.5 155.1 ** 147.1 127.1 123.9 * 127.7 124.6 * 127.7 127.7 128.1 127.6 * 137.3 ** 131.3 ** 142.8 132.7 ** 139.6 ** 126.1 139.1 ** 126.0 ** LWS 2 3 180.3 178.0 ** 180.7 173.2 ** 180.7 184.1 176.9 178.6 ** 170.1 ** 164.9 171.0 ** 166.5 ** 180.7 * 182.1 ** 182.6 171.8 ** 184.7 * 179.1 181.5 * 179.7 ** 162.1 * 154.2 159.4 150.0 160.4 * 154.2 159.0 * 152.7 170.1 ** 164.9 * 166.5 161.8 * 166.4 ** 164.4 164.3 ** 162.2 *
HYH Syllable position 4 2 3 133.9 ** 184.7 * 176.7 * 134.8 ** 177.5 169.6 * 137.1** 176.0 * 176.5 138.5 171.4 * 166.5 * 171.6 ** 151.8 ** 178.6 **
134.0 ** 136.2 ** 135.8 ** 139.1 125.6 * 127.5 * 127.9 * 129.4 129.7 129.1 130.3 131.8
4 180.7 ** 173.9 ** 176.5 ** 188.8
176.3 ** 167.5 ** 170.7 ** 178.6 162.7 160.1 159.4 161.6 171.0 ** 161.4 ** 165.0 ** 170.8
187.4 ** 189.0 174.4 ** 172.6 ** 138.6 ** 144.2 130.8 ** 135.4 ** 171.6 ** 178.6 152.6 ** 153.5 ** LYK 2 170.9 * 170.9 166.4 * 164.6 * 157.6 ** 144.2 141.7 ** 155.9 ** 180.7 * 182.6 184.7 * 181.5 * 136.6 ** 137.7 133.3 ** 134.7 ** 157.6 ** 155.9 151.9 ** 150.6 **
4 177.5 ** 159.2 ** 171.1 ** 178.0
179.9 ** 169.1 ** 177.7 165.7 ** 144.0 * 136.1 * 143.3 136.6 *
163.2 ** 145.4 ** 153.8 ** 173.7 147.8 ** 132.0 ** 141.3 ** 142.3 151.8 ** 138.4 ** 145.2 ** 152.7
3 168.1 * 161.7 * 167.0 158.7 *
4 170.9 * 162.5 * 169.3 * 174.0
182.1 ** 171.8 ** 179.1 179.7 ** 138.1 133.9 138.9 136.3 144.2 144.0 142.1 143.9
176.3 ** 167.5 ** 170.7 ** 178.6 138.7 133.1 137.2 137.2 157.6 ** 155.9 ** 151.9 ** 150.6
FOCUS AND TAIWANESE UNCHECKED TONES
209
Table 7. Summary of significant effect of focus on duration (D), f0 range (R), and mean f0 (M) CYS
HYH
LWS
2
3
4
2
3
4
55
DM
DR
DM
DM
DRM RM
24
DRM
51
DRM DR
DRM DRM RM
21
DR
DRM RM
33
DRM DM
DRM
DM
D
LYK
2
3
4
DR
DM
DRM DM
DRM
3
4
DRM DRM
DRM
DRM DRM DRM DRM DRM DRM DRM
DRM DRM DRM DR
DRM D
2
RM
DM
DR
DRM D
DRM DRM DRM D
D
DM
3.2.5. Mandarin vs. Taiwanese Jin (1996) found that perceptually it is difficult to distinguish between broad focus sentences and sentences with narrow focus on the last word. However, Xu (1999) found that the duration of the same syllable under broad focus and narrow focus was significantly different at all five syllable positions. The f0 range differences between the same word but under either broad or narrow focus conditions was significantly different from each other in Mandarin. The discrepancy between production and perceptual data in Mandarin focus studies was not investigated. To examine the production distinctiveness between broad focus, narrow focus, and post-focus final words in Taiwanese, a post-hoc Duncan test was used to analyse the duration and f0 range of the penultimate syllables of the final words carrying the same tone, but under different focus conditions, as shown in Table 8. Results of post-hoc Duncan tests shown in Table 8 indicated that the duration difference between narrow and broad focus penultimate syllables was significant regardless of the following syllables, e.g. tone 55 produced by HYH and LWS, tone 21 produced by HYH and LYK, tone 33 produced by HYH and LYK. As for the f0 range, it was found that the f0 range of narrow and broad focus penultimate syllables was distinctive, besides tone 55 produced by CYS and LYK, tone 21 produced by CYS and LYK, and tone 33 produced by HYH and LYK. In summary, both the duration and f0 range of broad focus and narrow focus penultimate syllables in the final word were significantly different when the penultimate syllable carried tone 51, but not when they carried a level tone (55, 33) or low falling tone (21). Speaker-wise, either the duration or the f0 range was significantly different between narrow focus and broad focus fourth syllables produced by CYS and LWS. Narrow focused and broad focused penultimate syllables carrying tone 33 produced by HYH and LYK, or carrying tone 21 produced by LYK were not significantly different from each other in terms of either duration or f0 range. In Mandarin narrow focused final words and final words in broad focus sentences were perceptually indistinguishable, but acoustically distinguishable.
210
HO-HSIEN PAN
According to the Taiwanese acoustical data observed here, narrow focus final words was distinguishable from final words in broad focus sentences produced by CYS and LWS, but not for LYK and HYH. The discrepancy between production and perceptual data in Mandarin can be further explored by comparing the results of future production and perceptual studies in Taiwanese. Table 8. Post-hoc Duncan tests on the mean duration and f0 range of the penultimate (fourth) syllable. Means of the fourth syllable in different focus conditions produced by the same speaker were significantly different from each other when followed by different alphabets. Means followed by the same alphabets were not significantly different from each other. p < .05. DURATION
55
51
21 33
Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus
F0 RANGE 55 51 21 33
Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus Narrow Focus Post-Focus Broad Focus
CYS 322.5 A 280.6 B 257.4 C 251.4 A 223.3 B 199.3 C 257.4 A 220.2 B 223.3 B 320.9 A 281.9 B 268.7 B CYS 12.6 A 11.6 A 10.9 A 29.4 A 26.0 B 17.6 C 13.3 B 15.2 B 14.7 B 17.7 A 19.3 A 16.9 B
HYH 221.3 A 224.8 A 210.2 A 225.5 A 216.0 AB 209.8 B 216.9 A 205.5 A 207.1 A 219.4 A 224.3 A 219.9 A HYH 41.0 A 24.3 B 22.8 B 52.8 A 45.9 AB 40.4 B 42.0 A 36.0 B 36.2 B 41.6 A 42.0 A 36.9 A
LWS 234.0 B 250.3 A 227.5 B 202.8 B 219.8 A 182.7 C 231.6 A 225.0 AB 213.2 B 237.3 B 259.8 A 222.9 C LWS 41.9 A 29.8 B 24.2 B 27.7 A 21.2 B 18.7 B 23.8 A 22.1 A 17.3 B 34.0 A 29.7 B 26.4 B
LYK 297.7 A 290.9 AB 276.3 B 212.4 A 199.4 B 193.4 B 255.0 A 264.6 A 247.6 A 302.7 A 296.9 A 292.6 A LYK 47.1 A 36.5 B 38.1 AB 27.7 A 20.9 B 18.7 B 26.3 A 29.5 A 25.9 A 29.7 A 30.5 A 31.0 A
4. DISCUSSION The f0 and duration data produced by Taiwanese speakers in the present study revealed five major results. First, the duration of narrow focus syllables was longer than syllables under other focus conditions. Second, the degree of lengthening due to narrow focus was affected by a syllable’s position in a sentence. Third, the f0 range of the narrow focus syllable was expanded. Fourth, the tonal register (f0 level) contrasts between narrow focus high falling vs. mid falling tones, and between narrow focus high level vs. mid level tones was maintained even when f0 range was
FOCUS AND TAIWANESE UNCHECKED TONES
211
expanded. Fifth, duration was a more consistent cue than either f0 range or mean f0 values in signaling focus condition in Taiwanese. F0 range and mean f0 value complement each other in distinguishing focus conditions. In addition to the effect of focus, tonal coarticulation also influenced the f0 contour in Taiwanese. In Taiwanese the f0 offset target of a dynamic tone occurred after the offset boundary of a tone bearing unit, while the f0 offset target of a level tone occurred before the syllable boundary (Pan, 2002). By using only sonorants at either the beginning or end of a syllable, both anticipatory and preservatory tonal coarticulation was observed in this study. Preservatory tonal coarticulation was observed in tones 55, 24, and 51, while anticipatory tonal coarticulation was found in tones 55, 21, and 33. It was proposed that the preservatory tonal coarticulation took place during the initial consonant of the syllable, as found in Mandarin (Xu, 1999). To support the claim that preservatory tonal coarticulation occurred during the initial consonant of the syllable in Taiwanese, further studies with various syllable structures are necessary. Among narrow focus second, third, and fourth syllables, the duration of narrow focus third syllable was the longest, while the duration of the fourth syllable was the shortest. In Mandarin the duration of the narrow focus third syllable was also the longest, however the shortest syllable was the second syllable (Xu, 1999). The effect of focus lengthening was the strongest on the third syllable in both Mandarin and Taiwanese. According to global final lengthening rules, the duration of the narrow focus fourth syllable should be longer than the duration of the narrow focus third syllable, however local focus lengthening interacts with final lengthening here to determine the surface syllable duration. Focus lengthening exerts a strong effect on the third syllable but not on the fourth syllable. Narrow focused fourth syllables appeared to be shorter than narrow focused third syllables in both Taiwanese and Mandarin data. Further investigations with more variable sentence structures are needed to explore possible factors such as syllable position, part of speech, and syntactic or prosodic structures that contribute to the longer duration of narrow focus third syllable. In Mandarin with four distinctive f0 contours for each lexical tone, f0 range expansion was used as the major cue for signaling narrow focus. In Taiwanese, duration lengthening is a more consistent cue for narrow focus. The fact that there are two tonal pairs in Taiwanese contrasted mainly by f0 height and not by f0 contour may contribute to the limited manipulation of f0 range in different focus conditions. To further explore this potential cause, studies on other tonal languages with tonal pairs contrasting mainly by f0 height are needed. The study here concentrated only on the effect of focus on Taiwanese unchecked tones. Taiwanese checked tones are known for their shorter syllable duration and glottalized voiced quality in contrast with unchecked tones. To fully understand the influence of focus on duration contrasts between checked and unchecked syllables in Taiwanese and the influence of focus on voice quality in Taiwanese, further studies are necessary. The interaction between focus conditions, final and initial lengthening in different prosodic domains, and tonal coarticulation should also be investigated to fully understand the interaction of prosodic effects on surface duration and f0 contour in tonal languages.
212
HO-HSIEN PAN
Department of Foreign Languages and Literatures, National Chiao Tung University, Hsinchu, TAIWAN.
NOTES This research was supported by grants from National Science Council in Taiwan. Thanks to Professor Anne Chao and Pi-chiang Li for assistance in statistical analysis.
REFERENCES Beckman, Mary E., and Jan Edwards. (1990) “Lengthening and Shortenings and the nature of prosodic constituency.” In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech (J. Kingston and M. E. Beckman, editors.):152-178. Cambridge: Cambridge University Press. Berkovits, Rochele. (1993) “Utterance-Final Lengthening and Duration of Final-Stop closures” Journal of Phonetics 21 (4): 479-489. Chao, Yun Ren. (1968) A Grammar of Spoken Chinese, University of California Press. Cheng, Robert. (1968) “Tone sandhi in Taiwanese.” Linguistics 41: 19-42. Cheng, Robert. (1973) “Some notes on tone sandhi in Taiwanese.” Linguistics 100: 5-25. Cooper, William E., Stephen J. Eady, and Pamela R. Muller. (1985) “Acoustical Aspects of Contrastive Stress in Question-Answer Contexts.” Journal of Acoustical Society of America 77: 2142-2156. Eady, Stephen J., and William E. Cooper. (1986) “Speech Intonation and Focus Location in Matched Statements and Questions.” Journal of the Acoustical Society of America 80: 402-416. Eady, Stephen J., William E. Cooper, Gayle V. Klouda, Pamela R. Mueller, and Dan W. Lotts. (1986) “Acoustic Characteristics of Sentential Focus: Narrow vs. Broad and Single vs. Dual Focus Environments.” Language and Speech 29: 233-251. Fougeron, Cecile. (1999) “Articulatory Properties of Initial Segments in Several Prosodic Constituents in French.” UCLA Working Papers in Phonetics 97: 74-99. Gandour, Jack, Siripong Potsuk, and Sumalee Dechongkit. (1994) “Tonal coarticulation in Thai.” Journal of Phonetics 22: 477-492. Ho, Aichen T. (1976) “Mandarin Tones in Relation to Sentence Intonation and Grammatical Structure.” Journal of Chinese Linguistics 4: 1-13. Jin, Shunde. (1996) An Acoustic Study of Sentence Stress in Mandarin Chinese. Ph.D. dissertation, The Ohio State University. Liberman, Mark, and Janet Pierrehumber. (1984) “Intonational Invariance under Changes in Pitch Range and Length.” In Language Sound Structure (M. Aronoff & R. T. Oehrle, editors): 157-233. Cambridge, MA: MIT Press. Lin, Hui-Bin. (1988) Contextual Stability of Taiwanese tones. Ph.D. dissertation, The University of Connecticut. Lindblom, Bjorn, and K Rapp. (1973) “Some Temporal Regularities of Spoken Swedish.” Papers from the Institute of Linguistics, University of Stockholm 21: 1-58. Pan, Ho-hsien. (2002) “The location of F0 offset for Taiwanese Long Tones” In Speech Prosody 2002: Proceedings of the first International Conference on Speech Prosody: 555-558. Peng, Shu-hui (1997) “ Production and Perception of Taiwanese Tones in Different Tonal and Prosodic Contexts.” Journal of Phonetics 25 (3): 371-400. Pierrehumbert, Janet. (1980) The Phonology and phonetics of English Intonation. Ph.D. dissertation, Massachusetts Institute of Technology. Pierrehumbert, Janet, and Mary E. Beckman. (1988) Japanese Tone Structure. Cambridge, MA: MIT Press. Shen, Xiaonan Susan. (1973) “A Pilot Study on the Relation between the Temporal and Syntactic Structures in Mandarin.” Journal of the International Phonetic Association 22 (1-2): 35-43. Shih, Chi-Lin. (1988) “Tone and Intonation in Mandarin.” Working Papers Cornell Phonetics Laboratory No. 3: 83-109.
FOCUS AND TAIWANESE UNCHECKED TONES
213
Shi, Chi-Lin, and Benjamin Ao. (1994) “Duration Study for the AT&T Mandarin Text-to-Speech System.” In Conference Proceedings of the second ESCA/IEEE Workshop on Speech Synthesis: 29-32. Xu, Yi. (1999) “Effects of Tone and Focus on the Formation and Alignment of f0 Contours.” Journal of Phonetics 27: 55-107. Xu, Yi. (1997) “Contextual Tonal Variations in Mandarin.” Journal of Phonetics, 25: 61-83.
ELISABETH SELKIRK
BENGALI INTONATION REVISITED: An Optimality Theoretic Analysis in which FOCUS Stress Prominence Drives FOCUS Phrasing*
1. INTRODUCTION In this paper, I want to investigate the consequences of an idea about focus prosody that was first put forward by Jackendoff 1972, namely the hypothesis that the focusphonology interface in grammar is expressed as a relation between focus-marked syntactic constituents on the one hand, and prosodic stress prominence on the other. A strong form of the hypothesis, advocated in Truckenbrodt’s 1995 thesis and pursued here and in other recent work of mine (e.g. Selkirk 2002), is that the focusphonology interface consists only of interface constraints on the relation between syntactic focus and prosodic prominence. All the other predictable, nonmorphological, phonological properties of focus are claimed to be derived as a consequence of phonological markedness constraints on the relation between prosodic prominence and other aspects of phonological representation. This proposal can be called the Focus-Prominence theory of the focus-phonology interface. I think this theory provides an insightful account of the array of phonological properties that are associated with focus crosslinguistically, and at the same time explains the observed generalizations about focus projection and the distribution of focus-related prominence within the sentence. The question of focus projection is not addressed in this paper (but see Selkirk 1999, 2000; Selkirk and Katz, in preparation). What I want to show here is that Focus Prominence theory provides the basis for an understanding of focus-related phonological phrasing. In this I am following a path first charted out by Truckenbrodt 1995. Focus constituents are claimed to display a variety of prosodic properties crosslinguistically: i. appearance of special tonal morphemes 1 ii. appearance of default pitch accent2 iii. demarcation by a prosodic phrase edge3 iv. presence of main stress of a prosodic phrase4 v. appearance in a higher pitch range 5 vi. vowel length under main phrasal stress 6 (This list should not be taken to be exhaustive.)
215 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 215–244. © 2007 Springer.
216
ELISABETH SELKIRK
Should there be distinct focus-prosody interface constraints to account for each of the diverse non-morphemic prosodic properties listed above? I think not. The Focus Prominence hypothesis holds that there is a prevalent commonality to the phonological expression of focus, in languages of diverse types, and that it lies in the level of stress prominence assigned within a focus constituent. The appeal of this hypothesis is that stress prominence, at the appropriate designated level, is quite plausibly responsible for the various other reported phonological reflexes of focus, be it the appearance of default pitch accents to mark stress prominence, the lengthening of vowels under that prominence, or the appearance of a phonological phrase edge adjacent to that prominence. So under the Focus Prominence theory there are no constraints directly relating predictable pitch accent or prosodic phrasing to the focus-marking of constituents in the interface syntactic structure. For example there would be no constraints of the form Align L/R (Focus, ʌ) where π is a prosodic constituent of a selected level. Rather, following Truckenbrodt’s 1995 proposal, the presence of a ʌ edge flanking a focus would be the consequence of a constraint calling for the focus constituent to contain a prosodic prominence together with a a prosodic alignment constraint calling for a prominence to be located at the edge of the prosodic constituent of which it is the head7. Bengali presents an apparent counterexample to the claim made by Focus Prominence theory that the phonological phrase edge alignment that appears with focus can be derived through the markedness-driven alignment of a prosodic phrase edge with the stress prominence of that phrase. The prominence-based theory of focus phrasing predicts a phonological phrase edge at only one edge of a focus constituent, the edge where the focus prominence is located. But according to Hayes and Lahiri in their classic 1991 article on Bengali intonation, a focus constituent in Bengali is flanked by phonological phrase edges at both the right and the left edges of the focus. The stress prominence of a phonological phrase in Bengali is claimed by Hayes and Lahiri to be located at the left of the phrase. So within the Focus Prominence theory, the appearance of a phonological phrase edge at the left edge of a focus constituent could be derived through an instance of the familiar sort of surface phonological markedness constraint Align R/L (π-prom, π), which aligns a π-prominence with a π-edge (π-prom is the prominent daughter constituent of π (its head)). It is the right phrase edge with focus that poses the problem. There is no evidence elsewhere in the language for the alignment of a phonological phrase with the right edge of a constituent. So Hayes and Lahiri propose a focus interface alignment constraint—formulable as Align R (Focus, ϕ)-- to account for the right phrase edge (ϕ stands for phonological phrase). The present theory, which seeks to eliminate focus-phrasing alignment constraints from the universal interface constraint repertoire and to reduce all nonmorphemic, phonological, reflexes of focus to reflexes of stress prominence, will require some principled non-prominence based explanation for the right phrase edge with focus in Bengali. The purpose of this paper is to put forward such an explanation.
BENGALI INTONATION REVISITED
217
An example of the focus phrasing seen in Bengali appears in sentence (2) below. (2) is a sentence with a sentence-medial contrastive focus appearing on a medial constituent within a left branching object noun phrase. The surface syntactic structure which we tentatively assume for this focus-marked sentence structure is as in (1). The prosodic phrasing structure in (3), which is an all-new, out of the blue, utterance of the same sentence structure, but minus the focus marking, should be contrasted to that in (2)8. (1)
S
NP
VP
PP NP NP NP N-FOC N V P ˇaka anlam ami raj‡a-r c‡hobi-r j‡onno I king’s PICTURES for money gave ‘ I gave money for the king’s PICTURES.’ (2)
(3)
L* ((ami I
phrase-edge not prominence-related LI HP L* HP raj‡ar) ( c‡hobir ) j‡onno ˇaka anlam )IP king’s PICTURES for money gave.
L*HP L* H* LI ((ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam))IP I king’s pictures for money gave
These are both declarative utterances. The phrasing of the neutral focus sentence (3) puts the subject, the complex NP object, and the verb each in a separate phonological phrase. (Nonfinal phonological phrases are in general marked by two tonal events--the presence of a L* pitch accent on the main stressed syllable in the phrase and the presence of a HP peripheral tone at the right edge of the phrase.) The focus sentence (2) alters the otherwise default phrasing in flanking the focus constituent, here a head noun internal to the complex noun phrase, with the left and right edges of a phonological phrase. The arrow marks the problematic right phonological phrase edge found at the right edge of the focus, the phrase edge that the Focus Prominence hypothesis can’t account for. Aside from the flanking of a contrastive focus constituent with phonological phrase edges, there is another important property of sentences with focus in Bengali, namely the absence of any phonological phrase following the focus constituent. This is visible in (2) through the absence of any pitch accent or nonfinal peripheral tones following the focus. We will see that this apparent “dephrasing” can also be given
218
ELISABETH SELKIRK
an explanatory account by Focus Prominence theory, in terms already suggested by Truckenbrodt 1995 for Japanese (section 2). 2. SKETCHING OUT THE FOCUS PROMINENCE THEORY I am going to assume that an utterance is simultaneously analyzed in terms of two discrete types of structure—morphosyntactic and phonological. Specifically, the assumption is that the two output representations defined by the grammar for a sentence, namely the surface morphosyntactic representation (PF) and the surface phonological representation (PR), share a terminal string. This assumption about the interfacing output representations predicts three general types of constraint that would be defined on output representations alone: morphosyntactic markedness constraints, phonological markedness constraints, and interface constraints relating morphosyntactic and phonological properties of the output. Syntactic structureprosodic structure alignment constraints such as Align-L (XP, MaP) are a classic type of interface constraint. They have a demarcative function, in calling for the edge of a designated category in the syntax to correspond to the edge of a designated category of prosodic structure (cf. Selkirk 1986 et seq). In addition, the family of Wrap constraints proposed by Truckenbrodt 1995, 1999 has a cohesive function in requiring that a syntactic constituent of a particular level be entirely contained within a prosodic phrase of a particular level. These Align and Wrap constraints on the syntax-phonology interface clearly have the function of carrying over into the hierarchically organized phonological/prosodic representation of the sentence salient, landmark, properties of the morphosyntactic phrase structure constituency. These constraints, which are apparently cross-categorial, ignore any featural properties of the morphosyntactic representation. Constraints relating focus and prosodic prominence of the sort being proposed in Focus-Prominence theory belong to a distinct class of syntax-phonology interface constraints. A morphosyntactic constituent with the property of being a focus is assumed to be focus-marked (Jackendoff 1972, Selkirk 1984, 1995, Rooth 1992, 1995, Schwarzschild 1999 and many others), so that any constraint calling for a focus-marked constituent in PF to contain a certain level of prosodic prominence in PR is a syntax-phonology interface constraint. But the difference between this and the Align/Wrap constaints is that information structural salience, represented as a property of individual morphosyntactic constituents, is being translated into prosodic structure salience, or prominence. Thus the two defining properties of prosodic structure-- prosodic grouping structure and prosodic prominence, or headedness— correspond to the two faces of surface morphosyntactic structure—phrase structural grouping and an encoding of information structural prominence. What then might be the formulation of the Focus-Prominence interface constraints that are being given responsibility for at least some of the focus phrasing properties of Bengali? Providing a fully motivated answer to this question is a topic of ongoing research, but it is possible to say something here of the ideas under consideration. Truckenbrodt 1995 and Rooth 1996 propose a Focus Prominence constraint that is essentially syntagmatic in character:
BENGALI INTONATION REVISITED
219
(4) Focus Prominence Constraint—syntagmatic (Truckenbrodt 1995, Rooth 1996): A focus is more prominent than any other element within the focus domain. [where focus and focus domain are syntactic/semantic constituents] In its definition of focus prominence this theory does not distinguish between types of focus (e.g. contrastive vs. presentational) and their associated types of domain constituent. Nor does the theory assure a regular prosodic level of prominence for the different focus types. In other work, however, this simplicity is shown to be problematic for the characterization of at least a certain range of focus phenomena (see Selkirk 2000, 2002; Sugahara 2002, 2003). So in what follows I will assume a paradigmatic theory of Focus Prominence, leaving open the question whether the syntagmatic version above is also required in grammar. The paradigmatic theory of Focus Prominence that I am entertaining posits a family of Focus Prominence constraints of the general form in (5), according to which a focused constituent of a particular morphosyntactic structure type must contain a phonological prominence of a particular prosodic structure type: (5) Focus-Prominence Constraint Family—paradigmatic (Selkirk 2000a, 2002) ƒ ( Xn) ⊂ ∆ (π) “The terminal string of an ƒ-focussed syntactic constituent of level Xn in PF (the interface morphosyntactic representation) is a terminal string of PR (the interface phonological representation) which contains the designated terminal element ∆ of a prosodic constiuent of level π.” i. ƒ is a variable over focus types (contrastive, presentational, …) ii. Xn is a variable over syntactic constituent types (word, phrase, …) iii. ∆ stands for “designated terminal element of” (see below), and iv. π is a variable over prosodic constituent types Of particular relevance to the current paper is a constraint relating the presence of a contrastively focused constituent in the syntax to the presence of a prosodic prominence of the Intonational Phrase level in the phonology. The formulation in (6) appears to achieve the correct results for Bengali. (6) FOCUS Prominence: FOCUS(α) ⊂ ∆IP “The terminal string of a contrastively focused (“big” FOCUS ) constituent of level α in PF (=α FOC) is a terminal string of PR which contains the designated terminal element ∆ of an Intonational Phrase.” Contrastive focus invokes a set of alternatives and its semantics can be characterized by alternatives semantics (Rooth 1992, 1995). I am suggesting here
220
ELISABETH SELKIRK
that it is an intonational phrase-level prominence that is called for in a contrastive focus constituent (notated with big caps as FOCUS and referred to as ‘big’ focus). As for other Focus Prominence constraints, they would include, at a minimum, constraint(s) relating words or phrases that are in presentational focus to presumably lower levels of prosodic prominence, for example: (7) Focus XP Prominence:
Focus(XP) ⊂ ∆MaP
“The terminal string of a presentationally focussed (“small” Focus ) constituent of level XP in PF (=XP Foc) is a terminal string of PR which contains the designated terminal element ∆ of a Major Phrase.” A presentational focus has the property of newness in the discourse, and its semantics is characterizable in terms of the theory proposed by Schwarzschild (1999). It will sometimes be notated with initial caps as Focus and nicknamed as ‘small’ focus. As for the prosodic category name ‘major phrase’, this is the level of phrasing immediately below the intonational phrase, sometimes also referred to as ‘intermediate phrase.’ I have chosen the term ‘major phrase’ for its mnemonic value, since the level of prosodic major phrase is identified by its alignment with the morphosyntactic maximal projection phrase. Notice that these hypothesized constraints of the paradigmatic Focus Prominence theory make the felicitous prediction that the phonological properties of big, contrastive, FOCUS are either a superset of those of small, presentational, Focus, or, if different, then are characteristic of a higher level of prominence than those of small focus. This is because, given the nature of prosodic structure, the ∆IP called for in a big focus constituent is necessarily also a ∆MaP, and ∆MaP is what is called for in a presentational focus phrase. That is, both contrastive and presentational focus will be called on by constraints to show the properties of a ∆MaP, but only contrastive focus will be called on to show the properties of the higher level ∆IP. Call this prediction big focus-small focus containment. This point becomes clear when we examine the definitions for designated terminal element and prosodic head and apply them to an example. (8) Def: A head of a prosodic constituent π is (i) the most prominent prosodic constituent immediately dominated by π (the π-prom of ʌ) or (ii) the most prominent prosodic constituent immediately dominated by a head of π. (9) Def. The designated terminal element (DTE, or ∆) of a prosodic constituent π is that mora in the terminal string of π that is dominated by the chain of heads of π. Note that the sample representation (10) satisfies the Focus Prominence constraint in (5) which requires that the contrastively focused word Mississippi contain the designated terminal element of an intonational phrase IP.
BENGALI INTONATION REVISITED
221
According to the recursive definition of head given above, the boldfaced head constituents are all heads of IP. Assuming that moras are part of the terminal string, the penultimate mora in Mississippi is the designated terminal element of IP. This is because it is the head mora of the head syllable of the head foot of the head prosodic word of the head minor phrase of the head major phrase of the intonational phrase. Turning to Bengali, we will assume that the focus type whose prosodic properties are being described in the Hayes and Lahiri paper is big, contrastive, focus. Their examples of focus involve cases of explicitly contrastive focus or answers to wh-questions. So we will be investigating in Bengali the consequences of assuming that a big focus (FOCUS) constituent contains the DTE of an Intonational Phrase, as called for by the big FOCUS Prom constraint in (5). The properties of presentational focus in Bengali have not yet been submitted to a systematic investigation. (10)
IP | MaP | MiP PWd Ft
π-prom of IP = MaP π-prom of MaP = MiP
PWd Ft
Ft
σ σ σ σ σ σ | | | | | | µ µ µ µ µ µ v Ι s Ι t [M Ι ss Ι ss Ι pp Ι ]FOC
π-prom of MiP = PWd π-prom of PWd = Ft π-prom of Ft = σ π-prom of σ = µ
= ∆IP (dom. by the head σ, Ft, PWd, MiP, MaP of IP) (Underlining will be consistently taken to denote head status.) Let’s look at the general shape of the analysis I am proposing for the flanking of a contrastive FOCUS-marked syntactic constituent by phonological phrase edges in Bengali. First, FOCUS-Prom (6) calls for a ∆IP within the FOCUS constituent. This has the consequence that the ∆IP is dominated by the head MaP of IP, the head of that head MaP, and so on, as seen in the partial representation in (11) below:
222
ELISABETH SELKIRK
(11)
[Partial prosodic structure 1] FOCUS-Prom ⇒
IP | MaP | PWd | Ft | σ | µ = ∆IP [[ami] [[[[raj‡a-r] [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]] I king’s PICTURES for money gave ‘I gave money for the king’s PICTURES’ In meeting the requirements of the FOCUS-Prominence constraint, head constituents are defined at all prosodic levels lower than IP. Now, the grammar contains a class of prosodic markedness constraints that call for the alignment of these prosodic head constituents with the right or left edge of their mother prosodic constituents (McCarthy and Prince 1993) such as the well-attested Align R/L (Ft, PWd). Hayes and Lahiri argue that a phonological phrase has its head at the left edge of the phrase, giving a pattern of left edge phonological phrase prominence. I will express this constraint as Align L (PWd, MaP), assuming that the phonological phrase appealed to in the constraint is at the level of the major phrase and that it is a prosodic word level head-constituent that is aligned with the MaP left edge. (This analysis ignores for reasons of expository convenience the possibility that there may be an additional level of phonological phrase (the Minor Phrase) intervening between PWd and MaP, as does the analysis of Hayes and Lahiri.) Following Truckenbrodt’s 1995 analysis of the left phonological phrase edge that appears with FOCUS in Japanese, my analysis of Bengali gives this prosodic alignment constraint the responsibility for the flanking of Bengali FOCUS with a left phonological phrase edge, as shown in (12a). [Note that (12a) is only a partial prosodic tree and (12b) is a partial prosodic labelled bracketing.]
BENGALI INTONATION REVISITED
(12)
223
[Partial prosodic structure2]
Align L (PWd, MaP) ⇒ a.
IP | MaP MaP | PWd | Ft | σ | … µ … | [H]FOC [L]DECL [[ami] [[[[raj‡a-r [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]] I king’s PICTURES for money gave ‘ I gave money for the king’s PICTURES’ b. IP
((ami raj‡ar)MaP MaP(c‡hobir j‡onno ˇaka anlam)IP I king’s PICTURES for money gave.
On this proposal, then, a constraint like AlignL (PWd, MaP) has in general two functions. Here it induces the presence of a phonological phrase edge at the edge of a prosodic prominence whose position with respect to the syntactic structure is fixed by the FOCUS-Prom constraint. In cases where the location of the prominence is not fixed by an interface constraint, the same constraint predicts that the prominence will fall wherever the grammar determines that the left edge of a phonological phrase might appear. This two-fold effect follows from the fact that the locus of prosodic prominence may either be fixed independently in which case the edge comes to align with it, or the locus may not be fixed independently, in which case the prominence locates itself wherever the grammar may call for a phrase edge. As for appearance of the right edge of a phonological phrase edge seen in (13) at the right edge of FOCUS, I argue in the following section that it is to be ascribed to the presence of the tonal morpheme [H]FOC at the right edge of the FOCUS constituent in morphosyntactic structure.
224
ELISABETH SELKIRK
(13)
[Partial prosodic structure3]
Align R ([H]FOC, MaP) ⇒ a.
IP | MaP MaP | PWd PWd ………….. | Ft | σ | … µ … | [H]FOC [L]DECL [[ami] [[[[raj‡a-r [c‡hobi-r]FOC] j‡onno] ˇaka] [anlam]]] I king’s PICTURES for money gave ‘ I gave money for the king’s PICTURES’ [L]DECL b. [H]FOC h IP(ami raj‡ar) MaP(c‡ obir)MaP j‡onno ˇaka anlam)IP
The (a) examples of these partial representations contain the morphosyntactic labelled bracketing of the sentence, which includes the marking for contrastive big FOCUS on the phrase-medial noun pictures, as well as what I will argue below are the tonal morphemes for FOCUS and DECLARATIVE, [HFOC] and [L]DECL respectively. The hypothesis is that the morphemic contrastive FOCUS tone is lexically specified to appear at the right edge of a phonological phrase, through the effect of a morpheme specific alignment constraint Align R ([H]FOC, MaP). This constraint induces the presence of the phonological phrase edge at the position at the right edge of FOCUS constituent that the FOCUS morpheme is hypothesized to occupy in morphosyntactic structure. Hayes and Lahiri in fact argue that the H phrase-edge tone of FOCUS is morphemic in Bengali; the present proposal simply draws the consequences of that morphemic status within the framework of assumptions adopted here. In section 3 I give arguments for the morphemic status of the focus H tone. There is a final phrasing property of big FOCUS sentences, one that is also arguably a prosodic prominence alignment effect, namely a “dephrasing” to the right of the FOCUS constituent. No tones appear between the right edge of the FOCUS constituent and the end of the sentence in Bengali. This can be seen in example (2). The post-FOCUS stretch is demarcated at the beginning by the [H] morphemic tone that appears at the right edge of the FOCUS and at the end by the sentence-final illocutionary tonal morpheme. Between them, there are no prominence-marking
225
BENGALI INTONATION REVISITED
pitch accents, nor any phrase-edge-marking H peripheral tones. Since tones mark these prosodic structure landmarks of a phonological phrase by default, the absence of the tones is most straightforwardly explained by the post-FOCUS absence of the phonological phrasing and prominence that trigger the presence of these tones. This sort of post FOCUS “dephrasing” is argued by Truckenbrodt 1995 to result from a constraint which calls for the prosodic head of an intonational phrase to align with the right edge of the IP. Any phonological phrase intervening between the FOCUS phrase and the right edge of the intonational phrase would be disaligning and so produce a non-optimal prosodic representation for the sentence. In particular, after FOCUS one never sees the appearance of the phrasing normally associated with the matrix verb. So the provisional constraint “Verb-ϕ Align” (see footnote 8) must be dominated by the IP-level prosodic alignment constraint. I will assume that the “dephrasing” observed in the optimal candidate moreover constitutes a violation of Exhaustivity (IP), hence:
ϕ Align” ⇒ (14) Align R (MAP, IP) >> Exhaustivity (IP), “Verb-ϕ IP MaP
MaP
PWd PWd
PWd
PWd
PWd
L* [H]FOC h (c‡ obi-r)MaP MaP
j‡onno
ˇaka
L* H IP((ami raj‡ar )MaP
PWd
[L]DECL anlam)IP
The section below is devoted to establishing that the account I have proposed for the appearance of a phonological phrase edge at the right of the FOCUS constituent is well founded. It will rely on establishing the morpheme status of the H tone that flanks the FOCUS constituent on the right as well as establishing the existence of a morpheme-specific alignment constraint that may induce the presence of a phonological phrase edge at the edge of the FOCUS tonal morpheme. 3. TONAL MORPHEMES IN BENGALI SENTENCE TONOLOGY The preceding analysis of Bengali FOCUS prosody has adopted many of the assumptions of Hayes and Lahiri’s masterful (1991) account: the notion that phrasing is central to an account of the distribution of tones; the notion that phrase stress is leftmost in the phonological phrase, while stress prominence in the intonational phrase is rightmost; the notion that a [H]FOC tonal morpheme must be posited. Where the account proposed here crucially differs from Hayes and Lahiri’s is in giving the [H]FOC morpheme responsibility for the FOCUS-related phrasing. A more general difference is that the account offered here is a constraint-based optimality theoretic account which seeks to provide an explicit, exhaustive, analysis of all the relevant tonal patterns in Bengali as well as of all the relevant phrasing
226
ELISABETH SELKIRK
patterns in the language. Specifically, the aim is provide a complete account of the tonological differences between declarative and question utterances under both “neutral” and contrastive focus conditions. We will see that the H tone that appears at the right edge of a FOCUS constituent has a significantly different behavior from the peripheral default H tone that is the regular marker of right edge of phonological phrase. 3.1. The intonation of neutral focus sentences 3.1.1. The default L* HP pattern for phonological phrases To begin, we will look at Bengali sentences with so-called neutral or broad focus, starting with a treatment of the default L* pitch accent and H edge tones that mark nonfinal phonological phrases, as seen in (3). A pitch accent is simply a tone whose distribution is defined with respect to a prosodic prominence. The insertion of a default, non-lexically specified, pitch accent is a type of prosodic enhancement and must be the consequence of a phonological markedness constraint calling for the designated terminal element of a prosodic constituent to be associated with some tone. Constraints of this type are known to play a role in the world’s languages9. The insertion of a default edge tone is also a variety of prosodic enhancement, this time serving to demarcate prosodic phrase edges. It must be the result of a markedness constraint calling for the edge of a phrase to be aligned with a tone. Again, such constraints are attested crosslinguistically10. It is likely not a coincidence that the pitch accent and peripheral tone of the Bengali phonological phrase have polar tonal values, and indeed Hayes and Lahiri argue that the Obligatory Contour Principle (OCP) has a central place in Bengali tonology. For reasons to be seen right below, the OCP-based analysis offered here picks out the pitch accent as the tonal element whose polar value is a function of the other. The High value of the peripheral tone will be specified by an edge-tone alignment constraint, as in (15a). With the High edge specified by constraint, the introduction of the polar Low value for the pitch accent can be achieved by a combination of the prominence-tone association constraint in (15b) and the OCP. (15) a. Align R (MaP, H tone) “Align the R edge of a major phonological phrase with the R edge of a H tone.” [= the source of the default High edge tone] b. Associate (∆ ∆MaP, Tone) “Associate the designated terminal element of a major phonological phrase. i.e. the head mora of the MaP, with some Tone.” [= the source of a pitch accent on ∆MaP, which is realized as either L or H, as dictated by the OCP] The tableau in (16) illustrates the role for these constraints in deriving the tones of the initial phonological phrase from the sentence in (2):
227
BENGALI INTONATION REVISITED
(16) OCP …. [ ami ] [[ raj‡a-r]... a. … ( ami raj‡ar)MaP…. ⇒ L H b. … (ami raj‡ar )MaP H H c. … (ami raj‡ar )MaP
Align R (MaP, H) *!
Assoc (∆MaP, T)
*Tone
* **
*!
**
The two constraints in (15), which call for the presence of tone in the representation, crucially outrank the constraint *Tone, which minimizes the presence of tone in the representation. The OCP adjudicates the choice of tone, and is not crucially ranked with respect to the others. (The non-ranking among the higher constraints is provisional.) There is another role for the OCP. In addition to assuring the non-identical character of the tones introduced by default into the representation, as here, Hayes and Lahiri also propose that it is responsible for the failure of the default H edge tone to appear in the first place, when it is followed by another H tone in the utterance. This effect is seen in (3), where the bracketed perpherial tone is actually not realized, because of the H* that follows in the next phrase. The absence of that peripheral H will be analyzed below. 3.1.2. The tonal patterns of final phrases The default L* H phrasal tone pattern is preempted in the final phrase of the sentence by the tonal morphemes expressing the illocutionary force of the sentence. The patterning of tones in the final phonological phrase of the sentence is contrastive, and is a function of the declarative vs. interrogative status of the utterance, together with the FOCUS status of the elements within the final phrase. The basic, neutral focus, declarative ends in a H* pitch accent followed by L boundary tone, while the basic, neutral focus, yes-no interrogative ends in L* plus HL boundary tone combination. Compare the declarative non-FOCUS sentence in (17) with the non-FOCUS interrogative in (18).
(17)
L*H L* H* [L]DECL (ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam) I king’s pictures for money gave “I gave money for the king’s pictures.”
(18)
L*H L* H L* [HL]QUES (ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam) I king’s pictures for money gave? “Did I give money for the king’s pictures?”
228
ELISABETH SELKIRK
According to Hayes and Lahiri, the H* L of the declarative is composed of an underlying H* declarative morpheme followed by a L sentence-final ‘neutral’ morpheme, while the L* HL of the yes-no interrogative is composed of an underlying L* interrogative morpheme and a peripheral HL ‘yes-no’ morpheme. Operating within a pre-OT framework of assumptions, Hayes and Lahiri propose that the OCP has a role to play in determining possible tonal contours in Bengali, but they do not exploit this idea fully in the analysis of these contours. I would like to suggest, as an alternative OT-based analysis, that the final boundary L of the declarative is the declarative morpheme itself, and that the H* pitch accent preceding the declarative morpheme [L]DECL is not morphemic. Rather that H* is a default pitch accent whose quality is determined by the OCP on the basis of the L quality of the declarative morpheme, as shown schematically in (19a). Similarly, the HL boundary combination can be taken to be the morpheme for the yes-no interrogative and the preceding L* pitch accent in the final phrase can be determined by OCP-respecting default, as shown in (19b): (19) PF (morphosyntactic interface)
PR (surface phonological representation)
a.
[[…….]InflP[ L ]DECL ]ForceP
(…..
( ∆H….. µ L)ϕ)IP
b.
[[…....]InflP[ HL ]QUES]ForceP
(…..
( ∆L….. µ HL)ϕ)IP
In other words, I am suggesting that the illocutionary tonal morphemes in Bengali consist only of boundary tones, as is the case in many tone languages, for example. In the interface PF representation, these illocutionary morphemes are the functional heads of a syntactic projection that, following Rizzi 1997, will be referred to as a Force Phrase. In the surface phonological representation PR, the presence of the default pitch accent is determined by Assoc (¨MaP, Tone) and the quality of the pitch accent tone is determined by the quality of the illocutionary force morpheme and the OCP:
BENGALI INTONATION REVISITED
229
(20) Declarative: […...[anlam]V ]InflP[ L ]DECL ]ForceP ⇒ H* [L] a. (…… ( anlam)MaP)IP L* [H] b. . (…… ( anlam)MaP)IP L* [L] c. . (…… (anlam)MaP)IP [L] d. . (…… (anlam)MaP)IP
Realize [L]DECL
Realize [HL]QUES
OCP
Assoc (∆MaP, T)
*Tone **
*!
** *!
** *!
*
Interrogative […..[anlam]V]InflP[ HL ]QUES ]ForceP ⇒ L* [HL] a. (…… ( anlam)MaP)IP H* [L] b. . (…… ( anlam)MaP)IP H* [HL] c. . (…… (anlam)MaP)IP [HL] d. . (…… (anlam)MaP)IP
*** *!
** *!
*** *!
**
The constraints Realize [L]DECL and Realize [HL]QUES mentioned in the tableau assure that the tones of a tonal morpheme in the input are maintained in the output, in the quality specified in the input; these constraints are members of the family of constraints which require that a morpheme have some phonological realization in the output. I will assume that the general character of these Realize constraints for tonal morphemes is as in (21). (21) Realize [Tone(s)]M
( = a constraint schema)
The tone(s) of a tonal morpheme [T1 (T2)]M in the morphosyntactic input representation must be realized as such in the output phonological representation Together with the OCP, these faithfulness constraints assure that the default pitch accent in the final phrase is the polar opposite of the following lexically specified boundary tone morpheme. So just as the quality of the L* pitch accent in nonfinal phrases is determined by constraint, so is the quality of the pitch accents in the final phrase. Note that the constraint MaxTone, which calls for an input tone to have a corresponding tone in the output (McCarthy and Prince 1995), cannot be given the function of maintaining the tonal morphemes in the output. Bengali is not a tone
230
ELISABETH SELKIRK
language, in which tonal contrasts in morphemes which also have segmental content are preserved on the surface. Rather, assuming Richness of the Base (Prince and Smolensky 1993), *Tone must be ranked above Max Tone in order to ensure that any nonmorphemic tones are eliminated in the output. But *Tone must be ranked below the morpheme realization constraints of the form Realize [Tone]M. An intonational language, which lacks lexical tone contrasts expected for those found in tonal morphemes, is thus characterized by the ranking Realize [Tone]M >> *Tone >> MaxTone. 3.1.3. The absence of default edge H in the penultimate phrase in declaratives A minor ranking adjustment to the constraint system developed so far will allow an account of a further property of declarative intonation. Hayes and Lahiri report that no default peripheral H tone appears on the penultimate phonological phrase in the case of the declarative, as seen in (17) (=(3)). They ascribe this to the OCP, which disallows a H tone sequence consisting of a phrase-final H followed by the pitch accent H* of the declarative. Since, by hypothesis, both of these tones are default, none of the faithfulness constraints seen above can decide which one of the H tones is realized. Rather the ranking of Assoc (¨MaP, T) and the OCP over Align R (¨MaP, H) will derive the result that it is the edge tone, not the pitch accent, which fails to appear. In other words, it is more important in Bengali to maintain a pitch accent than to maintain a peripheral tone, when the identical qualities of these would produce an OCP violation. The tableau in (22), which illustrates the analysis, contains a version of sentence (17), which, for the sake of the exposition, lacks the overt subject noun phrase:
(22) L
[[raj‡a-r c‡hobi-r j‡onno ˇaka] [anlam]][ ] DECL] L* H H* [L]DECL a. (raj‡ar c‡hobir j‡onno ˇaka) (anlam)) L* H [L]DECL b. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))
Realize α
OCP
Assoc (∆MaP Tone)
Align R (MaP, H)
*!
*Tone
**** *!
***
⇒ L* ø H* [L]DECL c. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))
*
***
H* L H* [L]DECL d. (raj‡ar c‡hobir j‡onno ˇaka) (anlam))
*
****!
BENGALI INTONATION REVISITED
231
The optimal candidate c. shows a violation of the constraint Align R (MaP, Tone); the ranking of this constraint below the OCP and the Assoc (¨MaP, Tone) allows for this candidate to emerge as the winner. Candidate c. shares an Align R (MaP, H) violation with the nonoptimal candidate d, because both the absence of a H tone and the appearance of a L tone instead of H constitute violations of this constraint. Candidate d. is therefore ruled out by its greater number of violations of the structure-minimizing constraint *Tone. No higher ranked constraint calls for the presence of a default peripheral L at the edge of major phrase, so *Tone rules it out. 3.1.4. The absence of default edge H in the final phrase in declaratives There is one last property of the tonal patterns of the final phrase that remains to be explained, namely the absence of the default H right edge tone that is normally found in nonfinal phrases (except, of course, for the case just described). The default peripheral H tone simply does not appear preceding the L boundary tone of the declarative, as shown in (23a). As for the interrogative, which ends in a [HL] tonal morpheme, as in (18), there is no way of telling whether the default peripheral H tone is present as well. (23) Neutral focus declaratives lack a phrase-peripheral High tone in the final phrase:
a. b.
H* [L] …….. ( ¨….. µ)MaP)IP L* H[L] * …….. ( ¨….. µ )MaP)IP ok
If the default H were to surface, the tonal pattern to be predicted by the OCP would be identical to that found in interrogatives, namely a L* pitch accent followed by a HL boundary sequence, as in (23b). Homophony avoidance is transparently not a factor in ruling out this candidate for the declarative pattern, however, since homophony of distinct sentence types is not avoided in Bengali. As we will see below, a declarative with a contrastive FOCUS in the final phrase has exactly the L* HL pitch pattern found in the interrogative. Rather, the impossibility of the pattern in (23b) is analyzable as a consequence of the constraint system. Basically, the proposal is that the tonal alignment constraint Align R (MaP, H), which is violated in the optimal candidate (17), is dominated by the morpheme-specific alignment constraint Align R ([L]DECL, IP) and the well-known tonal markedness constraint *Contour Tone. (24) gives the ranking that will derive the absence of the default peripheral H tone in the final phrase and (25) is the tableau that shows it. (The pitch accents of the final phrase are not shown in the schematic phrase-final representations in (25).) (24) Realize [L]DECL, Align R ([L]DECL, IP), *Contour Tone >> Align R (MaP, H) >> *Tone
232
ELISABETH SELKIRK
(25) Realize [L]DECL …[anlam]] [ L]DECL ]FroP H …. (… µ µ)MaP)IP [H] b. …. (… µ µ)MaP)IP [L] H c. …. (… µ µ )MaP)IP H[L]
Align R ([L]DECL, IP)
*Contour Tone
Align R (MaP, H)
*Tone
*!
*
*!
*
a.
d. …. (… µ µ )MaP)IP H [L] e. …. (… µ µ )MaP)IP ⇒ [L] f. … . (…µ µ )MaP)IP
*!
**
*!
** *
**!
*
*
Candidate f., with its simple declarative [L] morpheme, is the optimal one. It violates Align R (MaP, H), but does not show the violations of the higher ranked constraints seen in candidates a.- d., and has fewer violations of *Tone than candidate e. has. The constraint *Contour Tone introduced here is a tonal markedness constraint familiar from much previous research. Its essential role is to disallow the case where both the peripheral default H and the tonal morpheme [L]DECL are associated to the same tone-bearing unit, i.e. the same mora. As for the constraint Align R ([L]DECL, IP), it has the function of ruling out candidate c. in this tableau, in which the declarative morpheme is associated to the penultimate mora of the phrase rather that to the edge mora, to which the default H edge tone is associated here. Morpheme-specific subcategorizational alignment constraints like Align R ([L]DECL, IP) are made explicit or presupposed in the the literature (Gussenhoven 2000 , Grice et al 2000), where they are given the function of linearizing tonal morphemes within the prosodic representation. (26) Align R ([L]DECL, IP) Align [L]DECL with the rightmost tone-bearing unit of an Intonational Phrase. Note that an alternative analysis based on the metathesis-banning input-output correspondence constraint Linearity (McCarthy and Prince 1995) cannot do the job of ruling out candidate c., since the H, as a default tone, is not in the input representation, and so its position with respect to input tones is not regulated by the constraint. This completes my constraint-based analysis of the tonal contours found in declarative and interrogative sentence types under conditions of neutral focus. The full constraint ranking motivated so far, (27), shows the role for totally familiar types of constraints from the tonal and intonational literature in accounting for neutral intonation in Bengali.
BENGALI INTONATION REVISITED
233
(27) Realize [L]DECL, AlignR ([L]DECL, IP) *ContourTone, OCP, Assoc (¨MaP, Tone)
AlignR (MaP, H) *Tone In the next section an analysis of tonal contours in sentences with contrastive FOCUS will be provided which draws on this constraint ranking and adds to it just the constraints relevant to realizing and linearizing the FOCUS morpheme, namely Realize [H]FOC and Align R ([H]FOC, MaP). 3.2 The intonation of sentences with contrastive FOCUS A declarative sentence containing a FOCUS constituent lacks the H* LI contour of the basic declarative. Instead what one finds in the FOCUS declarative is a final contour consisting of a L* pitch accent followed by a H peripheral tone followed by the L peripheral tone of the declarative morpheme. There are two cases to distinguish:
(28) FOCUS constituent is final in the declarative sentence (on the verb) L*HP L* HP L* H [L] (ami) (raj‡ar c‡hobir j‡onno ˇaka) (anlam) I king’s pictures for money GAVE (29) FOCUS constituent is not final in the declarative sentence L* (ami I
HP L H [L] j‡onno ˇaka anlam) raj‡ar) (c‡hobir) king’s PICTURES for money gave.
The H peripheral tone of a FOCUS, marked in bold italics, always flanks the right edge of the morphosyntactic FOCUS constituent and so differs in its distribution from the declarative morpheme [L], which is confined to the right edge of the sentence. If the FOCUS constituent is not final in the sentence, the H appears at its non-final right edge, at a distance from the L tone at sentence end. 3.2.1. Final FOCUS Let’s review the Hayes and Lahiri argument that the H tone appearing with final FOCUS in declaratives is a morpheme, rather than merely the default H peripheral
234
ELISABETH SELKIRK
tone seen in nonfinal phonological phrases. Hayes and Lahiri base the argument on the contrast between the final tonal patterns of nonFOCUS declaratives like (17) and FOCUS declaratives like (28). The contrast is not in the final L tone, which is common to both forms of the declarative. The contrast is also not in the tonal value of the pitch accents, which are predictable on the basis of the quality of the following peripheral tone. It is the presence of the peripheral H tone in final FOCUS declaratives like (28) which is contrastive. That peripheral H in (28) must be morphemic. As we saw above, it cannot be an instance of the default peripheral H tone, which simply fails to appear in nonFocus declaratives like (17). So we must posit a FOCUS tonal morpheme--[H]FOC, an entity whose presence in the representation can be assured by a morpheme-realization constraint. As we will see, it is this morphemic status which permits an explanation for the distribution of this H tone in (28) and (29), and for the appearance of the right edge of phonological phrase at the right edge of the FOCUS constituent. A contour tone consisting of the [H]FOC morpheme and the [L]DECL morpheme is formed at sentence edge in the final FOCUS case. The simple presence of these tones in the representation is guaranteed by morpheme realization constraints, but faithfulness does not guarantee the joint positioning of the tonal morphemes at the right extreme of the utterance, in violation of *Contour Tone. The creation of the contour tone must be forced by constraints requiring that these morphemes appear at a phrase edge. Such an alignment constraint was proposed above for the declarative morpheme, namely (26), Align R ([L]DECL, IP). For the FOCUS morpheme, the constraint should be formulated as an alignment with the edge of a phonological phrase: (30) Align R ([H]FOC, MaP) Align [H]FOC with the rightmost tone-bearing unit of a Major Phrase. The ranking above *Contour Tone of these two morpheme-specific alignment constraints in (31) explains why they form an illicit contour at phrase edge, as we see in tableau (32). (31) Realize [H]FOC, Realize [L]DECL, Align R ([H]FOC, MaP), Align R ([L]DECL, IP) >> *Contour Tone
BENGALI INTONATION REVISITED
235
(32) ... [ anlam][H]FOC] [ L]DECL ]FroP
Realize [L]DECL
Realize [H]FOC
Align R ([H]FOC, MaP)
[H] [L] ⇒ a. …. (…µ µ)MaP)IP [H] b. …(… µ µ)MaP)IP [L] c. …(…µ µ )MaP)IP [H][L] d. ...(… µ µ )MaP)IP
Align R ([L]DECL, IP)
*Contour Tone
*
*!
*!
*!
Note that this analysis assumes that the alignment constraints for both [H] and [L] tonal morphemes are satisfied by an association to the final tone-bearing unit of the phrase, as seen in candidate a. In other words, the [H] in the optimal candidate a. is considered to be right-aligned even if it precedes the [L] within the phrase. Candidate d. shows a real misalignment of the [H], however, in being associated to the penultimate mora. In candidates b. and c., it is the disappearance of the input tonal morphemes, in violation of the morpheme realization constraints, which accounts for the ungrammaticality of the forms. What we don’t yet have an explanation for is the ungrammaticality of an additional candidate where the order of the morphemes in the final contour tone is simply the opposite of what we see in candidate a. Some additional principle would be required to account for the optimality of candidate a over this alternative. In the spirit of Pierrehumbert and Beckman 1988, one might assume that an IP-aligned edge tone must lie outside a MaP-aligned edge tone. But there is also a possible explanation based on the positioning of these tonal morphemes in the morphosyntactic structure, where the sentence-peripheral [L] declarative tone lies higher up and to the right of the focus [H] tone, which marks a constituent lower down in the sentence. To sum up, the two morpheme-specific constraints for the FOCUS morpheme-Realize [H]FOC and AlignR ([H]FOC, MaP)-- have been brought into play in this section and the constraint ranking has been refined. The full constraint ranking is now as in (33).
236
ELISABETH SELKIRK
(33) Realize [H]DECL AlignR ([L]DECL, IP) Realize [H]FOC AlignR ([H]FOC, MaP) Assoc (¨MaP, Tone)
*Contour Tone
OCP
Align R (MaP, H) *Tone MaxTone In the next section we will see that the constraints motivated here will also enable us to account for the characteristic tone and phrasing properties of nonfinal FOCUS in Bengali. 3.2.2. Nonfinal FOCUS Now we are at the point where we can understand the apparently problematic fact with which this paper began, namely the fact that a FOCUS is always flanked by a MaP edge on its right, even when it is not final in the sentence. Sentence (2), repeated here in (34), contains an example of a non-final FOCUS. The interface syntactic representation is (35).
(34) L* ((ami
H L* H LI raj‡ar) (c‡hobir) j‡onno ˇaka anlam)IP
(35)
S
VP NP PP NP P N V NP NP NFOC | | |\ N N N [H]FOC ami raj‡a-r c‡hobi-r j‡onno ˇaka anlam I king’s PICTURES for money gave ‘I gave money for the king’s pictures’ What immediately meets the eye (and ear), is that the right edge of the nonfinalFOCUS constituent is marked by a H tone. We must presume that this is the same focus morpheme [H]FOC that is observed when the FOCUS is final in the sentence. For explicitness, let’s take the FOCUS morpheme to be adjoined as a
237
BENGALI INTONATION REVISITED
suffix to a word (as in (35)) or a larger phrase, where it licenses the FOCUS property on the dominating node, which in turn gets interpreted as FOCUS in the semantics. Given the position of the [H]FOC morpheme as a suffix of the FOCUS constituent in the syntax, the interface and markedness constraints in (33) will guarantee that in the phonological representation of the sentence the right edge of the FOCUS constituent will correspond to the right edge of a major phrase in the declarative case given in (34). The constraint AlignR ([H]FOC, MaP) plays a crucial role in deriving this result. The analysis goes as follows. The FOCUS morpheme [H]FOC is forced by faithfulness to the syntactic representation (call this “Syntax Faith” for short) to remain in its syntactically specified position as a suffix at the right edge of the FOCUS constituent. Confined to that position, the FOCUS morpheme is nonetheless required to satisfy its own morpheme-specific interface alignment constraint, AlignR ([H]FOC, MaP), which calls for the morpheme to appear at the right edge of a major phrase in phonological representation. Since the position of the [H]FOC is fixed by the syntax in a context in which the right edge of phonological phrase may not be called for, satisfaction of the alignment constraint may require that the phrase edge be introduced into the representation. In other words, AlignR ([H]FOC, MaP) may in effect induce the presence of the phrase edge. This is the case in (34)/(35), as seen in (36): (36) Nonfinal FOCUS in the declarative L
[[raj‡a-r [[c‡hobi] [H]FOC-r] j‡onno ˇaka] [anlam]] [ ]DECL] ⇒ a. b. c. d.
L* [H]FOC [L]DECL ( ….. (c‡hobir )MaP j‡onno ˇaka anlam) L*[H]FOC [L]DECL ( ….. (c‡hobir j‡onno ˇaka anlam) MaP)IP L* [H][L] ( ….. (c‡hobir j‡onno ˇaka anlam)MaP)IP L* [H]FOC L* H* [L]DECL ( ….. (c‡hobir)MaP ( j‡onno ˇaka)MaP (anlam)MaP)IP
Syntax Faith
AlignR ([H]FOC, MaP)
AlignR (MaP, IP)
Exh (IP) *
*! *! *!
Candidate c. moves the [H]FOC to coincide with a MaP edge at the end of the sentence, and so violates Syntax Faith. Candidate b. lacks a right edge of MaP at the [H]FOC in situ position, and so violates AlignR([H]FOC,MaP). Candidate a., which respects both these constraints, is the optimal one. As for candidate d., it contains the phonological phrase edge that is otherwise always present at the left edge of the verb, as well as the edge induced by the FOCUS morpheme, all organized into a prosodic structure respecting Exhaustivity. But, as was proposed in section 2, this post-FOCUS phrasing is ruled out by the markedness constraint which aligns the head MaP with the right edge of IP. The optimal candidate a. lacks any Major
238
ELISABETH SELKIRK
Phrase intervening between the head MaP of the FOCUS and the end of the sentence, and so is not considered to count as a violation of AlignR (MaP, IP). It does show a violation of the lower ranked constraint Exhaustivity (IP) (Selkirk 1995), which requires the Intonational Phrase to immediately dominate only major phrases, i.e. constituents at the next level down in the prosodic hierarchy. So this, then, is the explanation for the presence of the right edge of phonological phrase at the right edge of a nonfinal FOCUS constituent in Bengali. The FOCUS morpheme, through its own, independently motivated, subcategorizational prosodic alignment constraint AlignR ([H]FOC, MaP), induces the presence of the phrase edge observed. This means that there is no reason to follow Hayes and Lahiri in positing a FOCUS-prosody interface constraint which aligns the right edge of a FOCUS syntactic constituent with the right edge of phonological phrase. The Hayes and Lahiri analysis is incompatible with the Focus Prominence theory of the interface of focus and phonology, so it is a welcome result that there is an alternative to that theory which falls out from the independently motivated analysis of Bengali intonation that has been proposed here. While the current proposal might be preferable on the grounds of theoretical economy, given that it successfully excludes the class of Focus-Phrasing interface alignment constraints from universal grammar, it would desirable to clinch the case on the basis of empirical fact. Fortunately, the facts are in principle available, though they have not yet been investigated. In a current collaborative project with Aditi Lahiri, we hope to bring the facts to light. The theory proposed here predicts that if the [H]FOC morpheme is for some reason absent at the right edge of a FOCUS constituent in surface representation, there should be no right edge of phonological phrase at that location. The Hayes and Lahiri theory predicts on the other hand that, regardless of the presence or absence of the FOCUS morpheme, a phonological phrase edge should appear at the right edge of a syntactic FOCUS constituent. Now there happens to be a case of nonfinal FOCUS in Bengali where the [H] FOC morpheme fails to be realized in the output. This occurs in interrogatives, where indicates the deleted FOCUS [H] tone:
(37)
L* H L* [HL]QUES (ami raj‡ar) (c‡hobir j‡onno ˇaka anlam) I king’s PICTURES for money gave. ‘Did I give money for the PICTURESFOC of the king?
The tonal morpheme for interrogatives is [HL]QUES, and, as Hayes and Lahiri point out, the absence of the FOCUS morpheme [H]FOC at the right edge of the FOCUS constituent could be attributed to the OCP. Given the Hayes and Lahiri analysis of FOCUS phrasing, there are no implications of this tonal deletion for the phrasing. But in the analysis of Bengali intonation that I have proposed, the loss of the tonal morpheme implies an absence of phonological phrase edge at the right edge of the FOCUS constituent, since there is no other constraint that would produce that phrasing. Now it turns out that there is a way of probing this difference in phrasing predictions in Bengali.
BENGALI INTONATION REVISITED
239
As Hayes and Lahiri demonstrate with admirable systematicity, the phonological phrase organization of Bengali is reflected not just in the patterning of tones within the sentence, but also in the segmental phonology. Interword assimilations like the complete assimilation of final r to a word-initial coronal are found within the phonological phrase, but are blocked at phrase edges. So, for example, the sequence /c‡Hobi-r j‡onno / is differently realized in sentences (2) and (3). In (3) where the sequence is phrase-internal, the /r j‡/ sequence is realized on the surface as [j‡ j‡], while in (2), where the first word is a FOCUS and followed by a phonological phrase edge marked by the [H] focus morpheme, the sequence remains unchanged. Segmental assimilation patterns thus provide a means of diagnosing the presence of phonological phrase edges independent of tone, and it turns out that they may choose between my theory of the appearance of phonological phrase edge at FOCUS right edge and the one proposed by Hayes and Lahiri. My analysis predicts that there should be assimilation in the sequence /r j‡‡/ in (37), since the sequence is phraseinternal. Hayes and Lahiri predict that assimilation should be blocked, since they posit a phrase edge there even in the absence of [H]. The assimilation facts for this case are not reported in Hayes and Lahiri 1991, and are unavailable to me at this writing, but hopefully will emerge soon from joint investigation of such cases planned currently planned. I want to complete this section by showing just how it is that my analysis will select the representation in (37) as optimal for a case of nonfinal FOCUS in an interrogative sentence. The input representation for (37) contains two tonal morphemes—[H]FOC and [HL]Ques. If the nonrealization of the [H]FOC morpheme is the consequence of the OCP, then it must be the case that both the OCP and the constraint Realize [HL]QUES dominate the constraint Realize [H]Foc, which is violated in the representation, as in (38). The tableau in (39) illustrates the analysis. (38) Realize [HL]QUES, OCP >> Realize [H]FOC (39) Nonfinal FOCUS in the interrogative OCP HL
[….. […..[[c‡hobi][H]FOC-r]…..][ ]QUES] L* [H]FOC [HL]Ques a. ( ….. (c‡hobir)MaP ….. )IP L* [H]FOC b. ( ….. (c‡hobir)MaP ….. )IP L* [H]FOC [L]QUES c. ( ….. (c‡hobir)MaP …... )IP ⇒ L* [HL]QUES d. ( ….. (c‡hobir …... )MaP)IP L* [HL]QUES e. ( ….. (c‡hobir)MaP ….. )IP
Realize [HL]QUES
Realize [H]FOC
*!
AlignR ([H]FOC, MaP)
Exh (IP) *
*!
*
*!
* * *
*!
240
ELISABETH SELKIRK
The optimal candidate d. lacks the FOCUS morpheme, and in so doing respects the higher ranked OCP and Realize [HL]QUES, while incurring a violation of Realize [H]FOC. In this optimal candidate, there is no phrase edge at the right of the FOCUS since there is no [H]FOC to require it. Note that candidate e. has the same tones as the optimal d. but differs in having a phrase edge present at the right edge of the FOCUS. In this particular case a phrase edge in that medial position would be ruled out by the constraint Exhaustivity (IP), since the stretch between the major phrase it demarcates and the end of the intonation phrase is not itself parsed into major phrase. Observe that the new ranking in (38) is consistent with the other rankings motivated above for Bengali tonology. (40) is the summary ranking in (33), modified in virtue of (38). Here the OCP is promoted from the lower rank it had been given in (33) for want of any further evidence. (40)
Realize [HL]Ques OCP
Realize [H]DECL AlignR ([L]DECL, IP) Realize [H]FOC AlignR ([H]FOC, MaP) Assoc (¨MaP, Tone)
*Contour Tone Align R (MaP, H) *Tone MaxTone
The claim embodied by exploiting a tonal grammar of this sort is that the tonal/intonational patterns of sentences—in any language-- must be seen as deriving from the interaction of different types of constraints, including morpheme-specific realization and alignment constraints, generic faithfulness constraints like MaxTone, prosodic enhancement constraints calling for (default) pitch accent or edge tones, and classic tonal markedness constraints like the OCP and *Contour Tone. Of course these tonal constraints interact with the constraints of the grammar which define the prosodic structure of sentences. They may either collaborate within a prosodic structure that is independently defined, or, as in the case of the morphemespecific constraint AlignR ([H]FOC, MaP), may in fact be responsible for the presence of some aspect of prosodic structure. 4. SUMMARY In the early sections of the paper, I sketched out a theory of Bengali FOCUS-related phrasing that would be consistent with the Focus Prominence hypothesis, and in the last section this theory was further fleshed out, and shown to be viable. To summarize, the constraints and rankings crucially involved in the analysis of Bengali FOCUS phrasing patterns are:
BENGALI INTONATION REVISITED
241
(i) The FOCUS-Prominence interface constraint: FOCUS (α) ⊂ ¨IP (ii) Phonological markedness constraints of the prosodic prominence prosodic edge alignment family: -- AlignL (PWd, MaP) -- AlignR (MaP, IP) (iii) The ranking hierarchy FOC-Prom, AlignR (PWd, MaP) >> *StrucMaP (collectively responsible for the phrase edge at the left of FOCUS) (iv) The morpheme-specific alignment constraint AlignR ([H]FOC, MaP) (responsible for the phrase edge at the right of FOCUS) (v) The ranking hierarchy FOC Prom, AlignR (MaP, IP), AlignR ([H]FOC, MaP) >> Exh (IP) (collectively responsible for absence of phrasing to the right of the FOCUS phrase) The FOCUS-Prominence interface constraint makes appeal to the FOCUS properties of syntactic constituents in the interface representation, and is seconded in producing its prosodic phrasing consequences by familiar prosodic markedness constraints, as proposed by Truckenbrodt 1995. The additional right-edge phrasing effect in Bengali is produced by a constraint which calls on a specific morpheme in the interface syntactic representation to be aligned with a prosodic phrase edge in phonological representation, namely AlignR ([H]FOC, MaP) . The existence of this latter sort of constraint, which relates the FOCUS morpheme to prosodic phrasing, is consistent with the Focus Prominence theory of the focus-phonology interface. Focus Prominence theory does not exclude subcategorizational constraints that are restricted to specific morphemes like the FOCUS morpheme. The theory limits only the nature of interface constraints which appeal to the semantically interpreted focus feature marking of higher order constituents in the syntactic representation. This focus marking of higher order constituents may of course be projected from focus morphemes like that in Bengali, but the morpheme itself is not a focus(sed) constituent in this sense. The facts of Bengali focus intonation therefore do not challenge the hypothesis that the only focus-phonology interface constraints in a grammar are those which relate focus-marked constituents of surface PF to prosodic stress prominence in surface PR. The Hayes and Lahiri claim for the centrality of the OCP is supported in the present optimality theoretic analysis, which relies on the OCP for an explanation of the polar character of pitch accents and following peripheral tones within a phrase, as well as for an explanation of the absence of peripheral tones (whether default tone or underlying tonal morpheme) when a following tone (whether default pitch accent or underlying boundary tone morpheme) would be of identical tone quality. This long-distance application of the OCP, between tones of disparate provenance and surface association type is noteworthy, and demands notice in a typology of possible conditions of OCP application across languages. In the particular case of Bengali,
242
ELISABETH SELKIRK
assuming that the OCP governs possible output representations has permitted a pared down theory of what the tonal morphemes of Bengali are in the first place, restricting them in this language to sentence-final morphemes, as in the case of the declarative [L] and interrogative [HL] illocutionary force morphemes, or to the constituent-final [H] FOCUS morpheme. All other tones in Bengali intonation are analyzable as default tones, whose presence, and quality, is determined by phonological markedness constraints. University of Massachusetts Amherst 5. NOTES * The research for this paper was supported in part by National Science Foundation grant BCS000438 The Reflexes of Focus in Phonology, Principal Investigator: Elisabeth Selkirk. 1 [H*+L]FOC pitch accent in European Portuguese(Frota 2000), [H]FOC phrase-edge tone in Bengali (Hayes and Lahiri 1991), [H]FOC accent-tropic tone in Swedish (Bruce 1977) 2 Selkirk 1984, 1995 proposes that pitch accents are a default reflex of the presentational Focus status of a word in English. Selkirk 2002 suggests that it is the L+H* which appears by default with contrastive FOCUS. 3 Hungarian (Vogel and Kenesei 1987), Japanese (Pierrehumbert and Beckman 1988), Chichewa (Kanerva 1989), Shanghai Chinese (Selkirk and Shen 1990), and others 4 Jackendoff 1972, Hayes and Lahiri 1991, Reinhart 1995, Roberts 1996 5 Pierrehumbert and Beckman 1988, Inkelas and Leben 1990 6 European Portuguese (Frota 2000) 7 Note than I am not saying that there are no alignment constraints at all which characterize the syntaxphonology interface. Indeed, there is evidence that, independent of focus, you do need interface constraints aligning the edges of syntactic constituents defined in X-bar level terms with prosodic constituents at a designated level, e.g. Align R/L (XP, MaP) (see Selkirk 1986 et seq, Nespor and Vogel 1986, Chen 1987, Truckenbrodt 1998, Sugahara 2003, among others). 8 The position of the verb in the surface representation of these sentences is particularly in need of clarification. Given the structure in (1), there can be no principled explanation for the systematic appearance of a phonological phrase break at the left edge of the verb, seen in nonFOCUS sentences such as (3) . But since this aspect of Bengali phonological phrasing is not of immediate concern, I will continue to assume the structure in (1). It at least shows the analysis in terms of noun phrases that will survive regardless of the ultimate decision about their position in a higher order syntactic structure. 9 For example, in some languages with lexical pitch accent, words lacking pitch accents in their input form receive a default pitch accent on the main stressed syllable of the output form. See Zec 1999 on Serbo-Croatian, Lahiri 2002 on Swedish . 10 Beginning with the analysis of “initial lowering” in Japanese as an alignment of L and H peripheral tones (Poser 1984, Pierehumbert and Beckman 1988), there have been a variety of languages analyzed as showing default, constraint-introduced edge tones, including the medial MaP-edge L phrase tone of English (Selkirk 2000), the LH phrase edge tone of Korean (Jun 1993), etc.
6. REFERENCES Bruce, Gösta. Swedish Word Accents in Sentence Perspective. Lund: Gleerup, 1977. Chen, Matthew. “The syntax of Xiamen tone sandhi.” Phonology 4 (1987): 109-150. Frota, Sonya. Prosody and Focus in European Portuguese: Phonological Phrasing and Intonation. New York: Garland Publishing, 2000.
BENGALI INTONATION REVISITED
243
Grice, Martine, D.R. Ladd and Amalia Arvaniti. “On the Place of Phrase Accent in Intonational Phonology.” Phonology 17 (2000): 143-185. Gussenhoven, Carlos. The Lexical Tone Contrast of Roermond Dutch in Optimality Theory. In M. Horne (ed.), Prosody: Theory and Experiment. Dordrecht: Kluwer Publishing, 2001. Hayes, Bruce and Aditi Lahiri. “Bengali Intonational Phonology.” Natural Language and Linguistic Theory 9 (1991): 47-96. Inkelas, Sharon and W. R. Leben. “Where Phonology and Phonetics Intersect: The Case of Hausa Intonation. In J. Kingston and M. Beckman (eds.), Papers in Laboratory Phonology 1: Between the Grammar and Physics of Speech, pp. 17-34. Cambridge, Cambridge University Press, 1990. Jackendoff, Ray. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press, 1972. Jun, Sun-Ah. The Phonetics and Phonology of Korean Prosody. New York: Garland Publishing, 1995. Lahiri, Aditi, A. Wetterlin, and E. Steiner. “Unmarked Tone in Scandinavian.” Manuscript. Fachbereich Allgemeine Sprachwissenschaft, Unversity of Konstanz, 2002. Kanerva, Jonni. Focus and Phrasing in Chichewa Phonology. Stanford University: Doctoral dissertation, 1989. Kanerva, Jonni. “Focusing on Phonological Phrases in Chichewa.” In S. Inkelas and D. Zec (eds.), The Phonology-Syntax Connection, pp. 145-162. Chicago: University of Chicago Press, 1990. McCarthy, John and Alan Prince. “Generalized alignment.” In G. Booij and J. van Marle (eds.), Yearbook of Morphology, pp. 79-153. Dordrecht: Kluwer, 1993. Nespor, Marina and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986. Pierrehumbert, Janet and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988. Poser, William. The Phonetics and Phonology of Tone and Intonation in Japanese. MIT: Doctoral dissertation, 1984. Prince, Alan and Paul Smolensky. Optimality theory: Constraint Interaction in Generative Grammar. Manuscript, Rutgers University and Johns Hopkins University, 1993. Reinhart, Tanya. “Interface Strategies.” OTS Working Papers, OTS-WP-TL-95-002, Utrecht University, 1995. Rizzi, Luigi. “The Fine Structure of the Left Periphery.” In L. Haegemann (ed.), Elements of Grammar. Handbook of Generative Syntax, pp. 281-337. Dordrecht: Kluwer, 1997. Roberts, Craige. “Focus, Information Flow and Universal Grammar.” In P. Culicover and L. McNally (eds.), The Limits of Syntax, pp. 109-160. New York, Academic Press, 1998. Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75-116. Rooth, Mats. “Focus.” In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory. . London, Blackwell, 1996a. Rooth, Mats. “On the Interface Principles for Intonational Focus.” Proceedings of SALT VI, pp. 202-226. Ithaca, NY: Cornell University, 1996b. Schwarzschild, Roger. “Givenness, Avoid F, and Other Constraints on the Placement of Accent.” Natural Language Semantics 7 (1999): 141-177. Selkirk, Elisabeth. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, Mass.: MIT Press, 1984. Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In John Goldsmith (ed.), The Handbook of Phonological Theory, pp. 550-569. Cambridge: Blackwell Publishers, 1995. Selkirk, Elisabeth. “Interface Constraints on Focus.” Talk delivered at the Workshop on SyntaxPhonology Interface, Linguistic Society of Japan, Tokyo, November 1999. Selkirk, Elisabeth. “Focus Types and Tone.” Paper presented at the First North American Phonology Conference, Concordia University, Montreal, 2000. Selkirk, Elisabeth. “The Interaction of Constraints on Prosodic Phrasing.” In M. Horne (ed.), Prosody: Theory and Experiment, Dordrecht: Kluwer Publishing, 2001. Selkirk, Elisabeth. “Contrastive FOCUS vs. Presentational focus: Prosodic Evidence from Right Node Raising in English.” In B. Bel and I. Marlin (eds.), Speech Prosody 2002: Proceedings of the First International Speech Prosody Conference, pp. 643-646. Laboratoire Parole et Langage, Université de Provence, Aix-en-Provence, 2002. Selkirk, E. and J. Katz (in preparation) Phrasal stress and focus types. Ms. UMass Amherst and MIT. Selkirk, Elisabeth and Tong Shen. “Prosodic domains in Shanghai Chinese.” In S. Inkelas and D. Zec (eds.), The Phonology-Syntax Connection, pp. 313-338. Chicago, University of Chicago Press, 1990.
244
ELISABETH SELKIRK
Selkirk, E. and K. Tateishi. “Syntax and downstep in Japanese.” In C. Georgopoulos and R. Ishihara (eds.), Interdisciplinary Approaches to Language. Essays in Honor of S.-Y. Kuroda. Dordrecht, Kluwer, 1991. Sugahara, M. Downtrends and Post-FOCUS Intonation in Tokyo Japanese. University of Massachusetts, Amherst: Doctoral dissertation, in preparation. Truckenbrodt, Hubert. Phonological Phrases: Their Relation to Syntax, Focus and Prominence. MIT: Doctoral dissertation, 1995. Truckenbrodt, Hubert. On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry 30 (1999): 219-255. Vogel, Irene and István Kenesei. “Syntax and Semantics in Phonology.” In S. Inkelas and D. Zec (eds.), The Phonology-Syntax Connection, pp. 339-364. Chicago, University of Chicago Press, 1990. Zec, Draga. “Footed Tones and Tonal Feet: Rhythmic Constituency in a Pitch-Accent Language.” Phonology 16 (1999): 225-264.
MARK STEEDMAN
INFORMATION-STRUCTURAL SEMANTICS * FOR ENGLISH INTONATION
1. INTRODUCTION Selkirk (1984), Hirschberg and Pierrehumbert (1986), Pierrehumbert and Hirschberg (1990), and the present author, have offered different but related accounts of intonation structure in English and some other languages. These accounts share the assumption that the system of tones identified by Pierrehumbert (1980), as modified by Pierrehumbert and Beckman (1988) and Silverman et al. (1992), has as transparent and type-driven a semantics in these languages as do their words and phrases. While the semantics of intonation in English concerns information structure and propositional attitude, rather than the predicate-argument relations and operatorscope relations that are familiar from standard semantics in the spirit of the papers collected as Montague 1974, this information-structural semantics is fully compositional, and can be regarded as a component of the same semantic system. The present paper builds on Steedman (1991) and Steedman (2000a) to develop a new semantics for intonation structure, which shares with the earlier versions the property of being fully integrated into Combinatory Categorial Grammar (CCG, see Steedman 2000b, hereafter SP). This grammar integrates intonation structure into surface derivational structure and the associated Montague-style compositional semantics, even when the intonation structure departs from the restrictions of traditional surface structure. Many of the diverse discourse meanings that have been attributed to intonational tunes are shown to arise via conversational implicature from more primitive literal meanings distinguished along the three dimensions of information structure, speaker/hearer commitment, and contentiousness. 2. TONES AND INFORMATION STRUCTURE It is standard to assume, following Bolinger (1958, 1961) and Halliday (1963, 1967a,b), that pitch-accents, high or low, simple or compound, are in the first place properties of the words that they fall on, and that they mark the interpretations of those words as contributing to the distinction between the speaker’s actual utterance and other things that they might be expected to have said in the context to hand, as in the “Alternative Semantics” of Kartunnen (1976), Karttunen and Peters (1979), Rooth (1985, 1992), and Büring (1997a,b).1 In this sense, all pitch accents are contrastive. For example, in response to the question “Which finger did he bite?”, the word that contributes to distinguishing the following answer from other possible 245 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 245–264. © 2007 Springer.
246
MARK STEEDMAN
answers via reference is the deictic “this”, so the following intonation is appropriate.2 (1)
He bit THIS one . H* LL%
It is important to be clear from the start that the set of alternative utterances from which the actual utterance is distinguished by the tune is in no sense the set of all possible utterances appropriate to this context, a set which includes infinitely many things like “Mind your own business,” “That was no finger,” “What are you talking about?” and “Lovely weather we’re having.” Rather, the presupposed set of (presumably, ten) alternative utterances is accommodated by the hearer in the sense of Lewis (1979) and Thomason (1990), like any speaker presupposition that is not actually inconsistent with their beliefs. This does not imply that such alternative sets are confined to things that have been mentioned, or that they are mentally enumerated by the participants—or indeed that they are even finite. In terms of Halliday’s given/new distinction pitch-accents are markers of “new” information, although the words that receive pitch-accents may have been recently mentioned, and it might be better to call them markers of “not given” information. That seems a little cumbersome, so I will use the term “kontrast” from Vallduv´ı and Vilkuna 1998 for this property of English words bearing pitch-accents, spelling the corresponding verb “k-contrast”.3 I’ll further attempt to argue that there are just two independent semantic binaryvalued dimensions along which the literal meanings of the various pitch-accent types are further distinguished. The first of these dimensions has been identified in the literature under various names, and distinguishes between what I’ll continue to call “theme” and “rheme” components of the utterance, using these terms in the sense of Bolinger (1958, 1961) rather than Halliday. Theme can be thought of informally as the part of the sentence corresponding to a question or topic that is presupposed by the speaker, and rheme is the part of the utterance that constitutes the speaker’s novel contribution on that question or topic. However, it will become clear below that the notion of theme differs from that of topic as defined by, for example, Gundel (1974); Gundel and Fretheim (2001) in being speaker-defined rather than text-based. A great deal of the huge and ramifying literature on information structure can be summarized as distinguishing two dimensions corresponding to the given/kontrast and theme/rheme distinctions, although the consensus has tended to be obscured by the very different nomenclatures that have been applied. (See discussion by Steedman and Kruijff-Korbayov´ a (2001), which summarizes the terminology and its lines of descent, along with some contiguous semantic influences.) However, there is a further dimension of discourse meaning along which the pitch-accent types are distinguished which has not usually been identified in this literature. It concerns whether or not the particular theme or rheme to hand is mutually agreed–that is, uncontentious. This notion is related to various notions of Mutual Belief or Common Ground proposed by Lewis (1969), Cohen (1978), Clark and Marshall (1981) and Clark (1996).4
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
247
Both of these components of meaning are projected by the process of grammatical derivation from the words that carry the pitch-accent to the prosodic phrase corresponding to these information units, following Steedman 2000a, along lines briefly summarized in section 5. I’ll also try to argue that the intonational boundaries such as those sometimes referred to as “continuation rises,” which delimit the prosodic phrase, fall into two classes respectively distinguishing the speaker or the hearer as responsible for, or (in terms of the related accounts of Gussenhoven (1983, p. 201) and Gunlogson (2001, 2002)) committed to, the corresponding information unit.5 I’ll assume that the speaker’s knowledge can be thought of as a database or set of propositions in a logic (second-order, since themes etc. may be functions), divided into two subdomains, namely: a set S of information units that the speaker claims to be committed to, and a set H of information units which the speaker claims the hearer to be committed to. Information units are further distinguished on a dimension ±AGREED according to whether the speaker claims them to be uncontentious or contentious. The set of +AGREED information units is not merely the intersection of S and H: the speaker may attribute uncontentiousness to an information unit and responsibility for it to the hearer whilst knowing that in fact they do not regard themselves as so committed. In Steedman 2000a, S and H are treated as modalities [S] and [H] of a modal logic, and Stone (1998) has proposed a similar modality for mutual belief. In the present paper we will combine the feature ±AGREED with the speaker/hearer modalities, writing it as a superscript ±, as in [H+]. These classifications can be set out diagrammatically as in the tables 1 and 2, in which θ signifies theme, r signifies rheme, + indicates +AGREED, ± indicates ±AGREED, and [S] and [H] respectively denote speaker and hearer commitment. If a theme or a rheme is marked as agreed, then it’s in AGREED, whoever is explicitly claimed to be committed to it. If it is not so marked, then it is not in AGREED, even if speaker and hearer in fact both believe it. This last possibility arises because H is only the speaker’s attribution of commitment to the hearer, not the hearer’s actual belief. It follows that a theme or rheme may be believed by the speaker, and asserted by the speaker to be something that the hearer is committed to, without the hearer’s actually agreeing to it. We will come to a case of this later on. Table 1: The Meanings of the Pitch-accents
+ L+H* H*, (H*+L)
θ ρ
L*+H L*, (H+L*)
Table 2: The Meanings of the Boundaries
[S] [H]
L, LL%, HL% H, HH%, LH%
248
MARK STEEDMAN
At first glance, this proposal might appear to miss the point entirely. Where are notions like “topic continuation” (Brown, Currie and Kenworthy 1980) and “evaluation with respect to subsequent material” (Pierrehumbert and Hirschberg 1990), or the latter authors’ scales of commitment and belief? I’m going to argue that many of the effects that have been associated with intonational tunes arise as conversational implicatures from the interaction with context of literal meanings made up of the above simple components. To consider this claim we need some examples. 3. AN EXAMPLE: PITCH-ACCENTS The first example commemorates Miles Davis’ response to Dave Brubeck’s question concerning his reason for playing E ڸas the final note of In Your Own Sweet Way, in 6 place of E ڷas written by Brubeck: (2)
DB: Why did you play E-natural? MD: (Why didn’t YOU ) (WRITE E-natural ?) L+H* LH% H* LL% background kontrast theme
kontrast background rheme
The LH% boundary splits the utterance into two intonational phrases and two information units. The L+H* accent marks the first of these units as theme (L*+H would also be appropriate). It falls on the word you because its referent (Brubeck) is the element that distinguishes this theme from the other themes that are available. (Lambrecht and Michaelis (1998) in a related approach call such “marked” or contrastive themes “ratified topics”. Ratification certainly presupposes some alternative. However, the example to hand suggests that ratification is only one of many things that you can do with a contrastive theme or topic.) The set of available themes, which we will call the “Theme Alternative Set” (ThAS) is pre-supposed by Davis and accommodated by Brubeck as including just two possible themes. These can be thought of informally as “Why did/didn’t Davis do x?” and “Why did/didn’t Brubeck do x?” More formally we can think of the Theme Alternative Set as a set of l terms, which for this context is as follows, in which ± stands for polarity: (3)
λvp.λreason.cause′ reason(± do′ vp brubeck′) λvp.λreason.cause′ reason(± do′ vp davis′) (It’s assumed here that the fragment Why didn’t you is assigned a meaning which is a function from VP interpretations to why-question interpretations—the latter being
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
249
themselves functions from adverbial interpretations to causal propositions. This is in fact what the CCG grammars outlined in SP and below actually deliver, given an appropriate lexicon.) Other themes and Theme Alternative Sets are possible. For example, a further L+H* pitch-accent on didn’t is possible: (4)
(Why
DIDN’T
L+H*
) L+H* LH%
YOU
(WRITE E-natural ?) H* LL%
By saying (4), Davis presupposes, and Brubeck accommodates, a Theme Alternative Set which informally can be thought of as “Why did Davis do x” and “Why didn’t Brubeck do x”, and can be written as:
(5)
λvp.λreason.cause′ reason(– do′ vp brubeck′) λvp.λreason.cause′ reason(+do′ vp davis′)
In both cases, words whose interpretation distinguishes the intended theme from the others—which is how “k-contrasted” or “not given” is defined in the present system—bear pitch-accents, while those that do not contribute to the distinction— which is how we define “background” or “given”—do not. (See Prevost and Steedman 1994; Prevost 1995 for further detail on the determination of pitch-accent placement in sentence generation.) We do not need to think of the Theme Alternative Sets as closed under terms that are already in play in the conversation. A more general representation of the ThAS for (2) reminiscent of the “Structured Meaning” approach of Cresswell (1973, 1985) and von Stechow (1981) can be obtained by abstracting over the element(s) corresponding to accented words, thus: (6)
λsubj.λvp.λ reason.cause′ reason(± do′ vp subj)
Similarly, the ThAS for (4) can be written as follows:
(7)
λpolarity.λsubj.λ vp.λreason.cause′ reason(polarity(do′ vp subj))
Of course, themes including this one may not, and in fact usually do not, bear any pitch-accent at all, as in: (8)
(Why didn’t you ) (WRITE E-natural ?) H* LL%
250
MARK STEEDMAN
Such noncontrastive or “unmarked themes” presuppose or are accommodated to a singleton ThAS - in this case the following: (9)
{λvp.λreason.cause′ reason( – do′ vp brubeck′))}
Thus according to the present theory, as Halliday and Brown insisted, what is “new”, “not given,” or k-contrasted vs. what is “given” or background is in part determined by the speaker, not a property of a text or context alone (Brown 1983:67). By the same token, the notion of theme is also partly speaker-determined, not text-based as is the notion of topic of Gundel (1974); Gundel and Fretheim (2001). Similar considerations govern the effect of the rheme-tune in (2) and (4). The H* accent marks the second information unit as a rheme, and it falls on the word write because it is the interpretation of that word that distinguishes this rheme from the others that the context affords. This set of available rhemes, which we will call the “Rheme Alternative Set” (RhAS) is, again presupposed/accommodated by the participants to include only doing things to E\. In this particular case we can think of the RhAS as being closed under the things that have actually been mentioned—that is as (10)
λx.play′ e′ x λx.write′ e′ x Again, we can again think of the RhAS more generally by abstracting over the transitive predicate in structured-meaning style: (11)
λtv.λx. tv e′ x
We have so far passed over the role of the particular boundary tones in (2) and (4). Earlier we identified this role as assigning responsibility for theme/rheme status to either speaker or hearer. Thus the claim must be that in the above examples the theme is marked by Miles Davis as Brubeck’s responsibility, whereas the rheme is marked as his own. To see what this means, and to understand the implication of table 1 that both are “agreed”, we must look more broadly at the function of the boundary tones. 4. AN EXAMPLE: BOUNDARIES Brown (1980:30) identifies the role of high boundaries as indicating that there is more to come on the current topic from some participant. Pierrehumbert and Hirschberg (1990:304-308), from whom the following example is adapted, make a related claim concerning interpretation with respect to succeeding material (again, this may come from either participant):
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION (12)
251
a. Attach the jumper cables to the car that’s running, L+H* L+H* LH% b. Attach them to the car you want to L+H*
start, L+H* LH%
c. Try the ignition, L+H* L+H* LH% d. If you’re
lucky, L+H* LH%
e. You’ve started your car. H* LL% Pierrehumbert and Hirschberg don’t actually specify the pitch-accent types for this example, but L+H* seems appropriate for all accents except those in the last clause — in fact, H* accents sound quite odd, for reasons we’ll come to. In present terms, this means that the earlier clauses are all themes, and illustrates the fact that multiple themes, and in fact isolated themes without any rheme, are all possible. It is interesting to consider the effect of replacing the LH% boundaries by LL% boundaries, retaining the L+H* accents. This manipulation does not affect the coherence of the example very much. The main effect is to make the speaker’s prescription seem somewhat abrupt and discouraging of any interruption, and to be generally unconcerned with whether or not it is making any sense to the hearer. In comparison, the original (12) seems more attentive, and to invite the hearer to take control of the discourse if they want to. I’m going to claim that in both cases the forward motion of the discourse is the same, and is brought about, not by the inclusion of high boundaries as such, but by the rheme-expectation stemming from the theme-marking L+H* pitch-accents. The specific “kinder, gentler” effect of the version with LH% boundaries arises from their primary meaning of marking hearer-commitment. By marking the themes as, in the speaker’s view, the hearer’s responsibility (although in fact they may be completely new to the hearer), the possibility of the latter taking control of the discourse is maintained at every turn. These claims are borne out by considering the effect of substituting H* rheme accents for L+H* in both high- and low- boundary versions. With high boundaries, the instructions become quite irritating, and seem to imply that the hearer knows all this already. With low boundaries, the effect is again abrupt and not hearer-oriented. In both cases, coherence (though inferable from world knowledge) is reduced. I’m further going to claim that all the related effects of high boundaries, which have been variously described in the descriptive literature as “other-directed”, “turnyielding”, “discourse-structuring,” or “continuation” are similarly indirect implicatures that follow from the basic sense of high boundaries, which is to identify the hearer as in the speaker’s view committed to the relevant information unit.
252
MARK STEEDMAN 5. THE FULL SYSTEM
We are now ready to look at the entire system laid out in tables 1 and 2, via some simpler minimal pairs of examples in which tones including the L* pitch-accents and boundaries are systematically varied across the same text. If we limit ourselves for the sake of simplicity to tunes with a single pitchaccent, assume that H*+L and H+L* are not distinct from H* and L*, and take LL% and LH% as representative of the two classes of boundary then the classification in tables 1 and 2 allow eight tunes which exemplify the 23 = 8 possible combinations of these three binary features. It is instructive to consider the effect of these tunes when applied to the same sentence “I’m a millionaire,” uttered in response to various prompts. It’s important to realize that all these responses are indirect, and their force depends on whether the participants regard being a millionaire as counting as being rich.
(13)
H: You appear to be rich. S: I’m a MILLIONAIRE. H* LL% [S+]ρ millionaire′me′ (S committed to an agreed rheme.)
(14)
H: You appear to be poor S: I’m a MILLIONAIRE. L* LL% [S ]ρ millionaire′me′ (S committed to a non-agreed rheme.)
(15)
H: Congratulations. You’re a millionaire. S: I’m a MILLIONAIRE? H* LH% [H+]ρ millionaire′me′ (H committed to an agreed rheme.)
(16)
H: Congratulations. You’re a millionaire. S: I’m a MILLIONAIRE? L* LH% [H-]ρ millionaire′me′ (H committed to a non-agreed rheme.)
The above four responses can be assumed to consist of a single rheme.7The ones involving an L* pitch-accent mark the rheme as being not agreed. However, the pitch-accent itself does not distinguish who the opposition is coming from. This is not an ambiguity in the pitch-accent itself. Rather, the identification of the source of the conflict and the entire illocutionary force of the response depends on inference on the basis of what else is known about the participants’ beliefs. Thus, in (14), the one who appears to doubt the proposition in the second utterance is the hearer, but in (16) it is the speaker. In different contexts, the difference could be reversed or eliminated.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
253
A similar pattern can be observed for the theme pitch-accents: (17)
H: You appear to be rich. S: I’m a MILLIONAIRE. L+H* LL% [S+]θ millionaire′me′ (S committed to an agreed theme.)
(18)
H: You appear to be poor. S: I’m a MILLIONAIRE. L*+H LL% [S-]θ millionaire′me′ (S committed to a non-agreed theme.)
(19)
H: You appear to be a complete jerk. S: I’m a MILLIONAIRE. L+H* LH% + [H ]θ millionaire′me′ (H committed to an agreed theme.)
(20)
H: You appear to be a complete jerk. S: I’m a MILLIONAIRE. L*+H LH% [H-]θ millionaire′me′ (H committed to a non-agreed theme.)
At first encounter, it may appear that these tunes must mark rhemes, like those in (13) to (16). However in Steedman 2000a, I show that these are in fact isolated themes, of the kind we have already noticed in connection with example (12). These isolated themes achieve the effect of a response (as well as various other implicatures of impatience, diffidence, incompleteness, etc.) via the indirect speech act of leaving the hearer to generate the rheme for themselves. As before, the tunes involving L*+H accents imply disagreement or absence from mutual belief. Once again, the source of the disagreement can only be identified from the full discourse context. In the case of (19) and (20), it is important to remember that the speaker’s LH% boundary means only that the speaker views the hearer as committed to these themes. As far as the hearer is concerned, that is not the same as an actual commitment. Thus the L*+H in (20) simply has the effect of correctly excluding from the mutual belief set AGREED this theme which the boundary marks as in H, in spite of the fact that can also be inferred to be in the speaker’s own beliefs S. This is the possibility that was noticed in the discussion of tables 1 and 2: it seems a fundamental property of the system that there is a distinction between a proposition merely being in both S and H and it actually being in AGREED. The former amounts to a claim by the speaker that both participants ought to be committed to it. The latter is a claim by the speaker that both actually are committed. Example (20) is identical in information structural terms to the following example, extensively discussed by Ward and Hirschberg (1985) (see Pierrehumbert and Hirschberg 1990:295, (26)):
254 (21)
MARK STEEDMAN H: Harry’s such a klutz. S: He’s a good BADMINTON player. L*+H LH%
In terms of the present theory, the response is an isolated theme, which achieves its effect of contradiction by: a) claiming via an LH% boundary that the hearer is committed to the proposition (even though in fact they may not be); b) claiming via the L*+H pitch-accent that the theme is not (yet) mutually agreed (even though the hearer may in fact believe its content already); and c) leaving the hearer to infer for themselves on the basis of their world knowledge about badminton players the implicated rheme, that Harry is not in fact a total klutz. The contradiction is particularly effective, because a and b between them further implicate that H’s original remark was pretty stupid, and thereby force the hearer to infer this intended further conclusion for themselves, without the speaker needing to explicitly uttering it. However, this effect of the utterance is an indirect speech-act or conversational implicature, not part of the literal meaning of the words or the tones. As an aside, it is striking that within the present theory, such conversational implicatures can be analyzed solely in terms of knowledge and modality, without appealing explicitly to notions of cooperation, flouting, or to speech-act types and illocutionary force recognition. Many of the examples discussed by Grice (1975) and Searle (1975) seem to be susceptible to similar knowledge-based analysis, making Speech-act-theoretic analyses merely emergent, as in Steedman and Johnson-Laird (1980) and Cohen and Levesque (1990). For example, consider Grice’s famous analysis of the sarcastic or ironic conversational implicature achieved by saying “You’re a fine friend!” in a situation where the hearer has actually done the speaker a disservice. His analysis requires the hearer to detect that the speaker has flouted a conversational maxim (Quality), to assume that the speaker is still cooperating and therefore (by a step that is not quite clear), to infer that the speaker must mean the opposite of what they said. It is interesting however, to observe that one intonation contour with which such sarcastic comments are characteristically uttered is the following: (22)
You’re a FINE FRIEND L* L*
! LH%
This all-rheme utterance is marked by the L* pitch-accents as not agreed or in Mutual Belief, and by the LH% boundary as being something the hearer is in the speaker’s view committed to. It is the latter marking that makes the hearer compare the speaker’s proposition with their own beliefs, and identify the Rheme Alternative Set as something like the following: (23) ( – fine′(friend′self′)) (+fine′(friend′self′))
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
255
At this point, the speaker has achieved their goal of making the hearer aware of their own misdeed, and the indirect speech-act is complete, without any appeal to cooperation, maxims, or rules explicitly associating maxim-violating utterances with their negation. Indeed the effectiveness of the indirect accusation is greatly increased by the fact that the speaker has, so to speak, got under the hearer’s guard, forcing them into coming up with this thought for themselves, rather than stating it as a speaker commitment, which the hearer might reject. We as linguists may identify this as illocutionary uptake of an act of sarcasm, but the participants don’t need to know about any of this. 6. INTONATION IN COMBINATORY CATEGORIAL GRAMMAR CCG is a form of lexicalized grammar in which grammatical categories are made up of a syntactic type defining valency and order of combination, and a logical form. For example, the English intransitive verb walks has the following category, which identifies it as a function from (subject) NPs (which the backward slash identifies as on the left, and the feature-value indicated by subscript SG identifies as bearing singular agreement) into sentences S: (24)
walks := S\NPSG :λx.walk′x
Its interpretation is written as a l-term associated with the syntactic category by the operator “:”. The transitive verb admires has the category of a function from (object) noun phrases (which the forward slash identifies as on the right) into predicates or intransitive verbs: (25)
admires := (S\NPSG)/NP :λx.λy.admire′xy
In this case the syntactic type is simply the SVO directional form of the semantic type. (Juxtaposition of function and argument symbols in logical forms as in admire′x indicates function application. A convention of left association holds, according to which admire′xy is equivalent to (admire′x)y). In other cases categories may “wrap” arguments into the logical form, as in the analysis of Bach (1979, 1980), Dowty (1982), and Jacobson (1992). For example, the following is the category of the English ditransitive verb showed, which reverses the dominance/command relation of indirect and direct object x and y between syntactic derivation and the logical form:8 (26)
showed := (S\NPSG)/NP)/NP :λx.λy.λz.show′yxz
(The reason for doing this is to capture at the level of logical form the binding theory and its dependence on the c-command hierarchy in which subject outscopes direct object, which outscopes indirect (dative) object, which outscopes more oblique arguments—see Steedman 1996 for discussion).
256
MARK STEEDMAN
The syntactic operations of CCG by which such interpretations are assembled are distinguished by being strictly type-dependent, rather than structure-dependent. For present purposes they can be regarded as limited to operations of type-raising (corresponding to the combinator T) and composition (corresponding to the combinator B ). Type-raising turns argument categories such as NP into functions over the functions that take them as arguments, such as the verbs above, into the results of such functions. Thus NPs like Harry can take on categories such as the following: (27)
a. S/(S\NPSG) :λp.p harry′ b. S\(S/NP) :λp.p harry′ c. (S\NP)/((S\NP)\NP) :λp.p harry′ d. etc.
This operation has to be strictly limited to argument categories. One way to do so is to specify it in the lexicon, in the categories for proper names, determiners, and the like. The inclusion of composition rules like the following as well as simple functional application and lexicalized type-raising engenders a potentially very freely “reordering and rebracketing” calculus, engendering a generalized notion of surface or derivational constituency. (28)
Forward composition (> B ) X/Y : f Y/Z : g B X/Z : λx. f(gx)
For example, the simple transitive sentence of English has two equally valid surface constituent derivations, each yielding the same logical form: (29)
Harry admires Louise ___________T ________ S/(S\NPSG) (S\NPSG)/NP S\(S/NP)) λf.f harry′ . λx.λy.admire′xy : λp.p louise′ ____________________ >B S/NP : λx.admire′x harry′ __________________________________ < S : admire′louise′ harry′
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION (30)
Harry ______
>T
admires
257
Louise
________________
_______________
S/(S\NPSG) (S\NPSG)/NP (S\NP)\((S|NP)/NP) λf.f harry′ . λx.λy.admire′xy : λp.p louise′ _____________________________________
S\NPSG : λy.admire′ louise′y
<
__________________________________________________
S : admire′louise′ harry′
<
In the first of these, Harry and admires compose as indicated by the annotation > B to form a non-standard constituent of type S/NP. In the second, there is a more traditional derivation involving a verb phrase of type S\NP. Both yield identical logical forms, and both are legal surface or derivational constituent structures. More complex sentences may have many semantically equivalent derivations, a fact whose implications for processing are discussed in SP. This theory has been applied to the linguistic analysis of coordination, relativization, and intonational structure in English and many other languages. For example, since substrings like Harry admires are now fully interpreted derivational constituents, they can undergo coordination via the schematised rule (31), allowing a movement- and deletion- free account of right node raising, as in (32): (31)
(32)
Simplified coordination rule (<Φ>) X CONJ X′ X′′ [Harry admires] and _____________
>B
[Louise detests] a saxophonist
_____ ______________
S/NP
>B ____________
CONJ
S/NP
S\(S/NP)
____________________________________
<Φ>
S/NP __________________________________________
<
S This type-dependent account of extraction, as opposed to the standard account using structure-dependent rules, makes the across-the-board condition on extractions from coordinate structures a prediction or theorem, rather than a stipulation, as consideration of the types involved in the following examples will reveal: (33)
a. A saxophonist [that(N\N)/(S/NP) [[Harry admires]S/NP and [Lousie detests]S/NP]S/NP]N/NP b. A saxophonist that(N\N)/(S/NP) *[[Harry admires]S/NP and [Lousie detests him]S]] c. A saxophonist that(N\N)/(S/NP) *[[Harry admires him]S and [Lousie detests]S/NP]
258
MARK STEEDMAN
The availability of fully interpreted nonstandard derivational constituents corresponding to substrings like Harry admires was originally motivated by their participation in constructions like relativization and coordination and the desire to capture those constructions with a grammar obeying a very strict form of the Constituent Condition on Rules (SP, chapter 1). However, a theory that allows alternative derivations like (29) and (30) is clearly immediately able to cap-ture the fact that prosody can make exactly the same non-standard constituents into intonational phrases, as in (34a), as easily as the standard consituents in (34b):
(34)
a. HARRY admires LOUISE L+H* LH% H* LL% b. HARRY admires LOUISE H* L L+H* LH%
The way that CCG derivation is made sensitive to the presence of tones is as follows (adapted from Steedman 1999). The presence of a pitch-accent on a word infects its whole category with themehood or rhemehood, via a pair of feature-values θ=ρ and ±AGREE, the latter here abbreviated as superscript +/-. For example the transitive verb admires bearing an H* pitch-accent has the following category:9 (35)
admires := (Sρ+ \NPρ+)/NPρ+ :λx.λy.*admire′xy H*
The feature r ensures that a verb so marked can only combine with arguments that are compatible with rheme marking—that is, which do not bear the theme marking feature value θ—and marks its result as rheme marked as well. The element in the logical form corresponding to the accented word itself is marked for k-contrast with the asterisk operator. Boundaries, by contrast are not properties of words or phrases, but independent string elements in their own right. They bear a category which, by mechanisms parallel to those discussed in more detail in SP, “freezes” θ± /ρ± -marked constituents as complete information-/intonation-structural units, making them unable to combine further with anything except similarly complete prosodic units. For example, the hearer-responsibility signaling LH% boundary bears the following category: (36)
LH% := S$φ\S$η± : λf .[H±]η′ f
—where S$ is a variable ranging over S and syntactic function categories into S, η is a variable ranging over syntactic features θ/ρ, η′ ranges over the corresponding semantic translation θ′/ρ′ defined in terms of the alternative semantics discussed in section 2, superscript ± is a variable ranging over ±AGREE, and φ marks the result as a complete phonological phrase.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
259
The derivation of (34a) then appears as follows: (37)
Harry L+H* _________
admires LH% _____
Louise H* ______
LL% ________
______ >T Sθ+ /(Sθ+ \NPθ+) (S\NP)/NP
: λf.f *harry′ :λx.λy.admire′xy :λf.[H±]η′f :λp.p *louise′ _______________________ >B Sθ+/NPθ+ : x.admire′x *harry′
______________________________________
<
:λg.[S±]η′g
_______________________________________
<
Sφ/NPφ : [H+]θ′ (λx.admire′ x *harry′ ) Sφ\ (Sφ /NPφ) : [S+]ρ′ (λp.p *louise′) _______________________________________________________________________________
Sφ : [S+]ρ′ (λp.p *louise′)([H+]θ′ (λx.admire′x *harry′)
<
____________________________________________________________________________
S : admire′louise′harry In the last step of the derivation, the markers of speaker/hearer commitment, agreement/ disagreement, and theme/rheme are evaluated with respect to the database, to check that the associated presuppositions hold or can be accommodated. In the latter case this includes support or accommodation for the relevant alternative sets, and will include updates corresponding to the new theme and rheme. If any of these presuppositions fails, then processing will block and incomprehension will result. If it succeeds, then the two core λ -terms can β-reduce to give the canonical proposition as the result of the derivation. 7. EMPIRICAL ISSUES The present paper has laid a considerable burden of meaning on the distinction between pitch-accent types, and in particular that between H* and L+H*, which according to the present theory are respectively the most frequent rheme accent and theme accent. It might therefore appear to be an embarrassment that there is controversy in the literature over the reality of this distinction. Part of this controversy stems from the fact that trained ToBI annotators show quite low inter-annotator reliability in drawing this particular distinction (John Pitrelli, p.c.). When the characteristics of the actual pitch-accents annotated by them as H* and L+H* are plotted in terms of objective TILT parameters, there is very considerable overlap between the two categories (Taylor 2000). However, this seems to be a problem with the definitions of the relevant pitch contours that are provided in the ToBI annotation conventions (Beckman and Hirschberg 1999). The distinguishing characteristic of the L+H* accent is that the rise to the pitch maximum is late, typically beginning no earlier than onset of the vowel in the accented syllable. H* accents typically begin to rise earlier, in many cases much earlier. The definition of L+H* in the manual as “a high peak target on
260
MARK STEEDMAN
the accented syllable which is immediately preceded by relatively sharp rise from a valley in the lowest part of the speaker’s pitch range” does not make this entirely clear. Indeed it is likely that the distinction can only be drawn reliably if syllable boundary alignment is taken into account, and this information is not provided in the ToBI annotation system. It is also important to recall in using ToBI-annotated material that the manual explicitly instructs the annotator to use H* as the “default” accent type, explicitly instancing L+H* accents as examples that when in doubt should be annotated as H*.10 These characteristics of the ToBI annotation scheme mean that, useful though it is for other purposes, extreme caution has to be exercised in drawing strong conclusions concerning the reality of the H*/L+H* distinction from ToBI annotated corpora. In particular, while Taylors conclusion that the H*/L+H* distinction as drawn in the annotation to the relevant section of the Boston News Corpus is not phonetically real, it does not follow that the pitch-accent types themselves are not distinct. It is similarly unsafe to assess the present claim that L+H* is distinctively associated with theme by applying text-based criteria for identifying topics in free text such as those proposed by Gundel (1988). The only definition of a theme that is possible under the present proposal is in terms of contextually established or accommodating alternative sets. While the definitions in Steedman 2000a would allow restricted contexts to be manipulated to control the available alternatives, and allow the predictions concerning tune to be tested, identifying themes in free discourse is not easy, because of the pervading involvement of accommodation and inference inhuman discourse. For example, as Hedberg notes in her paper in the present volume, some of the L+H* accents which she finds not to be associated with topics in Gundel’s sense would be classified as isolated themes in the terms of the present theory (see Hedberg and Sosa 2001, note3; Hedberg and Sosa 2002).11 8. CONCLUSION The system proposed here reduces the literal meaning of the tones to just three semantically grounded binary oppositions. Crucially, it grammaticalizes a distinction between the beliefs that the speaker claims by their utterance that the speaker is committed to, and those that the hearer actually is committed to. It is only the latter set that includes Mutual Beliefs. It is therefore consistent for the speaker to claim and/or implicate that both they and the hearer are committed to a proposition, but that it is not mutually believed. This is a move in the present theory that is forced by examples like (21) and the minimal pairs in (13)-(20). The theory places a correspondingly greater emphasis on the role of speakerpresupposition (and its dual, hearer-accommodation, and by inference and implicature. To that extent, the present theory follows the tradition of Halliday and Brown, in claiming that it is the speaker who, within the constraints imposed by the context and the participants’ beliefs and intentions, determines what is theme and rheme, and what contrasts they embody, and not the text. University of Edinburgh
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
261
9. NOTES * Thanks to Betina Braun, Daniel Büring, Klaus von Heusinger, Stephen Isard, Alex Lascarides, and Bonnie Webber for comments on the draft. An earlier version of some parts of the paper appears as Steedman (2002). The work was supported in part by EPSRC grants GR/M96889 and GR/R02450, and EU FET grant MAGICSTER and EU IST grant PACO-PLUS. 1 The term “pitch-accent” is here restricted to what Ladd (1996) calls “primary” pitch-accents, sometimes called “nuclear” pitch accents (although there may be more than one in a sentence). Ladd follows Bolinger and many others in distinguishing primary accents from certain other accents that arise from the interaction of lexical stress with metrical the metrical grid. While there is still no objective measure to distinguish the two varieties, it is the primary accents that are perceived as emphatic or contrastive. 2 The notation for tunes is Pierrehumbert’s, see Pierrehumbert and Hirschberg 1990 for details including characteristic pitch-contours. 3 In Steedman 2000a and earlier work I called this property “focus”, following the “narrow” sense of Selkirk (1984). However this term invites confusion with the “broad” sense intended by Hajiþová and Sgall (1988) and Vallduví (1990), which is closer to the term “rheme” as used in the present system, and in Steedman 2000a and Vallduví and Vilkuna1998. 4 Hobbs (1990), who proposes a very different revision of Pierrehumbert and Hirschberg (1990) to the present one, also gives a central role to Mutual Belief. 5 In Steedman 2000a, I called this dimension “ownership”. 6 The story comes from Dave Brubeck. Miles was of course absolutely right. The tones shown in the example remain conjectural, however, given his complete lack of any trackable F0 . 7 Under the proposal in Steedman 2000a, they could also be analyzed as an unmarked theme “I’m” and a rheme “a millionaire”. In this particular context it makes very little difference, and we’ll ignore these readings. 8 The present analysis differs from that of Bach and colleagues in making Wrap a lexical combinatory operation, rather than a syntactic combinatory rule. One advantage of this analysis, which is discussed further in Steedman 1996, is that phenomena depending on Wrap, such as anaphor binding and control, are immediately predicted to be bounded phenomena. 9 Number agreement is suppressed in the interests of reducing formal clutter. 10 “Implicit in our discussion of the five pitch-accents is the notion that H* is the ‘default’ accent type. So, if there is any uncertainty about how low the F0 is before the peak, as in some cases of possible L+H* near the beginning of an utterance, the transcriber should mark ‘H*’ rather than ‘L+H*’.” (Beckman and Hirschberg 1999). 11 Similarly, the fact that non-native speakers often obliterate pitch-accent type distinctions, and yet manage to be understood, should no more lead one to conclude that the distinctions are not real than does the possibility of written communication.
10. REFERENCES Bach, Emmon. “Control in Montague Grammar.” Linguistic Inquiry 10 (1979): 513–531. Bach, Emmon. “In Defense of Passive.” Linguistics and Philosophy 3 (1980): 297–341. Beckman, Mary, and Julia Hirschberg. “The ToBI Annotation Conventions.” Manuscript, URL http://ling.ohio-state.edu/ tobi/ame tobi/annotation conventions.html. Ohio State University, 1999. Bolinger, Dwight. “A Theory of Pitch Accent in English.” Word 14 (1958): 109–149. Reprinted in Bolinger (1965), pp. 17-56. Bolinger, Dwight. “Contrastive Accent and Contrastive Stress.” Language 37 (1961): 83–96. Reprinted in Bolinger (1965), pp. 101-117. Bolinger, Dwight. Forms of English. Cambridge, Mass.: Harvard University Press, 1965.
262
MARK STEEDMAN
Brown, Gillian. “Prosodic Structure and the Given/New Distinction.” In Anne Cutler, D. Robert Ladd, and Gillian Brown (eds.), Prosody: Models and Measurements, pp. 67–77. Berlin: Springer-Verlag, 1983. Brown, Gillian, Karen Currie, and Joanne Kenworthy. Questions of Intonation. London: Croom Helm, 1980. Büring, Daniel. “The Great Scope Inversion Conspiracy.” Linguistics and Philosophy 20 (1997a): 175– 194. Büring, Daniel. The Meaning of Topic and Focus: The 59th Street Bridge Accent. London: Routledge, 1997b. Clark, Herbert. Using Language. Cambridge: Cambridge University Press, 1996. Clark, Herbert, and Catherine Marshall. “Definite Reference and Mutual Knowledge.” In Aravind Joshi, Bonnie Webber, and Ivan Sag (eds.), Elements of Discourse Understanding, pp. 10–63. Cambridge: Cambridge University Press, 1981. Cohen, Philip. On Knowing What to Say: Planning Speech Acts. University of Toronto: Doctoral dissertation, 1978. Cohen, Philip and Hector Levesque. “Rational Interaction as the Basis for Communication.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in Communication, pp. 221–255. Cambridge, Mass.: MIT Press, 1990. Cresswell, M.J. Logics and Languages. London: Methuen, 1973. Cresswell, M.J. Structured Meanings. Cambridge, Mass.: MIT Press, 1985. Dowty, David. “Grammatical Relations and Montague Grammar.” In Pauline Jacobson and Geoffrey K. Pullum (eds.), The Nature of Syntactic Representation, pp. 79–130. Dordrecht: Reidel, 1982. Grice, Herbert. “Logic and Conversation.” In Peter Cole and Jerry Morgan (eds.), Speech Acts, vol. 3 of Syntax and Semantics, 41–58. New York: Seminar Press, 1975 [Written in 1967]. Gundel, Janet. The Role of Topic and Comment in Linguistic Theory. University of Texas, Austin: Doctoral dissertation, 1974. Gundel, Janet. “Universals of Topic-Comment Structure.” In Michael Hammond, Edith Moravcsik, and Jessica Wirth (eds.), Syntactic Universals and Typology, pp. 209–242. Amsterdam: John Benjamins, 1988. Gundel, Janet, and Torsten Fretheim. “Topic and Focus.” In Laurence Horn and Gregory Ward (eds.), Handbook of Pragmatic Theory. Oxford: Blackwell, 2001. Gunlogson, Christine. True to Form: Rising and Falling Declaratives in English. University of California at Santa Cruz: Doctoral dissertation, 2001. Gunlogson, Christine. “Declarative Questions.” In Brendan Jackson (ed.), Proceedings of Semantics and Linguistics Theory XII, pp. 144–163. Ithaca, NY: Cornell University. 2002. Gussenhoven, Carlos. On the Grammar and Semantics of Sentence Accent. Dordrecht: Foris, 1983. Hajiþová, Eva and Petr Sgall. “Topic and Focus of a Sentence and the Patterning of a Text.” In Jánös Petöfi (ed.), Text and Discourse Constitution, pp. 70–96. Berlin: de Gruyter, 1988. Halliday, Michael. “The Tones of English.” Archivum Linguisticum 15 (1963): 1. Halliday, Michael. Intonation and Grammar in British English. The Hague: Mouton, 1967a. Halliday, Michael. “Notes on Transitivity and Theme in English, Part II.” Journal of Linguistics 3 (1967b): 199–244. Hedberg, Nancy and Juan Sosa. “The Prosodic Structure of Topic and Focus in Spontaneous English Dialogue.” This volume. Hedberg, Nancy and Juan Sosa. “The Prosody of Questions in Natural Discourse.” In Proceedings of Speech Prosody, Aix en Provence, Aptil. To appear. Hirschberg, Julia and Janet Pierrehumbert. “Intonational Structuring of Discourse.” In Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, New York, pp. 136–144. San Francisco, CA: Morgan Kaufmann, 1986. Hobbs, Jerry. “The Pierrehumbert-Hirschberg Theory of Intonational Meaning Made Simple: Comments on Pierrehumbert and Hirschberg.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in Communication, pp. 313–323. Cambridge, Mass.: MIT Press, 1990. Jacobson, Pauline. “Flexible Categorial Grammars: Questions and Prospects.” In Robert Levine (ed.), Formal Grammar, pp. 129–167. Oxford: Oxford University Press, 1992. Karttunen, Lauri, and Stanley Peters. “Conventional Implicature.” In Choon-Kyu Oh and David Dinneen (eds.), Syntax and Semantics 11: Presupposition, pp. 1–56. New York: Academic Press, 1979.
INFORMATION-STRUCTURAL SEMANTICS FOR ENGLISH INTONATION
263
Kartunnen, Lauri. “Discourse Referents.” In J. McCawley (ed.), Syntax and Semantics, vol. 7, pp. 363– 385. New York: Academic Press, 1976. Ladd, D. Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996. Lambrecht, Knud, and Laura Michaelis.“Sentence Accent in Information Questions: Default and Projection.” Linguistics and Philosophy (1998): 477–544. Lewis, David. Convention: a Philosophical Study. Cambridge Mass.: Harvard University Press, 1969. Lewis, David. “Scorekeeping in a Language Game.” Journal of Philosophical Logic 8 (1979): 339–359. Montague, Richard. Formal Philosophy: Papers of Richard Montague. Richmond H. Thomason (ed.). New Haven, CT: Yale University Press, 1974. Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. MIT: Doctoral dissertation, 1980. Pierrehumbert, Janet, and Mary Beckman. Japanese Tone Structure. Cambridge, Mass.: MIT Press, 1988. Pierrehumbert, Janet, and Julia Hirschberg. “The Meaning of Intonational Contours in the Interpretation of Discourse.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in Communication, pp. 271–312. Cambridge, Mass.: MIT Press, 1990. Prevost, Scott. A Semantics of Contrast and Information Structure for Specifying Intonation in Spoken Language Generation. University of Pennsylvania: Doctoral dissertation, 1995. Prevost, Scott and Mark Steedman. “Specifying Intonation from Context for Speech Synthesis.” Speech Communication 15 (1994): 139–153. Rooth, Mats. (1985). Association with Focus. University of Massachusetts, Amherst: Doctoral dissertation. Rooth, Mats. “A Theory of Focus Interpretation.” Natural Language Semantics 1 (1992): 75–116. Searle, John. “Indirect Speech Acts.” In Peter Cole and Jerry Morgan (eds), Speech Acts, vol. 3 of Syntax and Semantics, pp. 59–82. New York: Seminar Press, 1975. Selkirk, Elisabeth. Phonology and Syntax. Cambridge, Mass.: MIT Press, 1984. Silverman, Kim, Mary Beckman, John Pitrelli, Marie Ostendorf, Colin Wightman, Patti Price, Janet Pierrehumbert, and Julia Hirschberg. “ToBI: A Standard for Labeling English Prosody.” In Proceedings of the International Conference on Spoken Language Processing, Banff, Alberta, pp. 867–870. Edmonton: University of Alberta, 1992. Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 262–296. Steedman, Mark. Surface Structure and Interpretation. Cambridge, Mass.: MIT Press, 1996. Steedman, Mark. “Connectionist Sentence Processing in Perspective.” Cognitive Science 23 (1999): 615– 634. Steedman, Mark. “Information Structure and the Syntax-Phonology Interface.” Linguistic Inquiry 34 (2002a): 649–689. Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT Press, 2000b. Steedman, Mark. “Towards a Compositional Semantics for English Intonation.” Manuscript, URL http://www.cogsci.ed.ac.uk/~steedman/papers.html. University of Edinburgh, 2002. Steedman, Mark, and Philip Johnson-Laird. “Utterances, Sentences, and Speech-Acts: Have Computers Anything to say?” In Brian Butterworth (ed.), Language Production 1: Speech and Talk, pp. 111– 141. London: Academic Press, 1980. Steedman, Mark, and Ivana Kruijff-Korbayová. “Two Dimensions of Information Structure in Relation to Discourse Semantics and Discourse Structure.” Journal of Logic, Language, and Information, Introduction to the Special Issue on Information Structure, Discourse Semantics, and Discourse Structure, to appear. Stone, Matthew. Modality in Dialogue: Planning Pragmatics and Computation. University of Pennsylvania: Doctoral dissertation, 1998. Taylor, Paul. “Analysis and Synthesis of Intonation Using the Tilt Model.” Journal of the Acoustical Society of America 107 (2000): 1697–1714. Thomason, Richmond. “Accomodation, Meaning, and Implicature.” In Philip Cohen, Jerry Morgan, and Martha Pollack (eds.), Intentions in Communication, pp. 325–363. Cambridge, Mass.: MIT Press, 1990. Vallduví, Enric. The Information Component. University of Pennsylvania: Doctoral dissertation, 1990. Vallduví, Enric, and Maria Vilkuna. “On Rheme and Kontrast.” In Peter Culicover and Louise McNally (eds.), Syntax and Semantics, Vol. 29: The Limits of Syntax, pp. 79–108. San Diego, CA: Academic Press, 1998.
264
MARK STEEDMAN
Von Stechow, Arnim. “Topic, Focus and Local Relevance.” In Wolfgang Klein and Willem Levelt (eds.), Crossing the Boundaries in Linguistics, pp. 95–130. Dordrecht: Reidel, 1981. Ward, Gregory, and Julia Hirschberg. “Implicating Uncertainty: the Pragmatics of Fall-Rise Intonation.” Language 61 (1985): 747–776.
KLAUS VON HEUSINGER
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING* 1. INTRODUCTION Theories that relate discourse structure and intonational structure often concentrate on the discourse functions of pitch accents and boundary tones. Intonational phrasing, however, is less prominently investigated. T his paper focuses on intonational phrasing and its contribution to the construction of a discourse representation. I argue that intonational phrasing determines minimal discourse units which serve as the building blocks in a discourse representation. Even though minimal discourse units often correspond to syntactic constituents, sometimes they cross constituent boundaries. The problem can be illustrated by the very first sentence from the novel Das Parfum by Patrick Süskind, in (1).
(1)
H* !H* H* L% H* | | | | | [Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann, | ‘In the eighteenth century lived in France a man H* H* !H* | | | der zu den genialsten | und abscheulichsten Gestalten dieser an who was one of the most gifted and abominable personages (H*) (H*) !H* H* !H* L% | | | | | | genialen und abscheulichen Gestalten nicht armen Epoche gehörte.] in an era that knew no lack of gifted and abominable personages.’
We analyzed a read version of the novel with respect to intonational clues. The novel was professionally read by the artist Gert Westphal in 1995. The text was analyzed and intonationally segmented by Braunschweiler et al. (1988ff) in a project on spoken text in Konstanz. Parts of the text were then labeled for the following intonational properties: pitch accents (H*, L* or bitonal versions of it), boundary tones (H%, L%), and intonational phrasing (intonational phrases “[...]”, and intermediate phrases: “|...|”). We checked part of the labeling with Jennifer Fitzpatrick.1 (1) is phrased into two intonational phrases, and both further into intermediate phrases. The length of the different phrases differs quite remarkably. For example, the second intonational phrase consists of the three intermediate phrases | ein Mann | der zu den genialsten | und abscheulichsten Gestalten dieser an genialen und
265 C. Lee et al. Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation, 265–290. © 2007 Springer.
266
KLAUS VON HEUSINGER
abscheulichen Gestalten nicht armen Epoche gehörte |. At first glance, it is not straightforward to assign well-formed syntactic constituents to these intonational units, e.g. | der zu den genialsten |. Intonational phrasing depends on different parameters, including Selkirk’s (1984) “sense unit”. For Selkirk, an intonational phrase must be a sense unit. However, she does not give a definition of sense unit. The paper presents a new approach that defines sense units in terms of discourse structure. A sense unit corresponds to a discourse unit that establishes a certain discourse relation to the already established discourse universe. The paper is organized as follows: In section 2, I discuss different elements of discourse representation in terms of Discourse Representation Theory (DRT) and extend the formalism to segmented DRT, which is an attempt to integrate discourse relations into DRT. In section 3, I discuss the different elements of the intonational structure and their function with respect to the discourse structure. While pitch accents and boundary tones have received various functions, the discourse function of intonational phrasing has rarely been investigated. In section 4, I discuss the different parameters that determine the intonational phrasing. Besides metrical, phonological and syntactic parameters, semantics plays an important role. This function has been termed differently: Halliday (1967) introduced the term informational unit, while Selkirk (1984) uses sense unit. However, there is no semantic account of these terms. I argue that the semantics of intonational phrasing can be best accounted for in terms of discourse units. Discourse units are defined by their function to serve as arguments in discourse relations. In section 5, I describe different discourse relations, in particular I introduce new discourse relations that are relations between subclausal units. While discourse relations are defined between propositions, I show that there are also discourse relations between smaller units. Section 6 gives a short summary. Throughout this paper, I try to illustrate the arguments with examples from the novel Das Parfum. Die Geschichte eines Mörders (‘Perfume: The Story of a Murderer.’) by Patrick Süskind.2. Examples from the novel are quoted by chapter and sentence, e.g. 13-022. The intonational phrasing always relates to the German text, even though the English translation is often used for the discourse representation. The translation itself is from the English version of the novel. 2. DISCOURSE STRUCTURE Discourse structure is a cover term for different properties of a coherent text or discourse. In the following I focus on (i) reference and anaphora, (ii) information structure (topic-comment, or focus-background), and (iii) discourse relations between different discourse units. There are different families of theories treating discourse structure, each of which focuses on a different aspect. Discourse Representation Theory (Kamp 1981, Kamp & Reyle 1993) concentrates on representing the conditions for anaphoric reference. The discourse is incrementally (re)constructed. There is in principle no difference between parts of sentences and whole sentences since the construction algorithm does not recognize a special category of sentences (even though such a category is determined by the syntactic categories of the input). A second family of approaches (Klein & von Stutterheim 1987, Hobbs 1990, van Kuppevelt 1995, Roberts 1996, Büring 1997, 2003) understands a discourse structure as
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
267
representing the relations between propositions. Here the structure is represented as a tree of propositions. Such theories focus on the relation between sentences (or clauses), rather than on the relation between parts of sentences (or clauses). Neither view – except for Roberts (1996) and Büring (1997) – integrates aspects of information structure (topic-comment, or focus-background) in the analyses. These concepts are often used in the description of an additional level of sentence structure. Only the Prague School (Sgall & Hajičová & Benešová 1973) integrates information structure into the analysis of texts and discourses (see von Heusinger 2004 for a discussion of different approaches to information structure). 2.1 Reference and anaphora in discourse The initial problem that motivated discourse representation theories is the interpretation of nominal and temporal anaphora in discourse. The phenomenon of cross-sentential anaphora forces semantics to extend its limits from the sentence to the discourse. The key idea in the approach to semantics of discourse, exemplified in Heim (1982) and Kamp (1981), is that each new sentence or phrase is interpreted as an addition or ‘update’ of the context in which it is used. This update often involves connections between elements from the sentence or phrase and elements from the context. Anaphoric relations and definite expressions are captured by links between objects in this representation. In order to derive the truth condition of the sentence, the representation is embedded into a model. The best way to get acquainted with DRSs is to look at the example (2). (2) (2a)
(2b)
Im achtzehnten Jahrhundert lebte in Frankreich ein Mann. ‘In the eighteen century France there lived a man.’ t, u, x 18th cent(t) France(u) Man(x) live(x,u,t) {t,u,x | 18th cent(t) & France(u) & Man(x) & live(x,u,t)}
The box in (2a) graphically describes a discourse representation structure (DRS) with two parts. One part is called the universe of the DRS, the other its condition set. A DRS is an ordered pair consisting of its universe and condition set, which can also be represented as in (2b) in set notation – this set describes all possible instances for the discourse referents such that the conditions hold of them. The DRS in (2a) or (2b) has three discourse referents t, u, x in its universe and the conditions that the discourse referent t is a time point in the 18th century, the discourse referent u a location in France, the discourse referent u a man, and that the predicate live holds of x at the location u and at the time t. For getting the truth condition, we have to map the DRS onto a model by an embedding function f that maps the discourse referents onto elements of the domain of M such that the elements are in the
268
KLAUS VON HEUSINGER
extension of the predicates that are ascribed to the discourse referents. For example, the DRS (2a) or (2b) is true just in case that f(t) is in the 18th century, f(u) is in France, f(x) is a man and f(x) lives in f(u) at f(t). The sequence or conjunction of two sentences as in (3) receives a DRS incrementally. We start with the already established DRS for the first conjunct in (2a), and build the new DRS (3b) by inserting the new discourse referents for the pronoun er and the NP Jean-Baptiste Grenouille, and a condition for the predicate hieß. The anaphoric link of the pronoun is graphically represented as y = ?, indicating that the reference of the pronoun is still unresolved. The discourse referent which stands for an anaphoric expression must be identified with another accessible discourse referent in the universe. In the given context, y is identified with x, as in (3c). This mini-discourse is true if there is an embedding function f onto a model such that f(t) is in the 18th century, f(u) is in France, f(x) is a man, f(x) lives in f(u) at f(t), f(y) = f(x), f(z) is Jean-Baptiste Grenouille, and f(y) was named f(z). (3)
(3a)
Im achtzehnten Jahrhundert lebte in Frankreich ein Mann. Er hieß Jean-Baptiste Grenouille. ‘In the eighteen century France there lived a man. His name was JeanBaptiste Grenouille.’ t, u, x, y, z t, u, x, y, z 18th cent(t) 18th cent(t) France(u) France(u) t, u, x (3b) (3c) Man(x) Man(x) 18th cent(t) live(x,u,t) live(x,u,t) France(u) y=? y =x Man(x) z = J.B. Grenouille z = J.B. Grenouille live(x,u,t) name(y,z) name(y,z)
The new discourse referent introduced by the pronoun must be linked with an already established and accessible discourse referent. DRT defines accessibility in terms of structural relations, i.e. the discourse referent must be in the same (or in a higher) DRS. With this concept of accessibility, the contrast between (4) and (5) can be described by the difference in the set of discourse referents that are accessible for the discourse referent v of the pronoun er in (4) and (5). The construction rule for the negation in (4) creates an embedded discourse universe with the discourse referent u and the conditions scent(u) and x gave u to the world. The anaphoric pronoun er in the third (hypothetical) sentence cannot find a suitable discourse referent since it has no access to the embedded discourse universe with the only fitting discourse referent u. In (5a), however, the pronoun er in the second sentence is represented by the discourse referent v and the condition v = ?. This referent can be linked to the accessible discourse referent x, licensing the anaphoric link.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
(4)
269
So ein Zeck war das Kind Grenouille. An die Welt gab es nichts ab (...) nicht einmal einen Duft1 . (04-061) #Er1 war stark. ‘The young Grenouille was such a tick. He gave the world nothing (...) not even his own scent. #It was strong.’
x, y, z, v Tick(x) young Gr(y) x is y z=x (4a)
not
u scent(u) z gave u to the world
v = ? strong(v) (5)
Ein anderes Parfum aus seinem Arsenal war ein mitleiderregender Duft1 , der sich bei Frauen mittleren und höheren Alters bewährte. Er1 roch nach dünner Milch und sauberem weichem Holz. (38-015) ‘Another perfume in his arsenal was a scent for arousing sympathy that proved effective with middle-aged and elderly women. It smelled of watery milk and fresh soft wood.’ x, y, v
(5a)
scent for arousing sympathy that proved effective with middle–aged and elderly women(x) Another perfume in his arsenal (y) x is y v=x v smelled of watery milk and fresh soft wood
2.2 Information structure and discourse structure Information structure is generally understood as an additional linguistic level to describe sentence structure. Information structure often does not map syntactic structure, and this was the main reason for introducing this level of description in the
270
KLAUS VON HEUSINGER
19th century. It subsequently received different terms, such as theme-rheme, topiccomment, focus-background (see Sgall et al. 1973 for an overview). The theoretical basis for this additional structure varies according to the background theory of the researcher. But in most approaches information structure is defined by the contribution of the informational units to the sentence meaning. This is illustrated by the next two examples. In (6) the time of the reported event is fronted – since the time was already introduced, one can also say that this phrase is discourse-linked or backgrounded. In (7), however, the exclamation gut ‘good’ is fronted for focusing, while the given reference of the pronoun is backgrounded. (6)
(7)
Zu der Zeit, von der wir reden, herrschte in den Städten ein für uns moderne Menschen kaum vorstellbarer Gestank. ‘In the period of which we speak, there reigned in the cities a stench barely conceivable to us modern men and women.’ Gut schaut er aus. ‘He looks good.’
In general, theories assume that one unit is linked to the established discourse, while the other is said to express the new information in the sentence. Because of space limitations, I cannot present a full survey of the different approaches and a general criticism (see von Heusinger 2004). I only want to stress the point that information structure is often understood as a sentence structure and not as part of a discourse structure. Therefore, it is not included in discourse representation theories.
2.3 Sentence and discourse relation A discourse consists of sentences that are related to each other by relations, such as causation, explanation, coherence, elaboration, continuation. This can be illustrated in the following two discourse segments. In (8) the question is followed by a continuation, which in itself consists of a causation and a conjunction. This is best represented in an annotated tree, as in (8a). Similarly, the sentence (9) can be split into its clauses, which can then be represented in a tree, as in (9a).
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING (8)
271
“Was ist das?” sagte Terrier und beugte sich über den Korb und schnupperte daran, denn er vermutete Eßbares. (02-002) ‘“What‘s that?” asked Terrier, bending down over the basket and sniffing at it, in the hope that it was something edible.’ Continuation
What's that? asked Terrier
Causation
(8a)
in the hope that it was something edible.
Conjunction
(9)
bending down sniffing over at it the basket Technische Einzelheiten waren ihm sehr zuwider, denn Einzelheiten bedeuteten immer Schwierigkeiten, und Schwierigkeiten bedeuteten eine Störung seiner Gemütsruhe, und das konnte er gar nicht vertragen. (02-015) ‘He despised technical details, because details meant difficulties, and difficulties meant ruffling his composure, and he simply would not put up with that.’ Causation
He despised technical details,
(9a)
Elaboration
because details meant difficulties
Elaboration and difficulties meant ruffling his composure
and he simply would not put up with that.
Recent approaches to discourse structure (Hobbs 1990, van Kuppevelt 1995, Roberts 1996, Büring 1997, 2003) use anotated trees that relate propositions to each other. However, such approaches do not relate the internal structure to the propositions nor do they assume smaller discourse units than propositions.
272
KLAUS VON HEUSINGER
Only Asher (1993, 2004) combines insights from DRT and discourse relation in his theory of segmented DRT (= SDRT), which is not confined to the incremental composition of DRSs, but also captures discourse relations between the sentences in the discourse. He revises the classical DRT of Kamp (1981) and Kamp & Reyle (1993). The classical version describes the dynamic meaning of words or phrases with respect to a discourse structure. There is, however, no means to compare the dynamic potential of a full sentence with the discourse so far established. Asher (1993, 256) notes that the notion of semantic updating in the original DRT fragment of Kamp (1981) (...) is extremely simple, except for the procedures for resolving pronouns and temporal elements, which the original theory did not spell out. To build a DRS for the discourse as a whole and thus to determine its truth conditions, one simply adds the DRS constructed for each constituent sentence to what one already had. (...) This procedure is hopelessly inadequate, if one wants to build a theory of discourse structure and discourse segmentation.
In SDRT, each sentence Si is first represented as a particular segmented DRS for that sentence. The segmented DRS can then interact with the already established DRS reconstructing a discourse relation R, such as Causation, Continuation, Conjunction, Elaboration, etc. as informally sketched in (8b) and (8c) for the tree structure (8a). First the clause receives its DRS, which can then be related to the already established DRS, and then the representation can be integrated into the already established representation. In (8b), the already established DRS contains among other elements the discourse referents for the basket and for Terrier. The first two sentences from the tree (8a) are translated into DRSs which establish the discourse relation of Continuation, while the rest remains in the tree. In (8c) these two DRSs are integrated into the main DRS and the other three clauses are translated into segmented DRSs which again establish certain discourse relations with the main DRS: The sentence in (8) is represented as the DRS in (8b) with the box for the discourse information. The relation between the sentences (or propositions) are Cont. The remaining structure is given in (8b) and the DRSs for that structure is given in (8c):
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING (8) (8b)
273
“Was ist das?” sagte Terrier und beugte sich über den Korb und schnupperte daran, denn er vermutete Eßbares. (02-002)
u, p x, y, z, ... Cont What(u) = p basket(x) u = Terrier(y) ..... v Cont v=? Terrier(v) asked(v ,p) Causation
Conjunction bending down over the basket (8c)
in the hope that it was something edible.
sniffing at it
x, y, z,u, p, v w k l Cont Conj Caus y bending y sniffing in the hope basket(x) Terrier(y) down at k that l was ..... over the k =? something What(u) = p basket(w) edible u =x v=y w=? l =? Terrier(v) asked(v,p)
To summarize this very short presentation of DRT, the discourse structure of DRT provides not only a new structure but also introduces new semantic objects: discourse referents, conditions, and discourse domains (“boxes”). DRT explains semantic categories such as definiteness and anaphora in terms of interaction between these representations. Furthermore, the extension to SDRT allows us to express discourse relations between whole propositions, as well. These new tools, objects, and representations form the basis for a new semantic analysis of information structure. In the next section, this approach is sketched briefly. 3. INTONATIONAL STRUCTURE Intonation contours are represented by phonologists as a sequence of abstract tones consisting of pitch accents and two types of boundary tones. Pierrehumbert & Hirschberg (1990, 308) assign discourse functions to the particular tones: “Pitch accents convey information about the status of discourse referents (...). Phrase accents [= boundary tones of intermediate phrases] convey information about the relatedness of intermediate phrases (...). Boundary tones convey information about
274
KLAUS VON HEUSINGER
the directionality of interpretation for the current intonational phrase (...).” The status of discourse referents can be accounted for in terms of given vs. new; the boundary tones of intonational phrases indicate how the proposition expressed by the whole phrase is integrated into the discourse. Similarly, boundary tones of intermediate (or phonological) phrases that correspond to a full proposition indicate the way these propositions are interpreted with respect to the linguistic context, as illustrated in (10) and (11). While in (10), the L-boundary tone indicates that the two clauses have no relation to each other, the H-boundary tone in (11) indicates that the first clause is related to the second, suggesting a discourse relation of causation.
(10)
L L L% | | | [(George ate chicken soup) | (and got sick) ]
(11)
H L L% | | | [(George ate chicken soup) | (and got sick)]
However, in this view there is no way of treating phrases that correspond to units below the clause level, such as the modification im achtzehnten Jahrhundert (‘in the eighteenth century’), the unsaturated phrase lebte in Frankreich (‘lived in France’) or the first part of the complex noun der zu den genialsten (‘one of the most gifted’) in example (1), repeated as (12). (12)
[Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann, | der zu den genialsten | und abscheulichsten Gestalten dieser an genialen und abscheulichen Gestalten nicht armen Epoche gehörte.]
All these phrases can constitute intermediate phrases in German. Even though English and many other languages mark their intermediate phrases by boundary tones, in German there is no evidence for boundary tones for intermediate phrases (Féry 1993, 59-79). Evidence for intermediate phrases in German must be taken from other criteria. I argue on the basis of discourse structure and discourse relations that intonational phrasing (intonational and intermediate phrases) can sufficiently be defined by its function in building a discourse structure. Before I give a characterization of intonational phrasing for intonational phrases and intermediate phrases, I first present some approaches to the functions of pitch accents and boundary tones.
3.1 Pitch accents and reference Each intonational unit (intermediate phrase or intonational phrase) must have at least one pitch accent. Pitch accents are associated with prosodically prominent expressions in that phrase. Often they are associated with focus and thus indicate new (or not-given) information. Pitch accents themselves are often said to express the discourse status of their associated expressions (Hobbs 1990, Gussenhoven
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
275
1984, Ladd 1996). This can be illustrated by (13) and (14). (13) is the first sentence of the novel and introduces the time, the place and the person by phrases marked with a H* pitch accent. (14) is the first sentence of the second chapter. The wet nurse Jeanne Bussie was already introduced in the first chapter; so the L* indicates that she is discourse-old.
(13)
(14)
H* !H* H* L% H* | | | | | [Im achtzehnten Jahrhundert | lebte in Frankreich] [ein Mann, In the eighteenth century lived in France a man L* H% H* L* LH* H% | | | | | | [Einige Wochen später] [stand die Amme | Jeanne Bussie] ...(02-001) Few weeks later stood the wet nurse Jeanne Bussie
The pitch accent can also indicate contrast between two referents or unexpected relations between two referents, as illustrated in the often quoted example (15) and a sentence from our novel (16): (15) (16)
First HE called HIM a Republican and then HE offended HIM. Grenouille folgte ihm, mit bänglich pochendem Herzen, denn er ahnte, daß nicht ER DEM DUFT folgte, sondern daß DER DUFT IHN gefangengenommen hatte und nun unwiderstehlich zu sich zog. (08036) “Grenouille followed it, his fearful heart pounding, for he suspected that it was not he who followed the scent, but the scent that had captured him and was drawing him irresistibly to it.”
3.2 Tune representing information structure ‘
Steedman (1991, 2000) intertpretes Halliday s thematic structure (see section 4.2) in terms of combinatory categorial grammar (CCG). This can be illustrated with the following example which receives the informational structure in theme-rheme. Both thematic units are further divided into given material and new material; the latter is associated with a pitch accent. (17)
A:
Q: I know that Mary‘s FIRST degree is in PHYSICS. But what is the subject of her DOCTORATE? L+H*LH% H* LL% [Mary‘s DOCTORATE | is in CHEMISTRY] Given New Given New Theme Rheme
The basic informational units are the theme and the utterance. All other parts are defined with respect to these basic elements. For example, the rheme is a function
276
KLAUS VON HEUSINGER
that takes the theme as an argument to yield the utterance. Steedman now defines the syntactic function of the pitch accent L+H* as a theme that lacks a boundary tone, i.e. as a function that needs a boundary tone to yield a theme. Analogously, the pitch accent H* indicates a function that needs a boundary tone in order to yield a rheme. Thus in the description of tones, Steedman assumes the boundary tones and the whole tune as the primary units, while the pitch accents define the informational status as theme or rheme (cf. Hayes & Lahiri 1991 for a similar approach with respect to sentence type). (18) a b c d e f
Categorial functions of tones for English (Steedman 1991) LH% boundary tone simple argument LL% boundary tone simple argument L+H* pitch accent function from boundary tone into theme H* pitch accent function from boundary tones into rheme L+H*LH% contour simple argument: theme H* LL% contour function from themes into utterance
Steedman uses the terms theme and rheme as well as given and new. The first pair can be defined with respect to the sentence under analysis. Yet the second pair can only be defined by the discourse in which the sentence is embedded. Even though the tones and their functions are different for German, the following example from our novel may illustrate Steedman’s analysis. The first phrase ends with a H% boundary tone representing the theme (with the global contour of L*H%, cf. (18e)), while the second intonational phrase ends with L% expressing the rheme (with the global contour ...H*L%, cf. (18f)).
(19)
L* H% | | [Zu der Zeit, von der wir reden,] [herrschte in den Städten ‘In the period of which we speak, there reigned in the cities H*L H* !H* L% | | | | ein für uns moderne Menschen | kaum vorstellbarer Gestank.] to us modern men and women a stench barely conceivable’
However, not all sentences can be divided into one theme and one rheme, as in (20):
(20a)
b
L* H% H* L* LH* H% | | | | | | [Einige Wochen später] [stand die Amme | Jeanne Bussie] ‘Few weeks later stood the wet nurse Jeanne Bussie
[mit einem with a
H* H% | | Henkelkorb in der Hand] market basket in the hand
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
c
d
277
L* H* H% | | | [vor der Pforte des Klosters von Saint-Merri] at the gate of the cloister of Saint-Merri H* !H* !H* L% | | | | [und sagte dem öffnenden Pater Terrier,] and said to the opening Father Terrier’
The first four intonational phrases end with an H% boundary tone, and only the last phrase with an L% boundary tone. This is difficult to explain in terms of a view of information structure that is sentence bound. In such a view we must assume several themes before we get to the rheme, and the final sentence. The example suggests that the boundary tones indicate the relation of the phrase to the already established discourse on the one hand, and to the subsequent discourse on the other.
3.3 Tones representing different discourse functions Pierrehumbert & Hirschberg (1990) give a list of functions of pitch accents and boundary tones. The latter indicate whether the phrase to which the boundary tone is associated should be interpreted with respect to the preceding discourse or to the following discourse. Pierrehumbert & Hirschberg (1990, 304) illustrate this point in the following contrast between (21) and (22). The low boundary tone L% in (21a) indicates that this sentence as a unit is related to the discourse on its own, while the high boundary tone H% in (22a) indicates that it is to be interpreted with respect to the following sentence forming a large unit which then can be inserted into or related to the discourse. This difference influences the choice of the antecedent of the pronoun it in (21b) and (22b). In (21) it refers to the following proposition I spent two hours figuring out how to use the jack, while in (22) it refers back to the new car manual.
b
L L% My new car manual is almost unreadable. L H% It s quite annoying.
c
L L% I spent two hours figuring out how to use the jack.
‘
(21a)
b
L H% My new car manual is almost unreadable. L H% It s quite annoying.
c
L L% I spent two hours figuring out how to use the jack.
‘
(22a)
278
KLAUS VON HEUSINGER
Pierrehumbert & Hirschberg (1990, 308) assign the following discourse functions to the particular tones: Pitch accents convey information about the status of discourse referents, modifiers, predicates, and relationships specified by accented lexical items. Phrase accents convey information about the relatedness of intermediate phrases–in particular, whether (the propositional content of) one intermediate phrase is to form part of a larger interpretative unit with another. Boundary tones convey information about the directionality of interpretation for the current intonational phrase–whether it is “forward-looking” or not.
In explaining the function of intonational phrasing (intonational and intermediate phrases), they refer to the “propositional content” of the corresponding phrase. This can also be illustrated by the following fragment from our novel. The low boundary tones in (23a) and (23b) indicate that the content of the utterance can be added to the discourse without relating it to subsequent utterances. However, the high boundary tone in (23c) indicates that the utterance (“But I ve put a stop to that”) must be related to the next utterance (23d) (“Now you can feed him yourselves”). ‘
b
c
d
H* L% H* L% | | | | [Weil er mich leergepumpt hat] [bis auf die Knochen.] Because he s pumped me dry down to the bones. ‘
(23a)
H* L% | | [Weil er sich an mir vollgefressen hat.] ‘Because he himself on me stuffed has
H* H% | | [Aber damit ist jetzt Schluß.] But with that is now end H* !H* L% | | | [Jetzt könnt Ihr ihn selber weiterfüttern] Now can you him yourselves feed.’
However, not all intonational phrases can be associated with a propositional content, some intonational units might only refer to modifications such as im achtzehnten Jahrhundert (‘in the eighteenth century’) or the unsaturated phrase lebte in Frankreich (‘lived in France’) of example (1), repeated as (12). Thus, the functions of boundary tones must be redefined with respect to these “sub-propositional” units. Intonational phrasing doesn’t always correspond to propositions or to simple discourse referents. Therefore, we need a more fine-grained discourse structure that allows to construct corresponding discourse segments.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
279
Summarizing, pitch accents may indicate the discourse status of their respective discourse referents. They can also form the nucleus of an informational unit, as in Steedman s approach, which is, however, limited to the sentence. Pierrehumbert & Hirschberg define the function of boundary tones with respect to the relations between clauses. However, they can only deal with phrases that are associated with propositions. None of these approaches accounts for the discourse function of subclausal units. Before I develop such an approach in section 5, I give a sketch of the description of intonational phrasing in the next section. ‘
4. INTONATIONAL PHRASING AND ITS FUNCTION
4.1 Phrasing The term intonational phrase (IP) is usually applied to spans of the utterance which are delimited by boundary tones: “Like other researchers, we will take the melody for an intonational phrase to be the tune whose internal makeup is to be described. As a rule of thumb, an intonational phrase boundary (transcribed here as %) can be taken to occur where there is a non-hesitation pause or where a pause could be felicitously inserted without perturbing the pitch contour” (Pierrehumbert 1980, 19). In (24) from Selkirk (1995, 566), there are three intonational phrases, such that the relative clause corresponds to one, while each part of the matrix sentence to the right and to the left constitutes one. In (25) from the novel Das Parfum (02-125), one intonational phrase marks the direct speech, while the two others are associated with the two conjuncts of the assertion. The second conjunct is further divided into two intermediate phrases. ‘
‚
H% H% L% | | | [Fred,]IP [who s a volunteer fireman,]IP [teaches third grade]IP
(24)
‘
(25) H* L% L* H% (L*) H* L% | | | | | | | [“Na? ] [bellte Terrier] [und knipste ungeduldig | an seinen Fingernägeln.] ‘“Well?” barked Terrier, clicking his fingernails impatiently. “
‘
The terms in which we can define an intonational phrase are not very clearly understood. There are phonetic, syntactic and semantic criteria for forming an intonational phrase:
280
KLAUS VON HEUSINGER (26) (i) (ii) (iii)
(iv) (v) (vi)
Linguistic criteria for defining an intonational phrase (IP) Timing: An IP can be preceded and followed by a pause. Metrical: The metrical structure provides an additional clue, viz., the presence of a most prominent accent. Tonal: The boundary of an IP is sometimes tonally marked by a boundary tone. Pitch range adjustment plays a role, as well. Junctural: The boundary of an IP can block certain junctural phenomena (cf. Nespor & Vogel (1986)). Syntactic-prosodic: The boundaries of an IP correspond to those of some syntactic constituents. Semantic: The material in the IP must constitute an informational unit or sense unit.
The conflict between different criteria can be illustrated with the first sentence of our novel (1), repeated as (27). (27)
H* H* !H* H* L% | | | | | [Im achtzehnten Jahrhundert |580 lebte in Frankreich]300[ein Mann,|590 In the eighteenth century lived in France a man
The subscript indicates the duration of the pauses, which is shorter between the two intonational phrases than inside either of them. We rather assume the boundary tone as a robust criterion for an intonational phrase. Unfortunately, German does not show boundary tones for intermediate phrases (Féry 1993, 59-79). They can, however, be detected by other criteria such as pauses, lengthening of the final syllable and a pitch accent for each intermediate phrase. I argue that the discourse function of the intermediate phrase is one of the most reliable criteria. There are very short and very long intonational phrases, which means that the phrases do not depend on length. They rather depend on their appropriateness for building a coherent discourse. A discourse is coherent if at least the following two requirements are met: (i) anaphoric relations can be established; (ii) discourse relations hold between the discourse units, as argued in section 4.4.
4.2 Halliday: information units and information structure Halliday postulates an independent level for information structure and is the first one to introduce the term “informational unit”. He is in fact the first who uses the term information structure and establishes it as an independent concept. His main preoccupation was to account for the structure of intonation in English. Since phrasing does not always correspond to syntactic constituent structure, Halliday (1967, 200) postulates a different structural level as the correlate to phrasing (his “tonality ): “
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
281 ‘.
Any text in spoken English is organized into what may be called information units (...) this is not determined (...) by constituent structure. Rather could it be said that the distribution of information specifies a distinct structure on a different plane. (...) Information structure is realized phonologically by tonality , the distribution of the text into tone groups. ‚
‘
‚
The utterance is divided into different tone groups, which are roughly equivalent to intermediate phrases. These phrases exhibit an internal structure. Analogously, Halliday assumes two structural aspects of information structure: the informational partition of the utterance, and the internal organization of each informational unit. He calls the former aspect the thematic structure (theme-rheme), and the latter aspect is treated under the title givenness. The thematic structure organizes the linear ordering of the informational units, which corresponds to the Praguian view of theme-rheme (or topic-comment, or topic-focus, see section 2.2). The theme refers to that informational unit that comprises the object the utterance is about, while the rheme refers to what is said about it. Halliday (1967, 212) assumes that the theme always precedes the rheme. Thus theme-rheme are closely connected with word order, theme being used as a name for the first noun group in the sentence, and theme for the following: “The theme is what is being talked about, the point of departure for the clause as a message; and the speaker has within certain limits the option of selecting any element in the clause as thematic.” The second aspect refers to the internal structure of an informational unit, where elements are marked with respect to their discourse anchoring. Halliday (1967, 202) writes: “At the same time the information unit is the point of origin for further options regarding the status of its components: for the selection of point of information focus which indicates what new information is being contributed.” Halliday calls the center of informativeness of an information unit information focus. The information focus contains new material that is not already available in the discourse. The remainder of the intonational unit consists of given material, i.e. material that is available in the discourse or in the shared knowledge of the discourse participants. Halliday (1967, 202) illustrates the interaction of the two systems of organization with the following example (using bold type to indicate information focus; // to indicate phrasing). Sentence (28a) contrasts with (28b) only in the placement of the information focus in the second phrase. The phrasing, and thus the thematic structure, is the same. On the other hand, (28a) contrasts with (28c) in phrasing, but not in the placement of the information focus. However, since the information focus is defined with respect to the information unit, the effect of the information focus is different. (28)a b c
//Mary//always goes to town on Sundays.// //Mary//always goes to town on Sundays.// //Mary always goes to //town on Sundays.//
Halliday does not connect the sentence perspective with the discourse perspective, even though he makes some vague comments on it: ‘ ‘
The difference can perhaps be best summarized by the observation that, while given means what you were talking about (or what I was talking about before ), theme ‚ ‘ ‚
‚
‘
‚
KLAUS VON HEUSINGER ‘
‚
282
‘
‚
means what I am talking about (or what I am talking about now ); and, as any student of rhetoric knows, the two do not necessarily coincide. (Halliday 1967, 212)
The main progress initiated by the work of Halliday is the assumption of an independent level of information structure. This structure is closely related to the discourse and assigns the features given or new to the expressions in a sentence. However, he does not provide a criterion for informational units in terms of discourse.
4.3 Selkirk: sense units and argument structure Selkirk (1984) has argued that the intonational phrase (IP) constitutes a domain relevant to various aspects of the phonetic implementation of the sentence, including timing effects like constituent-final lengthening. Selkirk (1984, 286) employs the notions of sense unit since she argues that the intonational phrase cannot be defined by phonetics or by syntax alone, but it needs additional semantic constraints: Our position, then – again following Halliday 1967 – is that there are no strictly syntactic conditions on intonational phrasing. Any apparently syntactic conditions on where breaks in intonational phrasing may occur are, we claim, ultimately to be attributed to the requirement that the elements of an intonational phrase must make a certain kind of semantic sense. ‘
‚
Selkirk (1984, 286ff) defines the correlation between intonational phrase and the sense unit in (29), and in (30) she determines the sense unit as a complex of constituents that stand either in a modifier-head or argument-head relation: (29)
The Sense Unit Condition on intonational phrasing The immediate constituents of an intonational phrase must together form a sense unit.
(30)
Two constituents Ci, Cj form a sense unit if (a) or (b) is true of the semantic interpretation of the sentence: (a) Ci modifies Cj (a head) (b) Ci is an argument of Cj (a head)
This can be illustrated with (31). The first intermediate phrase im achtzehnten Jahrhundert modifies the head lebte in Frankreich, and the argument ein Mann...is an argument of the complex predicate im achtzehnten Jahrhundert lebte in Frankreich.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
283
(31) licensed by (30b)
Head | licensed by (30a)
Modifier | [Im achtzehnten Jahrhundert
Argument | [ein Mann, ...
Head | | lebte in Frankreich]
Selkirk herself (1984, 295f) notes that the Sense Unit Condition is very closely related to argument structure, so it does not cover cases where material is preposed or in nonrestrictive modifiers such as nonrestrictive relative clauses. The latter is a typical instance of backgrounding, which expresses a discourse relation rather than an argument-head relation, as illustrated by (32): (32)
[und sagte dem öffnenden Pater Terrier,] [einem etwa fünfzigjährigen | kahlköpfigen, | leicht nach Essig riechenden Mönch:] [“Da!” ] ‘... and the minute they were opened by Father Terrier, a bald monk of about fifty, with a faint odour of vinegar about him, she said “There!”’
While the background information about the Father Terrier is “embedded” into an independent intonational phrase, this phrase itself is divided into three intermediate phrases that each give one characteristic property of the person. Thus, it is not the argument structure that triggers the intonational phrasing, but rather the discourse relation of backgrounding.
4.4 Intonational phrasing and discourse units The discussion in the last two sections has shown that informational phrasing is partly determined by informational units. However, neither Halliday s concept of informational unit nor Selkirk s definition of sense unit succeeded in covering all cases. It already became clear that intonational phrasing must be described in terms of discourse units, which serve as arguments for discourse relations. This can be illustrated in the discourse tree (32a) for the sentence (32). ‘
‘
284
KLAUS VON HEUSINGER (32a) Backgrounding
[und sagte dem öffnenden Pater Terrier,]
Enumeration
leicht nach Essig riechenden Mönch]
[einem etwa | kahlköpfigen, | fünfzigjährigen
We can assign different discourse relations to the discourse units associated with the intonational phrasing. A discourse unit is defined by its appropriateness to serve as an argument in a discourse relation, rather than by its content or some other intrinsic property. This means that we can only define discourse units by defining discourse relations that operate on them. 5. DISCOURSE UNITS AND DISCOURSE RELATIONS Discourse relations are generally described in terms of relations between propositions. Therefore, the arguments for discourse relations are associated with clauses (or other linguistic phrases that express a proposition). This can be illustrated with (8), repeated as (33). Was ist das?” sagte Terrier und beugte sich über den Korb und schnupperte daran, denn er vermutete Eßbares. (02-002) ‘What s that? asked Terrier, bending down over the basket and sniffing at it, in the hope that it was something edible. Continuation
„
‘
‘
„
‘
What s that? asked Terrier
“
(33)
Causation
(33a) Conjunction
in the hope that it was something edible.
bending down sniffing over at it the basket The relation between the first two sentences can be described by Continuation, while the relation between the last clauses are Causation. Approaches to discourse or text structure that use these kind of discourse relations are fairly widespread (e.g. Mann & Thompson 1987, 1988 for Rhetorical Structure Theory (RST) or Asher 1993, 2004 for segmented DRT).
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
285
None of these approaches allow for subclausal discourse units and relations between them. However, we have seen in the last sections that intonational phrasing often corresponds to subclausal units. We have also said that discourse units are defined by the relations they establish. If we assume subclausal discourse units we must also define discourse relations that hold between them. In the following I discuss five discourse relations: (i) non-restrictive modification, (ii) backgrounding, (iii) enumeration, (iv) topicalization, and (v) frame-setting. While the first four are discussed in the literature, the relation of frame-setting is new.
5.1 Non-restrictive modification The relative clause in (34) consists of two intermediate phrases which correspond to der zu den genialisten (Gestalten gehörte) and to (der zu den) abscheulichsten Gestalten... gehörte. These two modifications are independent of each other, even though they both modify the same discourse referent x for a man. The point is that the main character of the book is not only one of the most gifted and abominable personages, but he is at the same time one of the most gifted personages and one of the most abominable personages. This is difficult to express in a purely linear way. However, if we assume two independent discourse representations, we can capture these two relations. (34)
[ein Mann | der zu den genialsten | und abscheulichsten Gestalten .... gehörte] a man who was one of the most gifted and abominable personages” „
(34a)
t, u, x 18th cent(t) non – France(u) restr. Man(x) Mod live(x,u,t)
y y =x y ∈ most gifted personages y y =x y ∈ most abnominable personages
5.2 Backgrounding In the example (35) below, a more general type of backgrounding can be found. Actually, there are even two levels of backgrounding: First the phrase in contrast to the names of other gifted abominations and second the actual names. The discourse relation of backgrounding relates these discourse units directly to the already established main DRS — there is no need to wait for the interpretation of the actual sentence. This is informally represented in (35a). (35)
[Er hieß | Jean-Baptiste Grenouille,] [und wenn sein Name] His name was Jean-Baptiste Grenouille, and if his name –
286
KLAUS VON HEUSINGER [im Gegensatz zu den Namen | anderer genialer Scheusale,] in contrast to the names of other gifted abominations, [wie etwa de Sades, | Saint-Justs, | Fouchés, | Bonapartes | undsoweiter,] de Sade’s, for instance, or Saint-Just’s, Fouché’s, Bonaparte’s etc. – [heute in Vergessenheit geraten ist,] has been forgotten today,
(35a)
t, u, x, y, z 18th cent(t) France(u) Man(x) live(x,u,t) y=x z = J.B. Grenouille name(y,z)
l, m name of l(m)
l =x
? in contrast to the names of other gifted abominations a, b, c, d de Sade(a), Saint–Just(b), Fouché(c), Bonaparte(d)
5.3 Enumeration A classical case of independent units is enumeration, which is here illustrated by (36). The intonational phrasing suggests that the discourse structure is constructed via independent representations for each predicate NP with goat‘s milk, with pap, and with beet juice, as given in (36a). (36)
[Jetzt könnt Ihr ihn selber weiterfüttern] ‘Now can you him yourselves feed [mit Ziegenmilch, | mit Brei, | mit Rübensaft.] with goat‘s milk, with pap, with beet juice.’
(36a)
x, y feed(x,y,z) y =a x = you
goat's milk(z) pap(z) beet juice(z)
Once we have an independent representation of each of the conjuncts, we can compare them and establish additional relations of gradation between them. This works particularly well for the following example (02-121), where we can compare the different representations according to a scale of intimacy.
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
287
(37)
H* H* H* H* H* | | | | | [sie hatte doch schon Dutzende | genährt, | gepflegt, | geschaukelt, | geküsst...| after all she had fed, tended, cradled and kissed dozens of them... (37a)
x, y nurse(x)
fed(x,y)
<
dozen-babies(y)
tended (x,y) <
cradled(x,y)
<
kissed(x,y)
more intimate activity
5.4 Topicalization
Topicalization or thematization is one of the central concepts of the functional sentence perspective of the Prague School, which was later adapted by Halliday and others (see section 4.2). Steedman s analysis of the thematic structure of a sentence focuses exactly on this aspect (see section 2.2 for discussion). The fragment (38) (02-126) illustrates this. The theme-rheme or the topic-comment establishes a functor-argument structure on a sentence that is independent from the grammatical relations. Since this issue is repeatedly discussed, I will continue to the next subclausal discourse relation. ‘
(38)[also an den Füßen zum Beispiel|da riechen sie wie ein glatter | warmer |Stein] Their feet for instance, they smell like a smooth warm stone [wie frische Butter riechen sie.] [Und am Körper] [riechen sie wie... ] They smell like fresh butter. And their bodies smell like...’ 5.5 Frame-setting
The discourse relation of “frame-setting” is illustrated by the first sentence of the second chapter (14), repeated as (39). The phrase einige Wochen später cannot be the topic, since the topic is the introduced person or the thing the sentence is about. However, it stands in its own phrase. I therefore assume the discourse relation of frame-setting. The phrase “sets the frame” for what there is to come. Here it shifts the reference time. The phrase can be integrated into the already established discourse before the rest of the sentence is interpreted, as illustrated in (39a) (see Maienborn 2003 for a related concept with the same name):
288
KLAUS VON HEUSINGER
(39)
(39a)
[Einige Wochen später] [stand die Amme | Jeanne Bussie] ...(02-001) ‘Few weeks later stood the wet nurse Jeanne Bussie...’ x, y, z, t 1, ... wet nurse(x) ... ... ...
t2 t2 = few weeks later as t1 u stood(u) u=x the wet nurse Jeanne Bussie(u) 6. SUMMARY
The presented analysis associates intonational phrasing with discourse units. I have proposed an extension of Asher s SDRT with smaller discourse representations and new relations between subclausal discourse representations. This allows us to assign discourse functions to intonational phrases, including phrases that do not correspond to entire clauses. Many more discourse relations must be defined, and I am convinced they can be defined in terms of discourse construction rules. ‘
Universität Stuttgart 7. NOTES *
The paper is a revised version of a talk given at the Topic/Focus Workshop, at the UC Santa Barbara, July 2001, and at the Linguistic Circle at the University of Edinburgh, October 2002. I would like to thank the audiences for the comments. In particular I would like to thank Jennifer Fitzpatrick, Carlos Gussenhoven, Bob Ladd, Aditi Lahiri, and Mark Steedman for discussion of earlier versions of this paper, and Daniel Büring, Matthew Gordon, and Chungmin Lee for editing this volume and for the very helpful and constructive review of this paper. The research was supported by a Heisenberg-Fellowship of the German Science Foundation and by a research Grant (HE 2259/9-2). 1 An intonational phrase boundary always coincides with an intermediate phrase boundary, therefore we shorten “[|...|...|]” to “[...|...]”. Even though English and many other languages mark their intermediate phrases by boundary tones, in German it is very controversial if there is evidence for boundary tones for intermediate phrases (Féry 1993, 59-79). 2 A short summary of the novel: “ In the slums of 18th-century Paris a baby is born and abandoned, passed over to monks as a charity case. But the monks can find no one to care for the child—he is too , demanding, and he doesn t smell the way a baby should smell. In fact, he has no scent at all. Jean-Baptiste Grenouille clings to life with an iron will, growing into a dark and sinister young man who, although he has no scent of his own, possesses an incomparable sense of smell. Never having known human kindness, Grenouille lives only to decipher the odors around him, the complex swirl of smells—ashes and leather, rancid cheese and fresh-baked bread—that is Paris. He apprentices himself to a perfumer, and quickly masters the ancient art of mixing flowers, herbs, and oils. Then one day he catches a faint whiff of something so exquisite he is determined to capture it. Obsessed, Grenouille follows the scent until he locates its source—a beautiful young virgin on the brink of womanhood. As his demented quest to create the “ultimate perfume” leads him to murder, we are caught up in a rising storm of terror until his final triumph explodes in all of its horrifying consequences.” (Short decription of the English translation of the novel, Süskind 1987)
DISCOURSE STRUCTURE AND INTONATIONAL PHRASING
289
8. REFERENCES Asher, Nicholas. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer Academic, 1993. Asher, Nicholas. “From Discourse Macro-Structure to Micro-Structure and Back Again: Discourse Semantics and the Focus/Background Distinction.” In H. Kamp, and B. Partee (eds.), Context Dependence in the Analysis of Linguistic Meaning. Amsterdam: Elsevier, 2004. Braunschweiler, Norbert, Jennifer Fitzpatrick and Aditi Lahiri. The Konstanz Intonation Database: German, Swiss German, American English, British English, East Bengali, West Bengali. University of Konstanz, 1988ff. Büring, Daniel. The 59th Street Bridge Accent. On the Meaning of Topic and Focus. London: Routledge, 1997. Büring, Daniel. “On D-Trees, Beans, and B-Accents.” Linguistics and Philosophy 26 (2003): 511-545. Féry, Caroline. German Intonational Patterns. Tübingen: Niemeyer, 1993. Gussenhoven, Carlos. On the Grammar and Semantics of Sentence Accents. Dordrecht: Foris, 1984. Halliday, Michael “Notes on transitivity and theme in English. Part 1 and 2.” Journal of Linguistics 3 (1967): 37-81, 199-244. Hayes, Bruce, and Aditi Lahiri. “Bengali intonational phonology.” Natural Language and Linguistic Theory 9 (1991): 47-96. Heim, Irene. The Semantics of Definite and Indefinite Noun Phrases. University of Massachusetts, Amherst. Ann Arbor: University Microfilms, 1982. Hobbs, Jerry. “The Pierrehumbert-Hirschberg Theory of Intonational Meaning Made Simple. Comments on Pierrehumbert and Hirschberg.” In P. R. Cohen, J. Morgan, and M. E. Pollack (eds.), Intentions in Communication, 313-323. Cambridge, Mass.: MIT, 1990. Kamp, Hans. “A theory of truth and semantic interpretation.” In J. Groenendijk, T. Janssen, and M. Stokhof (eds.), Formal Methods in the Study of Language, pp. 277-322. Amsterdam: Amsterdam Center, 1981. Kamp, Hans, and Uwe Reyle. From Discourse to Logic. Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht: Kluwer, 1993. Klein, Wolfgang and Christiane von Stutterheim. “Quaestio und referentielle Bewegung in Erzählungen.” Linguistische Berichte 109 (1987): 163-183. Ladd, Robert. Intonational Phonology. Cambridge: Cambridge University Press, 1996. Maienborn, Claudia. Die logische Form von Kopula-Sätzen. Berlin: Akademie Verlag, 2003. Mann, William, and Sandra Thompson. “Rhetorical Structure Theory: Description and Construction of Text Structures.” In G. Kempen (ed.), Natural Language Generation. New Results in Artificial Intelligence, Psychology, and Linguistics, 85-95. Dordrecht: Nijhoff, 1987. Mann, William, and Sandra Thompson. “Rhetorical Structure Theory: Towards a Functional Theory of Text Organisation.” Text 8.3 (1988): 243-281. Nespor, Marina, and Irene Vogel. Prosodic Phonology. Dordrecht: Foris, 1986. Pierrehumbert, Janet. The Phonology and Phonetics of English Intonation. Ph.D. Dissertation. Cambridge, Mass.: MIT, 1980. Pierrehumbert, Janet, and Julia Hirschberg. “The Meaning of Intonational Contours in the Interpretation of Discourse.” In P. R. Cohen, J. Morgan, and M. E. Pollack (eds.), Intentions in Communication, pp. 271-311. Cambridge, Mass.: MIT, 1990. Roberts, Craige. “Information Structure in Discourse. Towards an Integrated Formal Theory of Pragmatics.” In J.-H. Yoon, and A. Kathol (eds.), Ohio State University [=OSU] Working Papers in Linguistics. vol. 49, 91-136. Columbus, Ohio, 1996. Selkirk, Elisabeth. Phonology and Syntax. The Relation between Sound and Structure. Cambridge, Mass.: MIT, 1984. Selkirk, Elisabeth. “Sentence Prosody: Intonation, Stress, and Phrasing.” In J. Goldsmith (ed.), The Handbook of Phonological Theory, pp. 550-569. Oxford: Blackwell, 1995. Sgall, Petr, Eva Hajičová, and Eva Benešová. Topic, Focus and Generative Semantics. Kronberg/Taunus: Scriptor, 1973 Steedman, Mark. “Structure and Intonation.” Language 67 (1991): 260-296. Steedman, Mark. The Syntactic Process. Cambridge, Mass.: MIT, 2000. Süskind, Patrick. Das Parfum. Die Geschichte eines Mörders. Zürich: Diogenes, 1985. Süskind, Patrick. Das Parfum. Die Geschichte eines Mörders. Gelesen von Gert Westphal. Hamburg: Litraton, 1995.
290
KLAUS VON HEUSINGER
Süskind, Patrick. Perfume: The Story of a Murderer. Translated from the German by John E. Woods. New York: Vintage Books, 2001. Van Kuppevelt, Jan. “Discourse Structure, Topicality and Questioning. ” Linguistics 31 (1995): 109-147. Von Heusinger, Klaus. “Focus particles, sentence meaning, and discourse structure.” In W. Abraham, and A. ter Meulen, eds. The composition of Meaning. From Lexeme to Discourse, 167-193 Amsterdam: Benjamins.
Studies in Linguistics and Philosophy 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.
23. 24.
H. Hi˙z (ed.): Questions. 1978 ISBN 90-277-0813-4; Pb: 90-277-1035-X W. S. Cooper: Foundations of Logico-Linguistics. A Unified Theory of Information, Language, and Logic. 1978 ISBN 90-277-0864-9; Pb: 90-277-0876-2 A. Margalit (ed.): Meaning and Use. 1979 ISBN 90-277-0888-6 F. Guenthner and S.J. Schmidt (eds.): Formal Semantics and Pragmatics for Natural Languages. 1979 ISBN 90-277-0778-2; Pb: 90-277-0930-0 E. Saarinen (ed.): Game-Theoretical Semantics. Essays on Semantics by Hintikka, Carlson, Peacocke, Rantala, and Saarinen. 1979 ISBN 90-277-0918-1 F.J. Pelletier (ed.): Mass Terms: Some Philosophical Problems. 1979 ISBN 90-277-0931-9 D. R. Dowty: Word Meaning and Montague Grammar. The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. 1979 ISBN 90-277-1008-2; Pb: 90-277-1009-0 A. F. Freed: The Semantics of English Aspectual Complementation. 1979 ISBN 90-277-1010-4; Pb: 90-277-1011-2 J. McCloskey: Transformational Syntax and Model Theoretic Semantics. A Case Study in Modern Irish. 1979 ISBN 90-277-1025-2; Pb: 90-277-1026-0 J. R. Searle, F. Kiefer and M. Bierwisch (eds.): Speech Act Theory and Pragmatics. 1980 ISBN 90-277-1043-0; Pb: 90-277-1045-7 D. R. Dowty, R. E. Wall and S. Peters: Introduction to Montague Semantics. 1981; 5th printing 1987 ISBN 90-277-1141-0; Pb: 90-277-1142-9 F. Heny (ed.): Ambiguities in Intensional Contexts. 1981 ISBN 90-277-1167-4; Pb: 90-277-1168-2 W. Klein and W. Levelt (eds.): Crossing the Boundaries in Linguistics. Studies Presented to Manfred Bierwisch. 1981 ISBN 90-277-1259-X Z. S. Harris: Papers on Syntax. Edited by H. Hi˙z. 1981 ISBN 90-277-1266-0; Pb: 90-277-1267-0 P. Jacobson and G. K. Pullum (eds.): The Nature of Syntactic Representation. 1982 ISBN 90-277-1289-1; Pb: 90-277-1290-5 S. Peters and E. Saarinen (eds.): Processes, Beliefs, and Questions. Essays on Formal Semantics of Natural Language and Natural Language Processing. 1982 ISBN 90-277-1314-6 L. Carlson: Dialogue Games. An Approach to Discourse Analysis. 1983; 2nd printing 1985 ISBN 90-277-1455-X; Pb: 90-277-1951-9 L. Vaina and J. Hintikka (eds.): Cognitive Constraints on Communication. Representation and Processes. 1984; 2nd printing 1985 ISBN 90-277-1456-8; Pb: 90-277-1949-7 F. Heny and B. Richards (eds.): Linguistic Categories: Auxiliaries and Related Puzzles. Volume I: Categories. 1983 ISBN 90-277-1478-9 F. Heny and B. Richards (eds.): Linguistic Categories: Auxiliaries and Related Puzzles. Volume II: The Scope, Order, and Distribution of English Auxiliary Verbs. 1983 ISBN 90-277-1479-7 R. Cooper: Quantification and Syntactic Theory. 1983 ISBN 90-277-1484-3 J. Hintikka (in collaboration with J. Kulas): The Game of Language. Studies in GameTheoretical Semantics and Its Applications. 1983; 2nd printing 1985 ISBN 90-277-1687-0; Pb: 90-277-1950-0 E. L. Keenan and L. M. Faltz: Boolean Semantics for Natural Language. 1985 ISBN 90-277-1768-0; Pb: 90-277-1842-3 V. Raskin: Semantic Mechanisms of Humor. 1985 ISBN 90-277-1821-0; Pb: 90-277-1891-1
Volumes 1–26 formerly published under the Series Title: Synthese Language Library.
Studies in Linguistics and Philosophy 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39.
40. 41. 42.
43. 44. 45. 46. 47. 48. 49.
G. T. Stump: The Semantic Variability of Absolute Constructions. 1985 ISBN 90-277-1895-4; Pb: 90-277-1896-2 J. Hintikka and J. Kulas: Anaphora and Definite Descriptions. Two Applications of GameTheoretical Semantics. 1985 ISBN 90-277-2055-X; Pb: 90-277-2056-8 E. Engdahl: Constituent Questions. The Syntax and Semantics of Questions with Special Reference to Swedish. 1986 ISBN 90-277-1954-3; Pb: 90-277-1955-1 M. J. Cresswell: Adverbial Modification. Interval Semantics and Its Rivals. 1985 ISBN 90-277-2059-2; Pb: 90-277-2060-6 J. van Benthem: Essays in Logical Semantics 1986 ISBN 90-277-2091-6; Pb: 90-277-2092-4 B. H. Partee, A. ter Meulen and R. E. Wall: Mathematical Methods in Linguistics. 1990; Corrected second printing of the first edition 1993 ISBN 90-277-2244-7; Pb: 90-277-2245-5 P. G¨ardenfors (ed.): Generalized Quantifiers. Linguistic and Logical Approaches. 1987 ISBN 1-55608-017-4 R. T. Oehrle, E. Bach and D. Wheeler (eds.): Categorial Grammars and Natural Language Structures. 1988 ISBN 1-55608-030-1; Pb: 1-55608-031-X W. J. Savitch, E. Bach, W. Marsh and G. Safran-Naveh (eds.): The Formal Complexity of Natural Language. 1987 ISBN 1-55608-046-8; Pb: 1-55608-047-6 J. E. Fenstad, P.-K. Halvorsen, T. Langholm and J. van Benthem: Situations, Language and Logic. 1987 ISBN 1-55608-048-4; Pb: 1-55608-049-2 U. Reyle and C. Rohrer (eds.): Natural Language Parsing and Linguistic Theories. 1988 ISBN 1-55608-055-7; Pb: 1-55608-056-5 M. J. Cresswell: Semantical Essays. Possible Worlds and Their Rivals. 1988 ISBN 1-55608-061-1 T. Nishigauchi: Quantification in the Theory of Grammar. 1990 ISBN 0-7923-0643-0; Pb: 0-7923-0644-9 G. Chierchia, B.H. Partee and R. Turner (eds.): Properties, Types and Meaning. Volume I: Foundational Issues. 1989 ISBN 1-55608-067-0; Pb: 1-55608-068-9 G. Chierchia, B.H. Partee and R. Turner (eds.): Properties, Types and Meaning. Volume II: Semantic Issues. 1989 ISBN 1-55608-069-7; Pb: 1-55608-070-0 Set ISBN (Vol. I + II) 1-55608-088-3; Pb: 1-55608-089-1 C.T.J. Huang and R. May (eds.): Logical Structure and Linguistic Structure. Cross-Linguistic Perspectives. 1991 ISBN 0-7923-0914-6; Pb: 0-7923-1636-3 M.J. Cresswell: Entities and Indices. 1990 ISBN 0-7923-0966-9; Pb: 0-7923-0967-7 H. Kamp and U. Reyle: From Discourse to Logic. Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. 1993 ISBN 0-7923-2403-X; Student edition: 0-7923-1028-4 C.S. Smith: The Parameter of Aspect. (Second Edition). 1997 ISBN 0-7923-4657-2; Pb 0-7923-4659-9 R.C. Berwick (ed.): Principle-Based Parsing. Computation and Psycholinguistics. 1991 ISBN 0-7923-1173-6; Pb: 0-7923-1637-1 F. Landman: Structures for Semantics. 1991 ISBN 0-7923-1239-2; Pb: 0-7923-1240-6 M. Siderits: Indian Philosophy of Language. 1991 ISBN 0-7923-1262-7 C. Jones: Purpose Clauses. 1991 ISBN 0-7923-1400-X R.K. Larson, S. Iatridou, U. Lahiri and J. Higginbotham (eds.): Control and Grammar. 1992 ISBN 0-7923-1692-4 J. Pustejovsky (ed.): Semantics and the Lexicon. 1993 ISBN 0-7923-1963-X; Pb: 0-7923-2386-6
Studies in Linguistics and Philosophy 50. 51. 52. 53. 54.
55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78.
N. Asher: Reference to Abstract Objects in Discourse. 1993 ISBN 0-7923-2242-8 A. Zucchi: The Language of Propositions and Events. Issues in the Syntax and the Semantics of Nominalization. 1993 ISBN 0-7923-2437-4 C.L. Tenny: Aspectual Roles and the Syntax-Semantics Interface. 1994 ISBN 0-7923-2863-9; Pb: 0-7923-2907-4 W.G. Lycan: Modality and Meaning. 1994 ISBN 0-7923-3006-4; Pb: 0-7923-3007-2 E. Bach, E. Jelinek, A. Kratzer and B.H. Partee (eds.): Quantification in Natural Languages. 1995 ISBN Vol. I: 0-7923-3128-1; Vol. II: 0-7923-3351-9; set: 0-7923-3352-7; Student edition: 0-7923-3129-X P. Lasersohn: Plurality, Conjunction and Events. 1995 ISBN 0-7923-3238-5 M. Pinkal: Logic and Lexicon. The Semantics of the Indefinite. 1995 ISBN 0-7923-3387-X P. Øhrstrøm and P.F.V. Hasle: Temporal Logic. From Ancient Ideas to Artificial Intelligence. 1995 ISBN 0-7923-3586-4 T. Ogihara: Tense, Attitudes, and Scope. 1996 ISBN 0-7923-3801-4 I. Comorovski: Interrogative Phrases and the Syntax-Semantics Interface. 1996 ISBN 0-7923-3804-9 M.J. Cresswell: Semantic Indexicality. 1996 ISBN 0-7923-3914-2 R. Schwarzschild: Pluralities. 1996 ISBN 0-7923-4007-8 V. Dayal: Locality in WH Quantification. Questions and Relative Clauses in Hindi. 1996 ISBN 0-7923-4099-X P. Merlo: Parsing with Principles and Classes of Information. 1996 ISBN 0-7923-4103-1 J. Ross: The Semantics of Media. 1997 ISBN 0-7923-4389-1 A. Szabolcsi (ed.): Ways of Scope Taking. 1997 ISBN 0-7923-4446-4; Pb: 0-7923-4451-0 P.L. Peterson: Fact Proposition Event. 1997 ISBN 0-7923-4568-1 G. P˘aun: Marcus Contextual Grammars. 1997 ISBN 0-7923-4783-8 T. Gunji and K. Hasida (eds.): Topics in Constraint-Based Grammar of Japanese. 1998 ISBN 0-7923-4836-2 F. Hamm and E. Hinrichs (eds.): Plurality and Quantification. 1998 ISBN 0-7923-4841-9 S. Rothstein (ed.): Events and Grammar. 1998 ISBN 0-7923-4940-7 E. Hajiˇcov´a, B.H. Partee and P. Sgall: Topic-Focus Articulation, Tripartite Structures, and Semantic Content. 1998 ISBN 0-7923-5289-0 K. von Heusinger and U. Egli (Eds.): Reference and Anaphoric Relations. 1999 ISBN 0-7923-6070-2 H. Bunt and R. Muskens (eds.): Computing Meaning. Volume 1. 2000 ISBN 0-7923-6108-3; Pb: ISBN 1-4020-0290-4 S. Rothstein (ed.): Predicates and their Subjects. 2000 ISBN 0-7923-6409-0 K. Kabakˇciev: Aspect in English. A "Common-Sense" View of the Interplay between Verbal and Nominal Referents. 2000 ISBN 0-7923-6538-0 F. Landman: Events and Plurality. The Jerusalem Lectures. 2000 ISBN 0-7923-6568-2; Pb: 0-7923-6569-0 H. Bunt, R. Muskens and E. Thijsse: Computing Meaning. Volume 2. 2001 ISBN 0-7923-0175-4; Pb: 1-4020-0451-6 R. Musan: The German Perfect. Its Semantic Composition and Its Interactions with Temporal Adverbials. 2002 ISBN 1-4020-0719-1
Studies in Linguistics and Philosophy 79. 80. 81. 82.
G. Grevendorf and G. Meggle (eds.): Speech. Acts, Mind, and Social Reality. Discussions with R. Searle. 2002 ISBN 1-4020-0853-8; Pb: 1-4020-0861-9 G.-J.M. Kruijff and R.T. Oehrle (eds.): Resource-Sensitivity, Binding and Anaphora. 2003 ISBN 1-4020-1691-3; Pb: 1-4020-1692-1 R. Elugardo and R.J. Stainton (eds.): Ellipsis and Nonsentential Speech. 2005 ISBN 1-4020-2299-9; Pb: 1-4020-2300-6 C. Lee, M. Gordan and D. Bü ring (eds.): Topic and Focus : Cross-linguistic Perspectives on ISBN 1-4020-4795-9 Meaning and Intonation. 2006
Further information about our publications on Linguistics is available on request. springer.com