No title

CONTENTS 1 Typological implications of Kalam predictable vowels Juliette Blevins and Andrew Pawley 45 Prosodic fusion an...

Author: Colin J. Ewen | Ellen M. Kaisse (editors)

37 downloads 461 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

CONTENTS 1 Typological implications of Kalam predictable vowels Juliette Blevins and Andrew Pawley 45 Prosodic fusion and minimality in Kabardian Matthew Gordon and Ayla Applebaum

Cambridge Journals Online For further information about this journal please go to the journal website at:

journals.cambridge.org/pho

PHONOLOGY

153 Testing the role of phonetic knowledge in Mandarin tone sandhi Jie Zhang and Yuwen Lai

NUMBER 1

119 A test case for the phonetics–phonology interface: gemination restrictions in Hungarian Anne Pycha

PHONO

PHONOLOGY PHONOLOGY

27

77 Harmonic Grammar with linear programming: from linear systems to linguistic typology Christopher Potts, Joe Pater, Karen Jesney, Rajesh Bhatt and Michael Becker

PH NOLOGY

VOLUME 27 . NUMBER 1 . 2010

PHONOLOGY

PHONOLOGY

27

NUMBER 1

PHONOLOGY Editors Colin Ewen (University of Leiden) Ellen Kaisse (University of Washington) Review editor Andrew Nevins (University College London) Associate editors Bruce Hayes (University of California, Los Angeles) Elizabeth Hume (Ohio State University) Larry Hyman (University of California, Berkeley) William Idsardi (University of Maryland) René Kager (University of Utrecht) D. Robert Ladd (University of Edinburgh) Joe Pater (University of Massachusetts, Amherst) Keren Rice (University of Toronto) Editorial board John Alderete (Simon Fraser University) Diana Archangeli (University of Arizona) Amalia Arvaniti (University of California, San Diego) Ellen Broselow (State University of New York at Stony Brook) Andries Coetzee (University of Michigan) Matthew Goldrick (Northwestern University) Laura Downing (Research Centre for General Linguistics, Berlin) Gregory Iverson (University of Wisconsin-Milwaukee) Yoonjung Kang (University of Toronto Scarborough) Scott Myers (University of Texas at Austin) Marc van Oostendorp (Meertens Institute, Amsterdam) Tobias Scheer (CNRS/University of Nice) Richard Wright (University of Washington) Members of the editorial board are appointed for terms of five years. Subscriptions Phonology (ISSN 0952–6757) is published three times a year, in May, August and December. The subscription price of Volume 27 (2010) for institutions, which includes print and electronic access, is £170.00 (US $300.00 in the U.S.A., Canada and Mexico). The electroniconly price available to institutional subscribers is £146.00 (US $255.00). The print-only price available to institutional subscribers is £152.00 (US $265.00). The price to individuals ordering direct from the publishers and certifying that the journal is for their personal use is £30.00 (US $45.00). This includes both a print subscription and online access. Orders, which must be accompanied by payment, may be sent to a bookseller, subscription agent or direct to the publisher: Cambridge University Press, The Edinburgh Building, Shaftesbury Road, Cambridge CB2 8RU. Orders from the U.S.A., Canada and Mexico should be sent to: Cambridge University Press, Journals Fulfillment Department, 100 Brook Hill Drive, West Nyack, NY 10994-2133, U.S.A. Japanese prices for institutions are available from Kinokuniya Company Ltd, P.O. Box 55, Chitose, Tokyo 156, Japan. Prices include delivery by air. Orders may also be placed through the website: http://titles.cambridge.org/journals.

Copying This journal is registered with the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, U.S.A. (www.copyright.com). Organisations in the U.S.A. who are registered with the CCC may therefore copy material (beyond the limits permitted by the sections 107 and 108 of U.S.A. copyright law), subject to payment to the CCC. This consent does not extend to multiple copying for promotional or commercial purposes. Organisations authorised by the Copyright Licensing Agency may also copy material subject to the usual conditions. ISI Tear Sheet Service, 3501 Market Street, Philadelphia, PE 19104, U.S.A. is authorised to supply single copies of separate articles for private use only. For all other use, permission must be sought from the Cambridge or American branch of Cambridge University Press. Policy Phonology is concerned with all aspects of phonology and related disciplines. Preference is given to papers which make a substantial theoretical contribution, irrespective of the particular theoretical framework employed, but the submission of papers presenting new empirical data of general theoretical interest is also encouraged. One of the three issues of a volume is occasionally devoted to a particular theme. The editors welcome proposals for themes and offers to act as guest editors for thematic issues. Submission of papers Submissions should be sent to the editors in PDF format, preferably by e-mail. The editorial addresses are: Colin J. Ewen, Opleiding Engels, Universiteit Leiden, Postbus 9515, 2300 RA Leiden, The Netherlands ([email protected]); Ellen M. Kaisse, Department of Linguistics, University of Washington, Box 354340, Seattle, WA 98195-4340, U.S.A. ([email protected]). An abstract (no longer than 150 words) should be e-mailed to both editors when the manuscript is submitted. The author’s name should not appear on the paper itself, and, as far as possible, should not be identifiable from references in the text. A full set of notes for contributors is published on pp. 545–548 of Volume 26, and can also be found on the journal website. The language of submission and publication is English. Internet access Phonology is included in the Cambridge Journals Online service, which can be found at www.journals.cup.org. Information on other Press titles may be accessed at www.journals.cambridge.org or www.cambridge.org. This journal issue has been printed on FSC-certified paper and cover board. FSC is an independent non-governmental, not-for-profit organization established to promote the responsible management of the word’s forests. Please see www.fsc.org for information.

Printed in the United Kingdom at the University Press, Cambridge. © Cambridge University Press, 2010

PHONOLOGY VOLUME 27 NUMBER 1 2010

Edited by Colin J. Ewen and Ellen M. Kaisse

Published by the Press Syndicate of the University of Cambridge The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom CAMBRIDGE UNIVERSITY PRESS The Edinburgh Building, Cambridge CB2 8RU, United Kingdom 32 Avenue of the Americas, New York, NY 10013–2473, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia Ruiz de Alarco´n 13, 28014 Madrid, Spain Dock House, The Waterfront, Cape Town 8001, South Africa http://www.cambridge.org f Cambridge University Press 2010 First published 2010 Printed in the United Kingdom at the University Press, Cambridge ISSN 0952–6757

CONTENTS 1 Typological implications of Kalam predictable vowels Juliette Blevins (Max Planck Institute for Evolutionary Anthropology) and Andrew Pawley (Australian National University) 45 Prosodic fusion and minimality in Kabardian Matthew Gordon and Ayla Applebaum (University of California, Santa Barbara) 77 Harmonic Grammar with linear programming: from linear systems to linguistic typology Christopher Potts (Stanford University), Joe Pater, Karen Jesney, Rajesh Bhatt (University of Massachusetts, Amherst) and Michael Becker (Harvard University) 119 A test case for the phonetics–phonology interface : gemination restrictions in Hungarian Anne Pycha (University of Pennsylvania) 153 Testing the role of phonetic knowledge in Mandarin tone sandhi Jie Zhang (University of Kansas) and Yuwen Lai (National Chiao Tung University) 203 List of contributors

Phonology 27 (2010) 1–44. f Cambridge University Press 2010 doi:10.1017/S0952675710000023

Typological implications of Kalam predictable vowels* Juliette Blevins Max Planck Institute for Evolutionary Anthropology Andrew Pawley Australian National University Kalam is a Trans New Guinea language of Papua New Guinea. Kalam has two distinct vowel types: full vowels /a e o/, which are of relatively long duration and stressed, and reduced central vowels, which are shorter and often unstressed, and occur predictably within word-internal consonant clusters and in monoconsonantal utterances. The predictable nature of the reduced vowels has led earlier researchers, e.g. Biggs (1963) and Pawley (1966), to suggest that they are a non-phonemic ‘ consonant release ’ feature, leading to lexical representations with long consonant strings and vowelless words. Here we compare Kalam to other languages with similar sound patterns and assess the implications for phonological theory in the context of Hall’s (2006) typology of inserted vowels. We suggest that future work on predictable vowels should explore the extent to which clusters of properties are explained by evolutionary pathways.

1 Introduction This paper presents an analysis of predictable vowels in Kalam, a Trans New Guinea language of the Bismarck and Schrader Ranges in Madang Province, Papua New Guinea. Kalam sound patterns are of interest in presenting two distinct vowel types: full vowels /a e o/, which are of relatively long duration and always stress-bearing, and predictable vowels, which occur word-internally between consonants and in monoconsonantal utterances. Predictable vowels, in contrast to full vowels, are short, have contextually predictable qualities, are only stressed in certain positions and alternate with zero in certain contexts. Here we compare Kalam predictable vowels to similar sound patterns in other languages and assess the implications for phonological theory. * We are grateful to Bernard Comrie, four anonymous referees and audiences at the 2nd Sydney Papuanists’ Workshop and the Max Planck Institute for Evolutionary Anthropology for comments on earlier versions of this paper. Pawley’s ﬁeldwork on Kalam was supported by grants from the Wenner-Gren Foundation, the University of Auckland and the University of Papua New Guinea.

1

2

Juliette Blevins and Andrew Pawley In a recent treatment of inserted vowels, Hall (2006) presents a two-way classiﬁcation based on phonological status and distribution : ‘ epenthetic’ vowels are phonologically visible and serve to repair illicit phonotactics ; ‘ intrusive’ vowels are phonologically invisible and can be viewed as predictable transitions from one consonant to another. One central ﬁnding is that Kalam predictable vowels do not ﬁt neatly into this classiﬁcation : they have some properties of epenthetic vowels and other properties of intrusive vowels. If, as argued here, Kalam predictable vowels are treated as non-lexical, lexical representations will contain long strings of consonants, and even vowelless words. Our suggestion is that the seemingly mixed typological status of Kalam predictable vowels and the long strings of consonant found in the lexicon are both related to the historical origins of these enigmatic vowels. In Kalam, and other languages with similar sound patterns, synchronic vowel insertion results from inversion of historical vowel reduction and loss. Historical rule inversion can result in vowels whose gestural properties are similar to those of intrusive vowels, but whose distribution likens them to epenthetic vowels. The reduction and loss of all but a single stressed vowel within the phrase or word give rise to characteristically long consonant strings in the lexicon. w2 begins with an overview of predictable vowels, and reviews Hall’s (2006) typology. w3 provides an overview of Kalam sound patterns, and a detailed description of Kalam predictable vowels. These vowels fail to ﬁt into the simple two-way classiﬁcation proposed by Hall, and motivate a reconsideration of the typology of inserted vowels in terms of multiple pathways of evolution. For Kalam, we demonstrate that many synchronic predictable vowels are the remnants of historical vowel reduction and deletion. At the same time, synchronic patterns show predictable vowels in non-historical positions, suggesting a reanalysis of historical vowel reduction/deletion as synchronic insertion. w4 highlights other languages with predictable vowels similar to those in Kalam. Historical explanations best account for the mixed set of synchronic phonological properties they exhibit, including long consonant clusters and vowelless words.

2 Predictable vowels In many languages, sound patterns are characterised by predictable vowels within the phonological word or phrase. Predictable vowels are those whose quality, quantity and position can be determined from phonological context.1 In most languages, predictable vowels alternate with zero in at 1 Many languages have epenthetic vowels which can only be predicted on the basis of

morphological or morphosyntactic information. At the word level, Edo (Dunn 1968, Elugbe 1989) and Oko (Atoyebi, in progress), two Benue-Congo languages, show a pattern where all nouns begin with vowels. This pattern is extended to derived verbs and to loanwords via vowel epenthesis. In these languages, there will be an initial vowel inserted if the word is known to be a noun.

Typological implications of Kalam predictable vowels 3 least some contexts, motivating vowel-insertion processes within classical generative accounts and constraints yielding surface vowels within optimality treatments.2 There are many diﬀerent types of predictable vowels. One way of classifying these is by relevant phonological domain or context, as in (1)–(3). In this classiﬁcation, three types of predictable vowels are distinguished : those based on the form of phonological words (1), syllables (2) and consonants (3).

(1) Word-based predictable vowels: word-final schwa in Eastern and Central Arrernte (Henderson & Dobson 1994) cf. ak-urrknge ‘brain’ ‘head’ a. ake ak-aparte ‘mind, thinking’ cf. alkng-ultye ‘tears’ alknge ‘eye’ alkng-intyeme ‘look out of corner of eye’ ime ‘corpse’ cf. im-atyewennge ‘a curse of death’ b. parrike < Eng. paddock ‘fence’ thayete, thayte ‘area, side’ < Eng. side ‘really fast’ < Eng. bullet pwelerte In (1), words from Eastern and Central Arrernte are represented in the native orthography. In many Australian languages, including Arrernte, all phonological words end in vowels.3 In Eastern and Central Arrernte (1a), words end in a schwa-like vowel (spelled <e>), though this schwa is not found medially before another vowel (1a). Final schwa in Arrernte is a predictable feature of phonological words, and this sound pattern characterises loanwords as well (1b). Word-ﬁnal inserted vowels are often referred to as paragogic vowels.

2 Within models where underlying/lexical and surface forms are distinguished, pre-

dictable vowels are typically analysed as absent underlyingly but present on the surface. Within exemplar models (see Gahl & Yu 2006), where underlying/lexical forms can be viewed as generalisations over phonetic surface forms, the mappings relating generalisations to surface forms will involve zero-to-vowel mappings. For the remainder of this paper, we frame the analysis in generative underlying/surface terms, though it is equally amenable to treatment within an exemplar model in which phonology consists of a speaker’s generalisations from sound patterns within the exemplar space. 3 Nearly all Arrernte phonological words end in a central vowel, though this vowel need not be pronounced, and is often absent in sandhi when another vowel follows (Henderson & Dobson 1994 : 23). A small set of emphatics with distinctive intonation patterns seem to lack the ﬁnal central vowel (Henderson & Dobson 1994 : 23). Since these exceptional forms have distinctive intonation patterns, the distribution of word-ﬁnal central vowels can still be predicted on phonological grounds alone. Other Australian languages which require phonological words to end in vowels are Panyjima (Dench 1991 : 133) and dialects of Western Desert, like Pitjantjatjara (Goddard 1992 : ix). In both of these languages, consonant-ﬁnal native stems and loans are augmented by the word-ﬁnal syllable /-pa/.

4

Juliette Blevins and Andrew Pawley A better-studied epenthesis type is that triggered by constraints on syllable structure.4 In many languages, the maximal word-medial syllable is CV(V)C, where onsets and codas constitute single consonants. If morphology or syntax yields consonant clusters which cannot be syllabiﬁed in this way, predictable epenthetic vowels surface (Itoˆ 1989, Blevins 1995: 224–227). Well-studied examples of this kind include the Yawelmani dialect of Yokuts (see note 4), and a range of Semitic languages (Rose 2000), including many Arabic dialects (Selkirk 1981, Broselow 1992, Kiparsky 2003). The data in (2) is from Mapuche (also known as Araucanian and Mapudungan), an isolate of Chile.

(2) Syllable-based predictable vowels: cluster-splitting central vowel in Mapuche (Smeets 2008) a. kîTa’wîñmun /kîTaw-ñmu-n/ ‘I worked for my own pleasure’ /lef-n/ ‘I ran’ b. le’fîn ‘nail’ < Sp. clavo c. kî’lafo ‘white’ < Sp. blanco d. fî’laN In Mapuche, a high central vowel is obligatorily inserted in triconsonantal clusters (2a), and in word-ﬁnal biconsonantal clusters (2b), and can be stressed as in these examples (Smeets 2008: 51). Due to this process, all suﬃxes of the form -C- or -CC º can be seen as having two allomorphs : one with an initial high central vowel, occurring after C-ﬁnal stems, and one without, occurring elsewhere. As shown in (2c, d), epenthesis is also apparent in Spanish loans which do not conform to the maximal Mapuche CV(V)C syllable template.5 The least-studied patterns of predictable vowels are those which can be linked to consonant transitions. Transition vowels have been referred to variously as excrescent, intrusive, invisible, moraless, paragogic, parasitic, svarabhakti, transitional and weightless (Harms 1976, Levin 1987, Dell & Elmedlaoui 1996a, Warner et al. 2001, Hall 2006). In some languages, like 4 As with word-based epenthesis (see note 1), morphological information is some-

times necessary to predict locations of vowel insertion in syllable-based epenthesis. In fact, this is true of one of the best known cases in the literature. Yawelmani Yokuts /i/-epenthesis in pairs like /lOgiw-hin/ ‘ (he) pulverised (it) ’, /lOgw-it/ ‘ (it) was pulverised ’ or /?ilik-hin/ ‘ (he) sang’, /?ilk-en/ ‘ (he) will sing ’ (Newman 1944 : 25, 27) is analysed by many, including Kenstowicz & Kisseberth (1979 : 85–89) and Archangeli (1991), as a purely phonologically conditioned alternation. However, Newman (1944: 25) describes the predictable (or ‘ dulled ’) vowel as occurring only within stem-ﬁnal consonant clusters in reduced stems. Further, he makes it clear that there are other strategies for eliminating unsyllabiﬁable consonants. These include consonant deletion (Newman 1944 : 30) and insertion of a ‘ protective ’ vowel in nouns. In nouns, the protective vowel can be other than /i/ (e.g. /pil-/ ‘ road’+/w/ ‘ oblique ’, realised as /pilaw/), and is determined, in part, by noun class (Newman 1944 : 172–173). 5 Mapuche allows triconsonantal clusters ending in /fw/ and /pw/ (Smeets 2008 : 45). /Cw/ may be treated as single complex consonants, or the /w/ as part of the following vowel.

Typological implications of Kalam predictable vowels 5 Piro (Matteson 1965: 22–47) and Imdlawn Tashlhiyt Berber (Dell & Elmedlaoui 1985), these ﬂeeting vowels are interpreted as phonetic realisations of syllabic consonants. Even so, one characteristic that sets them apart from predictable vowels of the type illustrated in (1) and (2) is a clear dependency between phonetic vowel quality and the quality of adjacent consonants (Dell & Elmedlaoui 1996b, Coleman 2001). The association of vowels of this sort with consonant transitions is clear in the data from Sye (Erromangan), an Oceanic language of Vanuatu (Crowley 1998: 14).

(3) Consonant-based predictable vowels: schwa or copy vowel in CC clusters with /h G N/ in Sye (Crowley 1998: 14) /nehkil/ ‘snake’ a. [nehkil~neh@kil~nehekil] [moGpon~moG@pon~moGopon] /moGpon/ ‘his/her grandchild’ /yaGpon/ ‘egret’ [jaGpon~jaG@pon] /elGavi/ ‘hold it’ [elGavi~el@Gavi~eleGavi] b. [nempNon~nemp@Non~nempeNon] /nempNon/ ‘time’ /GanrNi/ [GandNi~Gand@Ni] ‘(s)he will hear it’ As illustrated in (3), the predictable vowel is found between /h/ or /G/ and a following consonant (3a), or between /G/ or /N/ and a preceding consonant (3b) ; the predictable vowel is schwa in free variation with zero, or a copy of a mid vowel in a preceding syllable. Main stress in Sye is penultimate, but these predictable transitional vowels are never stressed, and do not count for the purposes of stress assignment. Another way of classifying predictable vowels like those in (1)–(3) is by their phonological status. Vowels which function as syllabic nuclei for phonological processes are placed in one category, while those which do not appear to play any active role in the phonology are placed in another (Harms 1976, Levin 1987, Warner et al. 2001, Hall 2006). Hall’s (2006) recent cross-linguistic survey of ‘inserted vowels ’, which are absent lexically, but present on the surface, is a prime example of this type of classiﬁcation. Inserted vowels are divided into two basic types : EPENTHETIC vowels and INTRUSIVE vowels. Intrusive vowels are phonetic transitions between consonants and are generally phonologically invisible. In contrast, epenthetic vowels are not simple phonetic transitions, and are phonologically visible. Intrusive vowels do not seem to have the function of repairing universally rare or ‘marked ’ structures (3), while epenthetic vowels do function in this way (2). The full range of properties generally associated with each predictable vowel type is given in (4) and (5) from Hall (2006 : 391).

6

Juliette Blevins and Andrew Pawley

(4) Some properties of epenthetic (phonologically visible) vowels a. The vowel’s quality may be fixed or copied from a neighbouring vowel. A fixed-quality epenthetic vowel does not have to be schwa. b. If the vowel’s quality is copied, there are no restrictions as to which consonants may be copied over. c. The vowel’s presence is not dependent on speech rate. d. The vowel repairs a structure that is marked, in the sense of being cross-linguistically rare. The same structure is also likely to be avoided by means of other processes within the same language. (5) Some properties of intrusive (phonologically invisible) vowels a. The vowel’s quality is either schwa, a copy of a nearby vowel or influenced by the place of the surrounding consonants. b. If the vowel copies the quality of another vowel over an intervening consonant, that consonant is a sonorant or guttural. c. The vowel generally occurs in heterorganic clusters. d. The vowel is likely to be optional, have a highly variable duration or disappear at fast speech rates. e. The vowel does not seem to have the function of repairing illicit structures. The consonant clusters in which the vowel occurs may be less marked, in terms of sonority sequencing, than clusters which surface without vowel insertion in the same language.

In addition to oﬀering new diagnostics for intrusive vowels, Hall (2006) provides new evidence that intrusive vowels are not phonological units and do not form syllable nuclei at any level of representation. An additional claim is that three general properties of intrusive vowels follow from the characterisation of vowel intrusion in terms of abstract articulatory gestures within the model of Articulatory Phonology (Browman & Goldstein 1986, 1992). By treating intrusive vowels as retimings of existing articulatory gestures without addition of a vowel articulation, their quality (copy vowels or schwa-like), distribution (typically restricted to heterorganic clusters) and variability (likely to be absent in fast speech) are accounted for. In contrast, epenthetic vowels are those which add a vowel articulation to the gestural score. To relate these two kinds of predictable vowels, Hall (2006 : 422–423) invokes diachrony. The general claim is that intrusive vowels may become phonologised, and in doing so, shift from intrusive to epenthetic over time. While it is clear that many intrusive and epenthetic vowels have their origins in this sort of articulatory retiming and subsequent phonologisation, other well-known pathways exist for the evolution of predictable synchronic vowel–zero alternations. Perhaps the best known, discussed further in w4, is the process of historical vowel loss. Regular vowel loss yields vowel–zero alternations, which can be reinterpreted as insertions via rule inversion. A simple case of this kind is found in Manam,

Typological implications of Kalam predictable vowels 7 an Oceanic language of Manam Island of the north New Guinea coast, as analysed by Lichtenberk (1983 : 35–39). In Manam, /i/-epenthesis occurs when an adnominal suﬃx is added to a consonant-ﬁnal stem : /tama-gu/ ‘my father’, but /tamim-i-gu/ ‘my urine ’, where the underlined /i/ is epenthetic. Historically, word-ﬁnal high vowels /i/ and /u/ were lost after nasals : *tamimi > /tamim/ ‘urine ’. However, when this form was suﬃxed, the high vowel was protected and retained, as in the reﬂex of *tamimi-gu. This vowel–zero alternation was reanalysed as /i/-insertion, a fact evident in *u-ﬁnal stems : from Proto-Oceanic *danum ‘water’, Manam /daN/ <*danu, /mata-daN/ ‘ tears (eye-water) ’, but /mata-daN-i-gu/ ‘ my tears ’, with reanalysed epenthetic /i/, not **/mata-daN-u-gu/. If articulatory retiming is not the sole source of predictable vowels, and if clusters of properties exhibited by predictable vowels are in part attributable to their source, then we should not be surprised if Hall’s synchronic typology appears incomplete. Given other pathways of synchronic vowel–zero alternations, like the inversion of vowel deletion sketched above, we might expect other predictable vowel types, with a mix of the properties in (4) and (5), or with additional properties of their own. In the remainder of this paper, we describe and analyse predictable vowels which do not ﬁt neatly into Hall’s typology. The sound patterns of interest are found in Kalam, and described in detail in w3. In Kalam, predictable vowels have the properties shown in (6), where ‘ E’ indicates a property associated with Hall’s epenthetic class, ‘ I ’ a characteristic of Hall’s intrusive class and ‘N ’ a new property not clearly associated with either of Hall’s predictable vowel types. (6) Some properties of Kalam predictable vowels a. The vowel’s quality is either central, a copy of a nearby vowel or influenced by the quality of surrounding consonants (I). b. If the vowel’s quality is copied over an intervening consonant, that consonant need not be a sonorant or guttural (E). c. The vowel’s presence is not dependent on speech rate (E). d. The vowel does not generally occur in heterorganic clusters; it often occurs between homorganic consonants, including identical consonants (E). e. The vowel does not seem to have the general function of repairing illicit structures (I). f. The vowel is phonologically visible, since it can carry word stress (E). g. The vowel’s presence may be associated with consonant release (N). h. Lexical/underlying forms without predictable vowels may contain long strings of consonants, and may lack vowels altogether (N).

In w4 we note other languages with similar predictable vowels, and suggest how some of the properties in (6) can be explained in terms of parallel historical developments.

8

Juliette Blevins and Andrew Pawley

3 The Kalam language Kalam is spoken by about 20,000 people living around the junction of the Bismarck and Schrader Ranges on the northern fringes of the central highlands of Papua New Guinea. Most Kalam speakers live in the high, mountainous valleys of the upper Simbai, Kaironk and Asai Rivers, and on the northern slopes of the Jimi Valley adjacent to the Kaironk and Simbai Valleys. Most of the Kalam-speaking area is in Madang Province, though on the southern fall of the Bismarck Range, Kalam speakers are found in the Western Highlands Province as well. Kalam is one of two members of the Kalamic group, the other being Kobon. Kobon is spoken by approximately 10,000 people just west of the Kalam area, in the Schrader Ranges. The Kalamic group, in turn, is a branch of the Madang group of approximately 100 languages, itself a subgroup of the large Trans New Guinea family. The two major dialects of Kalam are Etp Mnm (‘ Etp language’) and Ti Mnm (‘ Ti language ’), where /etp/ and /ti/ are the words for ‘ what?’ in the respective dialects.6 Data cited below is from Etp Mnm unless noted otherwise. Where dialect diﬀerences are relevant to the discussion, they are noted in the text. After glosses, (PL) indicates a word from ‘ Pandanus Language ’, whose distinctive lexicon is used in certain ritually dangerous contexts, (T) indicates a Ti Mnm form and (L) a loan from Tok Pisin or English.7 Compared to most languages of the Trans New Guinea family, Kalam is fairly well studied. Descriptions include Biggs’ (1963) initial phonemic analysis, focusing on the vowel system, Pawley’s (1966) grammar of Etp Mnm, a Kalam–English and English–Kalam dictionary (Pawley & Bulmer 2003), various studies of syntax, semantics and speech processing, and a number of ethnographic and ethnobiological studies treating areas of Kalam lexical semantics. Historical work is much less extensive, but includes a ﬁrst reconstruction of Proto-Kalamic phonology and lexicon 6 Speakers of the Etp Mnm dialect occupy the Upper Simbai Valley, from the head

eastwards as far as Sugup, and occupy some tributaries of the Middle Simbai, as far east as Kaynej. The largest numbers of Ti Mnm speakers live in the Upper Asai Valley. For more on Mnm dialect geography, see Pawley & Bulmer (2003). Although we refer here to Etp Mnm and Ti Mnm as ‘ dialects’ of Kalam, they could be considered as diﬀerent languages, roughly as divergent as Standard Italian is from Spanish. 7 Pandanus Language is a variety of Kalam with an almost completely distinct lexicon, spoken in the high mountain forest when people are gathering or eating the fruit of the mountain pandanus, or when preparing cassowary ﬂesh. See Pawley (1992) for a full description. There are no clear diﬀerences between Pandanus Language phonology and the ordinary language where predictable vowels are concerned. Other diﬀerences can be noted. For example, /N/ is not found word-initially in the ordinary language, but it is in Pandanus Language. Kalam glosses in this paper are greatly abbreviated, since the primary focus is the form of lexemes, not their meaning. See Pawley & Bulmer (2003) for full dictionary entries. When a taxon, species or type of noun is involved, we use ‘ sp.’ to abbreviate ‘ species’ or ‘ speciﬁc type’.

Typological implications of Kalam predictable vowels 9 (Coberly 2002) and some wider comparative studies which touch on Kalam historical phonology and subgrouping, including Pawley & Osmond (1998) and Pawley (2001, 2008). 3.1 An overview of Kalam phonology In this section we provide an overview of the segment inventory, syllable structure and stress patterns of Kalam as background to our analysis of predictable vowels. Our treatment of Kalam segmental phonology follows closely that of Pawley (1966). The segmental phonemes include 16 consonants, shown in (7a), and three vowels, as in (7b).8 The symbols in (7) are those of the practical orthography of Pawley & Bulmer (2003) ; we use them here to facilitate look-up of dictionary forms. Departures from approximate IPA values are : /c/ and /j/ for palatalised dentals, voiced symbols /b d j g/ for prenasalised stops and /y/ for the palatal glide. All non-sonorant consonants have a range of allophones. In word-initial position they have close to their IPA values (modulo the notes above), word-medially there is intervocalic voicing and spirantisation of the /p t k/ series and word-ﬁnally the voiced prenasalised series is devoiced. Phonetic transcriptions are in square brackets and use IPA symbols. (7) a. Kalam consonant phonemes bi- denti- palatalised palatal velar labiallabial alveolar dental velar voiceless stop p t c k voiceless fricative s voiced prenasalised d j b g stop ñ nasal stop m n N lateral l semi-vowel y w b. Kalam vowel phonemes front central

back

(i) e

(u) o

mid low

a

8 Throughout, we use the term ‘ consonant ’ to refer to true consonants and glides

(semi-vowels). Where we wish to single out true consonants or glides respectively, this will be made explicit. Though the palatalised dental stops /c/ and /j/ have aﬀricative release, this is treated as a redundant property, and not transcribed phonemically or phonetically.

10 Juliette Blevins and Andrew Pawley While the consonant system is relatively straightforward, a few notes are in order concerning the vowel system in (7b). Though only three underlying vowels are posited, as shown in (8), on the surface there is a six-way contrast of vowel quality in CVC monosyllables. Following Pawley (1966), the high vowels i and u are analysed either as vocalised instances of /y/ or /w/, or as predictable vowels whose quality is determined by a preceding or following /y/, /w/ or palatalised dental consonant. When glides are vocalised, the resulting high vowels are heard and transcribed as half-long (and stressed) [oi; ou;], like the full vowels /a e o/, transcribed as [oa; oe; oo;]. When surface [i u] are the result of predictable vowel insertion, they are heard and transcribed as short, and can be unstressed. We follow Pawley’s (1966) analysis here, treating only /y/ and /w/ as underlying, and justifying this below. However, we emphasise that the central argument for Kalam predictable vowels as a novel type, with origins in vowel reduction/deletion, is independent of whether or not some instances of surface [i] and [u] derive from underlying vowels.9 (8) shows that, in addition to a contrast between /a e o/ and the vocalised glides /y w/, there is a surface high central short [6], with a word-ﬁnal [@] allophone, and other qualities (e.g. [i u o k]) under optional assimilation to neighbouring segments. This high central vowel occurs predictably between consonants in words like /kn/ ‘sleep ’. Predictable vowels of this kind are the focus of this study, with their quality and distribution detailed below. However, before turning to characteristics of predictable vowels, we brieﬂy justify the analysis of the vowel system in (7b).

(8) Surface vowel contrasts in CVC monosyllables surface / C_C phonemic [a] /kan/ [‘ka;n] ‘dodge’ [e] /ken/ [‘ke;n] ‘yam sp.’ [o] /kon/ [‘ko;n] ‘Jimi River’ [i] /kyn/ [‘ki;n] ‘tree fern’ [u] /kwn/ [‘ku;n] ‘like this’ [î, @]10 [‘kîn] /kn/ ‘sleep’ Arguments for the simple vowel system in (7b) are based on quantity, distribution and predictability.11 In general, the vowels /a e o/, as well as vocalised glides, have longer surface realisations than predictable vowels, 9 Similar analyses of surface front and back high vowels as vocalised glides are sug-

gested for other Papuan languages, including Iatmul and Yessan-Mayo (Foley 1986 : 49–52) and Haruai (Comrie 1991). Note that in Kalam vocalised /w/ has a fronted allophone IPA [y] before the palatal consonants /y/ and /j/, as in /gal-wj/ [oNga;lpwy;ntj] ‘ spider taxon ’ or /kwy/ [oky;j] ‘smell, odour’. 10 [@] is limited to word-ﬁnal position. 11 As already noted, no aspect of the arguments regarding other predictable vowels would change signiﬁcantly if one were to adopt an underlying /i u a e o/ vowel system. The central vowel would still be predictable and non-underlying.

Typological implications of Kalam predictable vowels 11 and consistently attract word stress. Predictable vowels are shorter and in some contexts are unstressed. While /a e o/ are found word-initially in native words, there are no native words beginning with /i/, /u/ or any central vowel. Instead, words may begin with [ji] or [wu] phonetically. Examples of these word-initial patterns are shown in (9).

(9) Word-initial vowels [‘e;‘mba;p] a. /e/ /ebap/ /a/ /aj/ [‘a;ntj] /o/ /omNal/ [‘o;mî‘Na;l] b. /y/ /ym/ [‘ji;m] /w/ /wN/ [‘wu;N] c. *[i;] *[u;] *[î, @]

‘one, a certain one’ ‘husband’s sister’ ‘two’ ‘plant crops’ ‘hair, fur, feathers’ — — —

A third argument for the vowel system in (7b) is based on the fact that the end of the word shows a complementary pattern to that in (9). If a word begins with a phonetic vowel, it can only begin with [a], [e] or [o] (6), but if a word ends in a phonetic vowel, as in (11), it cannot end in [a], [e] or [o]. A general feature of the Kalam lexicon is that words end in consonants, including the glides /y w/. Representative examples of word-ﬁnal obstruents, nasals, liquids and glides are shown in (10).

(10) Examples of word-final consonants /p/ /gap/ ‘star’ /gep/ ‘acting’ /gop/ ‘hook’ /b/ /kab/ ‘stone’ /keb/ ‘sweet potato sp.’ /kob/ ‘bird sp.’ /s/ /kas/ ‘hair, fur’ /kes/ ‘heartburn’ /kos/ ‘fire-saw’ /n/ /kan/ ‘dodge’ /ken/ ‘yam sp.’ /kon/ ‘Jimi River’

/l/

/y/

/w/

/kal/ /kel/ /kol/ /kay/ /key/ /koy/ /kaw/ — /gow/

‘fierce’ ‘palm sp.’ ‘sugar-cane sp.’ ‘group, gang’ ‘separately’ ‘blind’ ‘space’ ‘nest sp.’

Surface exceptions to this generalisation, shown in (11), involve either ﬁnal [i; u;] from /y w/ respectively, or predictable ﬁnal [@] in monosyllables. It is only when the vowels /a e o/ are distinguished from vocalised glides and predictable vowels that the consonant-ﬁnal phonotactics of the language can be viewed as exceptionless, as they are when underlying forms like those in (11) are adopted.

12

Juliette Blevins and Andrew Pawley

(11) a. Word-final surface vowels from underlying C-final words [i;] ‘mother’ [’a;’mi;] /amy/ [u;] ‘tree sp.’ [’a;’lu;] /alw/ [î, @] [’m@] ‘taro’ /m/ b. Word-final surface glides from underlying C-final words ‘group, gang’ /kay/ [a;j] [ka;j] ‘separately’ /key/ [e;j] [ke;j] ‘blind’ /koy/ [o;j] [ko;j] A fourth argument for the treatment of /a e o/ as underlying vowels and /y w/ as underlying consonants comes from allomorphy. In Kalam, there are two allomorphs of the negative preﬁx or pro-clitic: /ma-/ and /m-/ ‘ not, not yet’. The choice of allomorph is phonologically determined : /m-/ occurs before vowels /a e o/ and /ma-/ occurs elsewhere : /m-ag-p/ ‘he did not speak ’, /m-ow-p/ ‘he has not come ’, /m-o-ng-gab/ ‘ he will not come ’ vs. /ma-pkp/ ‘it has not struck ’, /ma-dan/ ‘don’t touch’, /ma-ynb/ ‘ it is not cooked ’, /ma-wkp/ ‘ it is not cracked ’. If instead of /y/ and /w/, /i/ and /u/ were posited, we would have no explanation for the absence of derived /m-i º/ or /m-u º/ forms. A ﬁnal argument for the vowel system in (7b) involves the distribution of surface hiatus. The only word-internal surface vowel sequences in Kalam are [i:a:, i:o:, i:e:, u:a:, u:o:, u:e:], as in /kyaw/ [oki;pa;w] ‘ tree sp.’, /kyep/ [oki;pe;p] ‘excrement ’, /kyon/ [oki;po;n] ‘insect sp. ’, /kwam/ [oku;pa;m] ‘ tree sp.’, /kwel/ [oku;pe;l] ‘tree sp. ’, /kuok/ [oku;po;k] ‘bowl’. In these cases, the long high vowels are surface realisations of vocalised phonemic high glides /y w/. The absence of all other surface vowel sequences is explained by the fact that, word-internally, no phonological vowel sequences are permitted. Since the predictable central vowel [6] is not underlying, and inserted only between consonants, it never occurs in surface hiatus. A summary of phonological arguments for distinguishing /a e o/ from underlying glides /y w/ and predictable vowels in Kalam is given in Table I. w3.2 presents a full synchronic description and analysis of predictable vowels. Their typological status is discussed in w3.3, and their historical development in w3.4. Here, we expand on the basic distributional and qualitative properties of predictable vowels, along with further arguments that predictable vowels are not part of Kalam speakers’ lexical representations. Kalam predictable vowels were analysed as ‘non-phonemic’ vocoids by Pawley (1966 : 33ﬀ). Since the position and quality of these vowels was predictable, they were assumed to be absent in underlying representations. Pawley’s description of these vowels bears a striking resemblance to aspects of ‘ intrusive’ vowels in the sense of Hall (2006) : A vowel occurs predictably between all adjacent consonant phonemes not separated by juncture, or following any consonant which occurs between junctures. Such a vowel is regarded as the release of the preceding consonant. Elsewhere, i.e. in the case of a consonant followed by

Typological implications of Kalam predictable vowels non-predictable vowels

non-predictable glides

13

predictable vowels

surface vowel

[a;]

[e;]

[o;]

[i;, ji;]

[u;, wu;]

[î, @]

phoneme

/a/

/e/

/o/

/y/

/w/

zero

half-long? always always always

always

always

never

stressed? always always always

always

always

sometimes

[ji;] only [wu;] only

word-initial?

yes

yes

yes

word-final?

no

no

no

yes

yes

[@] only

/m-/

/m-/

/m-/

/ma-/

/ma-/

never word-initial

no

no

no

yes

yes

no

negative proclitic? initial in hiatus?

no

Table I Distinguishing underlying vowels, glides and predictable vowels.

a vowel, or a ﬁnal consonant which is not preceded by juncture, consonant release is realized as zero º In most environments the consonant release vocoid is a short high central to mid central unrounded vowel (Pawley 1966: 33). In addition to the properties listed in (6) and Table I, predictable vowels have two other notable characteristics that set them apart from other vowels. First, there are many words of four, ﬁve and six syllables where the only surface vowels are predictable vowels: /pkpnp/ [oF6G6B6pn6p] ‘I could have hit ’, /mdnknN/ [om6nd6pn6G6n6N] ‘while I was staying ’, /pbtknknN/ [oF6mb6r6G6pn6G6n6N] ‘while I was fastening ’. Second, predictable vowels appear to have the highest frequency of any vowels : a sample count from two Ti Mnm texts with a total of 2088 vowels yields 36.3% predictable vowels, 32.7% /a/, 14.5 % /o/, 9.9% /e/, 4% /u/ and 2.4 % /i/. Some predictable vowels have already been exempliﬁed. The last example in (8) shows a high central predictable vowel in /kn/ [ok6n]. In (9a) a predictable vowel is found within the medial /mN/ consonant cluster in [oo;m6pNa;l], while in (11a) the predictable vowel [@] surfaces word-ﬁnally in /m/ [om@]. While the properties in Table I suggest that predictable vowels have a diﬀerent phonological status from the vowels /a e o/ and the glides /y w/, it is the distribution, quality and alternation of predictable vowels with zero that argue most strongly for their absence in underlying forms. These properties are illustrated in (12), where predictable vowels are underlined in phonetic transcriptions. Monomorphemic words are shown in (12a–f). In (12a–d) a predictable vowels occurs between consonants within the word. When there is no glide or palatal consonant to

14 Juliette Blevins and Andrew Pawley colour the predictable vowel, it typically surfaces as a neutral central high vowel [6].12

(12) Predictable vowels a. /kd/ [‘kînt] [kî’ndîl] b. /kdl/ [mî’lîp] c. /mlp/ [mî’Ngîn] d. /mgn/ [‘mb@] e. /b/ [‘m@] f. /m/ g. /an-ket/ h. /anwak/ i. /ap-tan-/

[‘a;nî’Ge;r] [‘a;nu’wa;k] [‘a;Bî’ra;n]

j. /ap-yap/

[‘a;Bi’ja;p]

k. /as-ket/

[‘a;s’ke;r, ‘a;sî’Ge;r] [‘a;s’wa;nt, ‘a;su’wa;nt]

‘segment, part’ ‘sinew’ ‘dry’ ‘vulva’ ‘man’, cf. /b-ak/ [‘mba;k] ‘that man’ ‘taro’, cf. /m-adeN/ [‘ma;’nde;N] ‘taro plant sp.’ ‘whose?’, cf. /an/ ‘who?’, /ket/ (poss cl) ‘co-wives’ ‘rise’, cf. /ap/ ‘movement toward’, /tan/ ‘ascend’ ‘fall’, cf. /ap/ ‘movement toward’, /yap/ ‘descend’ ‘leech sp.’ cf. /as/ ‘frog’, /ket/ (poss cl) ‘dewlap’ cf. /as/ ‘frog’, /wad/ ‘bag’

[‘a;j’Bo;r] [‘a;j’ma;j] [‘ka;ji’Na;j] [‘a;w’le;Nk] [‘ka;w’mba;p] [‘ko;wu’nja;k]

‘lizard sp.’ ‘two sisters’, cf. /ay/ ‘sister’ ‘tree sp.’ ‘tadpole’ ‘several, a few’, cf. /kaw/ ‘several’ ‘yam sp.’

l. /as-wad/ but m. /aypot/ n. /ay-may/ o. /kayNay/13 p. /awleg/ q. /kaw-bap/ r. /kowñak/13

In (12e–f) a predictable vowel occurs as schwa in monoconsonantal words. When these same stems occur in morphologically complex words followed by vowels, the predictable vowels do not surface, as in [omba;k] ‘ that man ’, from /b-ak/, or [oma;pnde;N] ‘taro plant sp. ’, from /m-adeN/. If underlying central vowels were posited for these stems, they would constitute a true anomaly in the language : they would be the only vowel-ﬁnal lexemes, and they would be the only words with underlying central vowels. Morphologically complex words are shown in (12g–j). For each pair (12g–h) and (12i–j) the initial morphemes /an, ap/ can occur in isolation. In isolation, each word is pronounced without a ﬁnal vowel : [oa;n, oa;p]. However, in line with Pawley’s description above, a predictable vowel 12 Scholz (1995) transcribes all predictable central vowels as schwas. As discussed in

w3.2, predictable vowels may be coloured by adjacent consonants, and may also assimilate partially or fully to neighbouring vowels. 13 Predictable vowels in word-internal VG.CV sequences are rare, as noted above.

Typological implications of Kalam predictable vowels 15 occurs between adjacent consonants within the word when these morphemes are followed by consonants. Furthermore, the predictable vowel can vary in quality, depending on surrounding consonants and vowels : compare [6] in (12g, i) with [u] before /w/ in (12h) and [i] before /y/ in (12j). Finally, the predictable vowel is optional if the underlying sequence is Vs-CV (12k, l), and generally absent if the sequence is VG-CV, where G is the glide /y/ (12m, n) or /w/ (12p, q). This pattern is robust across the entire language: whenever two intervocalic consonants come together within a word, and the ﬁrst is not a glide or /s/, a predictable vowel is present on the surface, though the same vowel is absent when the ﬁrst morpheme occurs word-ﬁnally. If /an, ap/ and every other consonant-ﬁnal morpheme in the language were analysed instead as vowel-ﬁnal (/an6, ap6/, etc.), we would face similar anomalies to those noted above for a ﬁnal underlying schwa: with the exception of glide-ﬁnal words, all words would end in central vowels, though these vowels would delete in wordﬁnal position, no words would end in vowels /a e o/ and there would be no relationship between the predictable quality of these vowels and their distributional regularities. Further evidence for the non-lexical status of Kalam predictable vowels can be found in loanword phonology and orthographic practice. Only a process of synchronic vowel insertion can account for the appearance of predictable vowels in loans.14 Some loans from Tok Pisin and English are given in (13). In (13a), the loan source has a word-internal consonant sequence, which, as in the native vocabulary, appears to be broken up by a predictable vowel. In (13b), VyCV and VsCV sequences are not split by an epenthetic vowel, in line with the generalisation above that predictable vowels are optional in these contexts.

(13) Predictable vowels in loans a. /alpim/ [’a;lî’Bi;m] /balayn/ [‘mba;’la;’jin] /dokta/ [‘ndo;Gî’ra;] /gapman/ [‘Nga;Bî’ma;n] /spet/ [sî’Be;r] but b. /aybiskes/ [‘a;j’mbi;s’ke;s]

‘help someone’
In addition to evidence from native sound patterns and loan vocabulary, there is some complementary evidence from native speaker intuitions for the non-lexical status of predictable vowels.15 Few Kalam are literate in 14 Many Kalam speakers are bilingual in Tok Pisin, and there is a great range of

variation in the production of Tok Pisin loans. Forms in (13) represent the ﬁrst generation of Kalam speakers of Tok Pisin, who learnt Tok Pisin as teenagers or adults, preserving Kalam sound patterns. 15 Hall (2006 : 395) cites Pearce (2004: 19) on potential psychological evidence for the nature of intrusive vowels in Kera, a Chadic language. Kera native speakers were asked to choose between CVCVCV and CVCCV for words with suspect medial

16 Juliette Blevins and Andrew Pawley their own language, writing to each other mainly in Tok Pisin or English. Some have learnt to read the orthography developed by the Summer Institute of Linguistics team (Scholz 1976). This orthography is used in the Kalam translation of the New Testament and in texts used in Anglican church services. The SIL orthography uses i for the predictable vowel, while writing the vowel /i/ as iy. However, those few literate Kalam who have been regularly exposed to the orthography used here, which lacks predictable vowels, have had little diﬃculty using it. The most proliﬁc native-speaking Kalam writer we know of, the late Saem Majnep, wrote hundreds of pages in Kalam,16 and was comfortable with this orthography. Of special interest here is the fact that Majnep had no particular problems with the lack of vowels in words like /kd/, /mnm/ ‘speech, language ’, /b/ or /m/. Writing words with phonetic [u] and [i] with /w/ and /y/ respectively did not cause any problems either. We view this as additional evidence that predictable vowels are not part of lexical phonological representations in Kalam. A fuller description and analysis of Kalam predictable vowels is provided in w3.2. The basic properties noted above allow us to provide a brief overview of syllable structure and stress patterns in the paragraphs that follow. Syllable structure in Kalam is maximally CVC, and minimally V. The full range of syllable types is illustrated in Table II. As discussed above, words are underlyingly C-ﬁnal; therefore, V and CV syllables are not found word-ﬁnally in the lexicon. Vowel-initial V and VC syllables (where V=/a e o/) are common word-initially, but not found medially in underlying forms. (Recall that glide vocalisation can result in V-initial surface syllables, as in /kyep/ [oki;pe;p].) There are also very few words with underlying medial CV syllables in V.CV.CV strings : it is likely that most sequences of open syllables have undergone historical syncope of *VCVCV>VCCV (see w4). Compare, for example, conservative Ti Mnm /pa.to.daN/, shown in the word-medial column of Table II, and its reduced Etp Mnm counterpart /patdoN/ [oFa;r6pndo;N]. As noted earlier, in word-medial intervocalic CC clusters where the ﬁrst C is not a glide or /s/, a consonant is released, with a predictable vowel appearing. The forms in Table II with medial C.C clusters do not show predictable vowels, and

intrusive vowels. Speakers chose CVCCV spellings, suggesting that the medial vowel was not part of their phonological lexical representation. Comrie (1991) describes a very similar situation in Haruai, a Piawi language of the Mid-Ramu District of Madang Province, Papua New Guinea. In Harui, [6] is non-lexical, serving to break up word-internal consonant clusters. Comrie notes that ‘ where Haruai writers have had to write down Haruai words (e.g. names on labour contracts), following the basic spelling conventions of Tok Pisin, they have not provided any orthographic representation of the 6 ’ (1991: 394). Comrie treats Haruai [6] as part of the phonetic realisation of syllabic allophones of the relevant consonants. See w4 for further discussion of other New Guinea languages with predictable vowels. 16 Majnep’s writings in Kalam include extensive Kalam texts in Majnep & Bulmer (1983, 1990) and Majnep (n.d.).

Typological implications of Kalam predictable vowels syllable type

word-initial

V

a a.leb ‘tongue’ e e.ñap ‘a bit’ o o.nep ‘precisely’

CV

a ka.may ‘tree sp.’ e ko.dal ‘centipede’ o pe.sel ‘herb sp.’

VC

a aw.lan ‘ginger sp.’ e ed.mas.ta ‘headmaster’ (l) o op.tin ‘can-opener’ (l)

CVC a kay.nam ‘grass sp.’ e key.kal ‘yam sp.’ o koy.maN ‘coconut palm’

word-medial — — — ko.la.leg ‘bird sp.’ pa.to.daN ‘far across river’ (t) a.ge.nak ‘when you said’

17

word-final — — — — — —

— —

aj ‘husband’s sister’ et~etp ‘what?’

—

ok ‘the, this, that’

ka.may.gis ‘bird sp.’ kob.kaw.nan ‘spider sp.’ ko.dal.nop ‘scorpion’

kay.nam ‘grass sp.’ key.kal ‘yam sp.’ kab dpyn ‘I’ve taken the stone’

Table II Syllable types.

were chosen to illustrate unambiguous phonological and phonetic CVC syllable types. Predictable vowels are absent in these forms either because the ﬁrst consonant is a glide (/kay.nam/) or because there is a phonological word-boundary between the consonants : e.g. /kodal nop/ ‘scorpion ’, from /kodal/ ‘centipede’, /nop/ ‘ father’; /kab dpyn/ [okamp.d6.pBi;n], from /kab/ ‘stone ’, /dpyn/ ‘I have taken’. In this latter example, the word-ﬁnal position of /kab/ is evident both in the absence of a predictable vowel before the next consonant, and in the ﬁnal devoicing of /b/. A ﬁnal aspect of Kalam sound patterns that needs to be introduced is word stress. Every Kalam phonological word has at least one word stress, and many have multiple stresses. There is no clear evidence for primary vs. secondary stress within the word. Basic rules of stress placement are shown in (14), and stated in terms of vowels as stress-bearing units.17 17 See Pawley (1966 : 37–43) for an early treatment of stress, and Pawley & Bulmer

(2003), where stress is marked on all lexemes. Stress is most prominent on vowels, but there are no strong arguments for vowels vs. syllables as stress-bearing units, except perhaps the minimal word constraint in (20). One reader asks whether it would be possible to posit abstract syllabic consonants or degenerate syllables, and apply stress rules to these representations, with the predictable vowels themselves inserted after stress assignment. While this would be possible, it would greatly weaken the nature of Hall’s typology, since any predictable surface-stressed vowel could be scratched from the epenthetic category by a derivation involving degenerate syllabiﬁcation, stress and predictable vowel insertion, in that order.

18

Juliette Blevins and Andrew Pawley

(14) Basic stress rules a. Stress the last vowel of all words (including monosyllables). b. Stress all full (non-predictable) vowels. c. Stress the first vowel of a word, provided that the next vowel is not stressed.

The rules in (14) are well known from metrical stress theory (Hayes 1995) : (14a) assigns stress to the last stress-bearing unit, ‘end-rule ﬁnal ’, (14b) assigns stress based on vowel-quantity (see Table I), a case of quantitysensitivity, or ‘weight-to-stress ’, and (14c) assigns initial stress, an instance of ‘end-rule initial ’ with clash avoidance. Since predictable vowels may be stressed by (14a) or (c), a derivational model must order vowel insertion before stress assignment. The basic stress rules in (14) are illustrated in (15), where ‘v’ represents a predictable vowel. (15a–d) have non-predictable (full) vowels only. (15e–h) are words with only predictable (reduced) vowels. (15i–l) have a mix of vowel types.

(15) Stress patterns with di‰erent word types c. b. /ebap/ a. /aj/ lexical e’bap ‘aj (14a) ‘e’bap ‘aj (14b) — — (14c) [’e;’mba;p] [‘a;ntj] surface ‘one’ ‘husband’s sister’ lexical e. /m/ mv insertion (14a) ‘mv (14b) — (14c) — surface [’m@] ‘taro’

d. /patayam/ /kolaleg/ pata’yam kola’leg ‘pa’ta’yam ‘ko’la’leg — — [’ko;’la;’le;Nk] [’Fa;’ra;’ja;m] ‘pandanus ‘bird sp.’ sp.’

f. /kd/ g. /kdl/ kvd kvdvl ‘kvd kv’dvl — — — — [’kînt] [’kî’ndîl] ‘segment’ ‘sinew’

lexical j. i. /kabs/ kabvs insertion (14a) ka‘bvs (14b) ‘ka‘bvs (14c) — surface [’ka;’mbîs] ‘cleft stick’

/ksen/ kvsen kv’sen — — [kî’sen] ‘fresh’

h. /cmnm/ cvmvnvm cvmv’nvm — ‘cvmv’nvm [’tjimî’nîm] ‘tree sp.’

l. /bemlgon/ (t) k. /klNan/ kvlvNan bemvlvgon kvlv’Nan bemvlv’gon kvlv’Nan ‘bemvlv’gon ‘kvlv’Nan — [’mbe;mîlî[’kîlî’Na;n] ’Ngo;n] ‘snake sp.’ ‘group of cousins’

Typological implications of Kalam predictable vowels 19 The rules in (14) account for the majority of surface stress patterns, including inﬂected verbs (16a) and compounds (16b).18 (16) Inflected verbs and compounds lexical a. /pk-p-n-p/ insertion pvkvpvnvp (14a) pvkvpv’nvp (14b) — (14c) ‘pvkvpv’nvp surface [’FîGîBî’nîp] ‘I could have hit’ (/pk/ ‘hit’)

b. /wj+blp/ wvjvbvlvp wvjvbv’lvp — ‘wvjvbv’lvp [’wi;ndjîmbî’lîp, ’wu;ndjîmbî’lîp, ’wy;ndjîmbî’lîp] ‘bird (pl)’ (?
3.2 Kalam predictable vowels As highlighted earlier, in addition to vocalised glides, two kinds of vowels can be distinguished in Kalam: the full vowels /a e o/, which are stable, of unpredictable quality, always stressed, of relatively long duration and limited to word-initial and word-medial positions, and predictable vowels. Predictable vowels, as already noted, are of relatively short duration, occur predictably between consonants within the word, never occur wordinitially and are only stressed in ﬁnal syllables (14a) or in initial syllables preceding unstressed vowels (14c). In this section we provide further details of the phonology of Kalam predictable vowels. Our aim is to show that Kalam predictable vowels are neither ‘epenthetic ’ nor ‘intrusive ’ in the sense of Hall (2006), and to explain aspects of their mixed status. 3.2.1 Why Kalam predictable vowels are not ‘ intrusive’ vowels. Hall’s class of intrusive vowels are phonologically invisible. The most obvious property of Kalam predictable vowels that eliminates them from ‘intrusive ’ vowel candidacy is the fact that they are stressed by the regular stress rules (14a) and (14c) in word-ﬁnal and word-initial positions respectively. Examples of stressed intrusive vowels were given in (15) and (16). Some of these are repeated in (17), along with other examples of lexically vowelless words. Although it is possible to view all the vowels in (17) as simple transitions from one consonant to the next, it is not possible to analyse all

18 One exception to these rules involves the verbal suﬃx (or enclitic) /-knN/ ‘simul-

taneous action by diﬀerent subject ’. Despite its ﬁnal position in the word, it is never stressed. Instead, ‘ ﬁnal’ word stress falls on the vowel immediately preceding this suﬃx : /aw-a-knN/ [oa;pwa;G6n6N] ‘ while he was coming ’. Though it is not stressed, /-knN/ is part of the preceding word for the purposes of predictable vowel insertion, as shown by examples like /g-n-knN/ [Ng6’n6G6n6N] ‘ while I was doing ’, not *[Ng6n k6n6N].

20 Juliette Blevins and Andrew Pawley predictable vowels as phonologically invisible, since some of them carry stress.

(17) Stressed predictable vowels /kd/ [‘kînt] /kdl/ [kî’ndîl] /cmnm/ [‘tjimî’nîm] /pk-p-n-p/ [‘FîGîBî’nîp] /g-n-knN/ [Ngî’nîGînîN] /wjblp/ [‘wu;ndjîmbî’lîp]

‘segment’ ‘sinew’ ‘tree sp.’ ‘I could have hit’ ‘while I was doing’ ‘bird (pl)’

Another aspect of Kalam predictable vowels which would appear to eliminate them from the ‘intrusive ’ vowel class is their dual function. Though in many cases, like (17), they can be viewed as simple transitions from one consonant to the next, this is not the case for all words. In one small class of words, predictable vowels are ‘epenthetic ’, in the sense of serving to repair illicit structures. The small class of words in question, shown in (18), are those that Pawley (1966) analyses as phonologically monoconsonantal.19

(18) Predictable vowels and the minimal word constraint in composition in isolation /b adeN/ [‘mba;’nde;N] /b/ [‘mb@] ‘man’ /m agom/ [‘ma;’Ngo;m] /m/ [‘m@] ‘taro’ [‘nda;m] /d/ [‘nd@] ‘hold, get’ /d am/ /g-ep/ [‘Nge;p] /g/ [‘Ng@] ‘happen’ /ñ-an/ [‘nja;n] /ñ/ [‘nji] ‘fit, give’ [‘la;n] (t) /l/ ‘stabilise’ /l-an/ [‘l@]

‘man alone’ ‘seasoned taro’ ‘take’ ‘doing’ ‘put (it) on!’ ‘put (it) down!’

When these morphemes are uttered as independent phonological words, predictable vowels occur. However, as shown in (18), when these morphemes are part of bigger words and followed immediately by a vowel, the monoconsonantal realisation is found. The predictable vowels in isolation forms cannot be interpreted as transitions from one consonant to the next, for the simple reason that there is no following consonant. In this case, it seems that what triggers the appearance of the stressed predictable vowel is the constraint stated in (19) that a minimal word must consist of at least one syllable which itself can carry word stress. Predictable vowels in (18) then serve to bulk subminimal words up to minimal words by adding a ﬁnal vowel.

(19) Minimal word constraint minimal word=minimal foot=s 19 Pawley (1966 : 23) states that ‘ only nasals and prenasalised obstruents occur alone in

minimal utterances º of the form #C# ’. This is the case in the Etp Mnm dialect, but Ti Mnm has the verb root /l/ ‘ stabilise ’, shown in (18). Another word type with seemingly epenthetic predictable vowels is discussed in w3.2.2.

Typological implications of Kalam predictable vowels 21 As noted in w3.1, accounting for the monosyllabic isolation forms in (18) in terms of predictable vowels seems preferable to positing underlying vowels for three reasons. First, as with other predictable vowels, the quality and position of these vowels is rule-governed. Second, if underlying vowels are posited in these words, they would be the only underlying vowel-ﬁnal words in the language. Third, if underlying central vowels are posited in these words, they would be the only underlying central vowels in the language. We conclude, then, that the predictable vowels in (18) serve a clear ‘repair’ function : phonological words consisting of only a single consonant are too small to constitute minimal words, and are bulked up to stressable CV syllables by this process. In these cases, the predictable vowel has properties of an epenthetic vowel, not an intrusive vowel. Stressability is the most salient property of Kalam predictable vowels that eliminates them from Hall’s intrusive vowel category. At the same time, the quality of Kalam predictable vowels is not precisely what one expects under Hall’s intrusive classiﬁcation either. Recall from (5) two heuristics regarding the quality of intrusive vowels, repeated as (20a). (20) Intrusive vowel quality (Hall 2006) a. The vowel’s quality is either schwa, a copy of a nearby vowel or influenced by the place of the surrounding consonants. b. If the vowel copies the quality of another vowel over an intervening consonant, that consonant is a sonorant or guttural.

Kalam predictable vowels are sometimes schwa (the ﬁrst four examples in (18)), and sometimes a copy of a nearby vowel, or inﬂuenced by the place of surrounding consonants (21i–r) : (21i–l) show the regular pattern of palatals triggering a following short [i], (21m–n) show optional [i] preceding palatals, (21o–q) show [u] after /w/ and (21r) shows optional [u] before /w/. These assimilatory patterns are the expected type for intrusive vowels, in line with (20a). However, the most common realisation of the Kalam predictable vowel is [6] (21a–d), a high central vowel.20 Further, when the vowel quality is a copy of another vowel over an intervening consonant, the intervening consonant need not be a sonorant or guttural, as per (20b). Rather, as shown in (21e–h), there are no clear restrictions on the nature of intervening consonants, and full anticipatory vowel copy or harmony is optional, and unbounded within the word (21g).

20 The status of [6] as the ‘default ’ quality of predictable vowels is what lies behind the

choice of <6> as the orthographic symbol for this vowel in Pawley & Bulmer’s (2003) dictionary. In many contexts, [6] is the only form of the predictable vowel that occurs ; in others, [6] varies with a copy vowel. We treat schwa in examples like (18) as an allophone of the predictable vowel in stressed open monosyllabic words.

22

Juliette Blevins and Andrew Pawley

(21) Predictable vowel quality (Pawley 1966: 33–37) default [î] full/partial V-copy a. /mlp/ ‘dry’ [mî’lîp] b. /kdl/ ‘sinew’ [kî’ndîl] c. /mgn/ ‘vulva’ [mî’Ngîn] d. /g-p-n-p/ [NgîBî’nîp] ‘I might have done’ e. /mlwk/ [mu’lu;k] ‘nose’ [mî’lu;k] f. /ykop/ [jo’Go;p, j–’Go;p ] ‘without cause’ g. /kgoN/ ‘garden’ (pl) [ko’Ngo;N, k–’Ngo;N] h. /bkdoN/ ‘yonder across valley’ [‘mboGo’ndo;N, mb–G–’ndo;N] assimilation to adjacent C i. /cg/ [‘tjiNk] ‘adhere’ [‘ndjil] j. /jl/ ‘concave section’ [‘njiNk] k. /ñg/ ‘water, clear liquid’ [‘ka;ji’Na;j] l. /kayNay/ ‘tree sp.’ m. /wcm/ ‘golden ringtail’ [‘wi;’tji;m] n. /ap-yap/ ‘fall, drop’ [‘a;Bi’ja;p] o. /wdn/ ‘eye’ [‘wu;’ndîn] [‘wu;’ndîn] p. /wlk/ ‘mix things together’ [‘wu;’lîk] [‘wu;’lîk] q. /kowñak/ ‘yam sp.’ [‘ko;wu’nja;k] r. /an-wak/ ‘co-wives’ [‘a;nu’wa;k]

A ﬁnal aspect of Kalam predictable vowels which make them unlike Hall’s intrusive vowels is their invariability. In (5d) intrusive vowels are described as ‘likely to be optional ’ and ‘ have a highly variable duration or disappear at fast speech rates’. However, apart from the VsCV and VGCV contexts discussed in w3.1, Kalam predictable vowels do not show this property. Though they are short, they do not disappear altogether at fast speech rates. The fact that Kalam predictable vowels can be stressed seems to rule them out as canonical invisible ‘intrusive ’ vowels. Where vowel quality and variability is concerned, Kalam predictable vowels do not pattern with intrusive vowels either. We conclude that Kalam predictable vowels are not instances of intrusive vowels in the sense of Hall (2006). 3.2.2 Why Kalam predictable vowels are not ‘epenthetic ’ vowels. If Kalam predictable vowels are not intrusive vowels, then perhaps they are epenthetic vowels. Recall that epenthetic vowels are phonologically visible. In addition, they serve to repair otherwise illicit structures. The fact that Kalam predictable vowels can be stressed is consistent with phonological visibility. Further, we have seen in (18) that some predictable vowels in Kalam serve to repair illicit structures by bulking up subminimal words. In addition, there is another set of contexts where predictable vowels may be viewed as epenthetic. Recall from (12) and (13) that, though

Typological implications of Kalam predictable vowels 23 predictable vowels occur word-internally between nearly all CC sequences, they are rare in VC1.C2V when C1 is /y/ or /w/, and optional when C1 is /s/. However, in C1C2 clusters where C1 is /y/, /w/ or /s/, and one or both of the consonants are unsyllabiﬁable, a predictable vowel is obligatory. Examples are given in (22), and suggest that, at least in a limited set of contexts, predictable vowels in Kalam are epenthetic in the sense of creating well-formed CV or CVC syllables. (22) Predictable vowels and well-formed syllables predictable derived vowel syllable [‘sîs] CVC a. /ss/ oblig. CV oblig. [sî’sa;k] /ssak/ — absent [‘a;s’se;r, /as-set/ ‘a;’se;r] [‘sîk] CVC b. /sk/ oblig. CV oblig. [sî’ka;p] /skap/ oblig. /skask/ [sî’Ga;’sîk] CV, CVC —, CV opt. /as-ket/ [‘a;s’ke;r, ‘a;sî’Ge;r] [‘ji;m] c. /ym/ oblig. CVC /yman/ oblig. [‘ji;’ma;n] CV /ay-may/ opt. [‘a;j’ma;j] — d. /wk/ oblig. CVC [‘wu;k] [‘wu;Ga’p] CV /wkap/ oblig. /awleg/ opt. [‘a;w’le;Nk] —

‘urine’ ‘left unfinished’ ‘leech sp.’, cf. /as/ ‘frog’, /set/ ‘leech’ ‘enter’ ‘wedge’ ‘feel smth ticklish’ ‘leech sp.’, cf. /as/ ‘frog’, /ket/ (poss cl) ‘plant crops’ ‘louse’ ‘two sisters’ ‘break open’ ‘tree sp.’ ‘tadpole’

Given that predictable vowels can be stressed, bulk up subminimal words and create well-formed syllables, in what way are they not instances of Hall’s class of epenthetic vowels ? Recall from our earlier discussion that not all predictable vowels appear to serve the bulking or syllabiﬁcation function. In particular, word-medially in VC1C2V strings, a predictable vowel is obligatory between the two consonants, provided that C1 is not /y/, /w/ or /s/. In this position, they serve neither the bulking function nor the syllabiﬁcation function, since coda consonants are the norm wordﬁnally : /an/ [oa;n] ‘who ’ in isolation, /an kun agak/ [oa;npku;pna;pNga;k] ‘who said so ?’ with [nk] in sandhi, but /an-ket/ [oa;n6pGe;r] ‘whose ?’ (12g), /anwak/ [oa;nupwa;k] ‘co-wives ’ (12h), where the word-internal CC cluster is split by a predictable vowel. Though Hall (2006 : 407) says explicitly that ‘epenthesis º is a way of repairing syllables that violate a language’s abstract structural rules’, she also uses a broader deﬁnition, where an epenthetic vowel simply ‘removes a marked structure ’ (Hall 2006: 393). Under this broad deﬁnition, Kekchi morphologically conditioned vowel insertion between C-ﬁnal roots and certain C-initial verbal suﬃxes is treated as epenthesis : ‘ CC clusters are avoided in many languages, so the epenthesis removes a marked

24 Juliette Blevins and Andrew Pawley structure ’. However, apart from this vowel-insertion process itself, there is no evidence that CC clusters are marked in Kekchi.21 Furthermore, in the majority of languages where CC clusters are avoided, it is possible to restate the generalisation in syllabic terms : avoidance of medial CCC and ﬁnal CC in Yokuts is attributed to a maximal CVC syllable template (see note 4). More generally, since this broader deﬁnition of epenthesis allows any process of vowel insertion to be functionally deﬁned as epenthesis on the basis of some markedness constraint, it is not particularly useful in distinguishing vowel insertion types. Nevertheless, we brieﬂy consider two ways that all Kalam predictable vowels, including those inserted between medial VCCV clusters, might be analysed as repairing illicit or marked structures, and show how each falls short of descriptive adequacy. One possibility is to view Kalam predictable vowels as repairing not syllable structure, but word phonotactics. Under this view, Kalam predictable vowels are inserted to eliminate word-internal consonant clusters. A general statement of the constraint is *CC, where the relevant domain is the phonological word.22 Words illustrating the general nature of this constraint across diﬀerent consonant types are shown in (23). Wherever possible, underlying VCCV strings are used for illustration, since predictable vowels inserted between clusters in this context cannot be viewed as serving a word-bulking or obligatory syllabiﬁcation function. (O=obstruent, R=sonorant, N=nasal, Ci=place feature identity). (23) Predictable vowels as possible repairs for ill-formed CC sequences [‘ko;Bî’ro;mp] a. OO ‘sphagnum moss’ /koptob/ /kajben/ [‘ka;ntji’mbe;n] ‘sugar glider’ ‘yonder across river’ /akdoN/ [‘a;Gî’ndo;N] ‘house fly’ /kabkol/ [ka;mbî’Go;l] ‘feathers on crown of birds’ /askom/ [‘a;sî’Go;m] ‘light penetrating a barrier’ /asday/ [‘a;sî’nda;j] ‘caterpillars sp.’ /loksam/ [‘lo;Gî’sa;m] b. RR ‘herb sp.’ /koñmay/ [‘ko;nji’ma;j] ‘taro sp.’ /amlan/ [‘a;mîl’a;n] ‘co-wives’ [‘a;nu’wa;k] /anwak/ ‘uncultivated pandanus’ /alnay/ [‘a;lî’na;j] ‘taro sp.’ /alwag/ [‘a;lu’wa;Nk]

21 For more general arguments against universal phonological markedness constraints,

see Blevins (2004, 2006, 2008). 22 This approach is similar to Comrie’s analysis (1991 : 394) of Haruai : ‘ in general,

Haruai avoids phonetic consonant clusters. Where two consonants would occur in sequence, or where a word would consist only of a consonant, the phonetic 6 vowel is inserted after the ﬁrst or (sole) consonant ’.

Typological implications of Kalam predictable vowels c. OR

/aknaN/ /aklaN/ /akyaN/ /agnoN/ /aglak/ /kabyam/ d. RO /amkab/ /mañtopy/ /amgaj/ /alsas/ e. NiCi23 /-mb/ /ntp/ /ntk/ f. CiCi24 /abben/ /aññak/ /ppal/ /mmaly/ /llmag/

[‘a;Gî’na;N] [‘a;Gî’la;N] [‘a;Gî’ja;N] [‘a;Ngî’no;N] [‘a;Ngî’la;k] [‘ka;mbi’ja;m] [‘a;mî’Ga;mp] [‘ma;nji’ro;’Bi;] [‘a;mî’Nga;ntj] [‘a;lî’sa;s] [-’mîmp] [nî’rîp] [nî’rîk] [‘a;mbî’mbe;n] [‘a;nji’nja;k] [Fî’Fa;l] [‘mî’ma;’li;] [‘lîlî’ma;Nk]

25

‘large eel sp.’ ‘cuscus sp.’ ‘down there’ ‘tree sp.’ ‘wife’s sister’s husband’ ‘tobacco’ ‘cane wicker frame sp.’ ‘name of a string figure’ ‘large flying insect sp.’ ‘yam sp.’ (2pl subj) (2dual obj) (2dual subj) ‘giant tree-rat sp.’ ‘lightning’ ‘shaking, jerking’ ‘vine sp.’ ‘feel that one is going to get sick’

Recall, however, that there are two systematic exceptions to this statement: the glides /y w/ and the fricative /s/ need not be followed by predictable vowels within the word when preceding another consonant, as illustrated in (12k–r), with further examples in (24). After glides, the predictable vowel is more often absent (24a). After /s/, it is usually variable (24b). These exceptions make it diﬃcult to state a general *C1C2 constraint, since for C1 /y w s/ must be excluded, but for C2 the same consonants must be included (24c).25 (24) Optional predictable vowel after /y w s/ a. /kaynam/ [‘ka;j’na;m] /ay-may/ [‘a;j’ma;j] [‘a;w’le;Nk] /awleg/ /kaw-bap/ [‘ka;w’mba;p] b. /kaskam/ [‘ka;s’ka;m, ‘ka;sî’Ga;m] /as-ket/ [‘a;s’ke;r, ‘a;sî’Ge;r]

‘grass sp.’ ‘pair of sisters’ ‘tadpole’ ‘several, a few’ ‘tree sp.’ ‘leech sp.’

23 Morpheme-internal homorganic NC sequences are rare, and none occur in the

V_V context. When homorganic NC sequences occur word-internally across morpheme boundaries, they tend to be eliminated by loss of the nasal component. 24 Morpheme-internal identical CC sequences are rare in the V_V context ; for this reason, word-initial sequences of this kind are exempliﬁed as well. 25 Recall that we cannot revert to a syllable-based constraint where /y w s/, but not other consonants, are possible codas, since, as noted earlier, all consonants are possible codas in word-ﬁnal position. Haruai (see note 22) also has exceptions to predictable vowel insertion, which also appear to defy a syllable-based analysis.

26

Juliette Blevins and Andrew Pawley

c. /mokyaN/ /ak-yaN/ /atwak/ /loksam/ /bet-wad/

[‘mo;Gi’ja;N] [‘a;Gî’ja;N] [‘a;ru’wa;k] [‘lo;Gî’sa;m] [‘mbe;rî’wa;nt]

‘decorations’ ‘down there’, cf. /ak-/ (loc) ‘silky cuscus sp.’ ‘caterpillars sp.’ ‘breastplate of bird’, cf. /bet/ ‘platform’, /wad/ ‘bag’

We suggest that the generalisation being missed by *CC, or any structural constraint, is that /y w s/ are the only consonants in Kalam which lack a phonetic release. Since the glides are vowel-like, there is neither closure nor release. Similarly, the fricative /s/ involves a constriction, but no closure, and therefore no release. All other consonants in Kalam are stops, or in the case of /l/, involve central closure and release (7a). Once this generalisation is taken into account, the distribution of predictable vowels in words like those in (23) can be related to conditions on consonant release, as stated in (25).26

(25) Conditions on consonant release a. Word-internally, a consonant is released. b. Word-finally, a consonant is typically unreleased. The simple conditions in (25) are the kind associated with Hall’s ‘intrusive ’ vowel class. Vowels associated with consonant release do not seem to have the function of repairing illicit structures. Further, the consonant clusters in which the vowel occurs may be less marked (e.g. RO in VR.OV) than clusters which surface without vowel insertion in the same language (e.g. sR in Vs.RV).27 In addition, the conditions in (25) are natural : there are many languages in which consonants are released wordinternally (e.g. Moroccan Arabic), and many others where they are unreleased word-ﬁnally (e.g. Cantonese).28 Under this analysis, predictable vowels appearing in words like (23) are not epenthetic, though other predictable vowels like those in (18) may serve an epenthetic function. Before providing further arguments along these lines, we consider one 26 The notion of ‘ release’ here is an articulatory one : any segment involving closure

(including lateral closure or ﬂeeting tap closure) has, by deﬁnition, a release phase. Note that if this constraint is posited for the synchronic grammar, then it must hold of the phonemes in (7a), not their phonetic realisations, since, on the surface, stops may be realised as fricatives intervocalically. There are other ways of stating the general constraint. For example, one could demand that word-internal syllable contact be ‘ open ’ (with consonant release), while contact across word-boundaries be ‘ closed ’ (without consonant release). 27 The markedness principle in question is the syllable-contact law, which ranks sonorant–obstruent sequences above obstruent–sonorant sequences at syllable boundaries (Vennemann 1988). However, numerous other ‘ preferred ’ cluster types, like homorganic NC sequences, are also split by Kalam predictable vowels. 28 See Davidson & Stone (2003) for experimental ultrasound evidence of transition vowels which result not from ‘epenthesis ’, but from non-overlapping consonantal gestures of the kind suggested here for Kalam.

Typological implications of Kalam predictable vowels 27 other potential analysis where Kalam predictable vowels like those in (23) would serve to ‘repair marked structure ’. Instead of a ban on CC clusters within the word, predictable vowels in forms like (23) could be attributed to a constraint demanding that ‘all syllables be open ’, where this constraint holds word-internally, but not word-ﬁnally. Within an optimality treatment this would involve a ranking of FINALC (‘ prosodic words must end in consonants’) over NOCODA (‘ syllables should be open ’) over DEP (‘no epenthesis’). There is at least one technical problem with this account : monoconsonantal words like those in (18) surface with ﬁnal predictable vowels, suggesting that for these derivations it is NOCODA which dominates FINALC, not the reverse. Since technical problems within OT grammars can always be solved by invoking additional constraints, we turn to a more fundamental problem with the analysis. This, again, concerns the data like that in (24a, b), where glides and /s/ can close word-medial syllables. Even if a mechanical solution is proposed where predictable vowels are inserted everywhere, but deleted optionally after glides and /s/, or where glides and /s/ are preferred word-medial codas, there is no explanation for why glides and /s/ form a natural class for optional deletion or preferred coda status. In contrast, under the analysis proposed in (25), release or ‘open transition ’ between two segments will, in part, depend on the phonetic nature of the segment involved : if it does not involve closure, then there will be no release, or no signiﬁcant or audible open transition. In sum, though it is possible to view some predictable vowels in Kalam as epenthetic in the sense of Hall (2006), not all submit to analysis in these terms. In particular, seeming transition vowels in word-internal VC.CV sequences, like those in (23), seem best analysed as non-overlapping consonant gestures, where word-medial consonants are released (25). A further property which sets Kalam predictable vowels apart from both canonical epenthetic and intrusive vowels is consonant-cluster splittability. Hall’s intrusive vowels generally appear in heterorganic clusters (5c), and regular rules of vowel epenthesis have been claimed to respect geminate integrity, failing to split morpheme-internal geminate clusters (Kenstowicz & Pyle 1973, Guerssel 1977).29 The examples in (23) show, however, that Kalam predictable vowels are found between any two word-internal consonants, including sequences of obstruent–obstruent (a), sonorant–sonorant (b), obstruent–sonorant (c), sonorant–obstruent (d), homorganic nasal–obstruent (e) and identical (geminate) morphemeinternal consonant sequences (f). Finally, as noted in the preceding subsection, the quality of Kalam predictable vowels results in their classiﬁcation as intrusive, not epenthetic vowels. As illustrated above in (21), Kalam predictable vowels can be central, a copy of a nearby vowel or inﬂuenced by the place of surrounding consonants, exactly as Hall describes for intrusive vowels (5a). 29 See Blevins (2004 : 184–188) on exceptions to geminate integrity. We return to the

signiﬁcance of geminate integrity violations in w3.3.

28 Juliette Blevins and Andrew Pawley The quality of predictable vowels is not ﬁxed, but highly variable and sensitive to phonetic context (16b). The fact that Kalam predictable vowels do not always serve to repair illicit structures seems to rule them out as classical epenthetic vowels. In particular, their word-internal function between consonants in VCCV contexts appears to be primarily one of release or open transition, regardless of cluster composition (25). Kalam predictable vowels have variable quality, determined by the phonetics of surrounding vowels and consonants, another characteristic atypical of epenthetic vowels. We conclude that, although Kalam predictable vowels are phonologically visible for stress, they cannot all be fruitfully analysed as instances of epenthetic vowels in the sense of Hall (2006). In sum, Kalam predictable vowels appear to have mixed properties. In terms of quality, and certain aspects of their distribution, they mimic intrusive vowels. In terms of stressability, and a subset of their functions, they are more like epenthetic vowels. In the following section, we suggest that the synchronic mix of properties exhibited by Kalam predictable vowels is partly explained by their historical origins. 3.3 Kalam predictable vowels as ‘remnant ’ vowels Recall from w2 that Hall’s account of intrusive vowels is based on their source in articulatory retiming of consonant gestures. If clusters of properties exhibited by predictable vowels are in part attributable to their source, and if articulatory retiming is not the sole source of predictable vowels, then we should not be surprised if Hall’s synchronic typology appears inadequate or incomplete. Given other pathways of synchronic vowel–zero alternations, we might expect other predictable vowel types, with a mix of the properties in (4) and (5), or additional properties of their own. We have demonstrated above that Kalam predictable vowels have a mix of properties in (4) and (5), as listed in (6a–f). In addition, they appear, in some cases, to be related to consonant-release features (6g). A ﬁnal property of Kalam predictable vowels is that their absence in lexical representations results in long strings of consonants, and words that may lack vowels altogether (6h) : /kslm/ ‘night, darkness’, /lknm/ ‘small frog sp. ’, /plkd/ ‘wing’, /sblN/ ‘ umbilical cord ’, /ssnm/ ‘wild millet ’, /pkcg/ ‘ fasten’, /k8gld/ ‘vine sp.’, /pktbk/ ‘attach ’, /pkpnp/ ‘ I could have hit ’, /mdnknN/ ‘while I was staying ’, /pbtknknN/ ‘ while I was fastening’. Though this property might seem unique to Kalam, we show in w4 that it is typical of languages with the same mixed type of predictable vowels, and therefore worthy of explanation. In the remainder of this section, we explore the historical origins of Kalam predictable vowels, and suggest ways in which their mixed properties follow from these origins. We will refer to predictable vowels with Kalam-like properties as ‘remnant ’ vowels. Remnant vowels are historical traces of vowel reduction and loss, found sometimes in their historical positions, and sometimes elsewhere. Though synchronically,

Typological implications of Kalam predictable vowels 29 their distribution can be predicted by insertion algorithms, diachronically they reﬂect inversion of unstressed reduced vowel loss. Since remnant vowels evolve from reduced vowels, they share many of the properties of reduced vowels : they are typically unstressed, very short and greatly inﬂuenced by coarticulatory eﬀects.30 Unlike Hall’s ‘ intrusive’ vowel category, remnant vowels are not a rephasing of existing gestures which result in vowel-like percepts. For this reason, they have none of the articulatory hallmarks of intrusive vowels : they are not generally limited to heterorganic clusters, and they do not have a highly variable duration. Like epenthetic vowels, remnant vowels do involve synchronic ‘insertion ’ in the generative sense, leading to true vowel–zero alternations, as in data like (18) above. Unlike epenthetic vowels, remnant vowels may not serve any obvious function : as in Kalam, they may simply reﬂect former positions of unstressed reduced vowels, and nothing more.31 As noted earlier, Kalam is one of two members of the Kalamic group, the other being Kobon. Historical work on Kalamic is not extensive, but includes Pawley & Osmond (1998), Pawley (2001, 2008) and Coberly (2002). Of these works, only Pawley (2001) and Coberly (2002) deal with the phonological history of Kalamic as such. Of the two dialects of Kalam, Ti Mnm and Etp Mnm, Ti Mnm is more conservative phonologically, with more full (vs. predictable) vowels, and syllable-ﬁnal /l/, which has often vocalised in Etp Mnm. Our working hypothesis is that historical vowel reduction/deletion led to a restructuring of parts of the Kalam phonological system, with its many predictable vowels. Some predictable vowels in Kalam are true remnants of once-present reduced vowels, while others are non-etymological consequences of reanalysis. Vowel reduction should occur where vowels are unstressed. Although we described Kalam word stress above as falling on ﬁnal vowels as well as on all full vowels, we did not mention an important fact about Kalam stress: in phrasal contexts, all but the last stress tends to be subordinated, weakened or lost. Since many morphemes and words in the language will often be in non-phrase-ﬁnal position, they may be unstressed, and therefore targets of vowel reduction. It is this stress subordination at the phrasal level that seems to have given rise to signiﬁcant vowel reduction in the Kalamic group. As preliminary evidence for this hypothesis, we note that vowel reduction is an ongoing process in Etp Mnm, and, to a lesser extent in Ti Mnm, as evidenced by full and reduced variants of many words. Consider the word /jwn/ ‘head’ in (26a). While this noun can occur alone in a noun 30 Recent work on vowel reduction includes Crosswhite (2004), Harris (2005) and

Barnes (2006). In systems where vowel reduction results in contrast maintenance (e.g. reduction of /i u a e o/ to [i u a]), ‘ remnant ’ vowels will not evolve, since vowel quality remains unpredictable. 31 Remnant vowels resemble other sound patterns which arise from rule inversion in their resistance to explanations rooted solely in markedness constraints. See, for example, Blevins (2008) on patterns of consonant epenthesis arising from historical inversion of weak coda loss.

30 Juliette Blevins and Andrew Pawley phrase, it is often a modiﬁer, as in : /jwn-bad/ ‘head-like appendage ’, /jwn kas/ ‘head hair’, /jwn mok/ ‘brain ’, etc. In fast speech, when /jwn/ is not in phrase-ﬁnal position, it is often reduced to /jn/ [dyin]. The situation is similar for /swd/ ‘ sword-grass sp. ’. Though this word can occur alone, referring to the taxon, it is very common as a modiﬁer, as in: /swd aydk/ ‘ common sword-grass’, /swd yNleb/ ‘Thysanolena maxima’, /swd magi/ ‘ seed-heads of sword-grass’, etc. In these contexts, it is often reduced to /sd/. In (26) we list all words noted with fast-speech reduced forms from Pawley & Bulmer (2003). In most cases, it is the surface vowel [u] (from vocalised /w/) that is reduced (26a, b). In two examples, (26b), a vowel is reduced between identical consonants, suggesting a historical source for the many words like those in (23f) with initial identical consonants. Though this might seems a minor point, a cross-linguistic generalisation holding of epenthesis into seeming geminate clusters is that, in all known cases, the historical sound change in question giving rise to this sound pattern was unstressed vowel loss between identical consonants (Blevins 2004 : 184–188, Blust 2007). In (26c), the ﬁrst /a/ in CaCaC is reduced. Though we refer to this process as vowel reduction, based on the diﬀerence between surface forms in (26), following our account of predictable vowels above, it appears to involve vowel deletion at the lexical level. (26) Synchronic vowel reduction/loss in Etp and Ti Mnm fast speech slow speech /alk-/ a. /alwk-/ [‘a;’lîG-] [‘a;’luG-] /jj/ [‘ndjintj] /jwj/ [‘ndju;ntj] /jn/ [‘djin] /jwn/ [‘ndju;n] /jnp/ [ndji’nîp] /jnwp/ [ndji’nup] /kd/ [‘kînt] /kwd/ [‘ku;nt] /kn/ [‘kîn] /kwn/ [‘ku;n] /kneN/ [kî’ne;N] /kwneN/ [‘ku;’ne;N] /lg/ [‘lîNk] /lwg/ [‘luNk] /pb/ [‘Fîmp] /pwb/ [‘Fu;mp] /sd/ [‘sînt] /swd/ [‘su;nt] /sgun/ [sî’Ngu;n] /swgwn/ [‘su;’Ngu;n] [sî’ndjiNk] /swjg-/ [‘su;ndjiNg-] /sjg-/ /sN/ [‘sîN] /swN/ [‘su;N] b. /gwgolN/ [‘Ngu;’Ngo;- /ggolN/ [Ngî’Ngo;’lîN] ’lîN] /mwmlak/ [‘mu;mî’la;k] /mmlak/ [‘mîmî’la;k] /plaj/ [Fî’la;ntj] c. /palaj/ [‘Fa;’la;ntj] /ptaj/ [Fî’ra;ntj] /pataj/ [‘Fa;’ra;ntj] /ykam/ [‘ji;’Ga;m] /yakam/ [‘ja;’Ga;m]

(prefix (t)) ‘base’ ‘head’ ‘squeak’ ‘back’ ‘like this’ ‘upriver’ ‘move smoothly’ ‘sun’ ‘sword grass’ ‘tree sp.’ ‘extract object’ ‘in good health’ ‘herb sp.’

‘mould’ ‘sliver of pearl’ ‘unmarried’ ‘group of people’

Stronger evidence for predictable vowels having sources in historical vowel reduction/deletion is found in comparative data from the two major Kalam dialects and Kalam’s closest relative, Kobon. In (27), where Ti

Typological implications of Kalam predictable vowels 31 Mnm has an underlying full vowel /a e o/ (in bold), the Etp Mnm cognate has no lexical vowel, but shows a predictable vowel in the same position, underlined in the broad phonetic transcription.

(27) Recent vowel reduction/loss in Etp Mnm Etp Mnm Ti Mnm a. /pak/ [‘pîk] [‘pa;k] /pk/ b. /pok/ [‘po;k] [‘pîk] /pk/ c. /ped/ [‘pînt] [‘pe;nt] /pd/ d. /tep/ [‘te;p] [‘tîp] /tp/ e. /ctek/ [tji’re;k] [tji’rîk] /ctk/ f. /mlep/ [mî’le;p] /mlp/ [mî’lîp] g. /ydek/ [‘ji;’nde;k] /ydk/ [‘ji;’ndîk] h. /pabtk-/ [‘Fa;mbî- /pbtk-/ [‘Fîmbî’rîk] ’rîk] i. /kapok/ [‘ka;’Bo;k] /kapk/ [‘ka;’Bîk] j. /na-sed/ [‘na;’se;nt] /na-sd/ [‘na;’sînt] k. /joNbaN/ [‘ndjo;Nî- /jNbaN/ [‘ndjiNî’mba;N] ’mba;N]

‘hit, strike’ ‘reddish brown, ripe’ ‘yam (generic)’ ‘place (generic)’ ‘1dual subj’ ‘dry’ ‘tasty, sweet’ ‘secure, fasten’ ‘pit for an earth oven’ ‘your grandfather’ ‘good quality stone axe blade’

Since one can only predict the reduced vowel from the full vowel, and since minimal pairs like Ti Mnm (27a) and (b) are associated with homophones in Etp Mnm, the data supports the hypothesis that predictable vowels are remnants of vowel reduction at the phonetic level, and vowel loss at the lexical level. For four of these forms, (a–c, f), Kobon cognates exist (namely Kobon /pak-, pa-/ ‘hit, strike ’, /po/ ‘ripe ’, /po¨d/ ‘yam (generic) ’, /m6lep/ ‘ dry’) and in each case these support the hypothesis that Ti Mnm is conservative in retaining a full vowel. Some of the cognate sets in (27) also lend support to the association between vowel reduction and phrasal stress subordination mentioned above. Recall our observation above that non-ﬁnal words within the phrase may be produced without lexical stress. This means that words which are ﬁrst elements of compounds or set phrases will occur unstressed at higher frequencies than other words. One class of words of this kind in Kalam are generics, like (27c): /ped/, /pd/. In Kalam, this lexeme is found as ﬁrst member of a number of longer phrases referring to kinds of yams, yam parts, tools relating to yam cultivation and so on. Examples include Etp Mnm /pd kolem aydk/ ‘wild yam sp. ’, /pd sgoy/ ‘wild yam sp. ’, /pd magi/ ‘aerial tuber of wild yam’, /pd kot/ ‘yam pole (for staking vines) ’, /pd sbel/ ‘narrow base of yam tuber’ and /pd yN/ ‘section cut from yam tuber for use as seed ’. If this was an isolated example, it would not lend much support to the stress-subordination hypothesis, but many lexically vowelless words like (27c, d) appear to be more common in phrase-initial or medial position than in phrase-ﬁnal position. A further interesting fact which may lend support to the stress-subordination hypothesis is the existence of homophones in Ti Mnm, where one

32 Juliette Blevins and Andrew Pawley form is reduced in Etp Mnm and the other is not. Three quite common homophones in Ti Mnm are shown in (28) with their Etp cognates, as well as examples indicating common usage. The lexeme /tep/ ‘good ’, which has not undergone reduction in Etp Mnm, is common in phraseﬁnal position (28a). Two homophones in Ti Mnm, /tep/ ‘ place (generic) ’ (28b) and the adverb /tep/ ‘again, once more ’ (28c) have diﬀerent syntactic distributions. As discussed above for /ped/, generics are common in initial position of phrases which refer to speciﬁc attributes of the generic. The examples in (28b) illustrate the same principle for this lexeme. Finally, the examples in (28c) show the positioning of the adverb /tp/ before the verb ; in this construction type, the adverb is not in non-phraseﬁnal position.

(28) The role of phrasal stress subordination in historical vowel reduction a. Ti Mnm /tep/, Etp Mnm /tep/ ‘good, enough’ ‘That’s enough.’ i. Mey tep. ‘You are a good man.’ ii. Nad b tep. ‘any valuable goods (pl)’ iii. kayg-tep b. Ti Mnm /tep/, Etp Mnm /tp/ ‘place (generic)’ ‘place for staying’ i. tp mdep ‘sleeping place’ ii. tp kneb ‘place for sitting’ iii. tp bsgep c. Ti Mnm /tep/, Etp Mnm /tp/ ‘again, once more’ ‘Say it again.’ i. Tp agan! ii. Tp adkd owak. ‘He’s back home again.’ ‘return, go back, go again’ iii. tp amWe highlight these facts because they may support vowel reduction as a function of phrasal stress subordination. However we should stress that the subordination hypothesis will remain speculative until a fuller study of Kalam phrasal prosody is carried out. If vowels which are more often subject to reduction are those which are lost ﬁrst, then a more general aspect of sound change is supported: where sound change is due to variation along the hyper-to-hypoarticulation continuum, frequency of reduced tokens can play an important role in the reanalysis of lexical forms (Bybee 2001, Blevins 2004 : 36–37). Comparison of Kalam and its sister language Kobon also supports the view that Kalam predictable vowels were historically full vowels that have undergone reduction and (in some cases) loss. Cognate sets are presented in (29), with Kobon data from Davies (1980, 1981, 1985). Where more than one form is cited for Kobon, these reﬂect dialectal variants. A slash separates strings being compared. Kobon full vowels which are absent in Kalam are printed in bold. In reconstructions, ‘V ’ indicates a vowel of indeterminable quality. (29a–j) show reduction/loss of vowels which are not in absolute initial or ﬁnal position, and which, following Davies (1980), are not predictable in Kobon. In (29a–e) we see evidence of regular

Typological implications of Kalam predictable vowels 33 Proto-Kalamic *s>h in Kobon ; while (29f–h) show Kobon regular lenition of Proto-Kalamic coda *k>0. The Kobon forms with ﬁnal vowels in (29k–t) are cognate with C-ﬁnal forms in Kalam. Kobon stress is synchronically penultimate (Davies 1980 : 58–59), and it is clear, even in Kobon, that word-ﬁnal vowels are being reduced and lost (29q), though again, these vowels are not predictable in Kobon.32 This sound change has come to completion in Kalam, where, as noted earlier, there is evidence that at the phonological level, all words are C-ﬁnal. Vowel loss in Kalam has occurred initially as well, as exempliﬁed in (29u). There is also external evidence for some of the Proto-Kalamic reconstructions in (29). For example, Proto-Kalamic *kabV (29l) is part of a widespread cognate set for which Pawley (2008) reconstructs Proto-Trans New Guinea *ka(mb,m)u[CV]. Another reconstruction with external support is ProtoKalamic *sib (29d), a reﬂex of Proto-Trans New Guinea *simb(i,u). Additional Proto-Trans New Guinea (PTNG) reconstructions with zero vowel reﬂexes in Kalam include PTNG *nVNg- ‘know, see, hear ’>Kalam /nN/, /ng-/ (T), PTNG *imbi ‘name ’>Kalam /yb/, PTNG *ambi ‘man ’>Kalam /b/, PTNG *pana(a,e) ‘woman, girl ’>Kalam /pa8/, PTNG *takVn(V) ‘moon ’>Kalam /takn/, PTNG *mundunmaNgV ‘heart’>Kalam /md-magi/.

(29) Some Kobon–Kalam comparisons Kalam Kobon medial vowel loss a. hab(î)ljiN sblN b. habö sbek c. wîhakwskd. hib sb, cb e. halañ slañ f. hagalj sgal/b g. kuñu, kîñu kuñk, kñk h. lisön gp lsen gp i. mulu mluk j. ado adk-

Proto-Kalamic *sabVliN *sab(o,e)k *wVsak*sib *salañ *sVgal *kVñuk *lisVn *muluk *adok

‘umbilical cord’ ‘pimple’ ‘to loosen’ ‘intestines’ ‘scab’ ‘discharge from eyes’ ‘saliva’ ‘have a cold’ ‘nose’ ‘to turn around’

32 There are other contexts where both Kobon and Kalam have predictable vowels. In

Kobon, as in Kalam, syllables can end in single consonants, and CC clusters are common word-internally at syllable boundaries. In this context : ‘ where consonant clusters occur across syllable boundaries within the phonological word there is a tendency for a very short non-phonemic transitional schwa to occur between the two consonants ’ (Davies 1980 : 57). Davies does not suggest that certain Kobon words are vowelless, but given that certain words contain only a short central vowel (e.g. /m6/ ‘ taro ’, /b6N/ ‘ strongly ’, /r6m6n/ ‘ edible greens ’, /k6d6l/ ‘ root ’), it is possible to analyse Kobon as having words whose lexical forms are C, CC, CCC, etc.

34

Juliette Blevins and Andrew Pawley

final vowel loss k. bi l. kabö m. maybö n. ramö o. habaynö p. gapî q. haji, hajî, haj r. rune s. nîme t. gawbu, gabe initial vowel loss u. ud-

b kab mayb tam sabayn gap haj tun -nm gawb

*bi *kabV *maybV *tamV *sabaynV *gapV *sajV *rune *nVme *gawbu

‘man’ ‘stone’ ‘shoulder’ ‘fork’ ‘gall bladder’ ‘star’ ‘compensation’ ‘ashes’ ‘mother’ ‘jew’s harp’

d-

*ud

‘to hold’

In sum, there is ample evidence that some Kalam predictable vowels are the remnants of once full vowels. When these vowels are in phrasal positions in which lexical stress is subordinated to phrasal stress they are reduced. If such reduced forms become frequent enough, they replace former lexemes with full vowels. At the stage where every (or nearly every) consonant-to-consonant transition within the word has a reduced transition vowel, the language learner may reverse the historical process of vowel loss/reduction, and assume that these transition vowels are inserted.33 We summarise the historical developments in Table III, with representative forms. *jubul ‘tree sp.’

*bi ‘man’

stage reduction of unstressed (non- svbgac II phrase-final) vowels

jvbul

bv

stage reduced vowels reanalysed as svbgac
jvbvl < /jbl/ bv
stage full vowels only I

*sib-gac ‘large intestine’

stage reduced vowels inserted where svbvgac
n.a.

Table III Vowel reduction reanalysed as vowel insertion, leading to predictable vowels. 33 In some languages with predictable vowels from historically reduced vowels, there

is no evidence of historical rule inversion in the form of extension of the predictable vowel pattern. This seems to be the case for Dieguen˜o and other Yuman languages (Langdon 1970 : 37–41). For Dieguen˜o, Langdon (1970: 37) posits an underlying /@/ phoneme which is ‘ always unstressed, never long, and accounts for all cases of unstressed vowels whose quality is either [@] or is predictable from its environment ’. However, she goes on to note that in word-initial position one need not assume an underlying vowel since ‘ there are no initial clusters and the presence of the vowel is completely predictable in that position ’. Interestingly, stress is phrase/word-ﬁnal in Dieguen˜o, as it is in Kalam.

Typological implications of Kalam predictable vowels 35 The phrasal nature of reduction, combined with the fact that words can be pronounced as independent phrases, means that even without a shift in the stress system, vowels reduced in Table III are still potentially stressbearing in some contexts. This allows us to explain why the seemingly predictable reduced vowel in Kalam can be stressed in word-ﬁnal syllables.34 Stage III, where reduced vowels are analysed as consonant release, allows us to explain the seemingly odd distribution on non-historical vowels in VC.CV contexts : as in the reﬂex of *sabliN ‘ umbilical cord’, Kalam /sblN/ [os6mb6pl6N], vowels appear in the context of consonants which have a closure release. Finally, the analysis sketched above makes it clear why long strings of consonants may arise in languages which have undergone this pathway of predictable vowel evolution : if all but phraseﬁnal vowels are potentially unstressed and reduced, and if all reduced vowels are ultimately reinterpreted as lexically absent, strings of consonants and vowelless words are expected.

4 Remnant vowels in a broader perspective Both Levin (1987) and Hall (2006) propose typologies of predictable vowels with two types : phonologically present (lexical) vowels, and phonologically absent (non-lexical) vowels. Within Hall’s typology, a further proposal is made that phonologically absent vowels are ‘intrusive ’ vowels. In addition to oﬀering new diagnostics for intrusive vowels, Hall (2006) claims that her gestural analysis is able to account for three general properties of intrusive vowels : their quality (copy vowels or schwa-like), their distribution (typically restricted to heterorganic clusters) and their variability (likely to be absent in fast speech). She also shows how intrusive vowels develop diachronically from retiming of consonant-toconsonant transitions. However, retiming of consonant-to-consonant transitions is only one pathway by which predictable vowels can arise. The present study suggests that the typology of predictable vowels be expanded to include vowels from other historical sources. In Kalam, we have found predictable vowels which defy description within the previous typology. Kalam predictable vowels are similar to intrusive vowels in terms of their quality, and their distribution. But Kalam predictable vowels can be stressed, and so cannot be phonologically invisible. Furthermore, they are common between identical (homorganic) consonants. We have suggested that the seemingly mixed properties of Kalam predictable vowels follow from their history in vowel reduction and reanalysis. Unlike intrusive vowels, predictable vowels in Kalam do not have their origins in elongated consonant-to-consonant transitions. Rather, a clear historical process of vowel reduction has been documented, leading us to classify Kalam predictable vowels as remnant vowels with the properties described in (6), and repeated as (30). 34 Initial stress (14c) may be a later innovation of the prominence system.

36

Juliette Blevins and Andrew Pawley

(30) Some properties of Kalam predictable vowels a. The vowel’s quality is either central, a copy of a nearby vowel or influenced by the quality of surrounding consonants (I). b. If the vowel’s quality is copied over an intervening consonant, that consonant need not be a sonorant or guttural (E). c. The vowel’s presence is not dependent on speech rate (E). d. The vowel does not generally occur in heterorganic clusters; it often occurs between homorganic consonants, including identical consonants (E). e. The vowel does not seem to have the general function of repairing illicit structures (I). f. The vowel is phonologically visible, since it can carry word stress (E). g. The vowel’s presence may be associated with consonant release (N). h. Lexical/underlying forms without predictable vowels may contain long strings of consonants, and may lack vowels altogether (N).

Of particular interest are two new properties associated with predictable vowels : association with consonant release (30g) and long consonant strings in the lexicon (30h). Under earlier treatments (e.g. Levin 1987), association with consonant release was a typical property of non-lexical (excrescent) vowels, but the same vowels were expected to be invisible to stress and other phonological patterns. Long strings of consonants in the lexicon have, as far as we know, not been generally associated with the existence of any predictable vowel type in the literature. Is it possible that other cases of predictable vowels with sources in unstressed vowel reduction may have a similar proﬁle ? Languages classiﬁed by Hall (2006) as having intrusive vowels include Imdlawn Tashlhiyt Berber, Tiberian Hebrew, Mokilese, Piro and Upper Chehalis (Coast Salishan). Since all of these languages have known histories of vowel reduction, a careful review of predictable vowel phonology may reveal that their properties are not entirely explained by the articulatory model. This appears to be the case for Tashlhiyt Berber (Dell & Elmedlaoui 1985, 1996a, b, Coleman 1999, 2001). In this Berber language, as in Kalam, words may consist entirely of consonants at the lexical level. And, as in Kalam, predictable vowels occur on the surface, and may be stressed. Though Hall classiﬁes these as ‘intrusive ’ vowels, due to their quality, distribution and variability, the fact that Berber intrusive vowels can be stressed seems to argue for their phonological visibility. It is not surprising then, that in the literature cited above, there is some debate as to the synchronic status of these vowels. Other predictable vowels with proﬁles similar to those in Kalam are found in other Papuan languages of New Guinea. The analysis of short high-to-mid central surface vowels has been problematic in many

Typological implications of Kalam predictable vowels 37 languages of the Sepik area, including the Wosera dialect of Abelam (Ndu family), as sketched by Laycock (1965), Alamblak, as analysed by Bruce (1984 : 61–70), and Yimas (Lower Sepik family), as treated by Foley (1991 : 44–50). In all of these languages, vowelless words and strings of consonants are found in the lexicon. In Wosera, the mixed status of the central vowel is analysed by Laycock (1965 : 44) in terms of two kinds of schwa: a phonemic schwa and a ‘ linking schwa’. The linking schwa occurs word-internally between heteromorphemic consonants in VC1-C2V, similar to the Kalam patterns in (20), except where C1 and C2 are identical or homorganic. Here, as in Kalam, the presence of the linking vowel is tied to release, since it is stated explicitly that the ﬁrst of a sequence of homorganic consonants is not released. In Bruce’s (1984 : 62–63) analysis of Alamblak, underlying central high/mid vowels are also distinguished from predictable vowels of the same quality, since there are a range of consonant clusters where the presence vs. absence of a transition vowel appears to be lexically determined. While predictable vowels are invisible for subparts of the stress-assignment algorithm and for a phonological process of low vowel dissimilation, the same vowels can be stressed as a last resort, showing simultaneous phonological visibility and invisibility. Similar sound patterns are found in Yimas, where Foley (1991 : 46–49) also proposes underlying and epenthetic high central vowels. Epenthesis of [6] in Yimas occurs between all underlying clusters which are impermissible surface clusters in the language. Unlike Wosera and Kalam, the predictable vowel is never inserted at a syllable boundary. While Yimas [6] comes closest to a canonical epenthetic vowel, it also shows mixed visibility with respect to stress. In the primary and secondary stress rules stated by Foley (1991 : 75), input includes only underlying vowels. However, as in Alamblak, the same predictable vowels can be stressed as a last resort. In Yimas, a surface phonetic constraint requires that one of the ﬁrst two vowels carry primary stress: if both are predictable [6], then primary stress falls on the ﬁrst of these (Foley 1991: 77). The recognition of remnant vowels as a distinct predictable vowel type may also assist with analysis of newly discovered predictable vowels. A recent ﬁnding of this kind involves a non-Papuan language of Papua New Guinea. The Selau dialect of Halia, an Oceanic Austronesian language of the northern tip of Bougainville, is described by Blust (2003) as having predictable schwa in words which, lexically, have no vowels. Historically, most of these forms result from high vowel reduction and loss. Representative Selau forms are shown in (31), with Proto-Oceanic reconstructions.

38

Juliette Blevins and Andrew Pawley

(31) Selau predictable vowels a. b. c. d. e.

/ss/ /mr/ /nt/ /ptn/ /lmt/

[s:@] [m@r] [@nt@] [pt@n] [l@mt@]

‘breast’ ‘back, behind’ ‘egg’ ‘coconut husk’ ‘moss, algae’

Proto-Oceanic *susu *muri (cf. Halia nata) *putun *lumut

Notice that, as in Kalam, some predictable vowels appear to reﬂect historically reduced vowels, while others do not. Blust (2003 : 149) is explicit on this point : ﬁnal high vowels were lost in Selau ; where a word-ﬁnal schwa is pronounced, as in (31a, c, e), it is ‘little more than a transitional vowel permitting speakers to pronounce what would otherwise be a disallowed ﬁnal consonant cluster ’. Supporting the view that Selau schwas are intrusive vowels within Hall’s (2006) typology, is the fact that such vowels are not ﬁxed in position. Words like /lma/ ‘hand/ (< Proto-Oceanic *lima) can be pronounced as [l@ma] or [@lma] : ‘such free variation suggests that the schwa in these forms is little more than an automatic facilitating vowel which enables speakers of the language to pronounce the underlying consonant clusters in lma and lsa respectively’ (Blust 2003: 148). If schwa is intrusive, then, under Hall’s classiﬁcation, it is expected to be phonologically invisible. However, this is not the case. Although Blust’s (2003) description, based on his own ﬁeldnotes, does not show stress, Capell’s (1963) 100-item Selau wordlist does. Selau stress is usually on the penult, and in penultimate position, schwa is stressed : [aps@nEn] ‘ her breast’, [onnta] ‘egg ’ (where Capell uses long ‘n ’ instead of schwa), [l@mpl@mou] ‘ wind’ (cf. Proto-Oceanic *timu(R) ‘wind bringing light rain ’; Ross et al. 2003: 131). Again, we appear to have an instance of a predictable vowel which is neither ‘intrusive ’ nor ‘epenthetic ’. Rather, due to regular vowel reduction and loss, and the inversion (and extension) of this historical process, predictable vowels in Selau are remnant vowels. Selau remnant vowels facilitate the pronunciation of consonant sequences at the phonetic level, but, at the same time, are phonologically stressbearing.35 Broader implications of this study relate to relationships between contemporary sound patterns and sound change. Blevins (2004, 2005, 2008) has argued that many aspects of sound patterns reﬂect recurrent patterns of sound change. In two speciﬁc areas, typologies have been expanded in signiﬁcant ways. In the study of geminate inventories, Blevins (2004, 35 Blust (2003: 152) highlights several unresolved problems raised by the Selau data.

The most challenging, he suggests, is ‘ to ﬁnd a reason why Selau, apparently alone among the more than 1000 Austronesian languages, has evolved a canonical shape which permits vowelless words ’. But is Selau alone in having this property ? Mapos Buang (Hooley 1970, 2006), another Oceanic language spoken in the central part of the Snake River Valley in Morobe Province, Papua New Guinea, appears to have a very similar sound pattern.

Typological implications of Kalam predictable vowels 39 2005) shows a close match between pathways of geminate evolution and geminate inventory composition : small geminate inventories are typically the result of local restricted consonant assimilations, while full inventories of geminates are common results of post-tonic lengthening and vowel loss between identical consonants. Where earlier studies of geminate inventories attempted to account for composition in terms of natural classes (sonorants, obstruents, etc.), the strongest predictor of geminate inventory composition appears to be pathway of geminate evolution. Closer in content to the vowel–zero alternations examined here are typologies of consonant epenthesis. Blevins (2008) proposes a major threeway division : (i) consonant epenthesis arising from phonetically transitional intervocalic glides, or glide percepts, (ii) consonant epenthesis from phonologisation of laryngeal boundary marking and (iii) consonant epenthesis from inversion of historical coda loss. In addition to these three major types, subsequent fortition of glides arising from (i) can result in synchronic obstruent epentheses. Each type has a set of expected segmental and distributional properties which distinguish it from the others. Consonant epentheses with sources in V-to-V transitions typically give rise to segments /j w/, and are found intervocalically, but not wordinitially or ﬁnally. Consonant epentheses based on phonologisation of laryngeal boundary markers generally show themselves as /h ?/, and are most common word-initially and word-ﬁnally. Consonant epentheses which have origins in weak coda loss show alternations in derived environments, where consonants are those most subject to coda weakening. In the body of this paper, we have focused on the description and analysis of predictable vowels in Kalam, and implications of this analysis for a typology of predictable vowels. We have shown that a simple twoway division between intrusive phonologically invisible vowels and epenthetic phonologically visible vowels is too restrictive, and that ‘remnant’ vowels of the Kalam type should also be included. Further, we have shown that the origins of Kalam predictable vowels in historical vowel reduction and loss account for the mixed synchronic properties they exhibit, and the long strings of consonants posited in lexical forms. Given other pathways of vowel evolution, other types of predictable vowels are expected. In (32) we list at least six distinct identiﬁable pathways by which predictable vowels can arise from natural phonetic processes and reanalysis (i.e. historical rule inversion), and suggest typological classiﬁcations based on their known properties.36

36 Here, we limit ourselves to predictable vowels in C# or CC contexts. For a

thorough treatment of the evolution of vowels in VC contexts, see Operstein (2007).

40

Juliette Blevins and Andrew Pawley

(32) General pathways of predictable vowel evolution: an expanded typology General pathway Predictable vowel type a. C-C with elongated transition intrusive (Hall 2006) b. C-C with perceived intrusive (Hall 2006) interconsonantal vowel c. C° with release phonologised as paragogic (Blevins & Garrett 1998) vowel d. unstressed vowel reduction remnant (Kobon; see above) e. unstressed vowel deletion+rule remnant (Kalam; see above) inversion f. unstressed vowel syncope+rule epenthetic (Yokuts; note 4) inversion

A full treatment of these alternative pathways and the predictable vowels types associated with them begs for future study. For now, we oﬀer brief comments based on available case studies. Pathways (32a) and (32b) are discussed at length in Hall (2006) and lead to the synchronic patterns she classiﬁes as intrusive vowels. Pathway (32c) leads to the evolution of word-ﬁnal paragogic vowels, whose quality and distribution distinguish them from intrusive vowels. Unlike intrusive vowels which occur between consonants, paragogic vowels are common word-ﬁnally. Further, unlike intrusive vowels (5b), paragogic vowels are often copies of the preceding vowel, independent of the quality of the intervening consonant. Pathway (32d) would seem to classify the synchronic stage of Kobon as described above : reduced vowels are found only in their historical positions, while (32e) describes the extension of reduced vowels via reanalysis: a stage of reduced vowel–zero alternations gives rise to a synchronic rule of predictable vowel insertion which is extended to environments where no historical vowels were present. This is the pattern found for Kalam. Finally, (32f) shows the pathway of a more restrictive type of historical vowel loss : medial vowel syncope. When vowel–zero alternations arising from this historical process are interpreted as synchronic vowel insertions, vowels appear to function as syllable repairs. A notable property of the typology in (32) is that it is agnostic with respect to whether vowels are ‘phonological ’ or ‘phonetic’. A phonetic process begins as a gradient variable aspect of the speech signal and evolves into a categorical invariant pattern. At the early stages, the process will have features we associated with phonetic eﬀects, at some point, effects associated with phonologisation, and later, the pattern may simply be left as a fossil record in the lexicon. Hall (2006: 422) acknowledges that this is the case for vowel intrusion : ‘like other phonetic processes, it may become phonologised. A vowel sound that originated as intrusive may be reanalysed over time as a segmental vowel ’. We suggest that this is true for all of the phonetically based processes giving rise to predictable vowels. Since any predictable vowel arising from the pathways in (32) may be

Typological implications of Kalam predictable vowels 41 phonologised, phonological visibility is not a useful heuristic for establishing vowel type. Pathways (32a–c) will typically give rise to phonologically invisible vowels at their origins. However, since all predictable vowels can undergo phonologisation, the visibility of vowels in the phonology may tell us little about other typological characteristics. Recall from the discussion above that in both Alamblak and Yimas certain parts of the stress-assignment algorithm must ignore predictable vowels, although, at what, under rule-ordering accounts would be ‘later’ levels of the derivation, the same vowels can be stressed. Under the reconceptualisation of predictable vowel typology in (32), a wider range of predictable vowel types than the two proposed by Hall (2006) is expected, with a range of distributional properties, quality patterns and visibility parameters. Articulatory and perceptual expansion of the speech stream are not the only sources of predictable vowels ; reduction and loss play a role as well, along with systematic reanalysis of alternating segments. As the Kalam data shows, articulatory reduction, deletion and reanalysis can lead to a pattern of predictable vowel distribution with its own identiﬁable signature. As more languages with this signature are discovered, and other patterns of predictable vowels are carefully explored, the proposed typology in (32) can be more thoroughly evaluated. REFERENCES

Archangeli, Diana (1991). Syllabiﬁcation and prosodic templates in Yawelmani. NLLT 9. 231–283. Atoyebi, Joseph (in progress). A reference grammar of Oko. PhD dissertation, Max Planck Institute for Evolutionary Anthropology, Leipzig. Barnes, Jonathan (2006). Strength and weakness at the interface : positional neutralization in phonetics and phonology. Berlin & New York : Mouton de Gruyter. Biggs, Bruce (1963). A non-phonemic central vowel type in Karam, a ‘ Pygmy ’ language of the Schrader Mountains, Central New Guinea. Anthropological Linguistics 5:4. 13–17. Blevins, Juliette (1995). The syllable in phonological theory. In John A. Goldsmith (ed.) The handbook of phonological theory. Cambridge, Mass. & Oxford: Blackwell. 206–244. Blevins, Juliette (2004). Evolutionary Phonology : the emergence of sound patterns. Cambridge : Cambridge University Press. Blevins, Juliette (2005). The typology of geminate inventories : historical explanations for recurrent sound patterns. Proceedings of the Seoul Linguistics Forum 2005. Seoul: Language Education Institute, Seoul National University. 121–137. Blevins, Juliette (2006). A theoretical synopsis of Evolutionary Phonology. Theoretical Linguistics 32. 117–166. Blevins, Juliette (2008). Consonant epenthesis : natural and unnatural histories. In Jeﬀ Good (ed.) Linguistic universals and language change. Oxford : Oxford University Press. 79–107. Blevins, Juliette & Andrew Garrett (1998). The origins of consonant–vowel metathesis. Lg 74. 508–556. Blust, Robert (2003). Vowelless words in Selau. In John Lynch (ed.) Issues in Austronesian historical phonology. Canberra: Australian National University. 143–152.

42

Juliette Blevins and Andrew Pawley

Blust, Robert (2007). Disyllabic attractors and anti-antigemination in Austronesian sound change. Phonology 24. 1–36. Broselow, Ellen (1992). Parametric variation in Arabic dialect phonology. In Ellen Broselow, Mushira Eid & John McCarthy (eds.) Perspectives on Arabic linguistics IV. Amsterdam & Philadelphia : Benjamins. 7–45. Browman, Catherine P. & Louis Goldstein (1986). Towards an articulatory phonology. Phonology Yearbook 3. 219–252. Browman, Catherine P. & Louis Goldstein (1992). ‘ Targetless ’ schwa : an articulatory analysis. In Gerard J. Docherty & D. Robert Ladd (eds.) Papers in laboratory phonology II : gesture, segment, prosody. Cambridge : Cambridge University Press. 26–56. Bruce, Les (1984). The Alamblak language of Papua New Guinea (East Sepik). Canberra: Australian National University. Bybee, Joan (2001). Phonology and language use. Cambridge : Cambridge University Press. Capell, Arthur (1963). Survey word list (standard) Bougainville survey extra : Halia/ Selau dialect (Austronesian), Tipot Bougainville I, North Solomons, Bougainville. Available (December 2009) at http://www.paradisec.org.au/ﬁeldnotes/ NSBHAL.htm. Coberly, Mary (2002). Towards a proto-Kalamic phonology and lexicon. Ms, Plains Language Center, Centennial, Wyoming. Coleman, John S. (1999). The nature of vocoids associated with syllabic consonants in Tashlhiyt Berber. In John J. Ohala, Yoko Hasegawa, Manjari Ohala, Daniel Granville & Ashlee C. Bailey (eds.) Proceedings of the 14th International Congress of Phonetic Sciences. Berkeley : Department of Linguistics, University of California, Berkeley. 735–738. Coleman, John S. (2001). The phonetics and phonology of Tashlhiyt Berber syllabic consonants. Transactions of the Philological Society 99. 29–64. Comrie, Bernard (1991). On Haruai vowels. In Andrew Pawley (ed.) Man and a half : essays in Paciﬁc anthropology and ethnobiology in honour of Ralph Bulmer. Auckland : Polynesian Society. 393–397. Crosswhite, Katherine M. (2004). Vowel reduction. In Bruce Hayes, Robert Kirchner & Donca Steriade (eds.) Phonetically based phonology. Cambridge : Cambridge University Press. 191–231. Crowley, Terry (1998). An Erromangan (Sye) grammar. Honolulu : University of Hawai’i Press. Davidson, Lisa & Maureen Stone (2003). Epenthesis versus gestural mistiming in consonant cluster production : an ultrasound study. WCCFL 22. 165–178. Davies, John (1980). Kobon phonology. Canberra: Australian National University. Davies, John (1981). Kobon. Amsterdam : North-Holland. Davies, John (1985). Kobon dictionary. Ms, St John’s College, Cambridge. Dell, Franc¸ois & Mohamed Elmedlaoui (1985). Syllabic consonants and syllabiﬁcation in Imdlawn Tashlhiyt Berber. Journal of African Languages and Linguistics 7. 105–130. Dell, Franc¸ois & Mohamed Elmedlaoui (1996a). Nonsyllabic transitional vocoids in Imdlawn Tashlhiyt Berber. In Jacques Durand & Bernard Laks (eds.) Current trends in phonology : models and methods. Salford : ESRI. 217–244. Dell, Franc¸ois & Mohamed Elmedlaoui (1996b). On consonant releases in Imdlawn Tashlhiyt Berber. Linguistics 34. 357–395. Dench, Alan (1991). Panyjima. In R. M. W Dixon & Barry J. Blake (eds.) The handbook of Australian languages. Vol. 4. Oxford : Oxford University Press. 124–243. Dunn, Ernest F. (1968). An introduction to Bini. East Lansing : African Studies Center, Michigan State University.

Typological implications of Kalam predictable vowels

43

Elugbe, Ben Ohi!mamhf (1989). Comparative Edoid : phonology and lexicon. Port Harcourt : University of Port Harcourt Press. Foley, William A. (1986). The Papuan languages of New Guinea. Cambridge : Cambridge University Press. Foley, William A. (1991). The Yimas language of New Guinea. Stanford : Stanford University Press. Gahl, Susanne & Alan C. L. Yu (eds.) (2006). Exemplar-based models in linguistics. Special issue. The Linguistic Review 23. 213–379. Goddard, Cliﬀ (1992). Pitjantjatjarra/Yankunytjatjara to English dictionary. Alice Springs : Institute for Aboriginal Development. Guerssel, Mohamed (1977). Constraints on phonological rules. Linguistic Analysis 3. 267–305. Hall, Nancy (2006). Cross-linguistic patterns of vowel intrusion. Phonology 23. 387–429. Harms, Robert T. (1976). The segmentalization of Finnish ‘ nonrules ’. Texas Linguistic Forum 5. 73–88. Harris, John (2005). Vowel reduction as information loss. In Philip Carr, Jacques Durand & Colin J. Ewen (eds.) Headhood, elements, speciﬁcation and contrastivity. Amsterdam & Philadelphia : Benjamins. 119–132. Hayes, Bruce (1995). Metrical stress theory : principles and case studies. Chicago: University of Chicago Press. Henderson, John & Veronica Dobson (1994). Eastern and Central Arrernte to English Dictionary. Alice Springs : IAD Press. Hooley, Bruce A. (1970). Mapos Buang : territory of New Guinea. PhD dissertation, University of Pennsylvania. Hooley, Bruce A. (2006). Mapos Buang dictionary. Ukarumpa : Summer Institute of Linguistics, Papua New Guinea. Available (December 2009) at http://www.sil.org/ paciﬁc/png/abstract.asp?id=49641. Itoˆ, Junko (1989). A prosodic theory of epenthesis. NLLT 7. 217–259. Kenstowicz, Michael & Charles Kisseberth (1979). Generative phonology : description and theory. New York: Academic Press. Kenstowicz, Michael & Charles Pyle (1973). On the phonological integrity of geminate clusters. In Michael Kenstowicz & Charles W. Kisseberth (eds.) Issues in phonological theory: proceedings of the Urbana Conference on Phonology. The Hague & Paris : Mouton. 27–43. Kiparsky, Paul (2003). Syllables and moras in Arabic. In Caroline Fe´ry & Ruben van de Vijver (eds.) The syllable in Optimality Theory. Cambridge : Cambridge University Press. 147–182. Langdon, Margaret (1970). A grammar of Dieguen˜o: the Mesa Grande dialect. Berkeley : University of California Press. Laycock, D. C. (1965). The Ndu language family (Sepik District, New Guinea). Canberra : Australian National University. Levin, Juliette (1987). Between epenthetic and excrescent vowels (or what happens after redundancy rules). WCCFL 6. 187–201. Lichtenberk, Frantisek (1983). A grammar of Manam. Honolulu : University of Hawaii Press. Majnep, Ian Saem & Ralph Bulmer (1983). Some food plants in our Kalam forests, Papua New Guinea. Working Paper 63, Department of Anthropology, University of Auckland. Majnep, Ian Saem & Ralph Bulmer (1990). Kalam hunting traditions. Vols 1–6. Department of Anthropology Working Papers, University of Auckland. Matteson, Esther (1965). The Piro (Arawakan) language. Berkeley : University of California Press. Newman, Stanley (1944). Yokuts language of California. New York : Viking Fund.

44

Juliette Blevins and Andrew Pawley

Operstein, Natalie (2007). Prevocalization : evidence for a new model of intrasegmental consonant structure. PhD dissertation, University of California, Los Angeles. Pawley, Andrew (1966). The structure of Kalam : a grammar of a New Guinea highlands language. PhD dissertation, University of Auckland. Pawley, Andrew (1992). Kalam Pandanus language : an old New Guinea experiment in language engineering. In Tom Dutton, Malcolm Ross & Darrell Tryon (eds.) The language game : papers in memory of Donald C. Laycock. Canberra : Australian National University. 313–334. Pawley, Andrew (2001). The Proto Trans New Guinea obstruents : arguments from top-down reconstructions. In Andrew Pawley, Malcolm Ross & Darrell Tryon (eds.) The boy from Bundaberg : studies in Melanesian linguistics in honour of Tom Dutton. Canberra: Australian National University. 261–300. Pawley, Andrew (2008). Some Trans New Guinea Phylum cognate sets. Computer ﬁle, Australian National University. Pawley, Andrew & Ralph Bulmer (2003). A dictionary of Kalam with ethnographic notes. Printout and computer ﬁle, Australian National University. Pawley, Andrew & Meredith Osmond (1998). The Madang group of Papuan languages : cognate sets and sound correspondences. Ms, Australian National University. Pearce, Mary (2004). Kera foot structure. Ms, University College London. Rose, Sharon (2000). Epenthesis positioning and syllable contact in Chaha. Phonology 17. 397–425. Ross, Malcolm, Andrew Pawley & Meredith Osmond (eds.) (2003). The lexicon of Proto Oceanic : the culture and environment of ancestral Oceanic society. Vol. 2: The physical environment. Canberra : Australian National University. Scholz, Lyle (1976). Revised Kalam orthography. Ms, SIL, Ukarumpa. Scholz, Lyle (1995). Organized phonology data : Kalam [Etip dialect] (Karam) language. Ms, SIL, Ukarumpa. Selkirk, Elisabeth (1981). Epenthesis and degenerate syllables in Cairene Arabic. In Hagit Borer & Yosef Aoun (eds.) Theoretical issues in the grammar of Semitic languages. Cambridge, Mass. : MIT. 111–140. Smeets, Ineke (2008). A grammar of Mapuche. Berlin & New York: Mouton de Gruyter. Vennemann, Theo (1988). Preference laws for syllable structure and the explanation of sound change : with special reference to German, Germanic, Italian, and Latin. Berlin : Mouton de Gruyter. Warner, Natasha, Allard Jongman, Anne Cutler & Doris Mu¨cke (2001). The phonological status of Dutch epenthetic schwa. Phonology 18. 387–420.

Phonology 27 (2010) 45–76. f Cambridge University Press 2010 doi:10.1017/S0952675710000035

Prosodic fusion and minimality in Kabardian* Matthew Gordon Ayla Applebaum University of California, Santa Barbara The Northwest Caucasian language Kabardian displays a typologically unusual process of word formation, whereby two lexical roots fuse to form a single prosodic word whose phonological behaviour is parallel to prosodic words containing a single root. It is shown that this process of fusion, which is subject to a number of phonological and morphosyntactic restrictions, reﬂects a typologically unusual response to a cross-linguistically common minimal word requirement banning monomoraic prosodic words. Rather than employing segmental lengthening or insertion to ensure that minimality is satisﬁed, Kabardian chooses to violate the one-to-one mapping between grammatical and prosodic words. A further complication is the scalar nature of minimality in Kabardian : while the impossibility of fusion in certain prosodic and morphosyntactic contexts allows monomoraic prosodic words to surface, a more stringent minimality restriction ensures that all prosodic words have at least one mora.

1 Introduction Many languages impose minimality restrictions on the size of prosodic words (McCarthy & Prince 1995). For example, the smallest prosodic word in Chickasaw (Munro & Willmond 1994) is CVV. Thus monosyllabic words of the shape CVV (1a) are found in Chickasaw, as are disyllabic words (1b). Monosyllables of the form CVC (1c) and CV (1d) are prohibited, however.

* The authors wish to thank three anonymous reviewers and the associate editor for their numerous insightful comments and suggestions on earlier versions of this paper. Thanks also to John Colarusso, Bruce Hayes, Pam Munro, Jaye Padgett, Colin Wilson, Kie Zuraw and audiences at UCLA and UCSC for their feedback on this research. A great debt of gratitude is owed to the many speakers without whose generosity and linguistic expertise this study would have been impossible. This work was supported by NSF Grant BCS0553771 and a UCSB Academic Senate Grant to the ﬁrst author and an ELDP grant from the Hans Rausing Foundation to the second author.

45

46

Matthew Gordon and Ayla Applebaum

(1) a. ja: ‘s/he cries’ ma: ‘s/he mows it’ la: ‘it is ploughed’

b. koni hapi iti

‘skunk’ ‘salt’ ‘mouth’

c. *kon *tap *nim

d. *ko *ta *ni

This paper explores word-minimality eﬀects in Kabardian, a Northwest Caucasian language spoken primarily in the Kabardino-Balkar republic of Russia and in Turkey. Kabardian is of interest from a minimality perspective, due to the scalar nature of its minimal word requirement. The prosodically worst-formed words consist simply of a single consonant, i.e. C. Slightly better are CV words, while optimal words are at least bimoraic, i.e. CVC, CVV or CVCV. The resulting hierarchy of well-formedness is Cqq>Cq>C. This minimality scale is evinced by two alternations that conspire to improve the standing of words within the hierarchy. First, the prosodic fusion of CV roots to an immediately preceding root under certain prosodic and morphosyntactic conditions indicates that CVC and larger (i.e. CVV or disyllabic) prosodic words are preferable to CV words. Thus, the roots /ow@nP/ ‘ house ’ and /oS’P/ ‘new ’ merge to form a single prosodic word [w@onPS’P] ‘new house ’, in order to beef up the CV root /oS’P/. An epenthetic schwa is inserted between the roots in the fused form if the ﬁrst root ends in a consonant : /S@d+S’P/ ‘donkey+new ’E[S@od@S’P] ‘ new donkey ’. The second minimality-induced alternation shows that a CV prosodic word is preferable to one consisting of only a consonant. Roots which can be analysed as underlyingly consisting only of a consonant, and which surface as such in fused forms, are realised with a ﬁnal schwa, i.e. as CV, when conditions on prosodic fusion are not satisﬁed. For example, the noun meaning ‘ horse’ has two realisations: [S@] in isolation or when it occurs in a phrasal context that does not allow for fusion, and [S] in a position where fusion is allowed, e.g. [z@S] ‘one horse ’ (= /z@+S/ ‘ one+ horse ’). The phenomenon of minimality in Kabardian is of relevance to several phonological issues. First, the scalar nature of Kabardian minimality necessitates an analysis appealing to a more complex interaction between prosodic constraints than those driving the cross-linguistically more common binary minimality conditions. In particular, the Kabardian data argues for a ﬁne-grained series of foot well-formedness constraints. A constraint requiring that feet be binary drives prosodic fusion, while a higher-ranked constraint requiring feet to be minimally monomoraic ensures that all words surface with at least one vowel. Crucially, foot-binarity is more stringently enforced for feet carrying phrasal stress than those carrying word-level stress. Furthermore, the phenomenon of epenthesis that accompanies fusion provides evidence for a distinction between a relatively highly ranked anti-epenthesis constraint holding within roots and a lower-ranked generic anti-epenthesis constraint.

Prosodic fusion and minimality in Kabardian 47 The structure of the paper is as follows. w2 presents background information on the Kabardian phoneme inventory. w3 examines the structure of prosodic words and phenomena that are diagnostic for them. w4 introduces the process of prosodic fusion, while w5 presents an optimality-theoretic analysis of fusion. The process of vowel epenthesis in fusion and non-fusion contexts is examined in w6. w7 situates the minimality eﬀects in Kabardian within the broader typology of minimal word requirements. Finally, w8 summarises the results of the paper.

2 Background on Kabardian Kabardian is a Northwest Caucasian language spoken by approximately 647,000 people (Gordon 2005). While the statistical majority of speakers of Kabardian reside in the Kabardino-Balkar republic of Russia, nearly one-third of the speakers live in Turkey, owing to a mass migration from Russia in the 19th century. The present paper focuses on the variety of Kabardian spoken in Turkey. The data on prosodic fusion discussed in this paper are primarily based on ﬁeldwork conducted with twelve speakers during three trips to Turkey between 2005 and 2007. The basic phenomenon of fusion in Kabardian is described in several published descriptions of the language that are based on Kabardian as spoken in Russia, including Turchaninov & Tsagov (1940), Jakovlev (1948), Kardanov & Bichoev (1955), Abitov et al. (1957), Kuipers (1960) and Colarusso (1992, 2006). Forms presented in the paper are from ﬁeldnotes, with the exception of data accompanied by a reference to published materials. In the latter case, cited forms were checked with native speakers. In addition to oﬀering a theoretical account of prosodic fusion, this paper builds on these published works by describing additional complications in the pattern and relating fusion to prosodic constituency in Kabardian.

2.1 Consonants Like other languages of the Caucasus, Kabardian is characterised by a large consonant inventory and a small vowel system. The consonants of Turkish Kabardian are given in (2).1

1 Speakers from Turkey have a single set of palato-alveolar fricatives, unlike speakers

of literary Kabardian from Russia, who have both the palato-alveolars /S Z/ and the alveopalatals /0 0’ 1/. In addition, the palatal fricative /// employed by the speakers examined in this paper corresponds to a more posterior /x/ for many speakers of Kabardian from Russia. Speakers of Kabardian from Russia generally also have a voiced palato-alveolar aﬀricate /J/ instead of the more conservative (Kuipers 1960 : 21) voiced palatalised velar stop /gj/ used by speakers from Turkey.

48

Matthew Gordon and Ayla Applebaum

(2) The consonant phonemes of Turkish Kabardian denti- palato- palatal/ velar uvular pharyn- laryngeal alveolar alveolar pal. geal velar p b t d kj gj kw q qw ? ?w stop p’ t’ kj’ kw’ q’ qw’ ts dz a‰ricate ts’ s z S Z ç ; xw Gw X ¶ ) h fricative f v f’ [w] S’ [gw] Xw ¶w n nasal m ¡ l lateral ¡’ P tap j glide labial

2.2 Vowels There is considerable controversy about the number of vowel phonemes in Kabardian. Most sources (e.g. Jakovlev 1930, 1948, Turchaninov & Tsagov 1940, Catford 1942, 1984, Abitov et al. 1957, Bagov et al. 1970) assume two short central vowels diﬀering in height /@ P/, as well as a longer and qualitatively lower vowel /a:/.2 The long /a:/ is joined on the surface by four other long vowels [i: e: o: u:], which are typically regarded as deriving from underlying sequences of short vowel+glide : [i:]=/@j/, [e:]=/Pj/, [o:]=/Pw/, [u:]=/@w/. The short vowels also have many diﬀerent surface variants, depending on surrounding consonants, e.g. rounded allophones adjacent to labialised consonants, retracted allophones next to uvulars and pharyngeals, etc. (see Colarusso 1992, 2006 for description and Choi 1991 for phonetic data).

3 The word in Kabardian Words in Kabardian consist of a root and optional preﬁxes and suﬃxes. The combination of root+bound aﬃxes will henceforth be termed the ‘ grammatical word’, in contrast to the ‘prosodic word’, which excludes certain suﬃxes (see w4.2 on the relationship between aﬃxes and 2 Kuipers (1960) is exceptional in assuming that all instances of [@] are the result of

epenthesis and that [P a:] are allophones. Duration measurements by Choi (1991) indicate that the lowest vowel is best analysed as a long vowel, since it is nearly twice as long as the next lowest vowel quality /P/. Some instances of the long low vowel occur in morphophonemic alternation with the slightly higher short vowel /P/, a fact which has led Colarusso (1992) to posit only two underlying vowels and derive the long /a:/ by rule. However, as Colarusso (1992, 2006) points out, there are several instances in which the occurrence of the long /a:/ is not predictable.

Prosodic fusion and minimality in Kabardian 49 the prosodic word). Diagnostics for the prosodic word are considered in w3.2. 3.1 Distribution of schwa A prosodic word may end either in a consonant or a vowel.3 Any of the surface long vowels (underlyingly short vowel+glide sequences) may occur in word-ﬁnal position. However, while short /P/ may occur in ﬁnal position of both monosyllabic and polysyllabic words (3a), ﬁnal schwa is restricted to monosyllabic prosodic words (3b). The restriction of schwa to monosyllables, a distribution noted by Kuipers (1960) and Colarusso (1992, 2006), will play a central role in the analysis of prosodic fusion.

(3) a. p’Æ ?Æ SÆ ‘?ÆnÆ ‘w@nÆ ‘da:mÆ Xa:P’z@nÆ

‘bed, place’ ‘hand, arm’ ‘milk, bullet’ ‘table’ ‘house’ ‘wing’ ‘good’

b. ¡@ t’@ S@ ¡’@ bl@ f’@

‘blood’ ‘ram’ ‘horse, three’ ‘man’ ‘seven’ ‘good’

The ban on ﬁnal schwa in polysyllabic words leads to allomorphic variation in the shape of monosyllabic roots, which end in schwa in isolation, but lack ﬁnal schwa when preﬁxed (4).

(4) s-o:-Sç s-o:-tç j@-¡ ja:-S’

1erg-hab-eat 1erg-hab-write 3poss-blood 3pl poss-earth

‘I eat it (hab)’ ‘I write it (hab)’ ‘his, her blood’ ‘their earth’

cf. Sç@ tç@ ¡@ S’@

‘eat!’ ‘write!’ ‘blood’ ‘earth’

Alternations between ﬁnal schwa and zero underlie the long-standing controversy surrounding the status of schwa in Kabardian. While most analyses (e.g. Jakovlev 1930, 1948, Turchaninov & Tsagov 1940, Catford 1942, 1984, Abitov et al. 1957, Bagov et al. 1970) treat the unaﬃxed form as underlying and thus assume that schwa is present lexically, Kuipers (1960) and Peterson (2007) treat the schwa-less allomorph as basic, and derive the schwa found in the bare root by epenthesis. In the representations given in this paper, the schwa will arbitrarily be treated as underlying, for the sake of expository clarity. As will become apparent, however, this decision is not crucial for the proposed analysis, which appeals to surface well-formedness constraints rather than input–output faithfulness to account for the alternation between root-ﬁnal schwa and zero. The failure of ﬁnal schwa to surface is a pervasive pattern in Kabardian that is also observed in the cases of prosodic fusion that are the focus of the 3 Most prosodic words begin with a consonant. The only words beginning with a

vowel are a few words that begin with the long vowel /a:/.

50 Matthew Gordon and Ayla Applebaum paper. Prosodic fusion and the behaviour of schwa are discussed in detail in w4. First, however, w3.2 presents several diagnostics for the prosodic word besides the ban on ﬁnal schwa. 3.2 Diagnostics for the prosodic word Before examining the process of prosodic fusion that is the focus of the paper, we consider evidence for the prosodic word, which plays a crucial role in characterising fusion. ww3.2.1–3 present diagnostics for the prosodic word that can be applied to words resulting from prosodic fusion as well as to words in which fusion does not take place. 3.2.1 Stress in Kabardian. The prosodic word is the domain of stress assignment in Kabardian. Stress is phonologically predictable (Abitov et al. 1957, Colarusso 1992, 2006), falling on a ﬁnal heavy syllable (CVV, CVC), otherwise on the penult (5).

(5) a. sÆ’b@n tÆp’SÆkj sa:’bi: na:‘nu:

‘soap’ ‘plate’ ‘baby’ ‘kid’

b. ‘pa:sÆ ‘sa:bÆ m@?Æ’P@sÆ XÆr’z@nÆ

‘early’ ‘dust’ ‘apple’ ‘good’

Kabardian also has phrase-level prominence, which will turn out to be a useful diagnostic for determining when two roots within the same phrase do not belong to the same prosodic word (w4.1). The distinction between phrasal stress and word-level stress is diagnosed through a combination of pitch, duration and intensity, and also by the pattern of pitch-accent placement. The pitch accent (H*) within a phrase falls on the primary stressed syllable of the phrase (Applebaum & Gordon 2007). In a phrase consisting of multiple prosodic words, both words have a stressed syllable, and the stressed syllable of the rightmost word characteristically carries the phrasal pitch accent. For example, the ﬁrst syllable of both words in the phrase [of@z"da:/P] ‘beautiful woman’ has word-level stress, with the stress on the second word carrying the phrasal pitch accent (6).

(6)

H* ‘f@z"da:çÆ

3.2.2 Vowel colouring. The phenomenon of vowel colouring provides further evidence for the prosodic word. The quality of vowels is substantially inﬂuenced by neighbouring consonants in Kabardian, with a stronger eﬀect exerted by a following consonant than by a preceding consonant and a stronger inﬂuence on short vowels than on long vowels (see Catford 1942, 1984, Choi 1991, Colarusso 1992, 2006 for details). Vowels are rounded next to rounded consonants, while the place of articulation of consonants conditions the backness and height of neighbouring vowels.

Prosodic fusion and minimality in Kabardian

51

(a)

t

a:

Æ

m

(b)

f

a:

Æ

d Figure 1

Figure 1 Unaspirated voiceless stop (a) and voiced stop (b) realisations of phonemic voiced stops in the words [ta:mP] ‘ wing ’ and [fa:dP] ‘ drink ’.

While the eﬀects of vowel colouring are either absent or only observed at rapid speech rates (depending on the speaker) across prosodic words, it applies consistently and saliently when the trigger and target are in separate grammatical words that have fused. Examples of coarticulatory rounding, the perceptually most salient of the colouring eﬀects, are given in (7).

(7) /ts’@kw’/ £ [ts’ukw’] ‘small’ /gw@dz/ £ [gwudz] ‘wheat’

/dÆ¶w/ £ [dO¶w] ‘thief’ /qwÆ/ w £ [qwO] ‘pig’

The failure of vowel colouring to apply across prosodic words is illustrated in (8).

‘warm horse’ (8) ’S@"xwa:bÆ m@?Æ’P@sÆ"xwa: ‘ripe apple’ ‘w@nÆ"xwa:bÆ ‘warm house’

*Su‘xwa:bÆ *m@?Æ’P@sO‘xwa: *w@nO‘xwa:bÆ

3.2.3 Stop voicing. A ﬁnal diagnostic for the prosodic word is provided by the phonetic realisation of the stop series traditionally classiﬁed as phonemic voiced stops. These ‘voiced ’ stops are usually realised as voiced stops in intervocalic and ﬁnal position of a prosodic word, but as voiceless unaspirated stops in initial position (9).

(9) initial [p]a:dzÆ [tz]a:nÆ [t]a:mÆ [kj]a:nÆ

‘fly’ ‘naked’ ‘wing’ ‘shirt’

intervocalic sa:[b]Æ ‘dust’ si:[dz]Æ ‘my tooth’ fa:[d]Æ ‘drink’ ¶u[g j]Æ ‘mirror’

final S’@[b] gw@[dz] S@[d] bÆ[gj]

‘back’ ‘wheat’ ‘donkey’ ‘spider’

52 Matthew Gordon and Ayla Applebaum Figure 1 depicts waveforms illustrating the realisation of phonemic voiced stops as phonetically voiceless unaspirated word-initially (a) and as voiced stops intervocalically (b).

4 Prosodic fusion Thus far, all the prosodic words considered have contained a single root. The process of prosodic fusion creates prosodic words consisting of multiple roots. Prosodic fusion applies most pervasively within noun phrases (see w4.1.1 for more on morphosyntactic constraints on fusion), and involves the combination of a head noun with another root within the same noun phrase. Such cases include noun+adjective sequences (10a), compounds consisting of two nouns (10b) and sequences of numeral+ noun (10c). Adjectives and numerals appear after the noun they modify, except for /z@/ ‘one’, which occurs before the noun. The numeral suﬃx [-i:] (/@j/) intervenes between a noun and a following number. Note that all the morphemes (with the exception of the numeral suﬃx [-i:], which is a bound morpheme) in their free-standing form are given in (10) to the left of the arrow.

(10) a. qwÆ+f’@ qw’Æ+Z@ w@nÆ+f’@ b. ZÆ+p’q’@ nÆ+gw@ nÆ+ps@ c. z@+S@ bzu:+i:+bl@ bo:+i:+p’¡’@

pig+good son+old house+good jaw+bone eye+zone eye+water one+horse bird+num+seven stable+num+four

£ £ £ £ £ £ £ £ £

qwÆf’ qwÆZ’ w@nÆf’ ZÆp’q’ nÆgw nÆps z@S bzu:i:bl bo:i:p’¡’

‘good pig’ ‘old son’ ‘good house’ ‘chin’ ‘face’ ‘tear’ ‘one horse’ ‘seven birds’ ‘four stables’

As the examples in (10) show, schwa at the end of the second root fails to surface, parallel to the situation observed in polysyllabic prosodic words containing a single preﬁxed root (see w3.1). Parallel to prosodic words consisting of a single root, ﬁnal /P/ at the end of a noun phrase consisting of more than one root is characteristically not deleted (11).4

(11) gw@+SXwÆ £ gw@SXwÆ ‘daring’ heart+big S@+fÆ £ S@fÆ ‘horse skin’ horse+skin w@nÆ+S’Æ £ w@nÆS’Æ ‘new house’ house+new

cf. no final [@] in [gwi:p’¡’] ‘four hearts’ no final [@] in [S@f’] ‘good horse’ no final [@] in [w@nÆZ] ‘old house’

4 Although our consultants limit ﬁnal vowel deletion to schwa, some speakers delete

ﬁnal short /P/ as well (John Colarusso, personal communication).

(b)

frequency (Hz)

(a)

frequency (Hz)

Prosodic fusion and minimality in Kabardian

53

225

150

q’ a:

b

Æ

r

‘d

e:

"ts’

u kw’

@

f

225

150

m@

b

@

s

@

‘m

Figure 2 Pitch patterns in (a) the prosodic phrase [q’a:bProd@:"ts’ukw’] ‘ the little Kabardian ’ and (b) the prosodic word [m@b@s@om@f] ‘ this good host’.

The loss of ﬁnal schwa in the forms in (10) and the preservation of ﬁnal /P/ in (11) follows if one assumes that the roots in both sets of forms have fused to form a single prosodic word. The other diagnostics for prosodic word status presented in w3.2 conﬁrm this analysis. In all forms displaying fusion, stress follows the predictable Kabardian pattern (w3.2.1), according to which the ﬁnal syllable is stressed if heavy, otherwise the penult is stressed (12).

(12) a. w@nÆ+f’@ £ w@’nÆf’ bzu:+i:+bl@ £ bzu:’i:bl bo:+i:+p’¡’@ £ bo:’i:p’¡’ b. gw@+SXwÆ £ ‘gw@SXwÆ £ ‘S@fÆ S@+fÆ w@nÆ+f’Æ £ w@’nÆS’Æ

‘good house’ ‘seven birds’ ‘four stables’ ‘daring’ ‘horse skin’ ‘new house’

If the prosodic word corresponded to the grammatical word, we would expect each grammatical word to have a stress. Furthermore, we would also expect stress to fall on the syllable preceding the one it actually falls on in cases involving fusion of a word to a preceding polysyllabic word, i.e. *[ow@nPf’] instead of [w@onPf’] and *[o/PmP_’] instead of [/PomP_’]. Figure 2 compares a pitch trace (taken from the speech of a female speaker) illustrating the stress pattern characteristic of a prosodic word resulting from fusion (b) with the typical pitch pattern found in a phrase consisting of two prosodic words (a). In the phrase consisting of the prosodic words /q’a:bProd@j/ ‘ Kabardian’ and /ts’@kw’/ ‘little ’, the highest pitch peak, associated with the phrasal H* pitch accent (w3.2.1), falls on the second word, with a clear secondary peak representing word-level stress falling on the ﬁnal syllable of the ﬁrst word. In the single prosodic word resulting from fusion of the preﬁxed noun /m@b@s@m/ ‘this host ’

54

Matthew Gordon and Ayla Applebaum

S

@

d

@

Z

Figure 3 Intervocalic voiced stop in the fused form [S@d@Z] ‘old donkey ’.

and the adjective /f’@/ ‘good ’, there is a single pitch peak on the ﬁnal syllable. Note that the schwa occurring between the two roots is due to a regular process of epenthesis occurring in fused forms that would otherwise have a consonant cluster at the boundary between the two roots (see w6 for discussion of epenthesis). Vowel colouring (w3.2.2) also occurs across grammatical words that have fused. For example, the fused form [SuXw] ‘stallion ’ (= /S@+Xw@/ ‘horse+ male ’) (Colarusso 1992: 47) is realised with a rounded vowel, owing to the labialised consonant in the second root. The rounding in this form can be compared with the lack of rounding in noun phrases in which fusion has not taken place, [oS@"xwa:bP], *[Suoxwa:bP] ‘ warm horse’, [m@?PoP@sP"xwa:], *[m@?PoP@sOoxwa:] ‘ripe apple’. Finally, in forms involving fusion, an intervocalic voiced stop or stop phase of an aﬀricate is obligatorily realised as a voiced stop, parallel to the realisation of voiced stops in prosodic words consisting of a single root (w3.2.3), e.g. /S@+tsP/ ‘horse+tooth ’E[oS@dzP]. In contrast, in phrases comprised of multiple prosodic words, intervocalic voicing of stops across word boundaries is limited to fast speech rates. Figure 3 depicts a waveform illustrating voicing in a stop that comes to stand in intervocalic position due to prosodic fusion of the root /S@d/ ‘donkey ’ with the adjective /Z@/ ‘new ’. In summary, phonologically fused roots pattern as prosodic words, parallel to grammatical words consisting of a single root. Diagnostics that indicate the prosodic word status of fused forms include deletion of ﬁnal schwa, stress assignment, vowel colouring and the distribution of voicing in stops. 4.1 Restrictions on fusion Prosodic fusion is subject to certain restrictions, both morphosyntactic and phonological, which are discussed in ww4.1.1 and 4.1.2 respectively. 4.1.1 Morphosyntactic restrictions on fusion. Fusion is limited to noun and adjective phrases and can only involve constituents (with one exception to be discussed later). These conditions limit fusion to sequences of noun+ adjective (13a), noun+numeral (13b), numeral+noun (13c), noun+noun

Prosodic fusion and minimality in Kabardian 55 compounds (13d) and adverb+adjective sequences (13e). It precludes fusion between nouns and other syntactic classes, even if they are constituents and satisfy phonological conditions for fusion, for example, a CV noun and a verb belonging to the same verb phrase, e.g. /S@S/@/ ‘eat a horse! ’.

(13) a. w@’nÆS’Æ ‘new house’ NP

b. bzu:’i:p’¡’ ‘four birds’ NP

N

AdjP

N

NumP5

w@n

Adj

bzu:

i:p’¡@

S’Æ c. z@S ‘one horse’ NP

d. ’S@fÆ ‘horse skin’ NP

e. na:’X@f’ ‘better’6 AdjP

NumP

N

N

N

AdvP

Adj

z@

S@

S@

fÆ

na:X

f’@

The constituency requirement also precludes fusion between two adjectives to the exclusion of a noun (14a), or between a noun and a preceding modiﬁer such as a numeral or a possessive preﬁx to the exclusion of a modiﬁer following the noun (14b).

(14) a. S@+Xw@+f’@£*‘S@"Xwuf’ horse+male+good ‘good stallion’ NP N¢ N¢

NumP AdjP

N

AdjP

S@

Xw@

b. z@+S@+f’@£*‘z@S"f’@ one+horse+good ‘one good horse’ NP

f’@

z@

N¢ N

AdjP

S@

f’@

5 We take no position on the internal structure of postnominal numeral phrases. 6 The schwa intervening between the adverb and the adjective in this form reﬂects a

regular process of epenthesis applying in fused forms in which the ﬁrst root ends in a consonant (see w6 for discussion of epenthesis).

56 Matthew Gordon and Ayla Applebaum It does, however, allow, for the possibility of recursive fusion involving a noun+two adjectives (15a), a noun+adjective sequence following a numeral (15b), a compound followed by one or more adjectives (15c) or other combinations of words eligible for fusion within the noun phrase. Examples (15a) and (15b) are the grammatical versions of their ungrammatical counterparts in (14a) and (14b) respectively.

(15) a. S@+Xw@+f’@£S‘uXwuf’ b. z@+S@+f’@£z@‘S@f’ c. S@+fÆ+Z@+f’@+i:+p’¡’@£S@fÆZ@‘f’i:p’¡’@7 horse+skin+old+good+num+four ‘four good old horse skins’ NP N¢ N¢ AdjP

N¢ N¢

AdjP

N

N

S@

fÆ

NumP i:p’¡’@

f’@

Z@

The attested form in (15b) diﬀers minimally from its ungrammatical counterpart in (14b) in having its second schwa before the ﬁnal consonant rather than after it, i.e. [z@oS@f’] vs. *[oz@S"f’@]. The grammatical form in (15a) diﬀers from its unattested counterpart in (14a) in displaying rounding of the vowel preceding the initial [Xw] of the second root, i.e. [SuoXwuf’] vs. *[oS@"Xwuf’]. In addition, the unattested forms in (14a) and (14b) have a stress on each of the prosodic words, with the stress on the second word being stronger (w3.2.1). In contrast, the grammatical forms in (15) have a single stress. There is one circumstance in which fusion can apply between words that are not constituents. If an adjective that is too large to fuse to a preceding noun (16a) or an already fused form is followed by a numeral (16b), the numeral fuses to the preceding element.

7 A short vowel preceding the numeral marker /-i:/ is deleted.

Prosodic fusion and minimality in Kabardian

57

(16) a. S@+da:çÆ+i:+p’¡’@£‘S@dÆ"çi:p’¡’, *S@‘da:çi:"p’¡’@ horse+beautiful+num+four ‘four beautiful horses’ NP N¢ N¢

NumP

N

AdjP

S@

da:çÆ

i:p’¡’@

b. S@+f’@+da:çÆ+i:+p’¡’@£‘S@f’dÆ"çi:p’¡’, *S@f’‘da:çi:"p’¡’@ horse+good+beautiful+num+four ‘four good beautiful horses’ NP N¢ N¢ N¢ N S@

NumP AdjP

i:p’¡’@

AdjP da:çÆ f’@

4.1.2 Phonological restrictions on fusion. Fusion is also subject to a phonological restriction : the second element must consist of a single light syllable, i.e. (C)CV. This light syllable may have either /P/ as its nucleus or may be a monosyllabic root that ends in schwa in its bare form. The number of onset consonants in the second root is irrelevant. Fusion does not take place if the second root is either polysyllabic or consists of a single heavy syllable, i.e. (C)CVV or (C)CVC. In such cases, both roots are stressed, with a stronger (phrasal) stress falling on the second element (17).

(17) ¡’@+Za:n S@+S’a:lÆ ZÆm+be: f@z+da:çÆ

man+bright horse+young cattle+rich woman+beautiful

£ £ £ £

‘¡’@"Za:n ‘S@‘S’a:lÆ ‘ZÆm"be: ‘f@z"da:çÆ

‘bright man’ ‘young horse’ ‘rich cattle’ ‘beautiful woman’

Further evidence for the lack of fusion where the second root is larger than CV comes from vowel colouring. As we have seen (w3.1.2), vowel colouring is observed in noun phrases involving fusion: /S@+Xw@/E[SuXw] ‘stallion ’, /nP+gw@/E[nogw] ‘ face’. Vowel colouring is not observed, however, when there is no fusion.

58

Matthew Gordon and Ayla Applebaum

(18) w@nÆ+xwa:bÆ ¡’@+¶w@P

house+warm £ ‘w@nÆ"xwa:bÆ ‘warm house’ man+skinny £ ‘¡’@"¶wuP ‘skinny man’

Furthermore, stops at the beginning of a morphological word that has not undergone fusion are realised (except at fast speech rates) as voiceless unaspirated, rather than as voiced, as one would expect if fusion had taken place (w3.1.3) : /q’a:lP+da:/P/ ‘city+beautiful’E[oq’a:lP"ta:/P] (cf. /S@d+S’P/ ‘donkey+new ’E[S@od@S’P]). A ﬁnal piece of evidence for a lack of fusion when the second root is polysyllabic or heavy comes from the non-application of schwa epenthesis, which inserts a schwa after a consonant-ﬁnal root that fuses to a following root. This process is discussed in w6. 4.2 Extraprosodicity and fusion Thus far we have only considered cases in which the right edge of the prosodic word is aligned with the right edge of the grammatical word. In fact, all of the examples presented thus far have ended in a bare root. Kabardian makes extensive use of suﬃxes in both nouns and verbs. Stress patterns suggest a division between nominal and verbal suﬃxes in terms of their relationship to the prosodic word. Colarusso (1992) observes that nominal suﬃxes and those that can attach to either nouns or verbs fail to attract stress, suggesting that they fall outside the prosodic word. On other hand, suﬃxes that are strictly verbal fall within the domain of stress, indicating that they lie within the prosodic word. The forms in (19a) contain suﬃxes that potentially attract stress if phonological conditions are met, while those in (19b) contain suﬃxes that fall outside the domain of stress. Note that the declarative suﬃx /-s/, which can attach to both nouns and verbs, fails to attract stress ; this means that the verb forms in (19a) containing this suﬃx have ﬁnal stress because of the long vowel in the preceding (strictly verbal) suﬃx and not because of the declarative suﬃx. It should also be noted that the deletion of the root-ﬁnal schwa before the past tense suﬃx in the third form in (19a) is due to a regular phonological rule deleting a short vowel before a long vowel. (Note that prosodic words are enclosed in = >W.)

£ :w@lÆ’1a:;Ws (19) a. w@-lÆ1Æ-a:-s 2abs-work-pst-decl £ :d@Sç@‘nu:;Ws d@-Sç@-nu:-s 1pl.abs-eat-fut-decl £ :j@¡:Gwa:‘Ga:;W j@-¡:Gw@-a:-Ga: 3erg-see-past-plup j@-s-w@kj’@-f@-n-s £ :j@sw@kj’@‘f@n;W 3abs-1erg-kill-pot-fut-decl w@-s-t@-Z@-nu:-s £ :w@st@Z@‘nu:;Ws 2abs-1erg-give-back-fut-decl

‘you worked’ ‘we ate’ ‘had she seen it?’ ‘I will be able to kill him’ ‘I will give you back’

Prosodic fusion and minimality in Kabardian

b. q’a:lÆ-kj’Æ ba:dzÆ-s s@-tç@-s va:q’Æ-t da:mÆ-hÆ-m S’a:qwÆ-?@m m@SÆ-hÆ-kj’Æ

city-instr fly-decl 1abs-write-decl shoe-pst wing-pl-erg bread-neg bear-pl-instr

£ £ £ £ £ £ £

:‘q’a:lÆ;Wkj’Æ :‘ba:dzÆ;Ws :‘s@tç@;Ws :‘va:q’Æ;Wt :‘da:mÆ;WhÆm :‘S’a:qwÆ;W?@m :‘m@SÆ;WhÆkj’Æ

59

‘city (instr)’ ‘it is a fly’ ‘I write’ ‘it was a shoe’ ‘wing (erg pl)’ ‘it’s not bread’ ‘bears (instr)’

In the forms in (19b), addition of a suﬃx to a CVCV root fails to pull stress to the right of its original location in the unsuﬃxed form. The fact that the suﬃxes in (19b) fails to receive any stress suggests that they are outside the prosodic word and do not initiate a new prosodic word. In keeping with the rejection of stress by the nominal and hybrid nominal/verbal suﬃxes in (19b), these suﬃxes also fail to ensure that roots containing a ﬁnal schwa in isolation have this schwa when suﬃxed, as the forms in (20) show. The failure of schwa to surface in these examples falls out from the general ban on prosodic word-ﬁnal schwa in polysyllabic (but not monosyllabic) words under the assumption that nominal and nominal/ verbal suﬃxes fall outside the prosodic word.8

‘with a good man’ £ :‘¡’@f’;Wkj’Æ (20) ¡’@+f’@-kj’Æ man+good-instr *‘¡’@‘f’@kj’Æ ‘it’s an old woman’ £ :f@‘z@Z;Ws f@z+Z@-s woman+old-decl *f@‘z@Z@s ‘it was one horse’ £ :z@S;Wt z@+S@-t one+horse-pst *z@‘S@t £ :‘nÆps;WhÆm ‘tears (erg)’ nÆ+ps@-hÆ-m eye+water-pl-erg *nÆ‘ps@hÆm £ :gj@‘d@Z;W?@m ‘it’s not an old chicken’ gj@d+Z@-?@m chicken+old-neg *gj@‘d@Z@?@m Further evidence for the extraprosodic status of nominal suﬃxes is provided by the behaviour of laryngeal features in obstruent clusters. In most positions, Kabardian does not allow obstruent clusters in which the two members have diﬀerent laryngeal features. Thus ejectives can only be adjacent to ejectives,9 voiced obstruents can only occur next to voiced obstruents and voiceless obstruents can only be adjacent to voiceless obstruents. In most cases, this restriction holds as a static constraint on roots. However, consonant-ﬁnal verbal preﬁxes have diﬀerent allomorphs, depending on the initial consonant of the root (21). 8 Suﬃxes beginning with a sonorant, however, are preceded by a schwa following a

root-ﬁnal consonant. The schwa in this case, however, is not attributable to fusion, but rather reﬂects a general process of epenthesis applying between a consonant and a following sonorant (Kuipers 1960, Colarusso 1992, 2006), e.g. /S@d+m/ ‘ donkey+ERG ’E[oS@d@m], /f@z+P/ ‘ woman+ABS ’E[of@z@P]. 9 If the ﬁrst ejective is an unreleased stop, as commonly is the case in casual speech, the ejective feature is not audible on the ﬁrst stop.

60

Matthew Gordon and Ayla Applebaum

(21) voiced :zda:;Ws :do:tç;W :vda:;Ws ejective :s’p’a:;Ws :t’p’a:;Ws :f’p’a:;Ws

voiceless ‘I eat it (habit)’ :so:Sç;W :t¡Æ‘Gwa:;Ws ‘we saw him’ :f¡ÆGwa:;Ws ‘you (pl) saw him’

‘I sewed it’ ‘we write it (habit)’ ‘you (pl) sewed it’ ‘I educated him’ ‘we educated him’ ‘you (pl) educated him’

Obstruent clusters containing a suﬃxal consonant fail to obey the voicingagreement restriction (22), indicating that the suﬃx lies outside the prosodic word.

(22) m@z-t S@d-s S’@b-?@m S@d-hÆ-m f@z-kj’Æ

forest-pst donkey-pres back-neg donkey-pl-erg woman-instr

£ £ £ £ £

:m@z;Wt :S@d;Ws :‘S’@b;W?@m :‘S@d;WhÆm :‘f@z;Wkj’Æ

‘it was a forest’ ‘it’s a donkey’ ‘it’s not a back’ ‘donkeys (erg)’ ‘woman (instr)’

Since verbal suﬃxes fall within the prosodic word, we would expect rootﬁnal schwa to be preserved before a consonant-initial verbal suﬃx, as schwa is only deleted in ﬁnal position of the prosodic word. This prediction is borne out by the data. Consonant-initial verbal suﬃxes may follow verb roots that end in a schwa in isolation, in which case the schwa also surfaces before the suﬃx.10

(23) d@-Sç@-nu:-s £ :d@Sç@‘nu:;Ws 1pl.abs-eat-fut-decl j@-s-¡a:Gw@-f@-n-s £ :j@s¡a:Gw@‘f@n;Ws 3abs-1erg-see-pot-fut-decl j@-t@-Z@-nu:-s £ :j@t@Z@‘nu:;Ws 3erg-give-back-fut.i-decl s@-p-¡a:Gw@-Z@-a:-s £ :s@p¡a:Gw@‘Za:;Ws 1abs-2erg-see-fin-pst-decl

‘we ate’ ‘I will be able to see him’ ‘he will give it back’ ‘you finally saw me’

4.3 Fusion : a summary In summary, several phonological diagnostics support the generalisation that Kabardian observes a ban on schwa in ﬁnal position of a prosodic 10 The source of the rightmost schwa in the ﬁrst three forms is ambiguous. It could be

due either to the fact that it is sheltered from the right edge of the prosodic word by the following suﬃx or to a general rule of schwa epenthesis (not bounded by the prosodic word) targeting clusters whose second member is a sonorant (see note 8). Other schwas in the forms, however, are unambiguously not attributable to the general rule of pre-sonorant epenthesis.

Prosodic fusion and minimality in Kabardian 61 word. Prosodic words may consist of a single root or may contain multiple roots within noun or adjective phrases that have undergone prosodic fusion. In order to be eligible for fusion, certain morphological and phonological conditions must be met. First, only constituents within noun (or adjectival) phrases are eligible to join together to form a single prosodic word. Furthermore, the second root must consist of a single light (CV) syllable in order to be fused to the preceding root. Apparent cases of ﬁnal schwa unexpectedly failing to surface before suﬃxes turn out upon closer inspection to be attributable to the extraprosodic status of these suﬃxes.

5 An OT analysis of prosodic fusion We propose that prosodic fusion in Kabardian is driven by a minimality condition requiring that prosodic words be at least as large as CVC. In moraic terms, the minimality constraint thus amounts to a requirement that feet be binary, under the assumption consistent with the stress facts (w3.2.1) that codas are moraic in Kabardian. This foot-binarity restriction, in conjunction with other prosodic constraints, ensures that subminimal words, i.e. monosyllables containing neither a long vowel nor a coda consonant, prosodically attach to the immediately preceding word within noun and adjectival phrases if the constituency requirement on fusion is satisﬁed. We further suggest that the surfacing of schwa in open monosyllables that would otherwise lack a vowel and are ineligible for fusion for morphosyntactic reasons is due to an overarching ban on words containing no vowels ; such a hypothetical word would not only fail to constitute a bimoraic foot but would also lack even a single mora. For this reason, schwa must surface in open monosyllables constituting a prosodic word that would otherwise not have a vowel. Crucially, whether the schwa in such words is assumed to be underlying or not does not inﬂuence the outcome of the two minimality conditions, which result from markedness constraints rather than input–output faithfulness. An optimality-theoretic analysis of Kabardian minimality eﬀects was developed and checked using OTSoft (Hayes et al. 2003). Several constraints are at work to yield the minimality eﬀects in Kabardian. First, we must simultaneously account for fusion where the second element is (C)CV, but block fusion where it is prosodically larger. A constraint based on the constraint requiring feet to be binary at either the moraic or syllabic level, FTBIN (Prince 1980, Broselow 1982, Prince & Smolensky 1993, Hayes 1995, McCarthy & Prince 1996), ranked above GRAMWD=PRWD (Prince & Smolensky 1993), produces the desired eﬀect. The relevant FTBIN constraint, however, must refer to the main, i.e. primary stressed, foot of the phrase, since only the size of the second word, the phrasally stressed one, is relevant for determining whether fusion takes place or not: compare /w@nP+S’P/E[w@("nPS’P)] ‘ new house ’, with fusion, and

62 Matthew Gordon and Ayla Applebaum [(oqwP)("da/P)] ‘beautiful pig ’, without fusion. The relevant constraint to produce this asymmetry is FTBIN("I) ; it is deﬁned in (24).11

(24) FtBin("F) The primary stressed foot of the prosodic phrase is binary at either the moraic or the syllabic level. The ranking of FTBIN("I) above GRAMWD=PRWD is exempliﬁed in (25). Prosodic words are enclosed in =>W and prosodic phrases in ^aF.

(25)

/S@+S’Æ/ FtBin("F) GramWd=PrWd * ™ a. [:("S@S’Æ);W]F *! b. [:(‘S@);W:("S’Æ);W]F

The ﬁrst candidate violates GRAMWD=PRWD, since the two grammatical words fail to constitute independent prosodic words, as required by the constraint. The second candidate commits a fatal violation of higherranked FTBIN("I), however, because its phrasal stressed foot is monomoraic. If the second word is larger than (C)CV and can thus be parsed into a canonical foot, fusion is blocked by GRAMWD=PRWD. GRAMWD= PRWD thus outranks the generic FTBIN constraint banning monosyllabic feet.

(26)

/¡’@+¶w@P/ ™ a. [:(‘¡’@)}W{("¶wuP);W]F b. [:¡’u(‘¶wuP);W]F

GramWd=PrWd FtBin

* *!

The second candidate is parsed as a single prosodic word in which the ﬁrst grammatical word is part of the same prosodic word as the second grammatical word, thereby violating GRAMWD=PRWD.12 FTBIN("I) is violated in phrases consisting of a word with a single (C)CV syllable. This demonstrates that all prosodic words must be parsed 11 As the associate editor points out, the division of FTBIN into separate constraints

referring to diﬀerent levels of prominence parallels Hayes’ (1995 : 87) distinction between strong and weak prohibitions on degenerate feet (see also Coetzee 2004 for the distinction couched within OT). Interestingly, though, the relationship between stress level and the strength of the binarity requirement is diﬀerent in Hayes’ account than in the present one. Whereas Hayes shows that languages distinguishing between strong and weak bans on degenerate feet more stringently enforce binarity in syllables receiving secondary word-level stress compared to those with primary word-level stress, Kabardian enforces binarity more strictly in phrase-level stressed syllables than in syllables receiving word-level primary stress. 12 A third candidate, in which the ﬁrst (monomoraic) root is not parsed into any prosodic word (and thus does not display rounding assimilation), [_’@(oHwuP)], would violate a higher-ranked constraint PARSE-w, which is only violated by suﬃxes falling outside the prosodic word (w4.2).

Prosodic fusion and minimality in Kabardian 63 into prosodic phrases, in keeping with the general principle of the prosodic hierarchy (Selkirk 1984, Nespor & Vogel 1986, Hayes 1989) requiring that lower constituents in the hierarchy belong to higher constituents. The constraint capturing this requirement is PARSE(PrWd).

(27) Parse(PrWd) Prosodic words belong to prosodic phrases. The ranking of PARSE(PrWd) over FTBIN("I) is shown in (28). The losing candidate is parsed into a prosodic word, but not a prosodic phrase, as indicated by the lack of phrasal stress.

(28)

/¡’@/ ™ a. [:("¡’@);W]F b. :(‘¡’@);W

Parse(PrWd) FtBin("F)

* *!

Thus far, we have accounted for cases of fusion but have not tackled the allomorphy involving ﬁnal schwa. First, let us consider allomorphs lacking ﬁnal schwa. The constraint responsible for the absence of ﬁnal schwa in Kabardian reﬂects a cross-linguistically common type of restriction observed, for example, in Yupik (Reed et al. 1977), Chukchi (Kenstowicz 1994), Moroccan Arabic (Dell & Elmedlaoui 2002) and Javanese (Horne 1974). In these other languages, the ban on ﬁnal schwa is bounded by the word. In Kabardian, this restriction must be must be narrowed to phraseﬁnal position, for reasons that are discussed in w6. The constraint against phrase-ﬁnal schwa is formulated in (29).

(29) *Final-@ /@/ does not occur at the right edge of a prosodic phrase. One strategy for honouring the constraint against ﬁnal schwa is to change schwa vowels to another vowel quality in ﬁnal position, as in Yupik (Reed et al. 1977), which displays an alternation between schwa and the low vowel [a] whereby schwa in non-ﬁnal position alternates with [a] in ﬁnal position. Rather than changing the quality of word-ﬁnal schwa, Kabardian instead opts to delete it. This means that IDENT-IO is ranked above MAX-IO, as illustrated in (30).

(30)

*Final-@ Ident-IO Max-IO /qwÆ+f’@/ * ™ a. [:("qwÆf’);W]F *! b. [:("qwÆf’@);W]F *! c. [:("qwÆf’Æ);W]F

*FINAL-@ is violated in phrases consisting of a monosyllabic root ending in schwa. This is due to the overriding requirements (not formulated here) that each foot have a head, i.e. at least one syllable, and that each syllable have

64 Matthew Gordon and Ayla Applebaum a head, i.e. a nucleus (see also Peterson’s 2007 account of Kabardian, which follows Kuipers 1960 in assuming that schwa is not present underlyingly). It may be noted that the distribution of schwa in both monosyllables and polysyllables is correctly produced whether schwa is assumed to be underlying, as in most analyses of Kabardian, or not, as in Kuipers’ (1960) approach. In the case of polysyllabic prosodic words, a highly ranked *FINAL-@ will ensure that ﬁnal schwa fails to surface. In monosyllabic words, the foot- and syllable-headedness constraints ensure that schwa surfaces in ﬁnal position. The fact that a light phrase-ﬁnal root fuses to a preceding root indicates that Kabardian elects not to beef up the second root through root-internal augmentation processes that would obviate the need for fusion. Thus the vowel in the second root fails to lengthen, indicating that DEP-IO(q) is ranked above GRAMWD=PRWD. Nor does the vowel lengthen in a CV root constituting an independent phrase, demonstrating that DEP-IO(q) outranks FTBIN("I). Both of these rankings are shown in (31).

(31) a.

b.

/w@nÆ+S’Æ/ Dep-IO(m) FtBin("F) GramWd=PrWd * ™ i. [:w@("nÆS’Æ);W]F *! ii. [:‘w@nÆ);W("S’a:);W]F /fÆ/ ™ i. [:("fÆ);W]F ii. [:"fa:);W]F

* *!

A ﬁnal possibility to exclude is the addition of an epenthetic consonant without a mora, in order to shield a schwa from the right edge of the word. The ranking of DEP-IO(C) over *FINAL-@ eﬀectively eliminates this option (32).

(32)

/S@/

Dep-IO(C) *Final-@

™ a. [:("S@);W]F b. [:("S@t);W]F

* *!

6 Epenthesis and prosodic fusion A further complication arising in fusion contexts is that a schwa is inserted between the two roots undergoing fusion if the ﬁrst root ends in a consonant (Colarusso 1992).

(33) f@z+SXwÆ gj@d+fÆ S@d+S’Æ

woman+mature £ f@‘z@SXwÆ ‘mature woman’ chicken+skin £ gj@‘d@fÆ ‘chicken skin’ donkey+new £ S@‘d@S’Æ ‘new donkey’

Final schwa deletion and epenthesis can co-occur in the same fused forms (34).

Prosodic fusion and minimality in Kabardian

(34) f@z+f’@ w@s+Z@ wÆPÆd+p’kj’@ m@l+ps@

woman+good snow+old song+frame ice+water

£ £ £ £

f@‘z@f’ w@‘s@Z wÆPÆ‘d@p’kj’ m@‘l@ps

65

‘good woman’ ‘old snow’ ‘melody’ ‘melt water’

The fused forms displaying epenthesis are clearly single prosodic words, parallel to those not involving epenthesis. Stress patterns, vowel colouring and stop voicing all indicate the prosodic word status of these forms. The epenthetic schwa is stressed if it stands in a penult preceding a light ﬁnal syllable : [gj@od@fP] ‘ chicken skin ’. The epenthetic vowel assimilates to a following consonant: [gj@oduXw] ‘male chicken’. A stop preceding an epenthetic schwa is voiced : [S@od@S’P] ‘new donkey ’ (cf. Fig. 3 above). Schwa epenthesis also provides a diagnostic for the lack of fusion when either morphosyntactic or phonological conditions on fusion are not met. For example, there is no epenthetic schwa between the roots comprising the phrase [oZPm"be:] ‘ rich cattle’, because the second root, /obe:/ ‘rich’, is heavy. As it turns out, epenthesis reﬂects one strategy to satisfy a more widereaching constraint against coda consonants. This constraint, *CODA, bans coda consonants within the prosodic word (35).

(35) *Coda No coda consonants within the prosodic word. Coda consonants within the prosodic word are limited to four contexts. First, they are found root-ﬁnally, either followed by a suﬃx (36a) or not (36b). Second, they are found root-medially (36c). Finally, they arise when the ergative preﬁxes, which consist of a single consonant, come into contact with a following root (36d).

‘it was snow’ (36) a. :w@s;Wt b. m@z ‘forest’ :‘f@z;Wkj’Æ ‘woman (instr)’ gj@d ‘chicken’ :‘S@d;W?@m ‘it’s not a donkey’ m@l ‘ice’ ‘plate’ c. tÆp.‘SÆkj )Ænt.‘Xw@ps ‘soup’ ‘good’ Xa:P.‘z@nÆ £ q’Æf.‘tça:s d. q’Æ-f-tç@-a:-s ‘you (pl) started to write it’ hor-2pl.erg-write-pst-decl q’@-dÆ-p-t@-a:-s £ q’@dÆp.‘ta:s ‘you loaned the book to us’ hor-1pl.dat-2erg-give-pst-decl q’Æ-s-tç@-a:-s £ q’Æs.‘tça:s ‘I wrote it’ hor-1erg-write-pst-decl Although verbal suﬃxes fall within the prosodic word, the structure of verb roots and verbal suﬃxes preclude examination of the applicability of the anti-coda constraint in suﬃxed verbs. Verb roots end in a vowel and

66 Matthew Gordon and Ayla Applebaum there are no verbal suﬃxes either in isolation or in combination with one another that create closed syllables that are not in ﬁnal position of the prosodic word. The failure of epenthesis to apply following root-ﬁnal consonants falls out from the ranking of the constraint against ﬁnal schwa above the anticoda constraint (37). (Prosodic bracketing is omitted from (37) and subsequent tableaux where candidates do not diﬀer in their constituency.)

(37)

/m@z/ *Final-@ *Coda * ™ a. m@z *! b. m@z@

Consonants also fail to delete, in order to honour the anti-coda constraint, indicating that MAX-IO is ranked above *CODA (38).

(38)

/sÆbÆp/ Max-IO *Coda

™ a. sÆ‘bÆp b. ‘sÆbÆ

* *!

The absence of root-medial epenthesis ﬁnds a natural explanation in terms of the tendency for faithfulness to be stronger in roots than in aﬃxes (McCarthy & Prince 1995). In the case of Kabardian, a contiguity constraint referring to the root (cf. Kenstowicz 1994 for a similar analysis of Chukchi) ensures that contiguous segments in the input remain contiguous on the surface. This constraint, CONTIGUITY-IORt (CONTIG-IO), militates against the insertion of epenthetic material within the root.

(39) Contig-IO No intrusion or deletion of segments between segments belonging to the root that are contiguous in the input (McCarthy & Prince 1995). CONTIG-IO is ranked above *CODA, as indicated by the failure of epenthesis to apply within roots (40).

(40)

/tÆpSÆkj/ Contig-IO *Coda ** ™ a. tÆp‘SÆkj *! b. tÆp@‘SÆkj

In the case of the ergative preﬁx, inserting an epenthetic vowel is not an attractive option to eliminate the coda consonant, since the ergative preﬁxes are contrasted with the absolutive preﬁxes on the basis of the occurrence or non-occurrence of schwa. The ergative preﬁxes consist of simply a consonant, while the absolutive preﬁxes corresponding in person and number consist of the same consonant+schwa. This diﬀerence yields transitive vs. intransitive minimal pairs diﬀering in whether they have a schwa or not, as noted by Catford (1984) : /st/a:s/ ‘I wrote (TRANS) ’ vs.

Prosodic fusion and minimality in Kabardian 67 /s@ot/a:s/ ‘I wrote (INTRANS) ’. The blocking of epenthesis following the ergative preﬁx thus reﬂects an overriding morphological anti-homophony constraint (not formulated here, but see Crosswhite 1999, Kenstowicz 2002, Albright 2003, Gessner & Hansson 2004, Ichimura 2006 and Kubowicz 2007 for anti-homophony constraints in OT). Other than the ergative preﬁx, all other preﬁxes in Kabardian (the 17 cited in Abitov et al. 1957) have the shape CV, reﬂecting the general dispreference for codas outside of the root. In light of the general avoidance of clusters within prosodic words except for the contexts just discussed, the insertion of schwa between roots in prosodically fused forms may be viewed as a case of the emergence of the unmarked (McCarthy & Prince 1994). *CODA is ranked above DEP-IO(q), thereby accounting for the inter-root epenthesis observed in fused forms.13

(41)

/gj@d+fÆ/ *Coda Dep-IO(m)

™ a. gj@‘d@fÆ b. ‘gj@dfÆ

* *!

Another failed candidate opts to delete one of the consonants comprising the cluster instead of inserting a vowel to break up the cluster. The fact that epenthesis rather than deletion is employed to avoid coda indicates that MAX-IO is ranked above DEP-IO(q).

(42)

/gj@d+fÆ/ Max-IO Dep-IO(m) * ™ a. gj@‘d@fÆ *! b. ‘gj@fÆ

*CODA is also ranked above GRAMWD=PRWD, as forms undergoing epenthesis also undergo fusion (43).

(43)

/gj@d+fÆ/

™ a. [:gj@("d@fÆ);W]F b. [:(‘gj@d);W:("fÆ);W]F

*Coda GramWd=PrWd

* *!

At ﬁrst glance, it might seem as if FTBIN("I) would successfully rule out the losing candidate in the above tableau. However, FTBIN("I) is ranked below DEP-IO(q), as evidenced by the failure of vowels to lengthen in monosyllabic phrasally stressed feet, i.e. ^=("S@)>WaF, not *^=("S@:)>WaF. It 13 In order to account for the fact that the epenthetic vowel in Kabardian is schwa

rather than /P/, we follow Gouskova’s (2003) analysis of schwa in Lillooet in assuming a series of constraints that ban epenthesis of diﬀerent vowel qualities. These constraints, termed RECOVER constraints by Gouskova, are universally ranked, such that constraints banning more sonorous vowel qualities are ranked above those prohibiting less sonorous vowels. Being the least sonorous vowel, schwa is thus the ideal epenthetic vowel.

68 Matthew Gordon and Ayla Applebaum thus must be a constraint ranked above DEP-IO(q), namely *CODA, that is responsible for the downfall of the losing candidate in (43). *CODA is ranked below certain markedness constraints. The fact that the deletion of ﬁnal schwa in polysyllabic forms creates a coda consonant means that the ban on ﬁnal schwa is ranked above *CODA. Furthermore, the failure of ﬁnal schwa to change to a diﬀerent vowel indicates that *CODA is ranked below IDENT-IO. Both of these rankings are illustrated in (44).

(44)

/qwÆ+f’@/ Ident-IO *Final-@ *Coda

*

™ a. qwÆf’ b. qwÆ‘f’@ c. qwÆ‘f’Æ

*! *!

Consideration of an additional challenger to the winner in (44) reveals a further crucial ranking that only emerges after the constraint rankings required to produce the correct epenthesis patterns are integrated into the analysis of forms involving schwa deletion without epenthesis. This candidate, ^=(oqwP)>W=("f’@)>WaF, in which the schwa in the second root is preserved and the two roots are parsed as separate prosodic words, would appear to be eliminated by the ranking of FTBIN("I) over GRAMWD= PRWD. However, the winning candidate violates two constraints ranked above FTBIN("I) which the failed candidate does not violate: MAX-IO and *CODA, thereby precluding the possibility that FTBIN("I) is the constraint that blocks ^=(oqwP)>W=("f’@)>WaF. We now consider why MAX-IO and *CODA are ranked above FTBIN("I). The ranking of *CODA, and thus MAX-IO, above FTBIN("I) is the result of a series of transitivity relations, as follows. We have already seen that MAX-IO outranks *CODA (see (38)). *CODA is ranked above DEP-IO(q) (see (41)), which, in turn, is superordinate to FTBIN("I), as evidenced by the failure of vowels to lengthen in phrasally stressed CV feet, i.e. ^=("S@)>WaF, not *^=("S@:)>WaF. This means that the constraint that knocks out the non-fused challenger ^=(oqwP)> W=("f’@)>WaF must be ranked above both MAX-IO and *CODA. The correct constraint is *FINAL-@, which is violated by the challenger but not the winner, as shown in (45).

(45)

/qwÆ+f’@/

™ a. [:("qwÆf’);W]F b. [:(‘qwÆ);W:("f’@);W]F

*Final-@ Max-IO *Coda FtBin("F) GramWd =PrWd

* *!

*

* *

The ranking of *FINAL-@ over GRAMWD=PRWD has implications for the characterisation of the prosodic domain referenced by *FINAL-@, which bans phrase-ﬁnal schwa (see Flack 2009 on markedness constraints bounded by diﬀerent prosodic domains). If *FINAL-@ were bounded by the word rather than the phrase, there would be no way to ensure that a ﬁnal

Prosodic fusion and minimality in Kabardian 69 schwa in a word followed by a word larger than CV within the same phrase is not deleted. For example, the phrase ^=(o_’@)>W=("HwuP)>WaF ‘ skinny man ’ would be incorrectly predicted to undergo fusion and surface as *^=("_’HwuP)>WaF, given the ranking of *FINAL-@ over GRAMWD=PRWD. By constraining *FINAL-@ to refer to phrase-ﬁnal position, the schwa at the end of the ﬁrst word in ^=(o_’@)>W=("HwuP)>WaF is allowed to surface, and fusion is correctly blocked. The ﬁnal constraint rankings are summarised in (46).

(46)

Ident-IO

Dep(C) *Final-@

Max-IO

Contig-IO

*Coda Dep(m)

Parse(PrWd) FtBin("F)

GramWd=PrWd FtBin

7 Kabardian and the typology of minimality eﬀects Kabardian minimality eﬀects diﬀer in two ways from those observed in other languages. First, the minimality requirement in Kabardian is scalar. Words consisting only of consonants are completely banned, while monomoraic words are avoided wherever there is the possibility of fusion to a preceding word. This process of fusion ensures that words consist minimally of a single heavy syllable where the necessary prosodic and morphosyntactic conditions for fusion are present. The scalar nature of minimality in Kabardian has been attributed here to a combination of a foot-binarity constraint operating over phrasally stressed feet coupled with a higher-ranked foot-monomoraicity constraint requiring that feet contain at least one mora. This latter constraint is likely universally inviolable, as a violation of it would entail the existence of a word consisting of a single non-syllabic consonant. Although there are languages in which certain words may consist of only consonants, e.g. Berber (Dell & Elmedlaoui 1985, 2002) and Bella Coola (Bagemihl 1991), words in these languages invariably

70 Matthew Gordon and Ayla Applebaum contain a consonant that functions as a syllable nucleus, thereby honouring the requirement on foot monomoraicity. 7.1 Kabardian and the typology of minimality Languages diﬀer in their minimal word requirements. For example, whereas the smallest prosodic word in Chickasaw (Munro & Willmond 1994) is CVV, in Mongolian (Hangin 1986) both CVV and CVC monosyllables are well-formed (as are disyllables). In Lardil (Hale 1973, Wilkinson 1988), on the other hand, words are minimally disyllabic. When conﬂated across languages, the minimality hierarchy in (47) emerges (Garrett 1999), in which the existence of prosodic words of a given shape implies the occurrence of prosodic words to the left within the hierarchy, assuming that independent restrictions, e.g. a ban on closed syllables, the absence of long vowels, etc., do not preclude any of the permitted templates.

(47) Word-minimality hierarchy (Garrett 1999) larger smaller CVCV CVV CVC CV Minimal word requirements may either exist as static restrictions on the lexicon, as in Chickasaw and Mongolian, or may induce phonological processes that conspire to ensure that words meet the minimality requirement. For example, glottal stop is inserted to ensure that underlying CV words meet the CVC-minimality requirement in Cupen˜o (Crowhurst 1994), as in (48).

(48) Glottal stop epenthesis in Cupeño (Crowhurst 1994) /Ci/ Ci? ‘gather’ /hu/ hu? ‘fart’ /kwa/ kwa? ‘eat’ The strategy taken to bolster words to meet the binarity requirement in Kabardian is typologically unusual compared to that observed in Cupen˜o. In most languages, the requirement that a word consist minimally of a bimoraic foot is satisﬁed either through syntagmatic constraints on the lexicon or through active word-internal phonological processes. Three types of processes for satisfying minimality at the word level are observed depending on the stringency of the minimal word requirement. One strategy exempliﬁed by Cupen˜o is epenthesis of a consonant if both CVV and CVC are minimal words. In some languages, a vowel may be lengthened in monosyllables in order to satisfy a minimality requirement. Thus, in Northern Sa´mi (Nielsen 1926), short vowels in monosyllabic

Prosodic fusion and minimality in Kabardian 71 function words lengthen if they stand as independent prosodic words rather than prosodically adjoining to an adjacent word. In other languages imposing a more stringent disyllabic minimality requirement, vowel epenthesis is employed to repair words that would otherwise be monosyllabic. For example, the disyllabic minimality restriction holding of verbs in Minto (Hargus & Tuttle 1997) is satisﬁed by insertion of a pleonastic schwa in otherwise unpreﬁxed monosyllabic verbs (49).

(49) Vowel epenthesis in Minto (Hargus & Tuttle 1997) @Ñ@x ‘he/she is crying’ cf. d@næÑ@x ‘the man is crying’ @bæÑ ‘it’s cooking’ ¡uk’æbæÑ ‘fish is cooking’ @Ca» ‘it’s melting’ »@»k’UxCa» ‘bear fat is melting’ In other languages with a disyllabic minimality requirement, an otherwise regular process of vowel deletion may be suspended if it would create a subminimal monosyllabic word. For example, a process of ﬁnal vowel deletion (Hale 1973, Wilkinson 1988) is blocked in Lardil in disyllables, so that they do not become monosyllabic. All of these strategies for satisfying minimality requirements ensure that a grammatical word retains its prosodic independence. In constraintranking terms, these strategies all entail a highly ranked foot-binarity constraint as well as a highly ranked GRAMWD=PRWD, which ensures that the mapping between grammatical words and prosodic words is one-toone. The foot-binarity constraint can be operative at either the moraic or syllabic level, depending on whether the language allows words consisting of a single heavy syllable or requires that words be minimally two syllables. These languages diﬀer in the faithfulness constraint that is violated by the repair operation employed to satisfy minimality. In all languages that do not employ fusion to satisfy minimality, DEP-IO(q) is ranked low and GRAMWD=PRWD is ranked high. These languages diﬀer in the ranking of three other constraints: a constraint against long vowels, *LONGV, a constraint against vowel insertion, DEP-IO(V), and a constraint against consonant insertion, DEP-IO(C). In languages that insert an epenthetic consonant, e.g. Cupen˜o, DEP-IO(C) is low-ranked. In languages adding an epenthetic vowel, e.g. Minto, DEP-IO(V) joins DEP-IO(q) at the bottom tier of the constraint hierarchy. Languages such as Northern Sa´mi, which employ vowel lengthening to honour minimality, rank *LONGV and DEP-IO(q) low in the constraint hierarchy.14 Finally, the Kabardian response to employ fusion as a means to satisfy minimality results from ranking GRAMWD=PRWD below the other four pertinent constraints. The various language-speciﬁc strategies for satisfying minimality and the rankings they instantiate are summarised in Table I. 14 We are not aware of any languages that employ consonant lengthening to satisfy

word-minimality, though this is a logical possibility. In fact, many languages beef up stressed light syllables through consonant lengthening (Hayes 1995).

72

Matthew Gordon and Ayla Applebaum pattern

language

ranking

vowel lengthening

Northern Sámi

GramWd=PrWd, Dep-IO(V), Dep-IO(C) ê*LongV, Dep-IO(m)

vowel insertion

Minto

GramWd=PrWd, Dep-IO(C), *LongV êDep-IO(m), Dep-IO(V)

consonant insertion

Cupeño

GramWd=PrWd, Dep-IO(V), *LongV êDep-IO(m), Dep-IO(C)

prosodic fusion

Kabardian

Dep-IO(m), Dep-IO(V), *LongV, Dep-IO(C) êGramWd=PrWd

Table I Typology of repair stategies for satisfying prosodic minimality.

7.2 Kabardian prosodic fusion in relation to other wordformation processes Interestingly, the Kabardian minimality-driven process of fusion superﬁcially resembles some other word-formation processes, notably cliticisation and compounding, though it diﬀers from these phenomena in certain important respects. We brieﬂy consider these other processes and their relation to fusion here. 7.2.1 Fusion as cliticisation. Kabardian prosodic fusion is similar in certain respects to cliticisation, a process involving attachment of a function word, e.g. a pronoun, to a content word that functions as a prosodic host. For example, pronouns in Spanish cliticise to verbs : da´ me lo ‘give me it ’ (give+me+it) (see Miller & Monachesi 2003 for an overview of clitics in Spanish and other Romance languages). Similarly, the latching of unstressed prepositions, articles and pronouns onto open class lexical items in English may be viewed as another type of cliticisation : on cars, an axe, read it! Clitic attachment is a cross-linguistically common phenomenon that shares with Kabardian fusion its grouping together of multiple grammatical words into a single prosodic word. Cliticisation characteristically diﬀers, however, from fusion in certain respects. First, the fused forms in Kabardian display stress patterns that conform to the regular word-level stress rules holding of non-fused forms. This means that roots surfacing as a single consonant that has undergone fusion to a preceding root fall in a stressed syllable, e.g. [w@onP-Z] ‘new house ’. Typically, though not without exception (see Klavans 1985), clitics are unstressed. For example, monosyllabic function words in English often prosodically attach to an adjacent content word (Zwicky 1970, Selkirk 1984, 1995, Kaisse 1985).

Prosodic fusion and minimality in Kabardian 73 Thus, the article an in the sentence He saw an !ant is unstressed, as evidenced by its reduced vowel [@], even though a comparable disyllabic noun ending in a C+coronal cluster (Hammond 1999) would be expected to have penultimate stress, e.g. !forest, !legend. Similarly, in the Spanish form ve´nde me lo ‘sell it to me’ (sell+me+it) containing two postverbal clitics, stress falls on the ﬁrst syllable of the verb root, the pre-antepenultimate syllable of the entire clitic group, even though stress in Spanish is typically restricted to one of the ﬁnal three syllables of a word (Harris 1983). A further feature of Kabardian fusion that diﬀers from the prototypical case of cliticisation is the fact that the words that undergo fusion in Kabardian are full-ﬂedged content words. In contrast, cliticised elements are characteristically function words that do not occur prosodically independent of their hosts. There is, however, at least one other language that displays cliticisation of content words as in Kabardian. In Macedonian (Franks 1989),15 there are a few morphosyntactic constructions in which a content word adjoins to a preceding word to form a single expanded domain of stress, which in other circumstances is bounded by the word. Stress falls on the antepenultimate syllable of words with at least three syllables and on the ﬁrst syllable of shorter words. Multi-word constructions that constitute a single stress domain include modiﬁer+noun sequences, numerous preposition+noun sequences and negation/interrogative+optional clitic clusters+verb sequences. For example, in the adjective+noun construction su!vo grozje ‘raisins ’ (dry+grapes) and the preposition+noun combination pre!ku glava ‘over (one’s) head’ (Franks 1989: 555), stress skips over the second word entirely, and instead falls on the antepenultimate syllable of the phrase. The rejection of stress by the second element in phrases like su!vo grozje (*!suvo !grozje) and pre!ku glava (*!preku !glava) bears close resemblance to Kabardian fused forms, which also end in an unstressed root. Another point of similarity between the two processes is the restriction of both phenomena to particular morphosyntactic contexts.16 7.2.2 Fusion as compounding. One other process with which Kabardian fusion shares certain properties is compounding. Like fusion and unlike cliticisation, compounding often involves the combination of two open class lexical items, e.g. blackboard, steamship, football. The similarity between the two processes, however, ends there. Unlike fusion, compound words may display phonological properties that are anomalous for noncompounds. Thus, compounds in many, though not all, languages stress both members of the compound, as in English, which usually places a 15 Thanks to the associate editor for bringing the Macedonian data to our attention. 16 Unfortunately, it is not clear from the literature on Macedonian that we have been

able to access whether there are phonological diagnostics of the prosodic word other than stress that could be used to diagnose whether the multi-word stress domains of Macedonian behave like single prosodic words in other respects.

74 Matthew Gordon and Ayla Applebaum stronger stress on the left element of the compound and a secondary stress on the right member, e.g. !black$board, !steam$ship, !foot$ball. Furthermore, it is common for phonotactic restrictions holding of non-compounds to be violated in compounds. For example, the /kb/ cluster in blackboard and the /tb/ cluster in football are unattested in monomorphemic words in English. In contrast, fused forms in Kabardian are phonologically indistinguishable from non-fused forms on the surface. The only diﬀerence between fused and non-fused forms is the epenthesis of schwa between the two roots comprising fused forms, a process that is unattested in nonfused forms. Perhaps the most salient diﬀerence, however, between Kabardian prosodic fusion and compounding is the motivation behind the two phenomena: whereas compounding is morphologically driven, prosodic fusion serves a clearly phonological goal in avoiding subminimal words. In summary, prosodic fusion in Kabardian shares certain properties with the word-formation processes of cliticisation and compounding. However, it diﬀers from prototypical instantiations of both these processes in certain respects. The phenomenon to which it appears to bear closest resemblance is the formation of multi-word stress domains in Macedonian, although it is unclear whether the extended stress domains of Macedonian are prosodically identical to single words with respect to properties other than stress.

8 Conclusions The process of prosodic fusion in Kabardian expands the typology of strategies employed to satisfy prosodic minimality, but in a way predicted by the mechanism of constraint re-ranking intrinsic to OT. By demoting the requirement that grammatical words map to prosodic words in one-to-one fashion below faithfulness constraints banning insertion of moras and segments, subminimal words are free to combine in order to honour a constraint on foot-binarity. Interestingly, the relevant binarity constraint in Kabardian is speciﬁc to phrasal stress, which accounts for the unidirectional nature of fusion whereby a subminimal word can fuse to a word to its left but not to its right. REFERENCES

Abitov, M. L., B. X. Balkarov, J. D. Desheriev, G. B. Rogava, X. U. El’berdov, B. M. Kardanov & T. X. Kuasheva (1957). Grammatika kabardino-cherkesskogo literaturnogo jazyka. Moscow : Izdatel’stvo Akademii Nauk. Albright, Adam (2003). A quantitative study of Spanish paradigm gaps. WCCFL 22. 1–14. Applebaum, Ayla & Matthew Gordon (2007). Intonation in Turkish Kabardian. In Ju¨rgen Trouvain & William J. Barry (eds.) Proceedings of the 16th International Congress of Phonetic Sciences. Dudweiler, Saarbru¨cken: Pirrot. 1045–1048. Bagemihl, Bruce (1991). Syllable structure in Bella Coola. LI 22. 589–646.

Prosodic fusion and minimality in Kabardian

75

Bagov, P. M., B. X. Balkarov, T. X. Kuasheva, M. A. Kumaxov & G. B. Rogava (eds.) (1970). Grammatika kabardino-cherkesskogo literaturnogo jazyka. Vol. 1: Fonetika i morfologija. Moscow : Nauka. Beckman, Jill, Laura Walsh Dickey & Suzanne Urbanczyk (eds.) (1995). Papers in Optimality Theory. Amherst : GLSA. Broselow, Ellen (1982). On predicting the interaction of stress and epenthesis. Glossa 16. 115–132. Catford, J. C. (1942). The Kabardian language. Le Maıˆtre Phone´tique (3rd series) 78. 15–18. Catford, J. C. (1984). Instrumental data and linguistic phonetics. In Jo-Ann W. Higgs & Robin Thelwall (eds.) Topics in linguistic phonetics, in honour of E. T. Uldall. Coleraine : New University of Ulster. 23–48. Choi, John D. (1991). An acoustic study of Kabardian vowels. Journal of the International Phonetic Association 21. 4–12. Colarusso, John (1992). A grammar of the Kabardian language. Calgary: University of Calgary Press. Colarusso, John (2006). Kabardian (East Circassian). Munich : Lincom. Crosswhite, Katherine (1999). Intra-paradigmatic homophony avoidance in two dialects of Slavic. UCLA Working Papers in Linguistics 1: Papers in Phonology 2. 48–67. Crowhurst, Megan J. (1994). Foot extrametricality and template mapping in Cupen˜o. NLLT 12. 177–201. Dell, Franc¸ois & Mohamed Elmedlaoui (1985). Syllabic consonants and syllabiﬁcation in Imdlawn Tashlhiyt Berber. Journal of African Languages and Linguistics 7. 105–130. Dell, Franc¸ois & Mohamed Elmedlaoui (2002). Syllables in Tashlhiyt Berber and in Moroccan Arabic. Dordrecht: Kluwer. Flack, Kathryn (2009). Constraints on onsets and codas of words and phrases. Phonology 26. 269–302. Franks, Steven (1989). The monosyllabic head eﬀect. NLLT 7. 551–563. Garrett, Edward (1999). Minimal words aren’t minimal feet. UCLA Working Papers in Linguistics 1: Papers in Phonology 2. 68–105. ´ lafur Hansson (2004). Anti-homophony eﬀects in Gessner, Suzanne & Gunnar O Dakelh (Carrier) valence morphology. BLS 30. 91–103. Gordon, Raymond G., Jr. (ed.) (2005). Ethnologue : languages of the world. 15th edn. Dallas : SIL International. http://www.ethnologue.com. Gouskova, Maria (2003). Deriving economy: syncope in Optimality Theory. PhD dissertation, University of Massachusetts, Amherst. Hale, Kenneth (1973). Deep–surface canonical disparities in relation to analysis and change : an Australian example. In Thomas Sebeok (ed.) Current trends in linguistics. Vol. 11. The Hague: Mouton. 401–458. Hammond, Michael (1999). The phonology of English : a prosodic optimality-theoretic approach. Oxford : Oxford University Press. Hangin, Gombojab (1986). A modern Mongolian–English dictionary. Bloomington : Research Center for Inner Asian Studies, Indiana University. Hargus, Sharon & Siri G. Tuttle (1997). Augmentation as aﬃxation in Athabaskan languages. Phonology 14. 177–220. Harris, James W. (1983). Syllable structure and stress in Spanish : a nonlinear analysis. Cambridge, Mass. : MIT Press. Hayes, Bruce (1989). The prosodic hierarchy in meter. In Paul Kiparsky & Gilbert Youmans (eds.) Rhythm and meter. San Diego: Academic Press. 201–260. Hayes, Bruce (1995). Metrical stress theory : principles and case studies. Chicago: University of Chicago Press. Hayes, Bruce, Bruce Tesar & Kie Zuraw (2003). OTSoft 2.31. Software package. http://www.linguistics.ucla.edu/people/hayes/otsoft/.

76

Matthew Gordon and Ayla Applebaum

Horne, Elinor Clark (1974). Javanese–English dictionary. New Haven : Yale University Press. Ichimura, Larry (2006). Anti-homophony blocking and its productivity in transparadigmatic relations. PhD dissertation, Boston University. ¨ bersicht u¨ber die Tscherkessischen (Adygheischen) Jakovlev, N. F. (1930). Kurze U Dialekte und Sprachen. Caucasica 6. 1–19. Jakovlev, N. F. (1948). Grammatika literaturnogo kabardino-cherkesskogo jazyka. Moscow : Izdatel’stvo Akademii Nauk. Kaisse, Ellen M. (1985). Connected speech : the interaction of syntax and phonology. New York: Academic Press. Kardanov, B. M. & A. T. Bichoev (1955). Russko–kabardinsko-cherkesskij slovar’. Moscow : Gosydarstvennoe Izdatel’stvo Inostrannyx i Natsional’nyx Slovarej. Kenstowicz, Michael (1994). Syllabiﬁcation in Chukchee : a constraints-based analysis. In Alice Davison, Nicole Maier, Glaucia Silva & Wan Su Yan (eds.) Proceedings of the Formal Linguistics Society of Mid-America 4. Iowa City : Department of Linguistics, University of Iowa. 160–181. Kenstowicz, Michael (2002). Paradigmatic uniformity and contrast. MIT Working Papers in Linguistics 42. 141–163. Klavans, Judith L. (1985). The independence of syntax and phonology in cliticization. Lg 61. 95–120. Kuipers, Aert H. (1960). Phoneme and morpheme in Kabardian (Eastern Adyghe). The Hague: Mouton. Kubowicz, Anna (2007). Paradigmatic contrast in Polish. Journal of Slavic Linguistics 15. 229–262. McCarthy, John J. & Alan Prince (1994). The emergence of the unmarked: optimality in prosodic morphology. NELS 24. 333–379. McCarthy, John J. & Alan Prince (1995). Faithfulness and reduplicative identity. In Beckman et al. (1995). 249–384. McCarthy, John J. & Alan Prince (1996). Prosodic morphology 1986. Ms, University of Massachusetts, Amherst & Brandeis University. Miller, Philip & Paola Monachesi (2003). Les pronoms clitiques dans les langues romanes. In Danie`le Godard (ed.) Les langues romanes : proble`mes de la phrase simple. Paris: CNRS. 67–123. Munro, Pamela & Catherine Willmond (1994). Chickasaw : an analytical dictionary. Norman & London : University of Oklahoma Press. Nespor, Marina & Irene Vogel (1986). Prosodic phonology. Dordrecht : Foris. Nielsen, Konrad (1926). Lrebok i Lappisk. 3 vols. Oslo : Br¿ggers. Peterson, Tyler (2007). Minimality and syllabiﬁcation in Kabardian. CLS 39:1. 215–235. Prince, Alan (1980). A metrical theory for Estonian quantity. LI 11. 511–562. Prince, Alan & Paul Smolensky (1993). Optimality Theory : constraint interaction in generative grammar. Ms, Rutgers University & University of Colorado, Boulder. Published 2004, Malden, Mass. & Oxford: Blackwell. Reed, Irene, Osahito Miyaoka, Steven Jacobson, Paschal Afcan & Michael Krauss (1977). Yup’ik Eskimo grammar. Fairbanks : Alaska Native Language Center. Selkirk, Elisabeth O. (1984). Phonology and syntax : the relation between sound and structure. Cambridge, Mass : MIT Press. Selkirk, Elisabeth O. (1995). The prosodic structure of function words. In Beckman et al. (1995). 439–469. Turchaninov, G. & M. Tsagov (1940). Grammatika kabardinskogo jazyka. Moscow : Izdatel’stvo Akademii Nauk. Wilkinson, Karina (1988). Prosodic structure and Lardil phonology. LI 19. 325–334. Zwicky, Arnold M. (1970). Auxiliary reduction in English. LI 1. 323–336.

Phonology 27 (2010) 77–117. f Cambridge University Press 2010 doi:10.1017/S0952675710000047

Harmonic Grammar with linear programming: from linear systems to linguistic typology* Christopher Potts Stanford University Joe Pater Karen Jesney Rajesh Bhatt University of Massachusetts, Amherst Michael Becker Harvard University Harmonic Grammar is a model of linguistic constraint interaction in which wellformedness is calculated in terms of the sum of weighted constraint violations. We show how linear programming algorithms can be used to determine whether there is a weighting for a set of constraints that ﬁts a set of linguistic data. The associated software package OT-Help provides a practical tool for studying large and complex linguistic systems in the Harmonic Grammar framework and comparing the results with those of OT. We ﬁrst describe the translation from harmonic grammars to systems solvable by linear programming algorithms. We then develop a Harmonic Grammar analysis of ATR harmony in Lango that is, we argue, superior to the existing OT and rule-based treatments. We further highlight the usefulness of OT-Help, and the analytic power of Harmonic Grammar, with a set of studies of the predictions Harmonic Grammar makes for phonological typology.

1 Introduction We examine a model of grammar that is identical to the standard version of Optimality Theory (OT ; Prince & Smolensky 2004), except that the * Our thanks to Ash Asudeh, Tim Beechey, Maitine Bergonioux, Paul Boersma, John Colby, Kathryn Flack, Edward Flemming, Bob Frank, John Goldsmith, Maria Gouskova, Bruce Hayes, Rene´ Kager, Shigeto Kawahara, John Kingston, John McCarthy, Andrew McKenzie, Ramgopal Mettu, Alan Prince, Kathryn Pruitt, Jason Riggle, Jim Smith and Paul Smolensky, and other participants in conferences and courses where this material was presented. This material is based upon work supported by the National Science Foundation under Grant BCS-0813829 to Pater. Any opinions, ﬁndings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reﬂect the views of the National Science Foundation.

77

78 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker optimal input–output mapping is deﬁned in terms of weighted rather than ranked constraints, as in Harmonic Grammar (HG; Legendre et al. 1990a, b ; see Smolensky & Legendre 2006 and Pater 2009b for overviews of subsequent research). We introduce a method for translating learning problems in this version of HG into linear models that can be solved using standard algorithms from linear programming. The implementation of this method facilitates the use of HG for linguistic research. The linear programming model returns either a set of weights that correctly prefers all of the intended optimal candidates over their competitors or a verdict of ‘infeasible ’ when no weighting of the given constraints prefers the indicated optima. Thus we provide for HG the equivalent of what the Recursive Constraint Demotion algorithm (Tesar & Smolensky 1998b) provides for OT: an algorithm that returns an analysis for a given data set with a given constraint set, and that also detects when no such analysis exists. In addition, we present OT-Help (Becker & Pater 2007, Becker et al. 2007), a graphically based program that can take learning data formatted according to the standards deﬁned for the software package OTSoft (Hayes et al. 2003) and solve them using our linear programming approach (and with Recursive Constraint Demotion).1 The public availability of OT-Help will help research on weighted constraint interaction to build on results already obtained in the OT framework. We start by discussing the model of HG we adopt and its relationship to its better-known sibling OT (w2). w3 states the central learning problem of the paper. We then describe our procedure for turning HG learning problems into linear programming models (w4). w5 develops an HG analysis of an intricate pattern of ATR harmony in Lango. The analysis depends crucially on the kind of cumulative constraint interaction that HG allows, but that is impossible in standard OT. We argue that the HG approach is superior to Archangeli & Pulleyblank (1994)’s rule-based analysis and Smolensky (2006)’s constraint-conjunction approach. Finally, w6 is a discussion of typology in HG, with special emphasis on using large computational simulations to explore how OT and HG diﬀer. That discussion deepens our comparison with OT, and it highlights the usefulness of using eﬃcient linear programming algorithms to solve linguistic systems. We show that comparisons between OT and HG depend on the contents of the constraint sets employed in each framework, and that the greater power of HG can in some cases lead, perhaps surprisingly, to more restrictive typological predictions.

2 Overview of Harmonic Grammar In an optimisation-based theory of grammar, a set of constraints chooses the optimal structures from a set of CANDIDATE structures. In this paper, 1 In addition, the popular open-source software package Praat (Boersma & Weenink

2009) now oﬀers an HG solver designed using the method we introduce here.

Harmonic Grammar with linear programming 79 candidates are pairs +In, Out,, consisting of an input structure In and an output structure Out. In HG, optimality is deﬁned in terms of a harmony function that associates each candidate with the weighted sum of its violations for the given constraint set. The weighted sum takes each constraint’s violation count and multiplies it by that constraint’s weight, and sums the results.

(1) Definition 1 (harmony function) Let C={C1 …Cn} be a set of constraints, and let W be a total function from C into positive real numbers. Then the harmony of a candidate A is given by: n

IC,W(A)=% W(Ci)·Ci(A) i=1

We insist on only positive weights. While there is no technical problem with allowing a mix of negative and positive weights into HG, the consequences for linguistic analysis would be serious. For example, a negative weight could turn a penalty (violation count) into a beneﬁt. For additional discussion of this issue, see Prince (2003), Boersma & Pater (2008 : w3.5) and Pater (2009b : w2.1). The constraints themselves are functions from candidates into integers. We interpret C(A)=l4 to mean that candidate A incurs four violations of constraint C. We also allow positive values: C(A)=4 thus means that A satisﬁes constraint C four times. In this paper, we use only constraint violations (negative numbers), but the approach we present is not limited in this way. The optimal candidates have the highest harmony scores in their candidate sets. Since we represent violations with negative natural numbers, and weights are positive, an optimum will have the negative score closest to zero, which can be thought of as the smallest penalty. As in OT, this competition is limited to candidates that share a single input structure. In anticipation of the discussion in w4, we make this more precise by ﬁrst deﬁning the notion of a TABLEAU, the basic domain over which competitions are deﬁned.

(2) Definition 2 (tableaux) A tableau is a structure (AIn, C), where AIn is a (possibly inﬁnite) set of candidates sharing the input In, and C is a (ﬁnite) constraint set. We can then deﬁne optimality in terms of individual tableaux: the optimum is a candidate that has greater harmony than any of the other members of its candidate set. (3) Definition 3 (optimality) Let T=(AIn, C) be a tableau, and let W be a weighting function for C. A candidate A=(In, Out)ŒAIn is optimal i‰ IC,W(A)>IC,W(A¢) for every A¢Œ(A AIn—{A}).

80 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker The use of a strict inequality rules out ties for optimality, and brings our HG model closer to the standard version of OT, whose totally ordered constraint set also typically selects a unique optimum (if the constraint set is large enough). Languages with tied optima are not of particular interest, since the resulting variation is unlikely to match actual language variation (see the discussion below of existing theories of stochastic OT, which either render ties vanishingly improbable, as in Noisy HG, or completely eliminate the notion of an optimum, deﬁning instead a probability distribution over candidates, as in Maximum Entropy grammar). Goldsmith (1991: 259) proposes to model phonological interactions using weighted constraints ; he describes an account in which constraint violations can involve variable costs, which encode relative strength and determine relative well-formedness. Goldsmith (1990: w6.5)’s discussion of violability and cost accumulation contains clear antecedents of these ideas ; see also Goldsmith (1993, 1999). Prince & Smolensky (2004 : 236) also discuss a version of OT that uses weighted sums to deﬁne optimality. Our formulation follows that of Keller (2000, 2006) and Legendre et al. (2006), though it diﬀers from Legendre et al.’s in demanding that an optimum in a candidate set be unique, which is enforced by using a strict inequality (the harmony of an optimum is greater than its competitors). This is a simplifying assumption that allows for easier comparison with the typological predictions of OT. Example (4) is a typical representation of a tableau for HG. The single shared input is given in the upper left, with candidate outputs below it and their violation scores given in tabular format. The representation is like those used for OT, but without ranking being signiﬁed by the left-to-right order ; it also adds a weighting vector in the topmost row and the harmony scores for each candidate in the rightmost column.

(4) A weighted constraint tableau 1 I

weight

2

Input

C1 C2

™ a. Outputa 0 —1 —1 b. Outputb —1 0 —2

By (3), Outputa is chosen as the optimal output for Input. Optimal candidates are marked with the pointing ﬁnger. We emphasise that our version of HG, as characterised by (3), is, like OT, an optimisation system. Our HG grammars do not impose a single numerical cut-oﬀ on well-formedness, but instead choose the best outcome for each input. This point is vital to understanding how the systems work, but it is easily overlooked. We therefore pause to illustrate this with a brief example modelled on one discussed by Prince & Smolensky (1997 : 1606 ; for additional discussion, see Pater 2009b). We assume that it is typologically implausible that we will ﬁnd a natural language in which a

Harmonic Grammar with linear programming 81 single coda is tolerated in a word but a second coda is deleted. Such a language would map the input /ban/ faithfully to [ban], but would map input /bantan/ to [ba.tan] or [ban.ta]. Such patterns are unattested, arguably for fundamental reasons about how natural languages work, so we would like our theory to rule them out. In OT, it can be shown that this pattern would require contradictory rankings: NOCODA would have to outrank, and be outranked by, MAX, which is impossible. HG delivers exactly the same result. To make deletion of one of two potential codas optimal, as in (5a), NOCODA must have a weight greater than MAX. To make preservation of a single potential coda optimal, as in (5b), MAX must have a greater weight than NOCODA. (We use speciﬁc weights to illustrate how the calculations work.)

(5) a.

weight

2

1

I

/bantan/ NoCoda Max i. ban.tan —2 0 —4 —1 —1 —3 ™ ii. ba.tan

b.

weight

1

2

I

/ban/ NoCoda Max —1 0 —1 ™ i. ban ii. ba 0 —1 —2

The contradictory weighting conditions for (5a) and (5b) can be represented more generally, as in (6a) and (6b) respectively. These statements are the HG analogues of the contradictory pair of ranking statements we would require in OT.

(6) a. W(NoCoda)>W(Max)

b. W(NoCoda)<W(Max)

What’s more, if we include complete candidate sets for the evaluation, then assigning greater weight to NOCODA selects the mapping /bantan/E[ba.ta] as optimal, whereas assigning greater weight to MAX selects the mapping /bantan/E[ban.tan] as optimal, just like the two possible total rankings of these constraints in OT. Importantly, if we were to impose a numerical cut-oﬀ on wellformedness, then we could rule out /bantan/E[bantan] but allow /batan/E[batan] (with, for example, W(NOCODA)=2, and the cut-oﬀ above 2 and below 4). However, our version of HG does not impose numerical cut-oﬀs (for versions of HG that do generate this sort of pattern, see Ja¨ger 2007 and Albright et al. 2008). As this example illustrates, the grammatical apparatus is designed to model entire languages, not just individual mappings from input to optimal output. Thus, we deal primarily with TABLEAU SETS, which are sets of tableaux that share a single set of constraints but have diﬀerent inputs.

(7) Definition 4 (tableau sets) A tableau set is a pair (T, C) in which T is a set of tableaux such that if T=(AIn, C¢)ŒT and T¢=(A¢In¢, C§)ŒT and T≠T¢, then In≠In¢ and C=C¢=C§.

82 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker Given a tableau set (T, C), a weighting function W determines a language by selecting the optimal candidate, if there is one, from each tableau TvT. Since our (3) uses a strict inequality, some tableaux could theoretically lack optimal candidates. We note again that by using a strict inequality, our deﬁnition involves a simpliﬁcation. Versions of the theory that are designed to handle variation between optima across instances of evaluation do not make this simplifying move (see e.g. Boersma & Pater 2008). HG is of interest not only because it provides a novel framework for linguistic analysis, but also because its linear model is computationally attractive. HG was originally proposed in the context of a connectionist framework. OT ranking has so far resisted such an implementation (Prince & Smolensky 2004: 236, Legendre et al. 2006 : 347). Beyond connectionism, HG can draw on the well-developed models for learning and processing with linear systems in general, and the basic apparatus can be used in a number of diﬀerent ways. For example, a currently popular elaboration of the core HG framework we explore here is the probabilistic model of grammar proposed by Johnson (2002), and subsequently applied to phonology by Goldwater & Johnson (2003), Wilson (2006), Ja¨ger (2007) and Hayes et al. (2008). In this log-linear model of grammar, usually referred to as Maximum Entropy (MaxEnt) grammar, the probability of a candidate is proportional to the exponential of its harmony, calculated as in (1) above. By assigning a probability distribution to candidates, a MaxEnt grammar can deal with the ‘free variation ’ that is captured in OT as probabilistic variation in the ranking of constraints (see Coetzee & Pater, in press for an overview of OT and HG models of variation). As the above-cited papers emphasise, MaxEnt grammar is particularly appealing in that it has associated provably convergent learning algorithms, unlike extant approaches to variation in OT. Stochastic versions of HG like MaxEnt grammar and Boersma & Pater (2008)’s Noisy HG can subsume our categorical model as a special case in which all candidates have probabilities approaching 1 or 0. What, then, is the interest of our more restrictive approach ? We see it as a useful idealisation that facilitates analysis of individual languages and typological study. Even when the analyst’s ultimate aim is to provide a stochastic HG account of a case involving variation, it can be useful to ﬁrst develop a categorical analysis of a subset of the data. Despite its attractive properties, HG has been little used in analysing the patterns of constraint interaction seen in the grammars of the world’s languages. One possible reason for the relative neglect is that researchers may have assumed that HG would clearly produce unwanted typological results (Prince & Smolensky 2004: 233). In related work, Pater (2009b) argues that this assumption should be re-examined. By studying a categorical version of HG that diﬀers so minimally from OT, it becomes possible to gain a clearer understanding of the diﬀerence between a theory of grammar that has ranked constraints and one that has weighting. Both

Harmonic Grammar with linear programming 83 the Lango ATR-harmony analysis of w5 and the typological investigations of w6 focus on uncovering these diﬀerences. Another likely reason that HG is relatively understudied is that it can be diﬃcult to calculate by hand a weighting for a set of constraints that will correctly prefer the observed output forms over their competitors. Furthermore, in doing linguistic analysis, we are often interested in showing that a particular set of outputs can never coexist in a single language, that is, in showing that a theory is suﬃciently restrictive. Establishing that none of the inﬁnitely many possible weightings of a set of constraints can pick a set of outputs as optimal may seem to be an insurmountable challenge. These problems are the motivation for our development of a translation from HG learning to linear programming solving, and for the implementation of this procedure in OT-Help.

3 Our Harmonic Grammar learning problem The learning problem that we address, in this paper and with OT-Help, is deﬁned in (8).

(8) Let (T, C) be a tableau set, and assume that each tableau T=(AIn, C)ŒT is ﬁnite and contains exactly one designated intended winning candidate oŒAIn. Let O be the set of all such intended winners. Is there a weighting of the constraints in C that deﬁnes all and only the forms in O as optimal (deﬁnition 3)? If so, what is an example of such a weighting? This is analogous to a well-studied problem in OT (Tesar & Smolensky 1998b, Prince 2002, Prince & Smolensky 2004) : given a set of grammatical forms and a set of constraints C, is there a ranking of the constraints in C that determines all and only the grammatical forms to be optimal ? Our approach to the problem is categorical: for a given subset A of the full set of candidates, the algorithm either ﬁnds a harmonic grammar that speciﬁes all and only the members of A as optimal, or else it answers that there is no such harmonic grammar. This is by no means the only approach one could take to (8) and related questions. As we mentioned earlier, there are many useful perspectives, including those that allow for approximate learning, those that model learning in noise and so forth. Our particular approach to answering (8) is ideally suited to examining typological predictions of the sort discussed in w6 below. We do not in this paper address the issue of candidate generation, focusing instead on the twin problems of evaluation and typological prediction. Like the OT implementations in OTSoft (Hayes et al. 2003) and Praat (Boersma & Weenink 2009), and the HG implementations in Praat and those discussed in the MaxEnt literature cited above, we take the candidates as given. This introduces the risk of spurious conclusions based on non-representative candidate sets, but we see no satisfactory way around the problem at present. Riggle (2004a, b) shows how to generate

84 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker the full set of candidates which are optimal under some ranking of the constraints, but that holds only if, among other things, all the constraints have ﬁnite-state implementations. The result carries over to HG. However, it would be a mistake for us to prematurely limit HG to this set of constraints in this way. This is not a limitation that the OT community has made, and we know of no reason to assume that HG makes this more pressing than it is in other approaches. Although HG does not impose ﬁniteness limitations on its candidate sets, (8) restricts attention to the ﬁnite case, in recognition of the fact that OT-Help can deal only with ﬁnite systems. There are linear programming methods for dealing with situations in which, in present terms, the candidate set is inﬁnite but the constraint set is ﬁnite ; Lo´pez & Still (2007) is an overview.2 However, exploring such algorithms is outside the bounds of this paper. In addition, we suspect that the proper approach here is not to explicitly allow inﬁnite candidate sets, but rather to take a constructive approach to exploring the space of potential winners, as in Harmonic Serialism (McCarthy 2007, 2009, Pater, to appear).3 It is typically fairly easy to answer question (8) for small systems like (9). Here we follow Tesar & Smolensky (1998b) in referring to the desired optimal form as the ‘winner’, and the desired suboptimal candidates as the ‘ losers’.

(9) weight

I

C1 C2 C3 Input a. Winner —4 0 —4 0 —2 0 b. Loser

For (9), we can swiftly reason to the conclusion that a weighting +1, 4.1, 1, suﬃces. That is, we can use a weighting function W such that W(C1)=1, W(C2)=4.1 and W(C3)=1. This gives +Input, Winner, a total weighted violation count of l8 and +Input, Loser, a total weighted violation count of l8.2. And it is easy to see that many other weightings work as well. But it quickly becomes challenging to reason in this way. Hand calculations are prohibitively time-consuming even for modestly sized systems. This is where linear programming methods become so valuable. They can answer question (8) quickly for even very large and complex systems. We turn now to the task of showing how to apply such algorithms to these data.

4 From linguistic data to linear systems In this section, we build a bridge from linguistics into the domain of linear programming algorithms. In doing this, we make powerful and eﬃcient 2 We thank an anonymous reviewer for bringing this work to our attention. 3 The next version of OT-Help will implement Harmonic Serialism, in which

generation and evaluation are combined.

Harmonic Grammar with linear programming 85 tools available to the linguist wishing to grapple with large, complex data sets. Our description closely follows the algorithm we employ in OTHelp, which incorporates the stand-alone HG solver HaLP (Potts et al. 2007), which has a web interface that allows users to upload their own data ﬁles and which displays its results in HTML. Our discussion proceeds by way of the simple tableau set in (10).

(10)

:

C1 C2 Input1 a. Winner1 0 —2 b. Loser1 —6 0

C1 C2 Input2 a. Winner2 —1 0 b. Loser2 0 —1

;

In OT, these two winner–loser pairs are inconsistent, since Winner1 requires C17C2, and Winner2 requires C27C1. Our primary task is to determine whether the same is true in HG, or whether there is a weighting vector +w1, w2, that selects +Input1, Winner1, and +Input2, Winner2, as optimal. 4.1 Equations in the linear system We ﬁrst convert the weighting conditions into linear inequalities. For each winner–loser pair, we want an inequality that guarantees that the winner has greater harmony than the loser, as in (3). Weighting conditions like those in (6) are useful for getting a handle on the problem analytically. For the tableau depicted on the left in (10), the weighting condition is (11a), and for the tableau on the right in (10), the weighting condition is (11b).

(11) a. (0·W(C1))+(—2·W(C2))>(—6·W(C1))+(0·W(C2)) b. (—1·W(C1))+(0·W(C2))>(0·W(C1))+(—1·W(C2)) For the numerical optimisations to follow, we make extensive use of the following notation.

(12) a. 0w1+—2w2>—6w1+0w2⁄6w1+—2w2>0 b. —1w1+0w2>0w1+—1w2⁄—1w1+1w2>0 The wi variables are the weights assigned by the weighting function W to these constraints. Inequality (12a) expresses the requirement that the Winner1 output is favoured by the weighting over the Loser1 output, and (12b) expresses the requirement that the Winner2 output is favoured by the weighting over the Loser2 output. These inequalities are the HG equivalents of OT’s Elementary Ranking Conditions (Prince 2002). They can be directly calculated from a winner–loser pair by subtracting the loser’s score on each constraint from that of the winner.

86 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker Given a tableau set (T, C), we translate each winner–loser pair in each tableau in T into an inequality statement like the above. A weighting answers the learning problem in (8) for (T, C) if and only if it satisﬁes all of these inequality statements simultaneously. 4.2 The objective function All and only the vectors +w1, w2, satisfying the inequalities in (12) are solutions to the learning problem (8) for (10). The vectors +1, 2, and +2, 3, suﬃce, as do an inﬁnite number of others. The structure of linear programming problems gives us an analytically useful way of selecting from the inﬁnitude of possible solutions to a problem like this. The crucial notion is that of an OBJECTIVE FUNCTION. Throughout this paper, we work with very simple objective functions : just those that seek to minimise the sum of all the weights. Thus, for the two-constraint tableau set (10), the objective function is (13).

(13) minimise 1w1+1w2 More generally, if there are n constraints, we seek to minimise the sum of all the weights wi for 1!i!n, subject to the full set of inequalities for the system. However, we have now run into a problem: our optimisation problem is undeﬁned (Chva´tal 1983 : 43). The vector +1, 2, is not a minimal feasible solution, and neither are +1, 1.5,, +1, 1.1,, +1, 1.0001,, etc. Each is better than the previous one according to (13) ; there is no minimal solution. Thus we can never satisfy (13) ; whatever solution we ﬁnd can always be improved upon. The problem can be traced to our use of strict inequalities. In stating the problem this way, we are eﬀectively stating a problem of the form ‘ﬁnd the smallest x such that x>0’, which is also ill-deﬁned. It won’t do to simply change ‘>’ to ‘G’, because that would insist only that the winner be at least as good as the losers, whereas our version of HG demands that the winner be strictly better. Thus, to address this problem, we solve for a special constant a. It can be arbitrarily small, as long as it is above 0. It allows us to have regular inequalities without compromising our goal of having the winner win (not tie). This is equivalent to adding the amount a to the weighted sum of the loser’s constraint violations. The value of a deﬁnes a margin of separation : the smallest harmony diﬀerence between an optimum and its nearest competitor. (Such margins of separation are important for the Perceptron convergence proof; see Boersma & Pater 2008 for an application to HG.) Our use of the margin of separation a renders certain systems infeasible that would otherwise be feasible. These are the systems in which a winner can at best tie its losing competitors. We want these systems to be infeasible, because we want the winners to be strictly better. But one might wonder whether certain choices of a could rule out systems that we want to

Harmonic Grammar with linear programming 87 judge feasible. For instance, what happens if a is set to be very large? Could this incorrectly rule out a feasible analysis? The answer is no. We assume that there is no maximal weighting for any constraint, and none of our systems contain the conditions that would impose such a ceiling for particular cases. Thus, assume that the chosen constant is a, and assume also that there is a weighting W for which one of the inequality statements sums to a constant d that is smaller than a. Then we simply ﬁnd a linear rescaling of W that respects our choice of a rather than d. This rescaling could result in infeasibility only if there were a maximal value for some weight. But we assume that there are no such maxima. 4.3 Blocking zero weights The next question we address is whether to allow 0 weights. A weighting of 0 is equivalent to cancelling out violation marks. To prevent such cancellation, we can impose additional conditions, over and above those given to us directly by the weighting conditions: for each constraint Ci, we can add the inequality wi#b, for some positive constant b. Once again, because we impose no maxima, excluding this subregion does not yield spurious verdicts of infeasibility. It is worth exploring brieﬂy what happens if we remove the extra non-0 restrictions (if we set the minimal weight b to 0). In such systems, some constraint violations can be cancelled out when weighted, via multiplication by 0. This cancellation occurs when a given constraint is inactive for the data in question, i.e. when it is not required in order to achieve the intended result. For example, our current model returns +1, 1, 1, as a feasible solution for the small system in (14) (assuming that we set the margin of separation a to 1 and the minimal weight b to 1).

(14) weight

1

1

1 I

C1 C2 C3 Input a. Winner 0 —1 0 —1 b. Loser —1 0 —1 —2

In this solution, C1 and C3 GANG UP on C2 : with this weighting, neither suﬃces by itself to beat the loser, but their combined weighted scores achieve the result. However, if we do not ensure that all weights are at least b, then the minimal solutions for these data are +1, 0, 0, and +0, 0, 1,, with either of C1 or C3 decisive and the other two constraints inactive. As in this example, imposing a greater than 0 minimum on weights tends to result in solutions that make use of gang eﬀects, while choosing a 0 minimum tends to ﬁnd solutions that make use of a smaller number of constraints. Exploring the diﬀerences between these solutions (as is possible in OTHelp) may help an analyst better understand the nature of the constraint interactions in a system.

88

C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker 1w 2} 1

5

—1

2 1

(1,2) w1}1

0

6w1®

—1 —2 —3

w2}1

6w1®2w2}1 —1w1+1w2}1 1w1 }1 1w2}1

1

subject to

3

1w1+1w2

2w2}

minimise

w1 +

4

—2 —1

0

1

2

3

4

5

Figure 1 Translation and graph of (10), with the feasible region shaded.

4.4 The ﬁnal form of the system The linear system derived from (10) using the above procedure is given in Fig. 1, along with a geometric representation. To provide a concrete solution and a visualisation, we’ve set the margin of separation a to 1 and the minimal weight b to 1.4 The optimal weighting here is w1=1 and w2=2. The current version of OT-Help accepts OTSoft ﬁles as input, converts them into tableau sets, translates them using the above procedure and then solves them with the simplex algorithm, the oldest and perhaps most widely deployed linear programming algorithm (Dantzig 1982, Chva´tal 1983, Bazaraa et al. 2005). 4.5 Further remarks on the translation Before putting these technical concepts to work solving linguistic problems, we would like to pause brieﬂy to use graphical depictions like the one in Fig. 1 to clarify and further explore some of the decisions we made in translating from tableau sets to linear systems. Because each linguistic constraint corresponds to a dimension, we are limited to two-constraint systems when visualising, but the technique can nonetheless be illuminating. 4.5.1 Infeasibility detection. The graphical perspective immediately makes it clear why some linguistic systems are predicted to be impossible: they have empty feasible regions. Our simple NOCODA/MAX example 4 This is the default for OT-Help. An advantage of this is that it often returns

integer-valued weights, which are helpful for studying and comparing systems.

89

Harmonic Grammar with linear programming 5 1w 2} —1 w1 +

2 1

w1}1

0 —1 —2 —3

1

subject to —1w1+1w2}1 1w1®1w2}1 1w1 }1 1w2}1

3

—2 —1

0

w2}1

1w1+1w2

1w 1® 1w 2}

minimise

1

4

1

2

3

4

5

Figure 2 The linear view of tableau set (15). The intersection of all the areas picked out by the inequalities is empty, which is just to say that no grammar picks out the set of speciﬁed winners.

from w2 provides a good case study. Our goal there was to show that HG, like OT, predicts that it is impossible for a single language to allow a /ban/ to surface faithfully as [ban], but for it to penalise just one of the codas in /bantan/, thereby allowing something like [ba.tan] to surface. Here is a tableau set seeking to specify such a language.

(15)

:

/bantan/ NoCoda Max a. ban.tan —2 0 —1 —1 ™ b. ba.tan

/ban/ NoCoda Max —1 0 ™ a. ban —1 b. ba 0

;

In Fig. 2, we have transformed this tableau set into a linear system and plotted it. The arrows indicate which region the two main inequalities pick out. There is no area common to both of them, which is just to say that the feasible region is empty. 4.5.2 Margins of separation. We asserted in w4.2 that the precise value of a does not matter for addressing the fundamental learning problem (8). Figure 3 helps bring out why this is so. This ﬁgure diﬀers minimally from the one in Fig. 1, in that the value of a here is 3 rather than 1. This narrows the bottom of the feasible region, and, in turn, changes the minimal solution, from +1, 2, to +2@, 5@,, but the important structure of the system is unchanged. One’s choice of the margin of separation a can have consequences for how the solution generalises to unseen data, that is, to tableaux that are not included in the learning data. Suppose, for example, that we evaluate the

90

C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker 7 6 (2@,5@)

5

6w1®2w2}3 —1w1+1w2}3 1w1 }1 1w2}1

4 3 2 1 0 —1 —2 —3

2} 1w + w1 —1

3

w1 —1

+

1w

2}

1

w1}1

—2 —1

0

w2}1

subject to

1w1+1w2

6w1 ®2 w2} 1 6w1 ®2 w2} 3

minimise

1

2

3

4

5

Figure 3 The system in Fig. 1, but with the value of a set to 3, rather than 1. The feasible region has narrowed at the bottom, and the solution is diﬀerent, but the basic structure remains the same.

candidates in the following new tableau, using the weights found with each of the two values of a above.

(16) Input3

C1 C2 a. Output3a 0 —4 b. Output3b —9 0

With a=3, the optimal weighting vector is +2@, 5@,, which favours Output3a. With a=1, the optimal weighting vector is +1, 2,, which favours Output3b. 4.5.3 Stopping short of optimization. In discussing the objective function (w4.2), we emphasised ﬁnding minimal solutions. While knowing which is the minimal solution can be illuminating, it goes beyond the learning question (8), which simply asks whether there is a feasible solution at all. Our approach can be simpliﬁed slightly to address a version of this more basic question, with a resulting gain in eﬃciency. To see this, we need to say a bit more about how the simplex algorithm works.5 The simplex algorithm begins by setting all the weights to 0 and then pivoting around the edge of the feasible region until it hits the optimal 5 We stay at a relatively informal level here, since full descriptions of the simplex

algorithm invariably run to dozens of pages and involve making a variety of speciﬁc assumptions about data structures. Chva´tal (1983) presents a variety of diﬀerent formulations, Cormen et al. (2001: 29) give an accessible algebraic implementation in pseudocode and Bazaraa et al. (2005) is an advanced textbook devoted to the simplex algorithm as well as its newer, theoretically more eﬃcient alternatives.

Harmonic Grammar with linear programming

91

10 9 8

minimise

—1w1+1w2

subject to —4w1+1w2 —2w1®1w2 5w1®2w2 all wi

7

} —8 } —10 } —2 } 0

(2,6)

6 5

(3,4)

4 3 2 1 (0,1) 0

1

(2,0)

2

3

4

5

6

7

8

9 10

Figure 4 The simplex algorithm begins at the all-0s solution (the origin), and then pivots around the edge of the feasible region until it ﬁnds the vector that does best by the objective function.

solution according to the objective function. Figure 4 illustrates for one of the basic two-variable systems discussed by Cormen et al. (2001 : 773). The arrows show one direction that the simplex might take; which direction it travels depends on low-level implementation decisions. For this problem, the all-0s solution is inside the feasible region, so it provides a starting point. However, for all the systems arrived at via the conversion method of w4, setting all the weights to 0 results in an infeasible solution. For this reason, our solver always goes through two PHASES. In phase one, it constructs from the initial system an AUXILIARY SYSTEM for which the all-0s solution is feasible and uses this system to move into the feasible region of the initial problem (ending phase one). In Fig. 1, this auxiliary program takes us from the origin of the graph to the point +1, 22,, which is a feasible solution. The phase two optimisation then brings us down to +1, 2,, which minimises the objective function. The auxiliary program also provides us with a means for detecting infeasibility. One of the central pieces of this auxiliary program is a new artiﬁcial variable, w0. After we have solved the auxiliary program, we check the value of this variable. If its value is 0, then we can safely remove it and, after a few additional adjustments, we have a feasible solution to the original problem. If its value is not 0, however, then it is crucial to our ﬁnding a solution in the ﬁrst place, thereby indicating that the initial problem has no solutions. This is the source of the verdict of ‘infeasible ’ – the linguist’s cue that the grammar cannot deliver the desired set of optimal candidates. Thus the question of whether there is a feasible weighting is answered during phase one of the simplex, with phase two devoted to potential improvements with regard to the objective function. If such improvements are not of interest, then we can stop at the end of phase one.

92

C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker

5 Lango ATR harmony in HG We now turn to linguistic analysis using HG, and our linear programming method as implemented in OT-Help. A key argument for OT’s violable constraints is their ability to reduce complex language-speciﬁc patterns to more general, plausibly universal principles. For example, Prince & Smolensky (2004 : w4) show that a complex pattern of stress in the dialect of Hindi described by Kelkar (1968) can be reduced to the interaction of three general constraints. This reduction depends on constraint violability : two of the three constraints are violated when they conﬂict with a higher-ranked constraint. In this section, we show that the same sort of argument can be made for replacing OT’s ranked constraints with weighted ones. Our demonstration takes the form of a case study : ATR harmony in Lango, as described in Bavin Woock & Noonan (1979), from which all the data below are taken. Our analysis is based on generalisations originally uncovered by Bavin Woock & Noonan, and draws heavily on the analyses of Archangeli & Pulleyblank (1994) and Smolensky (2006).6 Smolensky’s use of local constraint conjunction drew our attention to the possibility of a treatment in terms of weighted constraints. In w5.2, we argue that the HG analysis improves on the earlier ones: its central principles are more general, and its typological predictions are more restrictive. Although the constraints in our analysis are simple, their interaction is complex; a correct weighting must simultaneously meet a host of conditions. Finding such a weighting involves extensive calculation. This analysis thus also further illustrates the utility of OT-Help for conducting linguistic analysis in HG. 5.1 Cumulative constraint interaction in Lango Lango has a ten-vowel system, with ﬁve ATR vowels [i e u o @] and ﬁve corresponding RTR vowels [I E U O a]. The following examples of ATR 6 Other descriptions of Lango include Okello (1975) and Noonan (1992). We follow

Archangeli & Pulleyblank (1994)’s characterisation of Bavin Woock & Noonan’s description so as to facilitate a comparison of our analysis with previous ones. However, it is worth noting a few relevant issues in the data that should be investigated in future research. Okello (1975 : 16ﬀ) explicitly denies that right-to-left harmony is limited to high vowel triggers, provides examples of two suﬃxes with mid vowels that trigger harmony and claims that the failure of a mid vowel to trigger is morphologically determined. Harmony seems to be, in general, more pervasive in the dialect she describes : it is iterative and aﬀects preﬁxes (cf. Bavin Woock & Noonan 1979, Noonan 1992). Both Okello and Noonan describe the blocking pattern of intervocalic consonants diﬀerently from Archangeli & Pulleyblank and Bavin Woock & Noonan, claiming that suﬃx-initial consonants, rather than clusters, block. Finally, both Okello and Noonan describe the harmony as strictly ATR spreading. The examples of RTR harmony cited by Archangeli & Pulleyblank occur only with a single suﬃx, the inﬁnitive. Bavin Woock & Noonan also cite several examples of morphological conditioning of inﬁnitival suﬃx selection with RTR roots. Since the RTR harmony data are particularly unclear, we focus only on ATR harmony.

Harmonic Grammar with linear programming 93 spreading show that it targets RTR vowels in both suﬃxes (17a–d) and roots (17e–h), in other words, that ATR spreads left-to-right and right-to-left. We have omitted tone from all transcriptions.

(17) a. b. c. d. e. f. g. h.

/wot+E/ /Nut+E/ /wot+a/ /buk+na/ /atIn+ni/ /dEk+ni/ /lUt+wu/ /lE+wu/

[wode] [Nute] [wod@] [bukk@] [atinni] [dekki] [lutwu] [lewu]

‘son (3 sg)’ ‘neck (3 sg)’ ‘son (1 sg)’ ‘book (1 sg)’ ‘child (2 sg)’ ‘stew (2 sg)’ ‘stick (2 pl)’ ‘axe (2 pl)’

These examples also show that ATR spreads from high vowel triggers (17b, d–h) as well as from mid vowels (17a, c), and from both front (17e, f) and back vowels (17a–d, g, h). The examples also show that it crosses consonant clusters (17d–g) and singletons (17a–c, h). Finally, they show that it targets high vowels (17e, g), mid vowels (17a, b, f, h) and low vowels (17c, d). For each of these options for trigger, directionality, intervening consonant and target, there is a preference, which is instantiated in the absence of spreading when that preference is not met. The preferences are listed in (18), along with examples of the failure to spread under dispreferred conditions, as well as references to the minimally diﬀerent examples in (17) in which ATR spreading does occur in the preferred environment. (18) Conditions favouring ATR-spreading in Lango a. High vowel trigger i. R-L spreading only when the trigger is high /nEn+Co/ [nEnno] *[nenno] ‘to see’ cf. (17e–h) ii. L-R spreading across a cluster only when the trigger is high /gwok+na/ [gwokka] *[gwokk@] ‘dog (1 sg)’ cf. (17c) b. L-R directionality7 i. Mid vowel triggers spread only L-R /lIm+Co/ [lImmo] *[limmo] ‘to visit’ cf. (17a, c) ii. Spreading from a back trigger across a cluster to a non-high target only L-R /dEk+wu/ [dEkwu] *[dekwu] ‘stew (2 pl)’ cf. (17d) 7 The greater strength of L-R spreading also seems to be instantiated in the fact that

it iterates and thus targets vowels non-adjacent to the original trigger, while R-L spreading iterates only optionally (Bavin Woock & Noonan 1979, Poser 1982, Noonan 1992, Kaplan 2008). Like Archangeli & Pulleyblank (1994) and Smolensky (2006), we abstract from the iterativity-directionality connection here, though see Jurgec (2009) for a treatment of iterativity in vowel harmony that appears compatible with our analysis.

94

C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker

c. Intervening singleton i. L-R spreading from mid vowels occurs only across a singleton /gwok+na/ [gwokka] *[gwokk@] ‘dog (1 sg)’ cf. (17a, c) ii. R-L spreading from a back trigger to a non-high target only across a singleton /dEk+wu/ [dEkwu] *[dekwu] ‘stew (2 pl)’ cf. (17h) d. High target R-L spreading from a back trigger across a cluster only to high vowels8 /dEk+wu/ [dEkwu] *[dekwu] ‘stew (2 pl)’ cf. (17g) e. Front trigger R-L spreading across a cluster to a mid target only from a front trigger /dEk+wu/ [dEkwu] *[dekwu] ‘stew (2 pl)’ cf. (17f) We would like an account of the harmony pattern that encodes each of these preferences with a single constraint. No such account currently exists in either OT or in rule-based approaches, as we discuss in w5.2. We now show that such an account is available under the assumption that constraints are weighted. We follow Smolensky (2006) in ascribing the Lango trigger and directionality preferences to constraints on the heads of feature domains, though our implementation diﬀers somewhat in the details. Headed domain structures for ATR are illustrated in (19b) and (19d), in which the ATR feature domain spans both vowels. In (19b) the head is on the rightmost vowel, and in (19d) the head is leftmost. Unlike Smolensky (2006), we assume that a feature domain is minimally binary – a relation between a head and at least one dependent. In the disharmonic sequences in (19a) and (19c), the ATR feature is linked to a single vowel, and there is no head–dependent relation. The assumption that the ATR vowels in (19a) and (19c) are not domain heads is crucial to our deﬁnition of the constraints on triggers below. In these representations, a vowel unspeciﬁed for ATR is RTR; the use of underspeciﬁcation here is purely for convenience.

(19) ATR structures ATR b. a. pEti

ATR peti

c. ATR petI

d. ATR peti

8 Noonan (1992) notes that, for some speakers, mid vowels do assimilate to following

high back vowels across a cluster. This pattern can be straightforwardly accommodated by a diﬀerent weighting of our constraints, for example, one just like that in Table I, but with the weights of both HEAD[front] and ATR[high] decreased to 1.

Harmonic Grammar with linear programming 95 We assume that it is deﬁnitional of the head of the domain that it is faithful to its underlying speciﬁcation : a head of an ATR domain is underlyingly ATR. For spreading to occur, there must be a constraint that disprefers representations like those in (19a) and (19c) relative to (19b) and (19d) respectively. We adopt a single constraint that penalises both (19a) and (19c) : SPREAD[ATR] (see Wilson 2003, Smolensky 2006, Jurgec 2009 and McCarthy 2009 for alternative formulations of a spreading constraint). (20) Spread[ATR] For any prosodic domain x containing a vowel speciﬁed as ATR, assign a violation mark to each vowel in x that is not linked to an ATR feature.

Since ATR harmony applies between roots and suﬃxes in Lango, the domain x in (20) must include them and exclude preﬁxes. The transformation of an underlying representation like (19a) into a surface representation like (19b) is an instance of R-L spreading, which is dispreferred in Lango. The representation in (19b) violates the constraint in (21).9

(21) Head-L Assign a violation mark to every head that is not leftmost in its domain. For underlying (19a), HEAD-L and SPREAD[ATR] conﬂict: SPREAD [ATR] prefers spreading, as in (19b), while HEAD-L prefers the faithful surface representation (19a). The transformation of an underlying representation like (19c) into a surface representation like (19d) is an instance of spreading from a mid trigger, which is also dispreferred in Lango. This violates the constraint in (22), which also conﬂicts with SPREAD[ATR].

(22) Head[high] Assign a violation mark to every head that is not high.

9 Bakovi¯ (2000) and Hyman (2002) claim that preferences for L-R harmony are

always morphologically conditioned. A more typologically responsible analysis might replace HEAD-L with a constraint demanding that heads be root vowels, since R-L harmony in Lango always targets root vowels. Some support for this analysis comes from the dialect of Lango described by Okello (1975), in which preﬁxes undergo harmony, but do not trigger it. We use HEAD-L for ease of comparison with Archangeli & Pulleyblank (1994) and Smolensky (2006).

96 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker Similarly, front triggers are preferred by HEAD[front].

(23) Head[front] Assign a violation mark to every head that is not front. As for the constraint preferring spreading across singleton consonants, we follow Archangeli & Pulleyblank (1994) in invoking a locality constraint.

(24) Local-C Assign a violation mark to every cluster intervening between a head and a dependent. And ﬁnally, as the constraint penalising spreading to a non-high target, we follow Archangeli & Pulleyblank (1994) and Smolensky (2006) in using a co-occurrence constraint.

(25) ATR[high] Assign a violation mark to every ATR vowel that is not high. With this large set of markedness constraints that can conﬂict with the pro-spreading constraint SPREAD[ATR], faithfulness constraints are not necessary to characterise the patterns of blocking and spreading we have examined, and so we use only markedness constraints in the analysis we present here. A complete analysis would also include the faithfulness constraints violated by spreading (e.g. IDENT[ATR]) and faithfulness constraints that penalise alternative means of satisfying SPREAD[ATR] (e.g. MAX for segment deletion). We exclude these for reasons of space only. Like Smolensky (2006), we consider as inputs all bisyllabic sequences containing one ATR and one RTR vowel. The potential trigger ATR vowel is either high front [i], high back [u] or mid [e]. The potential target RTR vowel is either high [I] or mid [E]. We illustrate the analysis with just this subset of the vowels to make the presentation as clear as possible; some of the exact combinations are not attested in (17) and (18) or in Bavin Woock & Noonan (1979) (e.g. the potential mid trigger is in fact [o] in (17) and (18)). For each ATR/RTR pair, we consider sequences with both orderings of the vowels, and for each of these, we consider inputs with intervening singletons and clusters. For each of these inputs, we consider two candidates: the faithful one, and one in which the input RTR vowel surfaces as ATR. The unfaithful candidates are assumed to have the structure illustrated in (19b, d), where the underlying RTR vowel is parsed as the dependent in the ATR domain. In Table I, we provide a subset of the inputs, chosen for reasons we discuss below, along with the two candidates. The optimal form is labelled

Harmonic Grammar with linear programming 97 the winner, and the suboptimal candidate is labelled the loser (Prince 2002). A ‘W ’ in a constraint column indicates that the constraint favours the winner, and an ‘L ’ indicates that the constraint favours the loser. All of the constraints assign maximally one violation, so a constraint that favours the winner is violated once by the loser, and a constraint that favours the loser is violated once by the winner. The SPREAD[ATR] constraint assigns a W when the optimal form has undergone spreading, and an L when the optimal form does not. All of the other constraints assign Ls in some cases of spreading, and Ws in some cases when the candidate with spreading is suboptimal.

11 input T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12

W~L

iCE iCe ~ iCE uCE uCe ~ uCE eCE eCe ~ eCE ECi eCi ~ ECi ECu eCu ~ ECu ICe ICe ~ iCe iCCE iCCe ~ iCCE uCCE uCCe ~ uCCE eCCI eCCI ~ eCCi ECCi eCCi ~ ECCi ECCu ECCu ~ eCCu ICCu iCCu ~ ICCu

8

4

4

2

2

Spread Head Head- Local- Head ATR [ATR] [high] L C [front] [high] W W W W W L W W L W L W

L

L L L L L

L

L L

L L

W

L L W

W L W L

L L W L W L

W L

L W

9 7 1 5 3 1 5 3 1 1 1 1

Table I Informative winner–loser pairs for Lango vowel harmony, with constraint weights and margins of separation.

There is no OT ranking of these constraints that will correctly make all of the winners optimal. None of the constraints prefers only winners, and so Recursive Constraint Demotion will immediately stall. The topmost row shows the weights found by submitting these winner–loser pairs to the implementation of our linear programming-based solver in OT-Help. The rightmost column shows the resulting margin of separation between the optimum and its competitor, i.e. the diﬀerence between the harmony scores of the winner and the loser. Since, in this case, the constraints assign a maximum of one violation, the diﬀerence between the violation score of a winner and a loser on a given constraint is at most 1. Therefore, the margin of separation is simply the sum of the weights of the constraints that prefer the winner minus the sum of the weights that prefer the loser. The fact that these numbers are

98 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker always positive shows that winners are correctly optimal under this weighting.10 The ﬁrst six winner–loser pairs contrast L-R spreading and R-L spreading across an intervening singleton. The ﬁrst three are input conﬁgurations that can yield L-R spreading, since the ATR vowel is on the left. Spreading is always optimal, even with a target mid vowel, which violates ATR[high] when it harmonises. We have left out inputs with potential target high vowels, since with this constraint set, if spreading targets a mid vowel, it is guaranteed to target high vowel in the same context. ATR[high] penalises spreading to mid vowels, and there is no constraint that speciﬁcally penalises spreading to high vowels. The next three inputs (T4–6) are ones that can yield R-L spreading, since the ATR vowel occurs in the second syllable. Spreading in fact occurs with high triggers (T4–5), but not mid ones (T6). To illustrate the case in which spreading fails to occur, we include only an input with a potential high target /I/, since, if a high vowel in a certain environment fails to undergo spreading, a mid vowel is guaranteed to fail as well. The blocking of spreading in T6 is due to the joint eﬀects of HEAD[high] and HEAD-L : the sum of their weights is greater than the weight of SPREAD[ATR]. An analysis in terms of such a gang eﬀect is necessary because neither HEAD[high] alone (as in T3) nor HEAD-L alone (as in T4 and T5) is suﬃcient to override spreading. This is thus one source of diﬃculty for an OT analysis with these constraints : if either HEAD[high] or HEAD-L were placed above SPREAD[ATR] to account for T6, the wrong outcome would be produced for one of T3–5. Inputs T7–9 provide the conditions for L-R spreading across a cluster. Spreading is blocked with a mid trigger (T9), in contrast to L-R spreading across a singleton (T3). Again, we include only the input with the potential high target to illustrate blocking, since spreading to a mid target violates a proper superset of the constraints. Blocking here is due to the combined eﬀects of HEAD[high] and LOCAL-C, whose summed weights exceed that of SPREAD[ATR]. That LOCAL-C alone does not override SPREAD[ATR] is shown in T7–8. Again, since cumulative interaction is needed to get the correct outcome with this constraint set, OT ranking is not suﬃciently powerful to deal with this set of winner-loser pairs. Finally, inputs T10–12 illustrate the least preferred context for spreading : when the ATR vowel is on the right, and a cluster intervenes. Here, and in no other context, spreading is blocked if the trigger is back and the target is mid. This outcome is shown in T11, which can be compared with T2, 5 and 8, in which spreading does occur in other contexts. This is a gang eﬀect between four constraints, HEAD-L, LOCAL-C, HEAD[front] and ATR[high], whose summed weight exceeds that of

10 A display of this type is available in OT-Help as the ‘ comparative view’. In lieu of

Ws and Ls, the HG comparative view uses positive and negative integers respectively.

Harmonic Grammar with linear programming 99 SPREAD[ATR]. That no set of three of these constraints is suﬃciently potent to overcome SPREAD[ATR] is illustrated by inputs T5, 8, 10 and 12, whose optimal outputs have spreading that violates one of the four possible three-membered sets of these constraints. We do not include potential mid triggers in the set of inputs, since R-L spreading already fails to occur across a singleton (T6), and spreading across a cluster also violates LOCAL-C. In sum, the cumulative eﬀect of any of the following three sets of constraints overcomes the demands of SPREAD[ATR].

(26) a. Head[high], Head-L No R-L spreading from mid vowels. b. Head[high], Local-C No spreading from mid vowels across a cluster. c. Head-L, Local-C, Head[front], ATR[high] No R-L spreading from back vowels across a cluster to a mid vowel target. No other set of constraints that does not include all of the members of one of the sets in (26) is suﬃciently powerful to override SPREAD[ATR] : spreading occurs in all other contexts. A correct constraint weighting must simultaneously meet the conditions that the sum of the weights of each of the sets of constraints in (26) exceeds the weight of SPREAD[ATR], and that the sum of the weights of each of these other sets of constraints is lower than the weight of SPREAD[ATR]. OT-Help allows such a weighting to be found easily.

5.2 Comparison with alternatives If the constraints in the previous section were considered either inviolable, as in theories outside of HG and OT, or rankable, as in OT, they would be insuﬃcient for analysis of the Lango paradigm. In this section, we consider extant analyses constructed under each of these assumptions about the activity of constraints. We show that they suﬀer in terms of both generality and restrictiveness. In their parametric rule-based analysis, Archangeli & Pulleyblank (1994) posit ﬁve rules of ATR spreading. Each rule speciﬁes directionality and optional trigger, target and locality conditions. These are schematised in Table II. Cells left blank indicate that the rule applies with all triggers, targets or intervening consonants. The conditions are inviolable constraints on the application of the rules. Because of their inviolability, they must be limited to apply only to particular rules: none of them are true generalisations about ATR spreading in the language as a whole. Even though the directionality, trigger and locality preferences do not state completely true generalisations, they have

100 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker direction

trigger

L-R L-R R-L R-L R-L

high high high high, front

target locality VCV VCV high

Table II The rules of Archangeli & Pulleyblank (1994), each of which speciﬁes directionality and optional trigger, target and locality conditions. Cells left blank indicate that the rule applies with all triggers, targets or intervening consonants.

broad scope in the ATR system of Lango, and must therefore be encoded as constraints on multiple rules. Thus inviolability entails the fragmentation of each generalisation across separate formal statements. By encoding the conditions as parametric options for rules, Archangeli & Pulleyblank succeed in relating them at some level, but, in the actual statement of the conditions on spreading in Lango, there is a clear loss of generality in comparison with our weighted constraint reanalysis.11 We can further note that there exists no proposal for how a learner sets such parameters for spreading rules (see Dresher & Kaye 1990 on metrical parameters). Correct weights for our constraints can be found not only with linear programming’s simplex algorithm, but also with the Perceptron update rule (Pater 2008; see also Boersma & Pater 2008) and a host of other methods developed for neural modelling and machine learning. Along with this loss of generality, there is a loss of restrictiveness.12 In Archangeli & Pulleyblank’s parametric rule system, any set of rules with any combination of conditions can coexist in a language. Davis (1995) and McCarthy (1997) discuss this aspect of the theory with respect to disjoint target conditions on two RTR-spreading rules; here we consider the further possibilities introduced by trigger and locality conditions. One notable aspect of the Lango system is that L-R spreading is ‘ stronger’ in all respects : there is no environment in which R-L spreading applies more 11 In one respect, Archangeli & Pulleyblank (1994) and Smolensky (2006) aim to

generalise further than we do : to derive high vowel trigger restrictions in ATR harmony from the unmarkedness of ATR on high vowels. Pater (2009a) questions this move, pointing out that some harmony systems spread preferentially from marked vowels. John McCarthy (personal communication) notes that the strength of high triggers likely results from the greater advancement of the tongue root in high vowels. We formally encode this irreducible phonetic fact as the HEAD[high] constraint. 12 The large space of possibilities aﬀorded by the parametric theory is the impetus behind the development of Archangeli & Pulleyblank’s own OT analysis of Lango, whose notion of ‘ trade-oﬀs’ may be seen as a sort of a precedent to our HG treatment.

Harmonic Grammar with linear programming 101 freely with respect to any of the conditions. This ‘ uniform strength ’ property is predicted by the HG analysis, but not by the one using parametric rules. As Davis and McCarthy show, the latter theory allows one rule to apply more freely with respect to one condition, and another rule to apply more freely with respect to another condition. For example, with the following parameter settings, L-R spreading targets only high vowels, while R-L spreading has only high vowels as triggers. The set of triggers is unrestricted for L-R spreading, whereas the set of targets is unrestricted for R-L spreading.

direction trigger target L-R R-L

high high

Table III Parameter setting in which L-R spreading targets only high vowels, while R-L spreading has only high vowels as triggers.

To see that this system is impossible in HG, we can consider the required weighting conditions. Along with HEAD-L, violated by R-L spreading, we include in our constraint set HEAD-R, which penalises L-R spreading. The weighting conditions are illustrated in Table IV, using the comparative format.

input

W~L

e…I i…E E…i I…e

e…i ~ e…I i…E ~ i…e e…i ~ E…i I…e ~ i…e

Spread[ATR] Head[high] Head-L Head-R ATR[high] W L W L

L

W

L W L W

W L

Table IV Inconsistent weighting conditions for a hypothetical pattern.

L-R spreading is illustrated in the top two rows : ATR can spread from a mid vowel, violating HEAD[high], but not to a mid vowel, which would violate ATR[high]. R-L spreading, on the other hand, can violate ATR[high], as in the third row, but not HEAD[high], as in the last one. Recall that for the winners to be correctly optimal, in each row the sum of the weights of the constraints assigning Ws must be greater than the sum of the weights of the constraints assigning Ls. The resulting

102 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker inequalities are in fact inconsistent. When this problem is submitted to OT-Help, it returns a verdict of infeasible. By imposing other combinations of conditions on parameterised rules, there is a range of systems that one can create in which R-L spreading is stronger in one respect, and L-R is stronger in another. None of these can be generated by weightings of our constraints, since they always require inconsistent weighting conditions like those illustrated in Table IV. The general inability of HG to generate a system of this type can be understood as follows.13 If there is a condition on spreading that applies in one direction but not another, then the sum of the weights of the constraints violated by spreading in the banned direction must be greater than the sum of the weights violated by spreading in the allowed direction (since only the former can exceed the constraint(s) motivating spreading, like our SPREAD). By assumption, the constraints violated under any target, trigger or locality condition are the same for both directions of spreading. Therefore, this requirement reduces to the statement that the weight of the constraint(s) violated speciﬁcally by spreading in the banned direction (e.g. HEAD-R) must be greater than in the permitted one (e.g. HEAD-L). From this it should be clear why imposing a second condition on spreading that holds only in the opposite direction would result in inconsistency amongst the weighting conditions. Smolensky (2006)’s analysis of Lango in terms of conjoined constraints pursues a similar strategy to that of Archangeli & Pulleyblank (1994). Since OT does not allow the pattern to be analysed in terms of fully general constraints, Smolensky uses constraint conjunction to formulate complex constraints in terms of more basic formal primitives, much in the same way that Archangeli & Pulleyblank use parameterisation of rules. Again, we ﬁnd the same basic constraints instantiated multiple times in the analysis, this time across conjoined constraints. To facilitate comparison with our analysis, we show this using the basic constraints from w5.1, rather than Smolensky’s own. To get spreading from high vowel triggers L-R, but not R-L, we conjoin HEAD[high] and HEAD-L. For spreading across clusters only from high vowels, we conjoin HEAD[high] and LOCAL-C. Each of these conjoined constraints is violated when both of the basic constraints are violated. In Table V, we show how the conjoined constraints can resolve two of the sources of inconsistency in the failed OT analysis, using our constraint set from w5.1. In this table, the left-to-right ordering of the constraints provides a correct ranking (the dashed lines separate constraints whose ranking is indeterminate). The ﬁrst two rows show the conjoined constraint analysis of spreading from mid vowels only L-R, and the second two show the analysis of spreading across clusters from only high vowels. 13 This restriction is a generalisation of the subset criterion on targets in bidirectional

spreading in OT that McCarthy (1997) attributes to personal communication from Alan Prince.

Harmonic Grammar with linear programming 103 input

W~L

Head[high] Head[high] Spread Head Head- Local&Head-L &Local-C [ATR] [high] L C

eCI eCi ~ eCI ICe ICe ~ iCe iCCI iCCi ~ iCCI eCCI eCCI ~ eCCi

W L W L

W W

L W

W L W

W

Table V The use of local conjunction to resolve inconsistency in the OT analysis of Lango.

Here HEAD[high] appears in three constraints, much as the high trigger condition is imposed on multiple rules in Table III. Thus the conjoined constraint analysis also succeeds only at the cost of a loss of generality relative to the weighted constraint analysis. And, like the parametric theory, there is no learning algorithm for constraint conjunction (Smolensky 2006: 139). Furthermore, it shares with the parametric analysis the same loss of restrictiveness identiﬁed above. To show this, we provide in Table VI a local conjunction analysis of the hypothetical pattern in which only L-R spreading is triggered by mid vowels (due to conjoined HEAD[high]) HEAD-L), and only R-L spreading targets mid vowels (due to conjoined ATR[high])HEAD-R). input

W~L

e…I i…E E…i I…e

e…i ~ e…I i…E ~ i…e e…i ~ E…i I…e ~ i…e

Head[high] ATR[high] Spread Head Head- Head- ATR &Head-L &Head-R [ATR] [high] L R [high] W W

W L W L

L

W

L W L W

W L

Table VI The use of local conjunction to resolve inconsistency in the analysis of a hypothetical language.

For other cases in which local constraint conjunction in OT generates patterns not produced by the unconjoined versions of the basic constraints in HG, see Legendre et al. (2006) and Pater (to appear). The comparison of the typological predictions of the three analyses highlights an important general point about comparisons between theories of constraint interaction, which might be easy to overlook. One might be tempted to favour a less powerful theory of constraint interaction on the grounds that it will oﬀer a more restrictive theory of linguistic typology. However, the predictions of a theory of constraint interaction also depend

104 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker on the contents of the constraint set. Insofar as a more powerful theory of constraint interaction allows attested patterns to be analysed with a more restricted constraint set, the resulting typological predictions are likely to be in some ways more restrictive. This is just as true of comparisons between HG and OT as it is of comparisons between ranked and inviolable constraints. We oﬀer the Lango case study as a concrete illustration of this general point. We are not asserting that it is a decisive argument in favour of HG over OT. We oﬀer it instead in the hope that it will inspire further use of HG in linguistic analysis. There are a number of unresolved empirical issues surrounding Lango vowel harmony (see note 5) and the related typology. In recent work, McCarthy (2009) surveys the known cases in which bidirectional harmony has stronger restrictions on spreading in one direction than another, and concludes that all are doubtful for one reason or another. McCarthy’s critical survey is in fact driven by the inability of his proposed constraint set to produce such patterns when they interact through OT ranking. Further cross-linguistic work driven by the current positive HG results may well yield a diﬀerent outcome. Not only is further empirical study required to choose between HG and OT, but much further theoretical work is also needed to determine the ways in which HG and OT constraint sets can diﬀer in analyses of existing languages, and the ways in which the resulting theories diﬀer in their predictions. As we show in the following sections, OT-Help is invaluable not only in conducting analyses of individual languages in HG, but also in determining the predictions that constraint sets make in HG and OT.

6 Harmonic Grammar typology OT provides a successful framework for the study of linguistic typology, and this has been a key component of its success. A central question is what kind of typological predictions HG makes, especially since these predictions have been claimed to be unsupported (Prince & Smolensky 1997, 2004, Legendre et al. 2006; cf. Pater 2009b). The present section begins to explore this question via a number of computational simulations designed to highlight points of convergence and divergence between the two frameworks. OT-Help is essential here. It allows us to explore enormous typological spaces eﬃciently and to compare the resulting predictions of both OT and HG. All the data ﬁles used in these simulations are downloadable (December 2009) from http://web.linguist.umass.edu/~OTHelp/data/hg2lp/. Readers can immediately repeat our simulations using OT-Help. (A user’s manual is available as Becker & Pater 2007.) 6.1 Typology calculation In OT, a language is a set of optimal forms picked by some ranking of the constraints, and the predicted typology is the set of all the sets of optima

Harmonic Grammar with linear programming 105 picked by any ranking of the constraints. OTSoft (Hayes et al. 2003) determines the predicted typology by submitting sets of optima to the Recursive Constraint Demotion algorithm (RCDA) (Tesar & Smolensky 1998a), which either ﬁnds a ranking or indicates that none exists. OT-Help implements the RCDA as well as our linear programming approach, so we can use it to conduct typological comparisons between the two theories. OTSoft builds up the typology by using an iterative procedure that adds a single tableau at a time to the RCDA’s dataset. When a tableau is added to the dataset, the sets of optima that are sent to the RCDA are created by adding each of the new tableau’s candidates to each of the sets of feasible optima that have already been found for any previously analysed tableaux. The RCDA then determines which of these new potential sets of optima are feasible under the constraint set. This procedure iterates until all of the tableaux have been added to the dataset. This is a much more eﬃcient method of ﬁnding the feasible combinations of optima than enumerating all of the possible sets of optima and testing them all. OT-Help uses this procedure for both HG and OT. 6.2 The typology of positional restrictions In the analysis of Lango, we pointed out that one can compare the typological predictions of HG and OT only with respect to the constraint sets that each framework requires to analyse some set of attested phenomena. In that discussion, we compared HG to OT with local constraint conjunction, showing that the less restricted constraint sets permitted by local conjunction yielded less restrictive predictions for typology. Here, we compare HG and OT using non-conjoined constraints, showing again that the greater power of HG can allow for a more restrictive theory. Our example of positional restrictions is drawn from Jesney (to appear), to which the reader is directed for a more detailed discussion; our aim here is only to show how the example illustrates this general point. Research in OT makes use of two types of constraint to analyse what seems to be a single phenomenon: the restriction of phonological structures to particular prosodic positions. These two types of constraint – positional markedness (e.g. Itoˆ et al. 1995, Zoll 1996, 1998, Walker 2001, 2005) and positional faithfulness (e.g. Casali 1996, Beckman 1997, 1998, Lombardi 1999) – capture many of the same phenomena in OT, but neither is suﬃciently powerful on its own to account for the full set of attested positional restrictions. In HG, however, positional markedness constraints are able to capture a wider range of patterns, making positional faithfulness unnecessary for these cases. Positional markedness constraints directly restrict marked structures to the ‘ licensing’ position. Given voicing as the marked feature, for example, the constraint in (27a) disprefers any surface instance of [+voice] that appears unassociated with an onset segment, and the constraint in (27b) disprefers any surface instance of [+voice] that appears unassociated with the initial syllable.

106 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker

(27) a. VoiceOnset Assign a violation mark to every voiced obstruent that is not in onset position. b. Voice-s1 Assign a violation mark to every voiced obstruent that is not in the word-initial syllable. To illustrate the diﬀerences between HG and OT, we consider a language which allows both of the contexts identiﬁed in the constraints above – i.e. onsets and word-initial syllables – to license the marked [+voice] feature. In such a language, /badnabad/ would surface as [bad.na.bat], with devoicing only in the coda in a non-initial syllable. Table VII shows how this language can be analysed in HG with our two markedness constraints and a single non-positional faithfulness constraint. 2 W~L [bad.na.bat] ~ [bad.na.pat] [bad.na.bat] ~ [bat.na.bat] [bad.na.bat] ~ [bad.na.bad]

2

3

VoiceOnset Voice-s1 Ident[voice] L L W

W

W W L

1 1 1

Table VII A successful HG analysis of a language in which both onsets and word-initial syllables license [+voice].

As in the Lango example, the winner and loser always diﬀer by a maximum of one violation, so we can indicate a preference for each with ‘ W’ and ‘L ’, instead of indicating the degree of preference numerically. The ﬁrst row compares the desired optimum to an alternative that devoices all obstruents in non-initial syllables. The loser does better on VOICE-w1, at the expense of IDENT[voice]. The second row compares the winner to a loser that devoices all codas, which improves on VOICEONSET, again at the expense of IDENT[voice]. These two comparisons require each of the markedness constraints to have values lower than that of the faithfulness constraint. The last row compares the winner to the fully faithful candidate, which incurs violations of both markedness constraints. This comparison requires the sum of the weights of the markedness constraints to exceed that of the faithfulness constraint. The input /bad.na.bad/ will thus surface as [bad.na.bat], provided that the individual weights of the markedness constraints are insuﬃcient to overcome the weight of IDENT[voice], but the summed weights of the markedness constraints together are. Table VII shows a successful HG analysis. In each row, the sum of the weights of the constraints preferring the winner is greater by 1 than the sum of the weights preferring the loser.

Harmonic Grammar with linear programming 107 There is no OT ranking that will make the winner correctly optimal in Table VII ; no constraint assigns only Ws, and so Recursive Constraint Demotion fails. Analysing this type of pattern in OT requires positional faithfulness constraints like those deﬁned in (28). (28) a. Ident[voice]-Ons Assign a violation mark to every output segment in onset position whose input correspondent di‰ers in voicing speciﬁcation. b. Ident[voice]-s1 Assign a violation mark to every output segment in the initial syllable whose input correspondent di‰ers in voicing speciﬁcation.

The OT analysis with positional faithfulness constraints is shown in Table VIII. Here, we include general *VOICE and IDENT[voice] constraints, along with the positional faithfulness constraints deﬁned above. The left-to-right ordering of the constraints is a correct ranking (the relative ordering of the two positional faithfulness constraints is not crucial). W~L [bad.na.bat] ~ [bad.na.pat] [bad.na.bat] ~ [bat.na.bat] [bad.na.bat] ~ [bad.na.bad]

Ident[voice] Ident[voice] *Voice Ident[voice] -Ons -s1 W W

L L W

W W L

Table VIII A successful OT analysis using positional faithfulness to license [+voice] in both onsets and word-initial syllables.

While positional faithfulness constraints are required in OT to capture this pattern of licensing in onset and initial syllables, there are other domains where positional faithfulness constraints pose problems. A version of OT with positional faithfulness makes incorrect predictions regarding the realisation of ‘ﬂoating features ’ and other derived structures, for example, wrongly preferring them to target weak positions (Ito & Mester 2003, Zoll 1998). To see this, we consider an input with a voice feature introduced by a second morpheme (/VCE+katnakat/). The desired optimum in this sort of case would realise the feature in a strong position where it is generally licensed – e.g. [gatnakat], with voicing surfacing on the initial onset. This is the outcome predicted by positional markedness, but not by positional faithfulness, as Table IX shows. Positional faithfulness constraints prefer that ﬂoating marked features be realised in contexts that are not normally licensers, like the non-initial coda in the loser [kat.na.kad].

108 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker W~L

Ident[voice] Ident[voice] Voice Voice -Ons -s1 Onset -s1

[gat.na.kat] ~ [kat.na.kad]

L

L

W

W

Table IX A situation in which positional markedness constraints are required in OT.

Cases like these, where positional faithfulness and positional markedness each account for a subset of the attested phenomena, have led to a version of OT that includes both types of constraint. A simple continuation of the examples above illustrates the typological consequences. We submitted tableaux for each of the inputs /badnabad/ and /VCE+katnakat/ to OT-Help. For HG, we included only the positional markedness constraints, along with *VOICE and IDENT[voice], while for OT we also included the positional faithfulness constraints. The results are given in Table X. The potentially optimal outputs for /badnabad/ are shown in the ﬁrst column, and the potentially optimal outputs for /VCE+katnakat/ are shown in the top row. Cells are labelled with the name of the theory that makes the row and column outputs jointly optimal. [gat.na.kat] [kad.na.kat] [kat.na.gat] [kat.na.kad] [bad.na.bad] [bad.na.bat] [bad.na.pat] [bat.na.bat] [bat.na.pat] [pat.na.pat]

HG&OT HG HG&OT HG&OT HG&OT HG&OT

OT OT OT OT OT OT

OT OT OT OT OT OT

OT OT OT OT OT

Table X Typological predictions for HG with only positional markedness constraints and OT with both positional markedness and positional faithfulness constraints.

The HG results with positional markedness seem to match what is generally found typologically. The full typology may not be found for obstruent voicing, but it is found across the larger set of cases that includes positional restrictions and ﬂoating feature behaviour for other structures (see Jesney, to appear for documentation). OT with both positional faithfulness and positional markedness predicts that ﬂoating features can dock on any of the four positions deﬁned by the two parameters initial vs. non-initial syllable and onset vs. non-onset. Thus, all of the docking sites for /VCE+katnakat/ can be made optimal, indicated in Table X by the label OT in all columns. In addition, there is practically no predicted relation between the positions in which a feature is generally permitted and where

Harmonic Grammar with linear programming 109 ﬂoating feature docking will occur. For example, this version of OT can generate a language in which voicing is generally restricted to onsets (/badnabad/, [bat.na.bat]), but in which a ﬂoating [+voice] feature docks onto either a ﬁnal coda (/VCE+katnakat/, [kat.na.kad]) or a medial one (/VCE+katnakat/, [kad.na.kat]). Further research is required to determine whether a version of HG without positional faithfulness constraints can indeed deal with the full range of phenomena attributed to these constraints in OT. These initial results suggest that the pursuit of such a theory may yield a resolution to a long-standing problem in OT. Furthermore, since there is not a subset relation in the types of languages generated by the two theories of constraints and constraint interaction illustrated in Table X, this example illustrates the general point that a ﬂeshed-out theory of some set of attested phenomena in HG will likely be in some ways both less restrictive and more restrictive than an OT one.

6.3 Gradient Alignment and Lapse constraints We now turn to an example concerning the typological spaces determined by two diﬀerent classes of constraint that have been used for stress typology in OT. McCarthy & Prince (1993) propose an account of stress placement in terms of Alignment constraints, which demand coincidence of edges of prosodic categories. Gradient Alignment constraints are ones whose degree of violation depends on the distance between the category edges : roughly, if x should be at, say, the leftmost edge of a certain domain and it surfaces n segments (syllables) from that edge, then x incurs n violations for the candidate containing it. Kager (2005) proposes an alternative account of stress placement in OT that replaces gradient Alignment constraints with a set of Lapse constraints, which penalise adjacent unstressed syllables in various environments, assigning one mark per violation, as with normal markedness constraints. To examine the typological predictions of the two accounts, Kager constructed OTSoft ﬁles (Hayes et al. 2003) with a set of candidate parsings for words from two to nine syllables in length. Separate ﬁles contained the appropriate violation marks for each constraint set. For each of these, there were separate ﬁles for trochaic (left-headed) feet and for iambic (right-headed) feet (here we discuss only the trochaic results). Using OTSoft, Kager found that the gradient Alignment constraint set generated 35 languages, while the one with Lapse constraints generated 25. We used OT-Help to replicate Kager’s experiment, using both OT and HG. The results for the two constraint sets discussed above, derived from OTSoft ﬁles prepared by Kager, are shown in Table XI. We provide the number of languages that each combination of constraints and mode of interaction predicts, out of a total of 685,292,000 possible combinations of optima.

110 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker OT HG Gradient alignment Lapse

35 25

911 85

Table XI Number of predicted languages.

For both constraint sets, HG generates all the languages that OT does. HG also generates a signiﬁcant number of languages that OT does not. A primary source of this dramatic increase is the manner in which gradient Alignment constraints assign violation marks. To illustrate, we show four potential parses of a six-syllable word, and the violations they incur on two constraints. Foot edges are indicated by parentheses, and prosodic word edges by square brackets. ALIGN-L(Ft, Wd) demands that the left edge of every foot be aligned with the left edge of the word and is violated by each syllable intervening between these two edges. PARSE-w is violated by every syllable that fails to be parsed into a foot.

(29)

Align-L(Ft,Wd) Parse-s a. [(ta.ta)(ta.ta)(ta.ta)] 0 2+4=6 b. [(ta.ta)(ta.ta)ta.ta] 2 2 c. [(ta.ta)ta.ta.ta.ta] 4 0 d. [ta.ta.ta.ta.ta.ta] 6 0

ALIGN-L(Ft, Wd) and PARSE-w conﬂict in that every foot added after the leftmost one satisﬁes PARSE-w at the cost of violating ALIGN-L(Ft, Wd). This cost increases as feet are added: the second foot from the left adds two violations, the third one adds four and so on. This increasing cost interacts with weighting to produce a rich typology. With an appropriate weighting (e.g. a weight of 1 for ALIGN-L(Ft, Wd) and a weight of 2 for PARSE-w), a second foot will be added to avoid violating PARSE-w, but not a third one : (29b) emerges as optimal. This outcome would be impossible in HG, as it is in OT, if each non-leftmost foot added the same number of violations of ALIGN-L(Ft, Wd) (or whatever constraint replaces it).14 The HG typology with Lapse constraints is much closer to that of OT, but it still yields more than a threefold increase in predicted languages. We believe that it would be a mistake to take this sort of result to argue deﬁnitively for OT. First, it was arrived at using a constraint set designed for OT. As we have shown in ww5 and 6.2, weighted interaction allows for 14 See McCarthy (2003) for extensive arguments for the replacement of gradient

Alignment in OT.

Harmonic Grammar with linear programming 111 diﬀerent constraints than those used in OT, and these possibilities must be further explored to better understand the theory and how it differs from OT. Second, the result also depends on a particular mode of evaluation: here, the entire representation is evaluated once and only once by the entire set of constraints. As Pater (2009b, to appear) shows, changing assumptions about mode of evaluation yields positive results for HG typology, in addition to those that McCarthy (2006, 2007, 2009) demonstrates for OT (see also Pruitt 2008 on stress in Serial OT).

6.4 A typological correspondence between OT and HG The previous simulation highlights the fact that OT and HG can produce quite diﬀerent typological predictions. However, as we emphasised in the introduction, the two frameworks do not invariably diverge. The present section describes a simulation involving a fairly complex set of constraints for which OT and HG deliver identical typological predictions. The result is especially striking in light of the fact that some of the constraints are gradient Alignment constraints of the sort that produced a large diﬀerence in the previous section. The simulation involves the following set of constraints.

(30) a. Trochee Assign a violation to every right-headed foot. b. Iamb Assign a violation to every left-headed foot. c. Align(Ft)-L For every foot, assign a violation for every syllable separating it from the left edge of the word. d. Align(Ft)-R For every foot, assign a violation for every syllable separating it from the right edge of the word. e. Align(Hd)-L Assign a violation for every syllable separating the main stressed syllable from the left edge of the word. f. Align(Hd)-R Assign a violation for every syllable separating the main stressed syllable from the right edge of the word. The candidate set for the simulation consisted of all logically possible parses of words of two to ﬁve syllables in length into left- and right-headed bisyllabic feet, with main stress on either one of the feet in the four- and ﬁve-syllable words. The parses are all exhaustive, up to the limits imposed by the binary minimum ; there is no more than one unparsed syllable per word.

112 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker Here is a summary of the results of this simulation.

(31) Number of predicted languages with the constraint set in (30) a. All logically possible combinations of optima: 1536 b. OT: 18 c. HG: 18 Not only are the counts the same, but the languages themselves are the same (OT-Help does these calculations and comparisons automatically). An interesting aspect of this result is that the constraint set contains the gradient Alignment constraints ALIGN(Ft) and ALIGN(Hd), which, as we saw in w6.3, can lead to signiﬁcant diﬀerences in the predictions of OT and HG. Crucially, however, the constraint set contains neither PARSE-w nor WEIGHT-TO-STRESS. Because it lacks PARSE-w, the trade-oﬀ in violations between it and ALIGN(Ft) illustrated in (29) does not exist in the current set of violation proﬁles. Because it lacks WEIGHT-TO-STRESS, a trade-oﬀ with ALIGN(Hd), discussed by Legendre et al. (2006) and Pater (2009b), is also absent. We do not take this as evidence for the elimination of WEIGHT-TO-STRESS and PARSE-w from metrical theory. Rather, it serves to further illustrate the crucial point that it is the trade-oﬀs between violations of constraints, rather than the way that any one constraint assigns violations, that lead to diﬀerences between HG and OT. Like the NOCODA/MAX example in the introduction, this is because the version of HG we are considering is an optimisation system. 6.5 Summary The typological investigations above, which mix qualitative analysis of speciﬁc cases with large-scale quantitative assessment, point up the complexity of the relationship between OT and HG. There are constraint sets for which the two frameworks are aligned in their typological predictions, and there are constraint sets for which they diverge wildly. The examples show that certain constraint combinations can have apparent ill-eﬀects in one framework even as they produce desirable patterns in the other. These ﬁndings are just small pieces in the larger puzzle of how the two approaches relate to one another. We think the connection with linear programming, and the computational tools that go with it, can facilitate rapid progress in putting the rest of the pieces together.

7 Conclusion We have shown that Harmonic Grammar learning problems translate into linear systems that are solvable using linear programming algorithms. This is an important mathematical connection, and it has a practical component as well: our software package OT-Help facilitates comparison between weighting and other constraint-based approaches. This implementation, freely available and requiring no specialised user expertise,

Harmonic Grammar with linear programming 113 gets us over the intrinsic practical obstacles to exploring weighting systems. We can then focus attention on the linguistic usefulness of HG and related approaches, as we have done with our in-depth analysis of Lango ATR harmony (w5) and our typological investigations (w6). The formal results of this paper are best summarised by drawing an explicit connection with the fundamental theorem of linear programming (Cormen et al. 2001: 816).

(32) Theorem 1 (the fundamental theorem of linear programming) If L is a linear system, then there are just three possibilities: a. L has an optimal solution with a ﬁnite objective function. b. L is unbounded (in which case we can return a solution, though the notion of optimal is undeﬁned). c. L is infeasible (no solution satisﬁes all its conditions). Our method applies this theorem to understanding HG. The outcome is not directly relevant ; we always solve minimisation problems, and our systems are structured so that there is always a well-deﬁned minimum. The INFEASIBLE verdict is essential. It tells us that the current grammar cannot deliver the set of optimal candidates we have speciﬁed. This might be a signal that the analysis must change, or it might prove that a predicted typological gap in fact exists for the current constraint set. And if we are presented with an optimal solution, then we know our grammar delivers the speciﬁed set of forms as optimal. Moreover, we can then analyse the solution to learn about the relations among our constraints. We obtain these results eﬃciently ; though the worst-case running time for the simplex algorithm is exponential, it is extremely eﬃcient in practice, often besting its theoretically more eﬃcient competitors (Chva´tal 1983: 4, Cormen et al. 2001: 820–821). What’s more, we have opened the way to applying new algorithms to the problem, with an eye towards achieving an optimal ﬁt between the structure of linguistic systems and the nature of the computational analysis. Our approach works for the full range of harmonic grammars as we deﬁne them in w2, including very large and complex ones. We therefore see the translation of HG systems into linear systems solvable using linear programming methods as providing a valuable tool for the serious exploration of constraint weighting in linguistics. We also see great promise in the approach for developing theories of learning, for determining the nature of the constraint set and for gaining a deeper mathematical and algorithmic understanding of the theory’s main building blocks. UNBOUNDED

REFERENCES

Albright, Adam, Giorgio Magri & Jennifer Michaels (2008). Modeling doubly marked lags with a split additive model. In Harvey Chan, Heather Jacob & Enkeleida Kapia (eds.) Proceedings of the 32nd Annual Boston University Conference on Language Development. Somerville : Cascadilla. 36–47.

114 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker Archangeli, Diana & Douglas Pulleyblank (1994). Grounded phonology. Cambridge, Mass. : MIT Press. Bakovi¯, Eric (2000). Harmony, dominance, and control. PhD dissertation, Rutgers University. Bavin Woock, Edith & Michael Noonan (1979). Vowel harmony in Lango. CLS 15. 20–29. Bazaraa, Mokhtar S., John J. Jarvis & Hanif D. Sherali (2005). Linear programming and network ﬂows. 3rd edn. Hoboken, NJ : Wiley. Becker, Michael & Joe Pater (2007). OT-Help user guide. University of Massachusetts Occasional Papers in Linguistics 36. 1–12. Becker, Michael, Joe Pater & Christopher Potts (2007). OT-Help 1.2. Software available at http://web.linguist.umass.edu/~OTHelp/. Beckman, Jill N. (1997). Positional faithfulness, positional neutralisation and Shona vowel harmony. Phonology 14. 1–46. Beckman, Jill N. (1998). Positional faithfulness. PhD dissertation, University of Massachusetts, Amherst. Boersma, Paul & Joe Pater (2008). Convergence properties of a gradual learning algorithm for Harmonic Grammar. Available as ROA-970 from the Rutgers Optimality Archive. Boersma, Paul & David Weenink (2009). Praat : doing phonetics by computer (version 5.1.12). http://www.praat.org/. Casali, Roderic F. (1996). Resolving hiatus. PhD dissertation, University of California, Los Angeles. Chva´tal, VaZek (1983). Linear programming. New York: Freeman. Coetzee, Andries & Joe Pater (in press). The place of variation in phonological theory. In John A. Goldsmith, Jason Riggle & Alan Yu (eds.) The handbook of phonological theory. 2nd edn. Oxford: Blackwell. Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest & Cliﬀord Stein (2001). Introduction to algorithms. 2nd edn. Cambridge, Mass. : MIT Press. Dantzig, George B. (1982). Reminiscences about the origins of linear programming. Operations Research Letters 1. 43–48. Davis, Stuart (1995). Emphasis spread in Arabic and Grounded Phonology. LI 26. 465–498. Dresher, B. Elan & Jonathan D. Kaye (1990). A computational learning model for metrical phonology. Cognition 34. 137–195. Goldsmith, John A. (1990). Autosegmental and metrical phonology. Oxford & Cambridge, Mass. : Blackwell. Goldsmith, John A. (1991). Phonology as an intelligent system. In Donna Jo Napoli & Judy Anne Kegl (eds.) Bridges between psychology and linguistics : a Swarthmore Festschrift for Lila Gleitman. Hillsdale : Erlbaum. 247–268. Goldsmith, John A. (1993). Introduction. In John A. Goldsmith (ed.) The last phonological rule : reﬂections on constraints and derivations. Chicago: University of Chicago Press. 1–20. Goldsmith, John A. (1999). Introduction. In John A. Goldsmith (ed.) Phonological theory: the essential readings. Malden, Mass. & Oxford: Blackwell. 1–16. Goldwater, Sharon & Mark Johnson (2003). Learning OT constraint rankings using a ¨ sten Dahl Maximum Entropy model. In Jennifer Spenador, Anders Eriksson & O (eds.) Proceedings of the Stockholm Workshop on Variation within Optimality Theory. Stockholm : Stockholm University. 111–120. Hayes, Bruce, Bruce Tesar & Kie Zuraw (2003). OTSoft 2.1. http://www.linguistics. ucla.edu/people/hayes/otsoft/. Hayes, Bruce, Kie Zuraw, Pe´ter Sipta´r & Zsuzsa Czira´ky Londe (2008). Natural and unnatural constraints in Hungarian vowel harmony. Ms, University of California, Los Angeles.

Harmonic Grammar with linear programming 115 Hyman, Larry M. (2002). Is there a right-to-left bias in vowel harmony ? Ms, University of California, Berkeley. Ito, Junko & Armin Mester (2003). Japanese morphophonemics : markedness and word structure. Cambridge, Mass. : MIT Press. Itoˆ, Junko, Armin Mester & Jaye Padgett (1995). Licensing and underspeciﬁcation in Optimality Theory. LI 26. 571–613. Ja¨ger, Gerhard (2007). Maximum entropy models and Stochastic Optimality Theory. In Annie Zaenen, Jane Simpson, Tracy Holloway King, Jane Grimshaw, Joan Maling & Chris Manning (eds.) Architectures, rules, and preferences : variations on themes by Joan W. Bresnan. Stanford : CSLI. 467–479. Jesney, Karen (to appear). Licensing in multiple contexts : an argument for Harmonic Grammar. CLS 45. Johnson, Mark (2002). Optimality-theoretic Lexical Functional Grammar. In Paola Merlo & Suzanne Stevenson (eds.) The lexical basis of sentence processing : formal, computational and experimental issues. Amsterdam & Philadelphia : Benjamins. 59–73. Jurgec, Peter (2009). Autosegmental spreading is a binary relation. Ms, University of Troms¿. Kager, Rene´ (2005). Rhythmic licensing : an extended typology. In Proceedings of the 3rd International Conference on Phonology. Seoul : The Phonology–Morphology Circle of Korea. 5–31. Kaplan, Aaron F. (2008). Noniterativity is an emergent property of grammar. PhD dissertation, University of California, Santa Cruz. Kelkar, Ashok R. (1968). Studies in Hindi-Urdu. Vol. 1: Introduction and word phonology. Poona : Deccan College. Keller, Frank (2000). Gradience in grammar : experimental and computational aspects of degrees of grammaticality. PhD dissertation, University of Edinburgh. Keller, Frank (2006). Linear optimality theory as a model of gradience in grammar. In Gisbert Fanselow, Caroline Fe´ry, Ralf Vogel & Matthias Schlesewsky (eds.) Gradience in grammar : generative perspectives. Oxford : Oxford University Press. 270–287. Legendre, Ge´raldine, Yoshiro Miyata & Paul Smolensky (1990a). Harmonic Grammar : a formal multi-level connectionist theory of linguistic well-formedness : theoretical foundations. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. Hillsdale : Erlbaum. 388–395. Legendre, Ge´raldine, Yoshiro Miyata & Paul Smolensky (1990b). Harmonic Grammar : a formal multi-level connectionist theory of linguistic well-formedness : an application. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. Hillsdale : Erlbaum. 884–891. Legendre, Ge´raldine, Antonella Sorace & Paul Smolensky (2006). The Optimality Theory–Harmonic Grammar connection. In Smolensky & Legendre (2006 : vol. 2). 339–402. Lombardi, Linda (1999). Positional faithfulness and voicing assimilation in Optimality Theory. NLLT 17. 267–302. Lo´pez, Marco & Georg Still (2007). Semi-inﬁnite programming. European Journal of Operations Research 180. 491–518. McCarthy, John J. (1997). Process-speciﬁc constraints in optimality theory. LI 28. 231–251. McCarthy, John J. (2003). OT constraints are categorical. Phonology 20. 75–138. McCarthy, John J. (2006). Restraint of analysis. In Eric Bakovi¯, Junko Ito & John J. McCarthy (eds.) Wondering at the natural fecundity of things : essays in honor of Alan Prince. Santa Cruz : Linguistics Research Center. 195–219. McCarthy, John J. (2007). Hidden generalizations: phonological opacity in Optimality Theory. London : Equinox.

116 C. Potts, J. Pater, K. Jesney, R. Bhatt and M. Becker McCarthy, John J. (2009). Harmony in Harmonic Serialism. Ms, University of Massachusetts, Amherst. Available as ROA-1009 from the Rutgers Optimality Archive. McCarthy, John J. & Alan Prince (1993). Generalized alignment. Yearbook of Morphology 1993. 79–153. Noonan, Michael (1992). A grammar of Lango. Berlin & New York: Mouton de Gruyter. Okello, Jenny (1975). Some phonological and morphological processes in Lango. PhD dissertation, Indiana University. Pater, Joe (2008). Gradual learning and convergence. LI 39. 334–345. Pater, Joe (2009a). Review of Smolensky & Legendre (2006). Phonology 26. 217–226. Pater, Joe (2009b). Weighted constraints in generative linguistics. Cognitive Science 33. 999–1035. Pater, Joe (to appear). Serial Harmonic Grammar and Berber syllabiﬁcation. In Toni Borowsky, Shigeto Kawahara, Takahito Shinya & Mariko Sugahara (eds.) Prosody matters : essays in honor of Elisabeth O. Selkirk. London : Equinox. Poser, William J. (1982). Phonological representation and action-at-a-distance. In Harry van der Hulst & Norval Smith (eds.) The structure of phonological representations. Part 2. Dordrecht : Foris. 121–158. Potts, Christopher, Michael Becker, Rajesh Bhatt & Joe Pater (2007). HaLP : Harmonic Grammar with linear programming. Version 2. Software available at http://web.linguist.umass.edu/~halp/. Prince, Alan (2002). Entailed ranking arguments. Ms, Rutgers University. Available as ROA-500 from the Rutgers Optimality Archive. Prince, Alan (2003). Anything goes. In Takeru Honma, Masao Okazaki, Toshiyuki Tabata & Shin-ichi Tanaka (eds.) A new century of phonology and phonological theory: a Festschrift for Professor Shosuke Haraguchi on the occasion of his sixtieth birthday. Tokyo : Kaitakusha. 66–90. Prince, Alan & Paul Smolensky (1997). Optimality: from neural networks to Universal Grammar. Science 275. 1604–1610. Prince, Alan & Paul Smolensky (2004). Optimality Theory : constraint interaction in generative grammar. Malden, Mass. & Oxford : Blackwell. Pruitt, Kathryn (2008). Iterative foot optimization and locality in rhythmic word stress. Ms, University of Massachusetts, Amherst. Riggle, Jason (2004a). Generation, recognition, and learning in ﬁnite-state Optimality Theory. PhD dissertation, University of California, Los Angeles. Riggle, Jason (2004b). Generation, recognition and ranking with compiled OT grammars. Paper presented at the 78th Annual Meeting of the Linguistic Society of America, Boston. Smolensky, Paul (2006). Optimality in phonology II: harmonic completeness, local constraint conjunction, and feature domain markedness. In Smolensky & Legendre (2006: vol. 2). 27–160. Smolensky, Paul & Ge´raldine Legendre (eds.) (2006). The harmonic mind : from neural computation to optimality-theoretic grammar. 2 vols. Cambridge, Mass. : MIT Press. Tesar, Bruce & Paul Smolensky (1998a). Learnability in Optimality Theory. LI 29. 229–268. Tesar, Bruce & Paul Smolensky (1998b). Learning Optimality-Theoretic grammars. Lingua 106. 161–196. Walker, Rachel (2001). Positional markedness in vowel harmony. In Caroline Fe´ry, Antony Dubach Green & Ruben van de Vijver (eds.) Proceedings of HILP 5. Potsdam : University of Potsdam. 212–232. Walker, Rachel (2005). Weak triggers in vowel harmony. NLLT 23. 917–989. Wilson, Colin (2003). Analyzing unbounded spreading with constraints : marks, targets, and derivations. Ms, University of California, Los Angeles.

Harmonic Grammar with linear programming 117 Wilson, Colin (2006). Learning phonology with substantive bias : an experimental and computational study of velar palatalization. Cognitive Science 30. 945–982. Zoll, Cheryl (1996). Parsing below the segment in a constraint-based framework. PhD dissertation, University of California, Berkeley. Zoll, Cheryl (1998). Positional asymmetries and licensing. Ms, MIT. Available as ROA-282 from the Rutgers Optimality Archive.

Phonology 27 (2010) 119–152. f Cambridge University Press 2010 doi:10.1017/S0952675710000059

A test case for the phonetics– phonology interface: gemination restrictions in Hungarian* Anne Pycha University of Pennsylvania Despite diﬀerences in parsimony and philosophical orientation, physical and abstract theories of phonology often make similar empirical predictions. This study examines a case where they do not : gemination restrictions in Hungarian. While both types of theory correctly prohibit the lengthening of a consonant when ﬂanked by another consonant, they make diﬀerent predictions regarding both the relative duration changes within a target consonant and the applicability of restrictions to lengthening processes besides gemination. In two speechproduction experiments, these predictions are evaluated by measuring stop and frication durations within aﬀricates. Results show that relative duration changes occur, and that the restriction holds only for gemination, supporting an abstract theory. Yet results also indicate that gemination exhibits sensitivity to inherent durational diﬀerences between aﬀricates, providing some support for a physical theory. Thus I argue that an adequate theory of phonology must include abstract constituents, alongside a limited, principled set of physical landmarks.

1 Introduction There are many processes on either side of the phonetics–phonology interface which resemble one another. In both coarticulation and assimilation, for example, the qualities of one speech sound alter those of another sound. Of course, assimilation diﬀers from coarticulation in that it has the potential to neutralise contrast, but the resemblance is otherwise striking. * I am grateful to Peter Dienes, who wrote the stimulus sentences for Experiment 1. Four anonymous reviewers and the editors of Phonology provided constructive criticism, which greatly improved the paper. Audiences provided useful feedback at the University of Pennsylvania, the University of California, Santa Cruz, the University of California, Berkeley and the Linguistic Society of America Annual Meeting in 2009. Ashlyn Moehle provided expert research assistance. At the University of California, Berkeley, the Department of Linguistics, Phi Beta Kappa and the Abigail Hodgen Publication Award provided crucial ﬁnancial support. The contributors to Praat and R provided essential software tools. The Hungarian participants gave generously of their time. Thank you. Some of the results discussed in this paper were previously reported, in a diﬀerent format, in Pycha (2007, 2009).

119

120 Anne Pycha Vowel reduction, closed syllable vowel shortening and postnasal voicing are just a handful of the many additional processes that also have counterparts on either side of the interface, diﬀering only in their neutralisation potential (Flemming 2001; see also Ohala 1990, Blevins & Garrett 1998, Steriade 1999, 2001, Blevins 2004, Barnes 2006 and many others). These resemblances have led many researchers to argue that the most parsimonious theory of phonology is a uniﬁed theory, whereby phonological processes derive directly from phonetic ones. Once we truly understand the physical events of speech – that is, articulatory gestures and/or acoustic outcomes – which give rise to phonetic processes, the argument goes, we will also understand their phonological counterparts (Browman & Goldstein 1990, Flemming 2001, Steriade 2001, Gafos 2002). The uniﬁed theory presents a compelling case in part because many phonological processes are local : that is, they aﬀect constituents which are adjacent to one another in time. For example, most cases of consonant assimilation involve one speech sound altering the quality of an adjacent sound, not a non-adjacent sound (e.g. Cho 1990). Any theory must capture this locality generalisation, and a theory based on the physical events of speech captures it for free, because such events occur sequentially in continuous time. Crucially, a given event cannot skip time: it can aﬀect another event that occurs immediately before or after it, but no others. So, for example, if we analyse assimilation as a process by which one articulatory gesture aﬀects another, we capture the locality generalisation without further stipulation, because a gesture can only aﬀect immediately preceding or following gestures, not non-adjacent gestures. Despite their appeal, physical events are certainly not the only way to capture locality generalisations in phonology. Abstract constituents can do so also. The theory of autosegmental phonology (Goldsmith 1976, Clements & Keyser 1983), for example, employs the abstract constituent of the segment. A segment divides the speech stream into discrete representations, such as C or V, which abstract away from inherent diﬀerences in their physical implementation. In the theory, features such as [place] associate to segments via association lines. So we can analyse assimilation as a process by which the features associated to one segment spread to another segment, subject to the constraint that association lines cannot cross. This constraint captures the locality generalisation rather elegantly, but unlike the physical theory, it does not do so for free. This is because no built-in characteristic prevents association lines from crossing ; only a stipulation does. For a process like assimilation, then, one could argue that physical and abstract theories diﬀer in terms of parsimony. A physical theory captures locality by virtue of its built-in characteristics, while an abstract theory captures it with a stipulation. The problem, however, is that the two kinds of theories do not necessarily diﬀer in terms of predictions. As we have seen, both predict that assimilation should be overwhelmingly local.

A test case for the phonetics–phonology interface 121 As another example, both theories can predict that assimilation should target certain speech sounds over others. In physical theories, inherent physical diﬀerences among e.g. labial, alveolar and velar gestures make such predictions ; in abstract theories, constraints on markedness between labial, alveolar and velar segments can make similar predictions (e.g. de Lacy 2006). Because the predictions of physical vs. abstract theories do not always diﬀer, their relative merits are sometimes assessed on philosophical, rather than empirical grounds. In this paper, I use speech-production data to investigate physical vs. abstract theories for a particular case in which they make clearly diﬀerent predictions : gemination restrictions. Geminates are long speech sounds that contrast with short ones, and many languages with geminates impose restrictions on where they can occur (on gemination, see Kenstowicz 1982, Hyman 1985, Hayes 1986a, b, McCarthy 1986, Schein & Steriade 1986, Inkelas & Cho 1993, Rose 2000, Ham 2001, Muller 2001). In Hungarian, the focus of the current study, the restrictions on gemination are of particular interest, because, like assimilation, they can be aptly formulated in either physical or abstract terms. As reported in the literature, the restriction is that a singleton consonant cannot change to a geminate when ﬂanked by another consonant on either the left or the right (Vago 1980 : 41–43, Dressler & Sipta´r 1989 : 33–35, Na´dasdy 1989, Kenesei et al. 1998 : 448, Sipta´r & To¨rkenczy 2000 : 286–293). For example, suﬃxes that normally trigger gemination of a root-ﬁnal consonant, such as the instrumental case suﬃx, fail to do so just when another consonant is present on the left (Na´dasdy 1989 : 105).

(1) a. vassal csattal b. verssel akttal

/vOS-CAl/ £ [vOS:Ol] /COt-CAl/ £ [COt:Ol] /vErS-CAl/ £ [vErSEl] /Okt-CAl/ £ [OktOl]

‘iron (instr)’ ‘buckle (instr)’ ‘poem (instr)’ ‘nude (instr)’

*[vErS:El] *[Okt:Ol]

(In this and subsequent examples, the presence of /C/ in the underlying representation indicates a timing slot that triggers gemination, while /A/ and /O/ indicate an underspeciﬁed vowel whose features are ﬁlled by harmony. /A/ is realised as [O] or [E] ; /O/ as [O], [E] or [ø].) Similar restrictions hold when another consonant is present on the right. Underlying geminates can occur word-ﬁnally before pause, but shorten obligatorily before another consonant : hall [hOl:] ‘he hears’, but hallva [hOlvO] ‘ hearing’ (Na´dasdy 1989 : 104). The restriction in Hungarian, as we will see in subsequent sections, is a highly local one which makes no reference to abstract constituents such as syllables or words. It is an open question, however, whether the restriction makes reference to the abstract constituent of the segment. As formulated in published descriptions of Hungarian phonology (Vago 1980: 41–43, Dressler & Sipta´r 1989 : 33–35, Na´dasdy 1989, Kenesei et al. 1998 : 448, Sipta´r & To¨rkenczy 2000: 286–293), the restriction does refer to segments, along the lines in (2).

122 Anne Pycha (2) Abstract formulation A consonant (C) may become or remain geminate (CC) only when it is flanked by vowels (V) on both sides, or by a vowel (V) on the left and pause on the right.

But it is also possible to formulate the restriction in physical terms, as in (3).

(3) Physical formulation Areas of narrow constriction may lengthen or remain long only when flanked by areas of wide constriction. The physical formulation makes reference to areas of the speech stream according to how they are articulated, either with a narrow opening in the vocal tract (‘narrow constriction ’, associated with consonants) or with a wide one (‘ wide constriction’, associated with vowels) (for related ideas see Smith 1995, Kirchner 2000, Gafos 2002). There are plausible reasons to think that ﬂanking constrictions could aﬀect the implementation of long narrow constrictions, in which case the physical formulation oﬀers a reasonably parsimonious account.1 The physical formulation also uniﬁes the concepts of vowel and pause, either of which can ﬂank a geminate on the right (cf. hall [hOl:]). In the abstract formulation, each of these environments must be listed separately, but in the physical formulation, both can arguably be subsumed under the rubric of a ‘wide ’ constriction. Parsimony aside, the formulations make diﬀerent predictions. While both predict no change in overall duration of a target consonant (or area of narrow constriction), they diﬀer crucially in the predictions they make for relative changes within the target consonant. Speciﬁcally, the abstract formulation predicts that relative changes within the target consonant can occur, while the physical formulation predicts that they cannot. We can see this most clearly by considering consonants that have complex internal structures, such as aﬀricates. Aﬀricates consist of two portions, a stop closure followed by frication (for phonetic analyses of aﬀricates, see Repp et al. 1978, Dorman et al. 1980, Howell & Rosen 1983, Tarno´czy 1987, Miller-Ockhuizen & Zec 2002; for phonological analyses, 1 We can speculate as to the motivation for the physical restriction. In order to

achieve a lengthened narrow constriction, the speaker must control his or her gestures so as to maximise the amount of time that the articulators hold the constriction, while minimising the amount of time it takes for the articulators to achieve the constriction and release it. The best conﬁguration involves wide constrictions on both sides. A wide constriction (i.e. a vowel) on the left allows the speaker to anticipate the narrow constriction (consonant) and move the articulators toward the appropriate location even before the wide constriction (vowel) has ﬁnished. By contrast, a narrow constriction on the left would require the speaker to release this constriction before moving on to the next. Similarly, a wide constriction on the right allows the speaker to release the constriction without having to coordinate it with a subsequent narrow constriction, which could conceivably prolong it.

A test case for the phonetics–phonology interface 123 see Hualde 1988, Lombardi 1990, Rubach 1994, Clements 1999). For example, the Hungarian word kincs [kinC] ‘treasure ’ contains a word-ﬁnal aﬀricate preceded by a nasal. When a geminating suﬃx such as the instrumental is added to the word, the aﬀricate becomes a target for gemination, but [n] restricts this process : /kinC-CAl/E[kinCel] ‘treasureINSTR ’, *[kint:Sel]. Under the abstract formulation of the restriction, there is a single C target under consideration, namely [C]. This C cannot geminate, because it is preceded by another C – in other words, the restriction holds on the timing tier but not the feature tier.

(4) Abstract formulation timing tier: restriction applies feature tier: no restriction applies

C n

C t

S

Nothing, however, prevents a reorganisation of the relative duration of [t] and [S] within the C. Indeed, the representation C freely permits such a reorganisation precisely because it abstracts away from it. In other words, under the abstract formulation, reorganisation of the aﬀricate can occur even when gemination cannot. For the physical restriction, on the other hand, the concept of a segment is not operative. For example, in a word such as kinccsel [kinCel], the aﬀricate is not a segment, but a sequence of two diﬀerent target articulations, an oral stop closure followed by frication. Each of these articulations has narrow constriction. In addition, each articulation is crucially ﬂanked on the left by another articulation of narrow constriction – the stop closure is ﬂanked by the nasal, while the frication is in turn ﬂanked by the stop.

(5) Physical formulation nasal stop constriction time

oral stop constriction

frication constriction

restriction applies

restriction applies

Under the physical formulation, then, the stop closure and the frication are each independently restricted from lengthening in the temporal domain, because each is a narrow constriction preceded by a narrow constriction. As a consequence, no reorganisation of the relative durations of stop and frication is permitted, because any such reorganisation would violate the physical restriction at least once, if not twice. In sum, then, for a target aﬀricate with stop closure and frication components, the abstract formulation permits changes in the ratio of stop closure to overall duration (T/TS) while the physical formulation predicts no change.

124 Anne Pycha

(6) abstract physical

Prediction for T/TS ratio change permitted no change permitted

The physical formulation also makes a further prediction that distinguishes it from the abstract one, which is that the restriction should apply to any type of lengthening, not just gemination. As is well established, diverse processes can increase the duration of some portion of the speech stream, including gemination, but also, as documented for English and various other languages, phrase-ﬁnal or phrase-initial position (Klatt 1976, Fougeron & Keating 1997, Byrd et al. 2000, Cho & Keating 2001, Byrd & Saltzman 2003, Cho 2005, 2006, Turk & Shattuck-Hufnagel 2007), stress (Summers 1987, Turk & Shattuck-Hufnagel 2000, 2007), focus (De Jong & Zawaydeh 2002), rate (Miller 1981), clear speech (Smiljanic & Bradlow 2007) and voicelessness (Summers 1987). Among these, gemination is typically considered special because it has the potential to neutralise contrast, whereas the other processes do not. An abstract theory of gemination models this special status using the C representation. Thus, a gemination rule takes the basic form CECC; other lengthening processes do not make reference to C representations and fall outside the domain of the theory. A physical theory of gemination, however, does not employ the notion of C at all. Without C, gemination ceases to be a special process distinct from other processes that increase duration. Furthermore, any restriction on gemination is physically based, and should therefore apply to other types of lengthening as well. That is, any narrow constriction should fail to increase its duration when it is preceded or followed by another narrow constriction, regardless of the lengthening process involved. This paper presents the results of two Hungarian speech-production studies that test the diﬀering predictions of abstract and physical formulations of the gemination restriction. As we have seen, these formulations diﬀer chieﬂy in the predictions they make for relative changes within the target consonant, which are demonstrated most clearly by segments with complex internal structures, such as aﬀricates. Therefore, the production studies reported here place aﬀricates in target positions, and compare the ratio of stop closure to total duration (T/TS) in restricted gemination environments to that found in comparable singleton environments. In addition, the abstract and physical formulations diﬀer in the predictions they make for gemination relative to other lengthening processes. Therefore, the production studies also compare gemination with another process that increases duration, phrase-ﬁnal lengthening (for related work on Hungarian segmental duration, see Kassai 1979, 1982, Olaszy 1994, 2000, 2002, Hockey & Fagyal 1999, Go´sy 2001 and the papers in Go´sy 1991). The results of these studies demonstrate that gemination restrictions in Hungarian require the abstract constituent of the segment, and therefore

A test case for the phonetics–phonology interface 125 cannot be adequately modelled with a purely physical formulation. They also demonstrate that the restriction does not apply to phrase-ﬁnal lengthening, suggesting that gemination is a lengthening process distinct from others. At the same time, however, the results indicate that gemination in Hungarian exhibits some unexpected sensitivity to the inherent durations of segments, of the kind that abstract theories presumably abstract away from. This suggests that it is a compromise position which best captures the data. That is, the abstract representation of the segment, while still necessary in order to adequately describe phonological processes, can beneﬁt from the addition of at least some internal temporal landmarks.

2 Restrictions on gemination in Hungarian This section motivates the focus on gemination restrictions by describing them in more detail. In Hungarian, as in many other languages, geminates may be ‘true ’ or ‘fake ’. The restriction that concerns us applies without exception to true geminates, and it is straightforward to demonstrate that the restriction does not refer to relatively high-level constituents, such as syllables and words, but only (if at all) to relatively low-level constituents, such as segments. Interestingly, however, the restriction applies in a more graded fashion to fake geminates, which may surface when ﬂanked by another consonant of relatively high sonority. Although the current study focuses on only one type of true geminate, consideration of the full range of geminates and their concomitant restrictions helps to place both the abstract and physical formulations in a broader context. 2.1 Sources of geminates: true and fake In Hungarian, all singleton consonants have geminate counterparts (Kenesei et al. 1998: 425), and these may occur word-medially and wordﬁnally, but not word-initially. In both attested positions, geminates are phonemically contrastive with singletons, as shown by the examples in (7) (Na´dasdy 1989 : 104).

(7) hall [hOl:] ‘he hears’ kassza [kOs:O] ‘cash desk’

hal [hOl] ‘fish’ kasza [kOsO] ‘scythe’

In addition to phonemic geminates, Hungarian also has derived geminates. Derived geminates come from two sources, and correspondingly exhibit two diﬀerent sets of behaviours in restricted environments. The ﬁrst source of derived geminates is active phonological alternations, of which there are many. Some examples are given in (8) (Na´dasdy 1989 : 105, Kenesei et al. 1998 : 440, Sipta´r & To¨rkenczy 2000: 193).

126 Anne Pycha

(8) a. Geminating sux: triggers gemination of a root-ﬁnal consonant /vOS-CAl/ £ [vOS:Ol] ‘iron (instr)’ /COt-CAl/ £ [COt:Ol] ‘buckle (instr)’ b. Sibilant–glide sequence: triggers total progressive assimilation /moS-j/ £ [moS:] ‘wash (imp indef 2sg)’ /moS-jO/ £ [moS:O] ‘wash (def 3sg)’ c. Coronal–sibilant sequence: yields a geminate a‰ricate /la:t-sik/ £ [la:t:sik] ‘seem’ /bOra:t-Sa:g/ £ [bOra:t:Sa:g] ‘friendship’ d. Coronal–glide sequence: yields a geminate palatal consonant /la:t-jO/ £ [la:c:O] ‘see (3sg indic def)’ Other active alternations can also create surface geminates if the conditions are right. Regressive voicing assimilation, for example, applies generally in CC clusters. If the two consonants already share other features, a geminate results. The same goes for optional regressive place assimilation between sibilants (Kenesei et al. 1998: 441, 444–446).

(9) a. Voice assimilation /kOlOp-bOn/ £ [kOlOb:On] ‘hat (iness)’ /Ebe:d-tø:l/ £ [Ebe:t:ø:l] ‘lunch (abl)’ b. Sibilant place assimilation (optional) /ma:S-sor/ £ [ma:s:or] ‘other (mul)’ In the literature, phonemic geminates and geminates derived from active phonological processes are generally considered to be ‘ true’ geminates (see especially Kenstowicz 1982 and Hayes 1986b) ; essentially, this means that their behaviour is distinct from that of consonant clusters. The second source of derived geminates is the juxtaposition of identical singletons. These are referred to as ‘fake ’ geminates, meaning that their behaviour is similar to that of consonant clusters. These are also attested in Hungarian, as shown in (10) (Kenesei et al. 1998: 196, Rounds 2001: 60, 103, 107). In this and subsequent examples, the hyphens indicate morpheme boundaries.

(10) Erzsébet-t`l tisztít-tat magyar-ra van-nak

[ErZe:bEt:ø:l] [tisti:t:Ot] [mOÖOr:O] [vOn:Ok]

‘Erzsébet (abl)’ ‘clean (caus)’ ‘Hungarian (subl)’ ‘be (3pl)’

2.2 Restrictions on true geminates In Hungarian, true geminates are subject to strict restrictions : they may not occur when ﬂanked on either the left or right side by another consonant. For phonemic geminates, this restriction triggers degemination : [hOl:], but [hOlvO] (Na´dasdy 1989 : 104). For derived geminates, it is an

A test case for the phonetics–phonology interface 127 open question whether this restriction triggers degemination or prevents gemination from occurring in the ﬁrst place, but the surface requirement for a singleton is the same in either scenario. The restriction on gemination is demonstrated in the following examples, where a singleton consonant that would undergo gemination in an unrestricted environment fails to do so because of the presence of a restricting consonant on the left (Vago 1980: 42, Na´dasdy 1989 : 105, Sipta´r & To¨rkenczy 2000 : 293). Note that in many cases, the orthography continues to represent gemination by the doubling of consonant symbols, even in restricted environments. (11) a. Geminating suxes akttal /Okt-CAl/ £ [OktOl] verssel /vErS-CAl/ £ [vErSEl] ponttá /pont-CA:/ £ [ponta:] b. Sibilant–glide assimilation rajzzon /rOjz-jon/ £ [rOjzon] c. Coronal–sibilant sequences £ [øn<] öntsz /ønt-s/ d. Coronal–glide sequences küldjük /kyld-j-yk/ £ [kylÖyk] kardja /kOrd-jO/ £ [kOrÖO] e. Voice assimilation hordtam /hord-tOm/ £ [hortOm]

*[Okt:Ol] *[vErS:El] *[pont:a:]

‘with a nude’ ‘with a poem’ ‘into a point’

*[rOjz:on]

‘it should swarm’

*[ønt:s]

‘you (sg) pour’2

*[kylÖ:yk] *[kOrÖ:O]

‘we send it’ ‘his sword’

*[hort:Om] ‘carry (1sg past def)’

According to the literature, true gemination is also prevented by the presence of a restricting consonant on the right side, but concrete examples are scarce. We have already seen one example with underlying geminates, [hOl:], but [hOlvO]. Vago (1980 : 42) gives another example with regressive sibilant place assimilation, /hu:s Sko:t/Ehu´sz sko´t [hu:Sko:t] ‘twenty Scotsmen ’, *[hu:S:ko:t]. However, a reviewer observes that [hu:S:ko:t] is actually a possible surface form, because the process of sibilant place assimilation is optional, and that even when it does occur, degemination of the resulting form is also optional. This issue clearly needs more investigation, but does not aﬀect the design or conclusions of the current study. 2.3 Locality of restrictions on true geminates The restrictions on true geminates in Hungarian do not originate from independent restrictions on syllable or word structure, but are dependent upon the linear order of elements in a string. As an example, consider the 2 A reviewer notes that [øn=] is actually a rarely used variant of o¨ntesz [øntEs] ‘ you

(SG) pour ’, and suggests that the lack of similar published examples indicates that coronal-sibilant gemination almost never occurs in a restricted environment.

128 Anne Pycha root /Okt/ ‘ nude (N) ’, where the target is /t/ and the ﬂanking consonant on its left is /k/. The sources on Hungarian phonology agree that wordinternal geminates syllabify as sequences of coda+onset (Kenesei et al. 1998 : 414). So if gemination of /t/ were to occur, triggered for example by the addition of the instrumental suﬃx, the resulting syllabiﬁcation would be *[Okt.tOl] ‘with a nude’, with a complex coda at the end of the initial syllable. Yet there is no general prohibition in Hungarian against such forms. Sequences of CCC which do not contain a geminate are freely permitted across morpheme boundaries : kard-bo´l ‘from the sword ’, vers-rHl ‘about the poem ’, elv-telen ‘without principles’ (Sipta´r & To¨rkenczy 2000 : 101), paraszt-nak ‘peasant-DAT ’ (Kenesei et al. 1998 : 408), Budapest-re ‘ Budapest-SUBLAT ’ (Rounds 2001: 94). Furthermore, such sequences are syllabiﬁed as CC.C, i.e. with a complex coda at the end of the initial syllable, regardless of the relative sonority of the consonants. The following quotation makes this explicit : heteromorphemic VCCCV sequences can only yield a single consonant in onset position, even when a given cluster is permissible syllable-initially. Accordingly, Budapest-re ‘Budapest-SUB ’ can only be syllabiﬁed as /bu.dO.pEZt.rE/, even though both /tr/ and /Ztr/ are licit syllable-initial clusters (Kenesei et al. 1998 : 415). Thus, the prohibition of forms like *[Okt.tOl] is speciﬁc to geminates, and cannot be explained by general constraints on CCC sequences or complex codas. In fact, the failure of gemination to occur in restricted environments has the eﬀect of destroying the perfect correspondence between syllable boundaries and morpheme boundaries that would obtain in a hypothetical form such as *[Okt.tOl]; instead, the attested form [Ok.tOl] has a boundary that splits the root morpheme into two separate syllables. Because neither syllable- nor word-structure constraints play a role, the only way to state the gemination restriction is in terms of immediately neighbouring segments (if we adopt the abstract formulation of the restriction) or in terms of physical events of speech (if we adopt the physical formulation). 2.4 Graded restrictions on fake geminates Interestingly, the restriction as stated holds only for phonemic geminates and for geminates derived from active processes of lengthening and assimilation. For geminates derived from processes of ‘ passive’ lengthening J that is, for fake geminates derived from the juxtaposition of two singletons J the restriction may be waived, a situation which we turn to next. The tightness of the restrictions placed on fake geminates varies with the sonority of the ﬂanking consonant. When fake geminates are ﬂanked on either the left or right by an obstruent, degemination occurs obligatorily (Na´dasdy 1989 : 105–106, Sipta´r & To¨rkenczy 2000: 291).

A test case for the phonetics–phonology interface 129

(12) a. koszt-tól direkt-term` lakj jól b. kis-stíl^ olasz sztárok

[kosto:l] [dirEktErmø:] [lOkjo:l] [kiSti:ly] [olOsta:rok]

‘from food’ ‘a type of vine’ ‘eat enough (2sg imp)’ ‘petty’ ‘Italian stars’

When fake geminates are ﬂanked by a nasal, however, degemination can optionally occur (Na´dasdy 1989: 106, Sipta´r & To¨rkenczy 2000: 292).

(13) a. tank-ként comb-ból csont-tányér b. `s-smink kész sznob

[tONk(:)e:nt] [
‘like a tank’ ‘from thigh’ ‘bone plate’ ‘proto-make-up’ ‘a perfect snob’

Finally, when fake geminates are ﬂanked by a liquid or glide, degemination does not occur at all (Na´dasdy 1989 : 105–106, Sipta´r & To¨rkenczy 2000: 292).

(14) a. talp-pont szerb bor sztrájk-kor sért talán b. szép-próza két tragédia ügyes srác

[tOlp:ont] [sErb:or] [strajk:or] [Se:rt:Ola:n] [se:p:rozO] [ke:t:rOge:diO] [yÖES:ra:<]

‘foot-end’ ‘Serbian wine’ ‘during the strike’ ‘o‰ends perhaps’ ‘fiction’ ‘two tragedies’ ‘smart boy’

The data on fake geminates, while not under direct consideration in the current studies, nevertheless shed some additional light on the appropriateness (or not) of the physical formulation of the gemination restriction. As a reviewer points out, true and fake geminates in Hungarian are not distinguishable on the surface. This suggests that the articulatory gestures which produce true and fake geminates are the same, in which case the restriction should apply in the same manner to both, but we have seen that this is not the case. It is possible, however, that articulatory research could reveal that the gestures which produce true and fake geminates do in fact diﬀer despite their indistinguishable acoustic outputs, in which case their diﬀering behaviour with regard to restrictions is not a problem (on the articulation of geminates, see Smith 1995 and Gafos 2002; on diﬀering articulatory strategies underlying the same surface outcome, see Fukaya & Byrd 2005). The physical formulation of the gemination restriction is also interesting with regard to the sonority facts. As stated in this article, the physical formulation refers to areas of narrow vs. wide constriction in the speech stream. But of course this is an oversimpliﬁcation, particularly for a theory based on the physical events of speech, which can unfold in continuous

130 Anne Pycha space just as they do in continuous time. A physical theory would in fact capture degrees of constriction that are intermediate between narrow and wide. This oﬀers a potentially insightful way to capture the diﬀerence between ﬂanking obstruents, which prohibit fake geminates, and ﬂanking liquids and glides, which permit them, as well as the areas of optionality in between. A reviewer states that the facts of Hungarian are even more complex and interesting than suggested by published descriptions. The reviewer suggests that the restriction operates along a scale, whereby underlying geminates obey the restriction most categorically, geminates created by total assimilation and those created by assimilation of a single feature exhibit optionality in certain cases and fake geminates obey the restriction, as described above. These issues, while interesting, fall outside the scope of the current study. The key points for our concerns are (a) that the restriction applies without exception to true geminates, and (b) that it is highly local, requiring no reference to syllable or word structure.

3 Experiment 1 The primary goal of Experiment 1 is to examine the eﬀect of a restricted gemination environment on the relative durations inside a target consonant and, in particular, on the relative durations of T and S within an aﬀricate TS. The physical formulation of the gemination restriction predicts no change in the relative durations, while the abstract formulation permits changes in T duration. A secondary goal is to determine whether the restriction that holds of gemination also holds for another type of lengthening. The physical formulation predicts that the restriction should hold for essentially any process which increases duration, while the abstract formulation predicts that the restriction is special to gemination. As mentioned earlier, there are a number of processes besides gemination which have the eﬀect of increasing duration, and which could therefore be compared with it. In this study, we compare gemination with phrase-ﬁnal lengthening. This is a well-studied phenomenon, previously attested in Hungarian (Hockey & Fagyal 1999, Pycha 2009) as well as many other languages, in which the areas of the speech stream preceding a phrase-boundary increase in duration. As an example, consider the sentence When teenagers drive, quickly they get tickets. The [v] at the end of drive precedes a phrase boundary, and will therefore exhibit increased duration compared to when it does not precede a boundary, as in When teenagers drive quickly, they get tickets (sentences from Byrd & Saltzman 2003). A crucial characteristic of phrase-ﬁnal lengthening is that it relies upon the linear order of elements in time. That is, the relative position of a phrase boundary in time determines which segments (or gestures) lengthen, as well as the degree to which they lengthen. This is important

A test case for the phonetics–phonology interface 131 because the Hungarian gemination restriction, under either the abstract or physical formulation, is also crucially dependent upon the linear order of elements in time, regardless of whether these elements are modelled as segments or articulatory gestures. A lengthening process which also exhibits such a dependency would seem most likely to be aﬀected by the restriction and therefore to make the best candidate for comparison. Lengthening triggered by inherent properties such as voicelessness (e.g. [t] has greater duration than [d]) does not meet this criterion, because there is no sense in which the lengthening trigger, namely voicelessness, occupies a position in time relative to the lengthened segment. The same can be said of lengthening triggered by stress and emphasis, which are properties of particular syllables, and of clear speech, which is a style of talking. Of course, previous work has demonstrated that phrasal lengthening, in addition to exhibiting sensitivity to linear order, also exhibits sensitivity to higher-order phrasal structures (Fougeron & Keating 1997) and possibly to stressed syllables as well (Turk & Shattuck-Hufnagel 2007; see also Turk & Shattuck-Hufnagel 2000). But this does not diminish the robust ﬁnding that, within the local area surrounding a phrase boundary, lengthening eﬀects are closely tied to that boundary and dissipate rapidly with increasing distance from it (Byrd & Saltzman 2003, Byrd et al. 2005, Byrd et al. 2006, Byrd & Riggs 2008). Experiment 1 was thus designed to compare singleton and geminate aﬀricates in unrestricted vs. restricted positions, and furthermore to compare phrase-medial and phrase-ﬁnal aﬀricates, again in unrestricted vs. restricted positions. Stimuli. Stimuli, some of which overlap with those used in Pycha (2009), were constructed using four Hungarian roots that ended in aﬀricates: tekno˝c [tEknø:=] ‘tortoise ’, kedvenc [kEdvEn=] ‘favourite ’, becs [bEC] ‘honour ’ and kincs [kinC] ‘treasure ’. This set of four cross-cuts two factors, namely the environment of the target aﬀricate (unrestricted vs. restricted) and its place of articulation (alveolar vs. postalveolar). In the unrestricted environments, the target aﬀricates were ﬂanked on the left by a vowel : [tEknø:=] with an alveolar aﬀricate and [bEC] with a postalveolar aﬀricate. In restricted environments, the targets were ﬂanked on the left by [n] : [kEdvEn=] with an alveolar aﬀricate and [kinC] with a postalveolar aﬀricate. Each root was embedded in four contexts which cross-cut two additional factors, namely the lengthening type (gemination vs. phrase-ﬁnal lengthening) and the length context of the target consonant (short vs. long). For gemination, the short (singleton) context was created with the addition of the superessive suﬃx /-On/ and the long (geminate) context was created with the addition of the instrumental suﬃx /-CAl/. For phrase-ﬁnal lengthening, the short (phrase-medial) context was created by embedding the stand-alone root in the middle of a sentence, while the long (phrase-ﬁnal) context was created by embedding it at the end of a sentence.

132 Anne Pycha gemination environment

phrase-final lengthening

place

short (singleton)

long (geminate)

short (medial)

long (final)

alveunrestricted olar

tekn`cön [tEknø:<øn]

tekn`ccel [tEknø:t:sEl]

tekn`c él [tEknø:< e:l] ‘tortoise lives’

tekn`c. Él [tEknø:<] [e:l] ‘…tortoise. Lives…’

becsen [bECEn]

beccsel [bEt:SEl]

becs ülni [bEC ylni] ‘honour to sit’

becs. Ülök [bEC] [yløk] ‘…honour. I sit…’

postalveolar

restricted alve- kedvencen olar [kEdvEn<En]

postalveolar

kincsen [kinCEn]

kedvenc Ella kedvenc. Ella kedvenccel [kEdvEn<El] [kEdvEn< EllO] [kEdvEn<] [EllO] ‘…favourite. ‘favourite *[kEdvEnt:sEl] Ella…’ Ella’ kinccsel [kinCEl] *[kint:SEl]

kincs úszik [kinC u:sik] ‘treasures are swimming’

kincs. Úszik [kinC] [u:sik] ‘…treasure. It swims…’

Table I Stimuli for Experiment 1.

In sum, four factors were cross-cut to yield a total of sixteen stimuli : Environment (unrestricted vs. restricted)[Place (alveolar vs. postalveolar) [LengthType (gemination vs. phrase-ﬁnal lengthening)[Length (short vs. long). These are displayed in Table I. The gemination condition uses words with the superessive and instrumental case suﬃxes. The superessive, which has the possible surface forms -en, -o¨n, -on and -n, adds a meaning ‘ on ’ or ‘on top of ’ (Kenesei et al. 1998 : 235ﬀ). This suﬃx, like most suﬃxes of the Hungarian nominal paradigm, combines with the root without triggering gemination. The instrumental, which has the possible surface forms -el, -al, -vel and -val, adds a meaning ‘ with’ (Kenesei et al. 1998: 210). This suﬃx conditions gemination of the root-ﬁnal consonant: cf. vassal ‘iron-INSTR ’, bajjal ‘ trouble-INSTR ’, ketreccel ‘cage-INSTR ’ (Kenesei et al. 1998: 437). Gemination is represented in Hungarian orthography by ccs for the postalveolar aﬀricate and cc for the alveolar aﬀricate, even in restricted environments. The phrase-ﬁnal lengthening condition uses complete sentences. For simplicity, Table I displays only the key fragments of the sentences – i.e. fragments which show the comparison between phrase-medial and phraseﬁnal positions. The complete stimuli are given in Appendix A. So for example, in the short condition, the root tekno˝c occurs in the middle of a sentence : Nagyon sok tekno˝c e´l ebben a to´ban. ‘There are very many tortoises

A test case for the phonetics–phonology interface 133 living in this lake’. In the long condition, it occurs at the end of a sentence : ´ l me´g itt krokodil is. Errefele´ a leggyakrabban elo˝fordulo´ a´llat a tekno˝c. E ‘The most frequent animal here is the tortoise. Crocodiles also live here. ’ Because one goal of the study was to investigate potentially subtle diﬀerences in T/TS ratio between singleton and geminate aﬀricates in restricted environments, stimuli in the geminate condition were isolated words. The advantage of this design is that any diﬀerence between singletons and geminates can be attributed to gemination alone, and not to potentially interfering factors such as position within a phrase or utterance. The disadvantage is that a direct comparison between the gemination and phrase-ﬁnal conditions (which used complete sentences) will be tenuous, because any diﬀerences could be attributable either to inherent diﬀerences in the processes themselves or to diﬀerences between aﬀricates in isolated words on the one hand and complete sentences on the other. Since our primary intention is not to distinguish between these two processes, but rather to determine the appropriateness of abstract vs. physical formulations of a particular phonological process, the lack of direct analogues between the gemination and phrase-ﬁnal lengthening should not a priori aﬀect our conclusions adversely. Procedure. A list of sentences was prepared, containing ﬁve repetitions of each target sentence (5[8=40), additional target sentences (not analysed here) which placed aﬀricates in word-initial position (= 40) and ﬁller sentences (= 28). Following the sentences was a list of words, which contained four repetitions of each of the eight target words (becsen, beccsel, tekno˝co¨n, tekno˝ccel, kincsen, kinccsel, kedvencen, kedvenccel) (4[8=32) and ﬁllers (= 17). The order of the 108 sentences was randomised, although adjustments were then made to ensure that ﬁller sentences, and not stimulus sentences, occupied the ﬁrst and last item of every printed page. The order of the 49 words was similarly randomised. Subjects were asked to familiarise themselves with the sentences and words, and to read each one aloud for practice before recording began. During recording, which used a Marantz digital recorder and headmounted microphone, subjects were asked to read the sentences and words at a natural pace. When they mispronounced a word or sentence, they were asked to repeat the stimulus item from the beginning. Ten subjects were recorded in a soundproof booth ; the remaining four were recorded in a quiet room in their homes. Subjects. Subjects were adult native speakers of Hungarian (n=14), twelve of whom live in the Bay Area of California. The remaining two live in Hungary, but visited California during the study. They were paid for their participation. Eight were female, and six were male. Their ages ranged from 18 to approximately 50. The length of residence of those who lived in the United States ranged from two months to eleven years. They came from various locations in Hungary and Romania. Duration measurements. The duration of each portion of each target aﬀricate was measured using waveforms and spectrograms produced by Praat (Boersma & Weenink 2007), using the following procedure. The

134 Anne Pycha factor

F(1,13)

11·3, p<0·01 Length 49·6, p<0·01 LengthType 394·3, p<0·01 Environment 10·9, p<0·01 Place 38·0, p<0·01 Length:LengthType 27·2, p<0·01 Length:Environment 11·1, p<0·01 Length:Place Length:LengthType:Environment 13·2, p<0·01 6·4, p<0·05 Length:LengthType:Place Table II ANOVA results for T/TS outcome variable in Experiment 1.

closure (T) portion began when the preceding vowel displayed no more periodicity, and ended just before the release burst, if any. The frication portion (S) began at the onset of aperiodic energy, and ended at the cessation of aperiodic energy. In those cases where the stop portion of an aﬀricate displayed a release burst, the burst was included in the following frication portion. Analyses. Subjects produced four repetitions of each item in the gemination condition, and ﬁve repetitions of each item in the phrase-ﬁnal lengthening condition. In order to maintain balanced numbers across conditions, as required by the statistical analysis, one repetition of each item in the phrase-ﬁnal condition was discarded at random. One subject mispronounced three tokens during the session. These tokens were excluded from the dataset; again, in order to maintain balanced numbers across conditions, a fourth additional token for this subject was discarded. Another subject accidentally skipped one token during the session ; the missing data was replaced with the mean for that cell. This yielded a total of 892 tokens for analysis (2 environments[2 places of articulation[2 length types[2 lengths[4 repetitions[14 speakers=896, minus four tokens which were discarded). 3.1 Results Results, discussed in detail in the sections that follow, reveal two basic ﬁndings, as well as a third, unexpected ﬁnding. First, an analysis of the T/TS ratio shows that a small change in the internal duration structure does occur for aﬀricates in restricted gemination environments. This change occurs in a particular direction, such that the relative amount of duration occupied by the closure portion of the aﬀricate increases. Second, an analysis of total duration shows that the process of phrase-ﬁnal lengthening is not subject to the same restriction as gemination. That is,

A test case for the phonetics–phonology interface 135 ratio of T/TS

0·6

gemination phrase-final

0·4 0·2 0

short

0·8

(b)

unrestricted restricted

long

ratio of T/TS

0·8

(a)

alveolar postalveolar

0·6 0·4 0·2 0

short

long

Figure 1 Mean ratio of closure duration to total duration (T/TS) in target aﬀricates in Experiment 1, according to experimental condition.

target aﬀricates in phrase-ﬁnal positions exhibited signiﬁcant overall duration increases compared to their counterparts in phrase-medial positions, even in supposedly restricted positions. A third and unexpected ﬁnding is that alveolar and postalveolar aﬀricates possess diﬀerent T/TS ratios, and this diﬀerence is maintained under gemination. Analysis of the T/TS ratio. A repeated-measures Analysis of Variance (ANOVA), with subject as the error term and T/TS ratio of the target aﬀricate as the dependent variable, revealed the signiﬁcant eﬀects in Table II. Figure 1a shows a three-way interaction between Length, LengthType and Environment. For gemination, a change from short to long conditions increases the T/TS ratio by a relatively large amount in the unrestricted environment, from 0.39 to 0.54, but by a smaller amount in the restricted environment, from 0.27 to 0.30. For phrase-ﬁnal lengthening, a change from short to long has a negligible eﬀect in the unrestricted environment, but decreases the T/TS ratio a little in the restricted environment, from 0.21 to 0.18. Post hoc analysis of restricted environments reveals that the interaction between Length and LengthType is signiﬁcant here (F(1, 13)=6.1, p<0.05), indicating that ratios changed in signiﬁcantly diﬀerent ways for gemination vs. phrase-ﬁnal lengthening. However, the change from short to long in the restricted gemination condition is not signiﬁcant by itself, and neither is the change from short to long in the phrase-ﬁnal condition. There was also a three-way interaction between Length, LengthType and Place, which can be seen in Fig. 1b. For gemination, a change from short to long conditions aﬀects both places of articulation in a similar fashion, increasing the T/TS ratio from 0.30 to 0.38 for alveolar aﬀricates and from 0.36 to 0.47 for postalveolar aﬀricates. For phrase-ﬁnal lengthening, on the other hand, a change from short to long conditions has a diﬀerent eﬀect, and this eﬀect is diﬀerent for the two places of articulation. The T/TS ratio decreases for postalveolar aﬀricates, from 0.31 to 0.26, but increases slightly for alveolar aﬀricates, from 0.25 to 0.27. Post hoc analysis conﬁned to phrase-ﬁnal lengthening indicates that the interaction between Length and Place is signiﬁcant here (F(1, 13)=7.1, p<0.05).

250

(b)

unrestricted restricted

200

gemination phrase-final

150 100 50

short

long

total duration of TS

(a)

total duration of TS

136 Anne Pycha 250

alveolar postalveolar

200 150 100 50

short

long

Figure 2 Mean total duration (TS) of target aﬀricates in Experiment 1, according to experimental condition.

In addition to the interaction eﬀects, there were main eﬀects of Length, LengthType, Environment and Place. For Length the T/TS ratio is greater overall in long conditions (0.35) than in short ones (0.31). For LengthType the ratio is greater overall in gemination conditions (0.38) than in phrase-ﬁnal conditions (0.27), for Environment the ratio is greater overall in unrestricted environments (0.41) than in restricted environments (0.24) and for Place the ratio is greater overall in postalveolar aﬀricates (0.35) than in alveolar aﬀricates (0.30). Analysis of total duration. A repeated-measures ANOVA, with subject as the error term and total duration of the target aﬀricate as the dependent variable, revealed a four-way interaction between Length, LengthType, Environment and Place (F(1, 13)=10.5, p<0.05), several two-way interactions and main eﬀects of Length, LengthType and Environment. Fig. 2a shows the total duration of aﬀricates in unrestricted vs. restricted conditions, analogous to the ratios plotted in Fig. 1a. For gemination, a change from short to long conditions increases total duration by a comparatively large amount in the unrestricted environment, from 149.3 to 223.6 ms, but by a much smaller amount in the restricted environment, from 135.5 to 149.1 ms. For phrase-ﬁnal lengthening, a change from short to long conditions increases total duration in a similar fashion in unrestricted environments, from 108.4 to 179.7 ms, and in restricted ones, from 85.0 to 150.6 ms. Post hoc analysis indicates that the duration increase for restricted gemination is signiﬁcant (F(1, 13)=24.6, p<0.01), as it is for phrase-ﬁnal lengthening (F(1, 13)=69.8, p<0.01). Fig. 2b shows the overall duration in alveolar vs. postalveolar conditions, analogous to the ratios plotted in Fig. 1b. For gemination, a change from short to long conditions increases total duration by a comparatively small amount in alveolar aﬀricates, from 143.2 to 177.1 ms, but by a larger amount in postalveolar aﬀricates, from 141.6 to 195.6 ms. For phrase-ﬁnal lengthening, a change from short to long conditions increases total duration by similar amounts in alveolar aﬀricates, from 101.7 to 163.7 ms, and in postalveolar aﬀricates, from 91.7 to 166.6 ms.

A test case for the phonetics–phonology interface 137 3.2 Summary of Experiment 1 A primary goal of Experiment 1 was to determine whether the internal duration structure of aﬀricates changes in restricted gemination environments in Hungarian. Results reveal that a small change does occur. Furthermore, this change occurs in a particular direction, such that the relative amount of duration occupied by the closure portion of the aﬀricate increases. This increase diﬀers signiﬁcantly from the decrease in closure proportion that we see in a diﬀerent kind of lengthening process, namely phrase-ﬁnal lengthening. The attested changes in relative durations within the aﬀricate provide support for the role of the segment, rather than individual articulatory gestures, in the formulation of the gemination restriction in Hungarian, and therefore suggest that the restriction cannot be formulated in physical terms. This ﬁnding should be interpreted with caution, however. The increases in closure proportion triggered by restricted gemination are not signiﬁcant on their own, but only in comparison to the decreases triggered by phrase-ﬁnal lengthening. As we have discussed, gemination and phrase-ﬁnal lengthening are not directly comparable in this experiment, and so this interaction eﬀect could be due to real diﬀerences between the processes or to diﬀerences between stimuli using isolated words vs. sentences. In addition, somewhat surprisingly, the increases in closure proportion are accompanied by very small (13.6 ms) but signiﬁcant increases in the overall duration of the aﬀricate. This ﬁnding contradicts the predictions of both the physical and abstract formulations of the restriction, and suggests the need for eventual investigation into the full range of degemination eﬀects in Hungarian.3 A secondary goal of Experiment 1 was to determine whether lengthening processes other than gemination obey the restriction. Results indicate that phrase-ﬁnal lengthening does not. Target consonants in phrase-ﬁnal positions exhibited signiﬁcant overall duration increases compared to their counterparts in phrase-medial positions. This ﬁnding runs counter to the predictions of the physical formulation of the gemination restriction, whereby gemination is modelled as a series of physical events and is therefore essentially akin to other lengthening processes. Again, however, this result should be interpreted with caution. The sentences used in the phrase-ﬁnal lengthening condition were diverse, and the number of syllables per utterance was not balanced across medial and ﬁnal conditions. Because this factor was not controlled for, we cannot be sure whether the duration increases are due to phrasal position, number of syllables per utterance or a combination of these factors. 3 A reviewer suggests that the small increases in total duration observed here could be

due to the fact that Hungarian orthography preserves doubled consonants even when its phonology does not. Thus, [kinCEl] is spelled kinccsel (not kincsel) and [kEdvEn=El] is spelled kedvenccel (not kedvencel). Previous speech-production work on Dutch indicates that orthographically doubled consonants lead to small increases in duration (Warner et al. 2004), and this may be the case in the current study as well.

138 Anne Pycha An additional ﬁnding from Experiment 1, unexpected from our initial hypotheses, is that alveolar and postalveolar aﬀricates possess inherently diﬀerent internal duration structures : for postalveolar aﬀricates, the closure occupies a greater proportion of the segment than it does for alveolar aﬀricates. Interestingly, the process of gemination aﬀects closure proportion in a similar fashion for both places of articulation and, as a consequence, the inherent diﬀerence between the two aﬀricates is maintained in the long (geminate) condition. Phrase-ﬁnal lengthening, on the other hand, does not change closure proportion in the same way for both places of articulation. Instead, it decreases the ratio signiﬁcantly more for postalveolars and as a consequence, the inherent diﬀerence between the two aﬀricates is essentially neutralised. This ﬁnding must also be qualiﬁed. While the number of tokens in Experiment 1 is reasonably large, the number of experimental items is small and contains heterogeneous items. The size and composition of this set was originally constrained by the requirements of an additional speech-production study not reported here, but the upshot is that for the current study any diﬀerence in the behaviour of postalveolar vs. alveolar aﬀricates could be attributable not just to place, but also to number of syllables, vowel length or position relative to stress (which is always initial in Hungarian), or some combination of these factors. Given previous work on these issues, however, such diﬀerences are not expected to be large. For vowel length, in most instances Hungarian permits sequences of a long vowel followed by a geminate consonant (see Kenesei et al. 1998: 419), so there should be no restriction on the lengthening of a consonant after a long vowel, as in [tEknø:=]. For position relative to stress, it is usually the consonant in pre-stressed position that undergoes the most change in duration (Klatt 1976, Lavoie 2001). Consonants in post-stressed position, such as the aﬀricate in [bEC], are not known to change markedly, suggesting they may be reasonably compared with consonants in nonpost-stressed position, such as the aﬀricate in [tEknø:=]. In sum, Experiment 1 oﬀers cautious support for the predictions of the abstract formulation of the Hungarian gemination restriction. The relative durations of target aﬀricates can change, as permitted by an abstract formulation that refers to segments and not physical events. Other lengthening types do not observe the restriction, as predicted by an abstract formulation that treats gemination as special. These ﬁndings are tempered by certain shortcomings in the design of Experiment 1, which are addressed in Experiment 2.

4 Experiment 2 As with Experiment 1, the primary goal of Experiment 2 is to examine the eﬀect of a restricted gemination environment on the relative durations inside a target consonant. A secondary goal is to determine whether the restriction that holds of gemination also holds for phrase-ﬁnal

A test case for the phonetics–phonology interface 139 lengthening. In Experiment 2, however, the gemination and phrase-ﬁnal conditions are more directly comparable than they were in Experiment 1. The stimulus items are also more numerous, and they are balanced for vowel length and syllable count. Stimuli. Stimuli, some of which overlap with those used in Pycha (2007), were constructed using 28 Hungarian roots ending in aﬀricates, selected from Papp (1969). The list of roots was balanced for vowel length and syllable count. 13 of the roots had a short vowel preceding the target aﬀricate, and 15 had a long vowel preceding it. 13 of the roots were monosyllabic and 15 were bisyllabic. As in Experiment 1, the set of roots cross-cuts two factors : the environment of the target aﬀricate (unrestricted vs. restricted) and its place of articulation (alveolar vs. postalveolar). In the unrestricted environments, the target aﬀricates were ﬂanked on the left by a vowel : e.g. lazac [lOzO=] ‘ salmon’, with an alveolar aﬀricate, and kulacs [kulOC] ‘gourd ’, with a postalveolar aﬀricate. In restricted environments, the targets were ﬂanked on the left by [n] : ribanc [ribOn=] ‘harlot’, with an alveolar aﬀricate, and agancs [OgOnC] ‘antler ’, with a postalveolar aﬀricate. Each root was embedded in three contexts, one short and two long. The short (singleton and phrase-medial) context was created by adding the superessive suﬃx /-On/ to the root. The ﬁrst long context (geminate and phrase-medial) was created by adding the instrumental suﬃx /-CAl/. The second long context (singleton and phrase-ﬁnal) was created by leaving the root without suﬃxes, so that the individual word was also a complete phrase. In sum, three factors were cross-cut to yield a total of twelve contexts : Environment (unrestricted vs. restricted)[Place (alveolar vs. postalveolar)[Length (short vs. long geminate vs. long phrase-ﬁnal). These are displayed in Table III.

environment

place

short (singleton long long and phrase-medial) (geminate) (phrase-final)

unrestricted

alveolar postalveolar

[lOzO
[lOzOt:sOl] [kulOt:SOl]

[lOzO<] [kulOC]

restricted

alveolar postalveolar

[ribOn
[ribOn
[ribOn<] [OgOnC]

Table III Stimuli for Experiment 2.

For each context, there were seven roots, yielding 84 stimulus items. The complete list of roots is given in Appendix B. Method. Each word was embedded in a quoted phrase within a carrier sentence Marika azt mondta hogy _ gyorsan [mOrikO Ost montO

140 Anne Pycha ho& _ &orSOn] ‘Marika said _ quickly ’. Additional sentences with roots ending in simple stops and fricatives, not analysed here, were included in the list. The order of sentences was randomised, and ﬁllers interspersed throughout. Three native speakers of Hungarian (two female, one male) read each list. They were instructed to pronounce the sentences in a casual manner. Recording took place using a head-mounted microphone and Marantz digital recorder. Duration measurements. The duration of each portion of each target aﬀricate was measured using the same procedure as in Experiment 1. Analyses. A total of 252 tokens were analysed (7 roots[2 environments[2 places of articulation[3 length contexts[1 repetition[3 speakers). 4.1 Results Results, discussed in detail in the sections that follow, are similar to those from Experiment 1. First, an analysis of T/TS ratio shows that a change in the internal duration structure does occur for aﬀricates in restricted gemination environments, taking the form of an increase in the relative amount of duration occupied by the stop closure. Second, an analysis of total duration shows that the process of phrase-ﬁnal lengthening is not subject to the same restriction as gemination. Target aﬀricates in phraseﬁnal positions exhibited signiﬁcant overall duration increases compared to their counterparts in phrase-medial positions, even in supposedly restricted positions. Finally, the diﬀerences between alveolar and postalveolar aﬀricates which were apparent in Experiment 1 exhibit the same trend in Experiment 2, although results do not reach signiﬁcance. 4.1.1 Analysis of T/TS ratio. A repeated-measures ANOVA, with subject as the error term and T/TS ratio of the target aﬀricate as the dependent variable, revealed the signiﬁcant eﬀects in Table IV. factor

F(1,2)

Length 85·0, p<0·01 Environment 11177·0, p<0·01 Table IV ANOVA results for T/TS outcome variable in Experiment 1.

Figure 3a summarises the results. There is a main eﬀect of Length. The T/TS ratio is intermediate in the short condition, but increases in the long gemination condition, and decreases in the long phrase-ﬁnal condition. Length does not interact with any other factor. For the restricted gemination condition on its own, a change from short to long condition produces an increase in T/TS ratio from 0.26 to 0.33, which post hoc analyses

A test case for the phonetics–phonology interface 141 ratio of T/TS

0·6

gemination phrase-final

0·4 0·2 0

short

0·8

(b)

unrestricted restricted

ratio of T/TS

0·8

(a)

alveolar postalveolar

0·6 0·4 0·2 0

long

short

long

Figure 3 Mean ratio of closure duration to total duration (T/TS) in target aﬀricates in Experiment 2, according to experimental condition.

conﬁrm is signiﬁcant (F(1, 2)=40.4, p<0.05). There is also a main eﬀect of Environment, such that the T/TS ratio is overall greater in the unrestricted environment than in the restricted environment. Environment does not interact with any other factor. Figure 3b displays results for Place with other factors, analogous to Fig. 1b for Experiment 1. Although the graph looks similar to that for Experiment 1, this interaction did not reach signiﬁcance for Experiment 2. This may be due to lack of power in a repeated-measures ANOVA with only three subjects. Individual subject analyses show that two of three subjects in Experiment 2 do exhibit the same interaction that was robustly attested in Experiment 1 (Subject 1 : F(1, 2)=5.4, p<0.01; Subject 2 : not signiﬁcant ; Subject 3 : F(1, 2)=3.9, p<0.05). 4.1.2 Total duration. A repeated-measures ANOVA, with subject as the error term and total duration of the target aﬀricate as the dependent variable, revealed the signiﬁcant eﬀects in Table V. factor

F(1,2)

Length 17·3, p<0·05 Environment 34·6, p<0·05 Length:Environment 9·5, p<0·05 Table V ANOVA results for total duration outcome variable in Experiment 2.

There is a signiﬁcant interaction between Length and Environment, which can be seen in Fig. 4a. In unrestricted environments, durations increased by a relatively large amount from the short condition (131.4 ms) to the long geminate condition (198.9 ms) and the long phrase-ﬁnal condition (181.4 ms). In restricted environments, durations increased by smaller amounts from the short condition (108.3 ms) to the long geminate condition (117.1 ms) and the long phrase-ﬁnal condition (130.7 ms). For

250

(b)

unrestricted restricted

200

gemination phrase-final

150 100 50

short

long

total duration of TS

(a)

total duration of TS

142 Anne Pycha 250

alveolar postalveolar

200 150 100 50

short

long

Figure 4 Mean total duration (TS) of target aﬀricates in Experiment 2, according to experimental condition.

the restricted gemination condition on its own, post hoc analyses reveal that the change in duration (from 108.3 to 117.1 ms) is not signiﬁcant. Length is a main eﬀect. Durations are smallest in the short condition (119.9 ms), and larger in the long geminate (158.0 ms) and long phraseﬁnal conditions (156.1 ms). Environment is also a main eﬀect. Durations are greater in unrestricted environments (170.6 ms) than in restricted environments (118.7 ms). Place did not reach signiﬁcance as a main eﬀect or interact with other factors. Fig. 4b is provided for purposes of comparison with Experiment 1. 4.2 Summary As with Experiment 1, our primary goal in Experiment 2 was to determine whether the internal duration structure of aﬀricates changes in restricted environments in Hungarian. Results reveal that a change does occur and that it occurs in a particular direction, such that the relative amount of duration occupied by the closure portion of the aﬀricate increases. Unlike in Experiment 1, this increase reached signiﬁcance on its own and, interestingly, did not diﬀer signiﬁcantly from the increase that occurred in unrestricted environments. Also, unlike in Experiment 1, this increase was not accompanied by signiﬁcant increases in the overall duration of the aﬀricate. The fact that these results for relative duration reached signiﬁcance, even with the low power provided by three subjects, oﬀers strong support for the role of the segment, rather than individual articulatory gestures, in the proper description of the gemination restriction in Hungarian. An abstract, rather than a physical, formulation of the restriction appears to be necessary. A secondary goal of Experiment 2 was to determine whether lengthening processes besides gemination obeyed the restriction. Results reveal that phrase-ﬁnal lengthening does not obey it ; despite the presence of a ﬂanking consonant, target consonants in phrase-ﬁnal positions exhibited signiﬁcant duration increases compared to their counterparts in phrase-medial positions. This ﬁnding runs counter to the predictions of the physical formulation, whereby gemination is no diﬀerent from other

A test case for the phonetics–phonology interface 143 lengthening processes. Unlike in Experiment 1, the total number of syllables per utterance in phrase-medial vs. phrase-ﬁnal positions in Experiment 2 diﬀered by just one, suggesting that this eﬀect can most likely be attributed to phrasal position, rather than syllable count. For place of articulation, results from Experiment 2 showed the same pattern as those from Experiment 1, namely that gemination maintains inherent diﬀerences between diﬀerent places of articulation, while phraseﬁnal lengthening essentially neutralises them. In Experiment 2, these results were only a trend, not a signiﬁcant pattern. But individual subject analyses for Experiment 2 indicate that two out of three subjects do reach signiﬁcance for this interaction, suggesting that the results from Experiment 1 may generalise. 4.3 Summary of Experiments 1 and 2 The two experiments reported in this paper present reasonably similar portraits of lengthening in Hungarian, with three key ﬁndings. First, in both experiments, ratio increases occurred in gemination, even in restricted environments. Second, in both experiments, phrase-ﬁnal lengthening avoided the restriction altogether, exhibiting increases in overall duration. Finally, in both experiments, sensitivity to place of articulation occurred in both gemination and phrase-ﬁnal lengthening, although this was a signiﬁcant result in one experiment and only a trend in the other.

5 Discussion We began this study by asking whether phonological processes should be modelled directly on physical events of speech, and we pursued an answer by comparing the diﬀerent predictions of a physical theory based on gestures with an abstract theory based on segments. We focused on the gemination restriction in Hungarian because both the abstract and physical theories oﬀered potentially apt formulations which nevertheless diﬀered in their predictions, speciﬁcally in their predictions for the internal structure of segments. To recap, the abstract formulation predicts that the internal structure of e.g. an aﬀricate can change, because the gemination restriction applies to segments, not individual articulatory gestures. By contrast, the physical formulation predicts that the internal structure of an aﬀricate cannot change, because the gemination restriction applies to each articulatory gesture individually. The speech-production results reported here demonstrate that the physical formulation cannot be the correct one, because the internal structure of aﬀricates does change in the restricted gemination. That is, the restriction fails to apply individually to the closure and frication portions of an aﬀricate, as shown by the increases in relative duration of the closure that were evident in restricted gemination contexts. In addition, the physical formulation predicts that the same restrictions should apply to all lengthening processes, not just to gemination. Again the results demonstrate that this

144 Anne Pycha cannot be correct, because the overall duration of aﬀricates changes in phrase-ﬁnal lengthening.4 Of course, the experiments reported here examined only one type of gemination in Hungarian, namely gemination triggered by the addition of a particular suﬃx, the instrumental. As we saw in w2, however, the language has many other types of geminates, both underlying and derived, and these geminates diﬀer in the extent to which they obey the restriction. It is certainly possible, then, that other types of geminates produce different results than those reported here. Furthermore, the experiments compared gemination to only one other lengthening process, phrase-ﬁnal lengthening, and only in a very speciﬁc context. While we have shown that phrase-ﬁnal lengthening does not obey the abstract formulation of the gemination restriction, this does not mean that phrase-ﬁnal lengthening obeys no abstract formulation – previous work indicates that it probably does (e.g. Fougeron & Keating 1997). Furthermore, other lengthening processes could conceivably exhibit sensitivity to the restriction in the same way that gemination does. Within the conﬁnes of the current study, however, the evidence primarily supports an abstract theory based on segments. That is, the formulation of gemination restrictions in Hungarian seems to require segments, which divide the speech stream into units that act as a uniﬁed whole. This suggests that phonological processes cannot always be modelled on the physical events of speech. We also found evidence that gemination behaves diﬀerently from phrase-ﬁnal lengthening. Again, this suggests the need for a real distinction between processes even when they exhibit resemblances to one another, with some processes characterised as abstract and others characterised as physical. At the same time, however, some of the evidence we have examined suggests that an entirely abstract theory of phonological processes – that is, one which does away with inherent diﬀerences between segments altogether – is not appropriate either. As we have seen, the supposedly abstract process of gemination preserves inherent diﬀerences between Hungarian aﬀricates at diﬀerent places of articulation, rather than ignoring them. A theory based on the segment, which purposely abstracts away from such inherent diﬀerences, cannot capture this behaviour. Thus while the physical theory seems to be too physical to handle the facts, the abstract theory also seems to be too abstract. A potential compromise, as I will suggest, would be to maintain the abstract status of the segment while endowing it with a limited set of internal temporal landmarks. 4 The fact that gemination increases duration of the stop closure, and not of frication,

oﬀers another argument in favour of an abstract theory. In the current study, the morphological trigger for gemination is the instrumental suﬃx /-CAl/. This suﬃx attaches directly to the right of the root-ﬁnal consonant, i.e. adjacent in time to the frication portion of a root-ﬁnal aﬀricate. Nevertheless, gemination essentially skips the frication in order to trigger an increase in the duration of the stop portion. The process is therefore non-local, and diﬃcult (if not impossible) to model in a physical theory of phonology.

A test case for the phonetics–phonology interface 145 The ﬁnding that gemination preserves inherent diﬀerences between Hungarian aﬀricates has two potential explanations: either these diﬀerences are inevitable consequences of articulatory implementation, or they are a fundamental part of the segment representation. The ﬁrst explanation is the one most commonly put forth to explain inherent diﬀerences between segments: these diﬀerences are the direct and inevitable consequence of the articulatory implementation of a particular segment. As a result, we will observe these diﬀerences whenever we compare the segments in positions that are otherwise identical. An often cited example is that the voiced velar stop [g] has shorter duration in intervocalic position than labial [b] or alveolar [d]. We can attribute this to articulatory implementation. To produce voicing, speakers must maintain a pressure diﬀerential across the glottis. In articulations with a closure at the back of the vocal tract, such as velar stops, the cavity formed above the glottis is so small that the pressure diﬀerential disappears rapidly and voicing can be maintained for only a short time. In articulations with closure towards the front of the vocal tract, such as labials and alveolars, the cavity formed is bigger, and voicing can be maintained for a longer period of time (Ohala 1997). But an explanation along these lines seems unlikely for the current data. As we have seen, gemination does not just lengthen aﬀricates, but changes their internal structure. Thus, whatever the articulatory strategy that Hungarian speakers use to produce singletons, which have a relatively small closure proportion, it seems unlikely to be replicated in the production of geminates, which have a much larger closure proportion. The pattern found across singleton and geminate contexts must therefore arise from another source. Furthermore, we have seen evidence that speakers can and do obliterate inherent diﬀerences between aﬀricates in certain cases. Speciﬁcally, in phrase-ﬁnal position, alveolar and postalveolar aﬀricates exhibited identical closure proportions. Even if an explanation based on articulatory implementation were tenable for the other word and phrasal positions examined in this study, then, it would not be tenable here. The other potential explanation is that at least some inherent duration diﬀerences are actually a fundamental part of the segmental representation, rather than something to abstract away from. That is, even abstract segments need to include temporal landmarks. Of course, we have already seen evidence that a theory which includes a multitude of temporal landmarks – a fully physical theory – is not adequate for the data. The proposal that I would like to suggest here, following Steriade (1993, 1994) is therefore diﬀerent and signiﬁcantly more abstract : segments, speciﬁcally stops, are bipositional. That is, while stops do not consist of a series of temporal landmarks unfolding in a continuous time dimension, they do consist of at least two temporal landmarks, closure and release, which are ordered relative to one another. Closure and release, represented abstractly as CLO and REL, are both subsumed under an overarching segment constituent, but each nevertheless acts as an independent skeletal position to which features can associate. For the speciﬁc case of aﬀricates,

146 Anne Pycha CLO associates to [t] features, while REL associates to [s] or [S] features. For the more general case of stops, CLO and REL can associate to a wider range of features, including glottalisation and nasality. Steriade oﬀers evidence that such a representation can capture glottalisation patterns in Mazateco onsets (1993) and pre- and postnasalisation patterns in Bantu stops (1994). While the details remain to be worked out, the inherent duration diﬀerences that we have seen in the current study could be potentially captured if CLO and REL associate not just to features, as proposed by Steriade, but also to subsegmental units of timing. For example, in Hungarian alveolar aﬀricates, the CLO position would associate to a [t] as well as a single subsegmental timing unit, and REL would associate to an [s] as well as a single subsegmental timing unit. In Hungarian postalveolar aﬀricates, on the other hand, the CLO position would associate to [t] and two timing units, and REL would associate to [S] and one timing unit. Crucially, any subsegmental timing units must remain distinct from the more familiar segmental timing units, such as C. We know this because the inherent durational diﬀerences between Hungarian aﬀricates remain intact even under gemination, when an additional C is inserted. Some previous research on gemination supports the idea that this ‘truly minimal ’ set of temporal landmarks – i.e. the set of CLO and REL – is all that is needed. In speech-production studies of Hungarian and three other languages, Ham (2001, summarised in Cohn 2003) found that while singleton consonants exhibited robust overall duration diﬀerences based on their place of articulation (e.g. [p] vs. [t] vs. [k]) or their voicing (e.g. [t] vs. [d]), geminate consonants exhibited much more modest diﬀerences based upon these factors, although these diﬀerences were still evident. This runs counter to the ﬁnding of the current study, in which inherent durational diﬀerences are not attenuated but maintained under gemination. But there could be a good reason why. In a bipositional segment, of the kind I am suggesting here, only durational diﬀerences between positions in the segment are represented, because only the subconstituents of CLO and REL can independently associate to timing units. Any other durational diﬀerences, such as those conditioned by place and voicing, are still abstracted away from in the representation, and we therefore do not expect them to be maintained under gemination. The bipositional proposal diﬀers in this respect from a physical theory of multiple landmarks in continuous time, which can represent all kinds of durational diﬀerences as articulatory implementations, and therefore predicts that durational diﬀerences should be maintained under gemination. One ﬁnal point about the results of the current study bears mentioning. It is noteworthy that the T/TS ratio in restricted gemination exhibited not just random change, but a consistent increase. On its own, an abstract formulation of restricted gemination merely permits change. That is, it predicts that the internal structure of an aﬀricate can change precisely because it is subsumed under the C representation. On this view, there is nothing special about aﬀricates in restricted gemination positions ; they should exhibit variation just like any other singleton aﬀricate.

A test case for the phonetics–phonology interface 147 Previous speech-production research (Pycha 2007, 2009), however, has shown that aﬀricate gemination in Hungarian has two distinct correlates, or ‘signatures’ : an overall duration increase, which we can refer to as the degree of lengthening, and an increase in T/TS ratio, which we can refer to as the type of lengthening. On this view, there is indeed something special about aﬀricates placed in restricted gemination positions, because these aﬀricates can potentially satisfy the demands of the restriction and of gemination at the same time. For example, forms such as /kinC-CAl/ can satisfy gemination by increasing the internal T/TS ratio. But they can also satisfy the restriction by failing to lengthen overall. The current study shows that for the most part this is exactly what happens, although small increases in overall durations are still evident. In other words, changes in aﬀricate structure reﬂect not random variation, but the principled use of an alternative signature for gemination, namely an increase in T/TS ratio. The ﬁnding that diﬀerent correlates of lengthening can occur largely independently of one another strengthens the argument for the existence of diﬀerent types of processes, and suggests that an accurate characterisation of the phonetics–phonology interface requires focusing not on how cognate processes diﬀer in degree, but how they diﬀer in type.

Appendix A: Phrase-final lengthening stimuli for Experiment 1 short unalveNagyon sok tekn`c él ebben a tóban. restricted olar (phrase- [nOÖon Sok tEknø:< e:l Eb:En O to:bOn] medial) [very many tortoise lives in.this the in.lake] environment ‘There are very many tortoises living in this lake.’ long Errefelé a leggyakrabban el`forduló állat a tekn`c. (phraseÉl még itt krokodil is. final) [Er:EfEle: O lEgÖOkrOb:On Elø:fordulo: a:l:Ot O tEknø:<] [e:l me:g it: krokodil iS] [here the most.frequently appearing animal the tortoise] [lives also here crocodile too ‘The most frequent animal here is the tortoise. Crocodiles also live here.’ postshort Nem nagy becs ülni a sarokban. alve- (phrase- [nEm nOÖ bEC ylni O SOrogbOn] olar medial) [not great honour to.sit the in.corner] ‘It is not a great honour to sit in the corner.’ long Ez igazán nagy becs. Ülök a székben és mindenki (phrasekiszolgál. final) [Ez igOza:n nOÖ bEC] [yløk O se:gbEn e:S mindENki kisolga:l] [this really great honour] [I.sit the in.chair and everyone waits.upon] ‘This is a really great honour. I sit in the chair and everyone waits upon me.’

148 Anne Pycha restricted alveshort Hölgyeim és Uraim! Mai m^sorunkban két environ- olar (phrasenagyszer^ énekes, két nagy kedvenc Ella is fellép: ment medial) Ella Fitzgerald és Ella Jones. [hølÖEim e:S urOim] [mOi my:SoruNgbOn ke:t nOcsEry: e:nEkES ke:t nOc kEdvEn< El:O iS fEl:e:p El:O fid:ZErOld e:S El:O Jo:ns] [my.ladies and my.gentlemen!] [today’s our.show two extraordinary singer, two great favourite Ella also appear: Ella Fitzgerald and Ella Jones] ‘Ladies and gentlemen! In today’s show, we have two extraordinary singers, two great favourite Ellas: Ella Fitzgerald and Ella Jones.’ long Mari itt a nagy kedvenc. Ella nem olyan népszer^. (phrase- [mOri it: O nOc kEdvEn<] [El:O nEm ojOn ne:psEry:] final) [Mary here the great favourite] [Ella not so popular] ‘Mary is the favourite here. Ella is not so popular.’ postshort Három kincs úszik a vízen egy kis csónakban. alve- (phrase- [ha:rom kinC u:sik O vizen Ec kiS Co:nOgbOn] olar medial) [three treasure swims the on.water a small in.boat] ‘Three treasures are swimming on the water in a small boat.’ long Ez egy különleges kincs. Úszik a vízen. (phrase- [Ez Ec kylønlEgEs kinC] [u:sik O vi:zEn] final) [this a special treasure] [swims the on.water] ‘This is a special treasure. It swims on water.’

Appendix B: Roots used in the construction of stimuli for Experiment 2 environment

place

unrestricted

alveolar

root dac pác rác lazac kupac tornác akác

postalveolar kacs rács ács kulacs pamacs takács tanács

[dO<] [pa:<] [ra:<] [lOzO<] [kupO<] [torna:<] [Oka:<]

‘defiance’ ‘pickle’ ‘Serb’ ‘salmon’ ‘mound’ ‘porch’ ‘acacia’

[kOC] [ra:C] [a:C] [kulOC] [pOmOC] [tOka:C] [tOna:C]

‘fringe’ ‘grate (n)’ ‘carpenter’ ‘gourd’ ‘mop’ ‘weaver’ ‘advice’

A test case for the phonetics–phonology interface 149 restricted

alveolar

flanc lánc ránc ribanc suhanc románc zománc

postalveolar mancs brancs gáncs háncs agancs parancs bogáncs

[flOn<] [la:n<] [ra:n<] [ribOn<] [SuhOn<] [roma:n<] [zoma:n<]

‘frill’ ‘chain’ ‘crease’ ‘harlot’ ‘youngster’ ‘romance’ ‘enamel’

[mOnC] [brOnC] [ga:nC] [ha:nC] [OgOnC] [pOrOnC] [bOga:nC]

‘paw’ ‘profession’ ‘trip (v)’ ‘inner bark’ ‘antler’ ‘command’ ‘burr’

REFERENCES

Barnes, Jonathan (2006). Strength and weakness at the interface : positional neutralization in phonetics and phonology. Berlin & New York : Mouton de Gruyter. Blevins, Juliette (2004). Evolutionary Phonology : the emergence of sound patterns. Cambridge : Cambridge University Press. Blevins, Juliette & Andrew Garrett (1998). The origins of consonant–vowel metathesis. Lg 74. 508–556. Boersma, Paul & David Weenink (2007). Praat : doing phonetics by computer (version 4.3.22). http://www.praat.org/. Browman, Catherine P. & Louis Goldstein (1990). Tiers in articulatory phonology, with some implications for casual speech. In Kingston & Beckman (1990). 341–376. Byrd, Dani, Abigail Kaun, Shrikanth Narayanan & Eliot Saltzman (2000). Phrasal signatures in articulation. In Michael B. Broe & Janet B. Pierrehumbert (eds.) Papers in laboratory phonology V: acquisition and the lexicon. Cambridge : Cambridge University Press. 70–87. Byrd, Dani & Elliot Saltzman (2003). The elastic phrase: modeling the dynamics of boundary-adjacent lengthening. JPh 31. 149–180. Byrd, Dani, Sungbok Lee, Daylen Riggs & Jason Adams (2005). Interacting eﬀects of syllable and phrase position on consonant articulation. JASA 118. 3860–3873. Byrd, Dani, Jelena Krivokapik & Sungbok Lee (2006). How far, how long : on the temporal scope of prosodic boundary eﬀects. JASA 120. 1589–1599. Byrd, Dani & Daylen Riggs (2008). Locality interactions with prominence in determining the scope of phrasal lengthening. Journal of the International Phonetic Association 38. 187–202. Cho, Taehong (2005). Prosodic strengthening and featural enhancement : evidence from acoustic and articulatory realizations of /A,i/ in English. JASA 117. 3867–3878. Cho, Taehong (2006). Manifestation of prosodic structure in articulatory variation : evidence from lip kinematics in English. In Louis Goldstein, Douglas Whalen & Catherine T. Best (eds.) Papers in Laboratory Phonology 8. Berlin & New York: Mouton de Gruyter. 519–548.

150 Anne Pycha Cho, Taehong & Patricia A. Keating (2001). Articulatory and acoustic studies on domain-initial strengthening in Korean. JPh 29. 155–190. Cho, Young-mee Yu (1990). Parameters of consonantal assimilation. PhD dissertation, Stanford University. Clements, G. N. (1999). Aﬀricates as noncontoured stops. In Osamu Fujimura, Brian D. Joseph & Bohumil Palek (eds.) Proceedings of LP ’98 : item order in language and speech. Prague: Karolinum. 271–299. Clements, G. N. & Samuel J. Keyser (1983). CV phonology : a generative theory of the syllable. Cambridge, Mass. : MIT Press. Cohn, Abigail C. (2003). Phonological structure and phonetic duration : the role of the mora. Working Papers of the Cornell Phonetics Laboratory 15. 69–100. De Jong, Kenneth & Bushra Zawaydeh (2002). Comparing stress, lexical focus, and segmental focus : patterns of variation in Arabic vowel duration. JPh 30. 53–75. de Lacy, Paul (2006). Markedness: reduction and preservation in phonology. Cambridge : Cambridge University Press. Dorman, Michael F., Lawrence J. Raphael & David Isenberg (1980). Acoustic cues for a fricative-aﬀricate contrast in word-ﬁnal position. JPh 8. 397–405. Dressler, Wolfgang U. & Pe´ter Sipta´r (1989). Towards a natural phonology of Hungarian. Acta Linguistica Hungarica 39. 29–51. Flemming, Edward (2001). Scalar and categorical phenomena in a uniﬁed model of phonetics and phonology. Phonology 18. 7–44. Fougeron, Ce´cile & Patricia A. Keating (1997). Articulatory strengthening at edges of prosodic domains. JASA 101. 3728–3740. Fukaya, Teruhiko & Dani Byrd (2005). An articulatory examination of word-ﬁnal ﬂapping at phrase edges and interiors. Journal of the International Phonetic Association 35. 45–58. Gafos, Adamantios I. (2002). A grammar of gestural coordination. NLLT 20. 269–337. Goldsmith, John (1976). Autosegmental phonology. PhD dissertation, MIT. Go´sy, Ma´ria (ed.) (1991). Temporal factors in speech : a collection of papers. Budapest : Research Institute for Linguistics, Hungarian Academy of Sciences. Go´sy, Ma´ria (2001). The VOT of the Hungarian voiceless plosives in words and in spontaneous speech. International Journal of Speech Technology 4. 75–85. Ham, William H. (2001). Phonetic and phonological aspects of geminate timing. New York & London : Routledge. Hayes, Bruce (1986a). Inalterability in CV phonology. Lg 62. 321–351. Hayes, Bruce (1986b). Assimilation as spreading in Toba Batak. LI 17. 467–499. Hockey, Beth Ann & Zsuzsanna Fagyal (1999). Phonemic length and pre-boundary lengthening : an experimental investigation on the use of durational cues in Hungarian. In Proceedings of the 19th International Congress of Phonetic Sciences, San Francisco. 313–316. Howell, Peter & Stuart Rosen (1983). Production and perception of rise time in the voiceless aﬀricate/fricative distinction. JASA 73. 976–984. Hualde, Jose´ Ignacio (1988). Aﬀricates are not contour segments. WCCFL 7. 143–157. Hyman, Larry M. (1985). A theory of phonological weight. Dordrecht : Foris. Inkelas, Sharon & Young-mee Yu Cho (1993). Inalterability as prespeciﬁcation. Lg 69. 529–574. Kassai, Ilona (1979). Ido˝tartam e´s kvantita´s a magyar nyelvben. [Duration and quantity in Hungarian.] Budapest : Akade´miai Kiado´. Kassai, Ilona (1982). A magyar besze´dhangok ido˝tartamviszonyai. [Temporal relationships of Hungarian speech sounds.] In Ka´lma´n Bolla (ed.) Fejezetek a magyar leı´ro´ hangtanbo´l. [Chapters from Hungarian descriptive phonetics.] Budapest : Akade´miai Kiado´. 115–154.

A test case for the phonetics–phonology interface 151 Kenesei, Istva´n, Robert M. Vago & Anna Fenyvesi (1998). Hungarian. London & New York : Routledge. Kenstowicz, Michael (1982). Gemination and spirantization in Tigrinya. Studies in the Linguistic Sciences 12. 103–122. Kingston, John & Mary E. Beckman (eds.) (1990). Papers in laboratory phonology I: between the grammar and physics of speech. Cambridge : Cambridge University Press. Kirchner, Robert (2000). Geminate inalterability and lenition. Lg 76. 509–545. Klatt, Dennis H. (1976). Linguistic uses of segmental duration in English : acoustic and perceptual evidence. JASA 59. 1208–1221. Lavoie, Lisa (2001). Consonant strength : phonological patterns and phonetic manifestations. New York : Garland. Lombardi, Linda (1990). The nonlinear organization of the aﬀricate. NLLT 8. 375–425. McCarthy, John J. (1986). OCP eﬀects : gemination and antigemination. LI 17. 207–263. Miller, Joanne L. (1981). Eﬀects of speaking rate on segmental distinctions. In Peter D. Eimas & Joanne L. Miller (eds.) Perspectives on the study of speech. Hillsdale: Erlbaum. 39–74. Miller-Ockhuizen, Amanda & Draga Zec (2002). Durational diﬀerences in Serbian palatal aﬀricates. In Proceedings of the 1st Pan-American/Iberian Meeting on Acoustics. Cancun, Mexico. Muller, Jennifer S. (2001). The phonology and phonetics of word-initial geminates. PhD dissertation, Ohio State University. ´ da´m (1989). The exact domain of consonant degemination in Hungarian. Na´dasdy, A In Tama´s Szende (ed.) Proceedings of the Speech Research ’89 International Conference. Budapest : Linguistics Institute of the Hungarian Academy of Sciences. 104–107. Ohala, John J. (1990). The phonetics and phonology of aspects of assimilation. In Kingston & Beckman (1990). 258–275. Ohala, John J. (1997). Aerodynamics of phonology. Proceedings of the 4th Seoul International Conference on Linguistics [SICOL]. Seoul : Linguistic Society of Korea. 92–97. Olaszy, Ga´bor (1994). Sound duration measurements in declarative sentences. Acta Linguistica Hungarica 42. 51–62. Olaszy, Ga´bor (2000). The prosody structure of dialogue components in Hungarian. International Journal of Speech Technology 3. 165–176. Olaszy, Ga´bor (2002). Predicting Hungarian sound durations for continuous speech. Acta Linguistica Hungarica 49. 321–345. Papp, Ferenc (1969). Reverse-alphabetized dictionary of the Hungarian language. Budapest : Akade´miai Kiado´. Pycha, Anne (2007). Phonetic vs. phonological lengthening in aﬀricates. In Ju¨rgen Trouvain & William J. Barry (eds.) Proceedings of the 16th International Congress of Phonetic Sciences. Saarbru¨cken : Saarland University. 1757–1760. Pycha, Anne (2009). Lengthened aﬀricates as a test case for the phonetics–phonology interface. Journal of the International Phonetic Association 39. 1–31. Repp, Bruno H., Alvin M. Liberman, Thomas Eccardt & David Pesetsky (1978). Perceptual integration of acoustic cues for stop, fricative, and aﬀricate manner. Journal of Experimental Psychology: Human Perception and Performance 4. 621–637. Rose, Sharon (2000). Rethinking geminates, long-distance geminates, and the OCP. LI 31. 85–122. Rounds, Carol (2001). Hungarian : an essential grammar. London & New York: Routledge.

152 Anne Pycha Rubach, Jerzy (1994). Aﬀricates as strident stops in Polish. LI 25. 119–143. Schein, Barry & Donca Steriade (1986). On geminates. LI 17. 691–744. Sipta´r, Pe´ter & Miklo´s To¨rkenczy (2000). The phonology of Hungarian. Oxford : Oxford University Press. Smiljanic, Rajka & Ann R. Bradlow (2007). Stability of temporal contrasts across speaking styles in English and Croatian. JPh 36. 91–113. Smith, Caroline L. (1995). Prosodic patterns in the coordination of vowel and consonant gestures. In Bruce Connell & Amalia Arvaniti (eds.) Phonology and phonetic evidence : papers in laboratory phonology IV. Cambridge : Cambridge University Press. 205–222. Steriade, Donca (1993). Closure, release, and nasal contours. In Marie K. Huﬀman & Rena A. Krakow (eds.) Nasals, nasalization, and the velum. Orlando : Academic Press. 401–470. Steriade, Donca (1994). Complex onsets as single segments : the Mazateco pattern. In Jennifer Cole & Charles Kisseberth (eds.) Perspectives in phonology. Stanford : CSLI. 203–291. Steriade, Donca (1999). Phonetics in phonology : the case of laryngeal neutralization. UCLA Working Papers in Linguistics 2: Papers in Phonology 3. 25–146. Steriade, Donca (2001). Directional asymmetries in place assimilation : a perceptual account. In Elizabeth Hume & Keith Johnson (eds.) The role of speech perception in phonology. San Diego : Academic Press. 219–250. Summers, W. Van (1987). Eﬀects of stress and ﬁnal-consonant voicing on vowel production : articulatory and acoustic analysis. JASA 82. 847–863. Tarno´czy, Tama´s (1987). The formation, analysis and perception of Hungarian affricates. In Robert Channon & Linda Shockey (eds.) In honor of Ilse Lehiste. Dordrecht: Foris. 255–270. Turk, Alice E. & Stefanie Shattuck-Hufnagel (2000). Word-boundary-related duration patterns in English. JPh 28. 397–440. Turk, Alice E. & Stefanie Shattuck-Hufnagel (2007). Multiple targets of phrase-ﬁnal lengthening in American English words. JPh 35. 445–472. Vago, Robert M. (1980). The sound pattern of Hungarian. Washington : Georgetown University Press. Warner, Natasha, Allard Jongman, Joan Sereno & Rache`l Kemps (2004). Incomplete neutralization and other sub-phonemic durational diﬀerences in production and perception : evidence from Dutch. JPh 32. 251–276.

Phonology 27 (2010) 153–201. f Cambridge University Press 2010 doi:10.1017/S0952675710000060

Testing the role of phonetic knowledge in Mandarin tone sandhi* Jie Zhang University of Kansas Yuwen Lai National Chiao Tung University Phonological patterns often have phonetic bases. But whether phonetic substance should be encoded in synchronic phonological grammar is controversial. We aim to test the synchronic relevance of phonetics by investigating native Mandarin speakers’ applications of two exceptionless tone sandhi processes to novel words : the contour reduction 213E21/_T (Tl213), which has a clear phonetic motivation, and the perceptually neutralising 213E35/_213, whose phonetic motivation is less clear. In two experiments, Mandarin subjects were asked to produce two individual monosyllables together as two diﬀerent types of novel disyllabic words. Results show that speakers apply the 213E21 sandhi with greater accuracy than the 213E35 sandhi in novel words, indicating a synchronic bias against the phonetically less motivated pattern. We also show that lexical frequency is relevant to the application of the sandhis to novel words, but cannot account alone for the low sandhi accuracy of 213E35.

* This work could not have been done without the help of many people. We are grateful to Paul Boersma and Mietta Lennes for helping us with Praat scripts, Juyin Chen, Mickey Waxman and Xiangdong Yang for helping us with statistics and Hongjun Wang, Jiangping Kong and Jianjing Kuang for hosting us at Beijing University during our data collection in 2007. For helpful comments on various versions of this work, we thank Allard Jongman, James Myers, three anonymous reviewers and an associate editor for Phonology, and audiences at the 2004 NYU Workshop on ‘ Redeﬁning elicitation ’, the Department of Linguistics at the University of Hawaii, the Department of Psychology and the Child Language Program at the University of Kansas, and the 2005 Annual Meeting of the Linguistic Society of America. We owe a special debt to Hsin-I Hsieh, whose work on wug-testing Taiwanese tone sandhi inspired this research. All remaining errors are our own. This research was partly supported by research grants from the National Science Foundation (0750773) and the University of Kansas General Research Fund (2301760).

153

154 Jie Zhang and Yuwen Lai

1 Introduction 1.1 The relevance of phonetics to phonological patterning Phonological patterns are often inﬂuenced by phonetic factors. The inﬂuence is manifested in a number of ways, the most common of which is the prevalence in cross-linguistic typology of patterns that have articulatory or perceptual bases and the scarcity of those that do not. For example, velar palatalisation before high front vowels, postnasal voicing and regressive assimilation for major consonant places have clear phonetic motivations and are extremely well attested, while velar palatalisation before low back vowels, postnasal devoicing and progressive consonant place assimilation are nearly non-existent. The typological asymmetry can also be manifested in terms of implicational statements. For example, in consonant place assimilation, if oral stops are targets of assimilation in a language, then ceteris paribus, nasal stops are also targets of assimilation (Mohanan 1993, Jun 1995, 2004). This is to be expected perceptually, as nasal stops have weaker transitional place cues and are thus more likely to lose their contrastive place than oral stops when articulatory economy is of concern (the Production Hypothesis; Jun 1995, 2004). Evidence for the relevance of phonetics can also be found in the peripheral phonology of a language even when the phonetic eﬀects are not directly evident in its core phonology. Such peripheral phonology may include the phonology of its established loanwords (Fleischhacker 2001, Kang 2003, Kenstowicz 2007) and the speakers’ judgements on poetic rhyming (Steriade & Zhang 2001). For example, Steriade & Zhang (2001) show that although postnasal voicing is not neutralising in Romanian, its phonetic eﬀect is crucial in accounting for poets’ preference for /Vnt/~/Vnd/ as a semi-rhyme over /Vt/~/Vd/. The parallels between the traditionally conceived categorical/ phonological and gradient/phonetic patterns also indicate their close relation. Flemming (2001), for instance, outlines the similarity of patterning between phonological assimilation and phonetic coarticulation as well as a number of other processes present in both the traditional phonological and phonetic domains. 1.2 Where should phonetic explanations reside ? Although the existence of some form of relationship between phonological typology and phonetics is relatively uncontroversial, the precise way in which this relationship should be captured is a continuous point of contention among phonologists. One possibility is to consider the phonetic basis to be part of the intrinsic mechanism of the synchronic phonological grammar. Many theories have been proposed within rule-based phonology to encode this relation, from the abbreviation conventions of SPE (Chomsky & Halle 1968) and the innateness of articulatorily based phonological processes in Natural Phonology (Stampe 1979) to the

Testing the role of phonetic knowledge in Mandarin tone sandhi 155 grounding conditions for universal constraints in Grounded Phonology (Archangeli & Pulleyblank 1994). Optimality Theory (Prince & Smolensky 1993) further invites phonetic explanations into synchronic phonology, due to its ability to state phonetic motivations explicitly in the system as markedness constraints (Hayes & Steriade 2004). Work by Boersma (1998), Steriade (1999, 2001, 2008), Kirchner (2000, 2001, 2004), Flemming (2001) and Zhang (2002, 2004), among others, has proposed constraints that directly encode phonetic properties and intrinsic rankings based on such properties in synchronic phonology to capture typological asymmetries. There are various approaches as to how the phonetic substance gets to be encoded in the grammar. The strongest position is that the phonetically based constraints, intrinsic rankings, grounding conditions or other formal mechanisms are simply part of the design nature of the grammar on the level of the species (Universal Grammar (UG); Chomsky 1986), and they predict true, exceptionless universals of phonological typology. It is also possible that the design scheme of the grammar only includes an analytical bias that draws from a type of grammar-external phonetic knowledge (Kingston & Diehl 1994) and restricts the space of constraints and constraint rankings (weightings) to be learned by the speaker (Hayes 1999, Wilson 2006). This type of approach predicts strong universal tendencies in favour of phonetically motivated patterns, but allows ‘unnatural ’ patterns to surface in grammars and be learned by speakers. An alternative to the synchronic approach above is that the eﬀect of phonetics on phonological typology takes place in the realm of diachronic sound change. The typological asymmetries in phonology are then due to the diﬀerent frequencies with which phonological patterns can arise through diachronic sound change, which is caused by phonetic factors such as misperception (e.g. Anderson 1981, Ohala 1981, 1990, 1993, 1997, Blevins & Garrett 1998, Buckley 1999, 2003, Hale & Reiss 2000, Hansson 2001, Hyman 2001, Blevins 2004, Yu 2004, Silverman 2006a, b).1 Among researchers working in this framework, there are diﬀerent positions on the role of UG in synchronic phonology in general, from categorical rejection (Ohala 1981, 1990, 1993, 1996, Silverman 2006a) to selective permission (Blevins 2004) to utmost importance (Hale & Reiss 2000). But all proponents of this approach agree that Occam’s Razor dictates that if a diachronic explanation based on observable facts exists for typological asymmetries in phonological patterning, a UG-based synchronic explanation, which is itself hypothetical and unobservable, is not warranted (e.g. Hale & Reiss 2000 : 158, Blevins 2004: 23, Hansson 2008 : 882). For a 1 There are disagreements as to whether the speaker plays any active role in sound

change : for example, Ohala considers sound change to be listener-based and nonteleological, while Bybee’s (2001, 2006) usage-based model places great importance on the speaker’s production in the initiation of sound change; Blevins’ Evolutionary Phonology (Blevins 2004) also ascribes the speaker a more active role in sound change than Ohala’s model.

156 Jie Zhang and Yuwen Lai comprehensive review on the diachronic explanations of sound patterns, see Hansson (2008). The synchrony vs. diachrony debate is very often centred around the strongest form of the phonetics-in-UG hypothesis. Earlier proponents of the synchronic approach in the OT framework were primarily concerned with establishing stringent implicational statements on phonological behaviour from typological data, discovering the phonetic rationales behind the implications, and proposing optimality-theoretic models from which the implicational statements fall out as predictions (e.g. Jun 1995, Steriade 1999, Kirchner 2001, Zhang 2002). Conversely, critics of the synchronic approach, beyond proposing explicit frameworks for the evolution of phonological systems, and for how perception, and possibly production, may have shaped the evolution, have made eﬀorts to identify counterexamples to the phonetically based typological asymmetries and provide explanations for the emergence of such ‘ unnatural’ patterns based on a chain of commonly attested diachronic sound changes (Hyman 2001, Yu 2004, Blevins 2006 ; see also Bach & Harms 1972, Anderson 1981). The debate between Blevins (2006) and Kiparsky (2006) on whether there exist true cases of coda voicing is a case in point. Regardless of the outcome of particular debates, this seems to be a losing battle for the synchronic approach in the long run, as the implicational statements gathered from cross-linguistic typology are necessarily inductive – provided that we have not looked at all languages, we cannot be certain that no counterexamples will ever emerge. Moreover, experimental studies showing that phonetically arbitrary patterns can in fact be readily learned by speakers (Dell et al. 2000, Onishi et al. 2002, Buckley 2003, Chambers et al. 2003, Seidl & Buckley 2005) also seem to provide additional arguments in the diachronic approach’s favour. However, as we have mentioned earlier, a synchronic approach does not necessarily assume the strongest form of phonetics as a design feature. The analytical bias approach (e.g. Wilson 2006, Moreton 2008), for example, only favours the learning of particular patterns in the process of phonological acquisition, but does not in principle preclude the emergence or learning of other patterns. Pitched against the diachronic approach, the form of argument from either approach should come from experimental studies that show whether speakers indeed exhibit any learning biases in favour of phonetically motivated patterns, not whether phonetically unmotivated patterns can be learned at all.2 Relatedly, there is a crucial diﬀerence between phonological patterns observed in a language and the speaker’s internal knowledge of such 2 Hansson (2008: 882) argues that the substantive bias approach is not necessarily

incompatible with the diachronic approach, as such biases can be considered as one of the potential sources of ‘ external errors’ in language evolution. But we think that there is a fundamental diﬀerence between the two approaches in terms of the importance ascribed to phonetics in the grammar and the extent to which answers to asymmetries in both typological patterns and speakers’ internal phonological knowledge lie in the formal grammatical module.

Testing the role of phonetic knowledge in Mandarin tone sandhi 157 patterns. Many recent works have shown that speakers may know both more and less than the lexical patterns of their language. For example, Zuraw (2007) demonstrated that Tagalog speakers possess knowledge of the splittability of word-initial consonant clusters that is absent in the lexicon but projectable from perceptual knowledge, and that they can apply the knowledge to inﬁxation in stems with novel initial clusters ; Zhang & Lai (2008) and Zhang et al. (2009a, b), pace earlier works on Taiwanese tone sandhi such as Hsieh (1970, 1975, 1976) and Wang (1993), show that the phonologically opaque ‘tone circle ’ is largely unproductive in wug tests, despite its exceptionlessness in the language itself. This opens up a new area of inquiry for the synchrony vs. diachrony debate : provided that we are interested in the tacit knowledge of the speaker, then we need to look beyond the typological patterns to see which approach provides a better explanation for experimental results that shed light on the speakers’ internal knowledge. One likely fruitful comparison is to see whether there are productivity diﬀerences between two patterns that diﬀer in the level of phonetic motivation, but are otherwise comparable.3 1.3 Experimental studies addressing the role of phonetics in learning In this section we provide a brief review of the relevant experimental literature on the role of phonetics in diﬀerent types of learning situations. One possible line of investigation is to examine in a language with phonological patterns that diﬀer in the degree of phonetic motivation whether the patterns with stronger phonetic motivations are acquired more quickly and with greater accuracy in language acquisition. The claim that phonetically motivated morphophonological processes are acquired earlier and with fewer errors has been made in the literature (e.g. MacWhinney 1978, Slobin 1985, Menn & Stoel-Gammon 1995). For example, Slobin compared the eﬀortless acquisition of ﬁnal devoicing by Turkish children and the error-ridden acquisition of stop–spirant alternations in Modern Hebrew by Israeli children and suggested that there is a hierarchy of acceptable alternation based on universal predispositions that favour assimilation and simpliﬁcation in the articulatory output (1985 : 1209). Buckley (2002) points out that the role of such universal predispositions can only be established if the accessibility of the pattern, such as its distribution and regularity, is teased apart from the phonetic naturalness of the pattern. In demonstrating that many unnatural patterns are acquired 3 Hansson (2008: 881) worries that it would be ‘ all too easy to explain away apparent

counterexamples º as being lexicalized, morphologized, or in some other way not belonging to the ‘ real ’ phonology of the language’. But one should insist that any claims about whether a pattern falls outside the ‘ real ’ phonology of a language be supported by experimental evidence. Moreover, the argument is more likely in a subtler form of whether there are any detectable diﬀerences between patterns, not whether a pattern is categorically in or out of ‘ real ’ phonology.

158 Jie Zhang and Yuwen Lai readily, due to their high regularity and high frequency of occurrence, while many natural patterns are acquired with much diﬃculty due to their low accessibility, Buckley argues that accessibility, but not phonetic naturalness, determines the ease of learning. However, Buckley (2002) does not show that when accessibility is matched, a process lacking phonetic motivations can be acquired just as easily as a phonetically motivated process. The only such comparisons that can be found in Buckley (2002) are in Hungarian – the more natural backness harmony vs. the less natural /a/-lengthening, both of which are highly accessible, and the more natural rounding harmony vs. the less natural /e/-lengthening, both of which have low accessibility. MacWhinney (1978)’s original work, cited by Buckley, showed that backness harmony is acquired earlier than /a/-lengthening, but rounding harmony is acquired later than /e/lengthening. Therefore, phonetic naturalness does seem to aﬀect the order of acquisition, but it is unclear what the precise eﬀect is. Moreover, given that these comparisons are only made under a crude control of ‘ accessibility ’, the results cannot be deemed conclusive. Another approach is to test the learning of patterns with diﬀerent degrees of phonetic motivations in an artiﬁcial language. The artiﬁcial grammar paradigm (see Reber 1967, 1989, Redington & Chater 1996) has been widely used to investigate the learnability of phonological patterns in both children and adults. The paradigm typically involves two stages – the exposure stage, in which the subject is presented with stimuli generated by an artiﬁcial grammar, and the testing stage, in which the subject is tested on their learning of the patterns in the artiﬁcial grammar, measured by their ability to distinguish legal vs. illegal test stimuli, reaction time or looking time in the head-turn paradigm for infant studies. It is particularly suited for the comparison of the learning of diﬀerent patterns, as the relevant patterns can be designed to have matched regularity, lexical frequency and transitional probability. This line of research has been actively pursued, with conﬂicting results. Seidl & Buckley (2005) reports two experiments that tested whether nine-month-old infants learn patterns with diﬀerent degrees of phonetic motivation diﬀerently. The ﬁrst experiment tested whether the infants preferred a phonetically grounded pattern in which only fricatives and aﬀricates, but not stops, occur intervocalically, or an arbitrary pattern in which only fricatives and aﬀricates, but not stops, occur word-initially. The second experiment tested the diﬀerence between two patterns : a grounded pattern in which a labial consonant is followed by a rounded vowel and a coronal consonant is followed by a front vowel, and an arbitrary pattern in which a labial consonant is followed by a high vowel and a coronal consonant is followed by a mid vowel. In both experiments, the infants learned both patterns fairly well and showed no learning bias towards the phonetically grounded pattern, suggesting that phonetic grounding does not play a role in the learning of synchronic phonological patterns. But in Experiment 1, all fricatives and aﬀricates used were stridents, and as Kirchner (2001, 2004) shows, the precise articulatory control necessary for stridents in fact

Testing the role of phonetic knowledge in Mandarin tone sandhi 159 makes them less desirable in intervocalic position. Moreover, Thatte (2007 : 7) points out that there exist phonological generalisations other than the ones that Seidl & Buckley intended in their stimuli, and the infants might have responded to these generalisations. Therefore, Seidl & Buckley’s claim that there is no learning bias towards phonetically grounded patterns is open to debate. In Jusczyk et al. (2003), 4.5-monthold infants were presented with sets of three words, or ‘triads’, which consisted of two monosyllabic pseudo-words with the forms VC1 and C2V, followed by a disyllabic word in which either C1 or C2 assimilates in place to the adjacent consonant (an, bi, ambi ; an, bi, andi). The C1 assimilation pattern is perceptually motivated and cross-linguistically extremely common, while the C2 assimilation pattern has no clear perceptual grounding and cross-linguistically extremely rare. In a head-turn procedure, infants showed no diﬀerence in looking time between the triads with regressive and progressive assimilations, indicating the lack of a priori preference for phonetically motivated phonological patterns. However, Thatte (2007)’s study, which compared intervocalic voicing (pa, ﬁ, pavi) and devoicing (pa, vi, paﬁ) using a similar methodology, showed that 4.5-month-old infants exhibited a preference for the phonetically motivated intervocalic voicing, while 10.5-month-old infants preferred the phonetically unmotivated intervocalic devoicing. Thatte argues that the 4.5-month-olds’ results support the view that infants have an innate preference for phonetically based patterns and tentatively interpreted the 10.5-month-olds’ results as the combined eﬀect of their overall lower boredom threshold and their becoming bored with the phonetically motivated pattern earlier. In addition to the conﬂicting results, as Seidl & Buckley (2005) point out, the A, B, AB triad procedure is quite novel in infant research, and the assumption that the infants take the AB string to be a concatenation of A and B may not be valid. Therefore, the extent to which the phonetic bases of phonological patterns are directly relevant to ﬁrst language acquisition remains an open question. Pycha et al. (2003) tested adult English speakers’ learning of three nonEnglish patterns – ‘palatal vowel harmony ’ (stem and suﬃx vowels agree in [back]), ‘palatal vowel disharmony’ (stem and suﬃx vowels disagree in [back]) and ‘palatal arbitrary’ (an arbitrary relation between stem and suﬃx vowels) – and found that although subjects exhibited better learning of the harmony and disharmony patterns than the arbitrary pattern, there was no diﬀerence between harmony and disharmony. Taking harmony to have a stronger phonetic motivation than disharmony, they concluded that phonetic naturalness is not relevant to the construction of the synchronic grammar. Wilson (2003), in two similar experiments with similar results, interpreted the results diﬀerently, however. Wilson argued that both assimilation and dissimilation can ﬁnd motivations in phonetics and thus both have a privileged cognitive status in phonological grammar. Wilson (2006) showed that when speakers were presented with highly impoverished evidence of a new phonological pattern, they were able to extend the pattern to novel contexts predicted by a phonetically based

160 Jie Zhang and Yuwen Lai phonology and linguistic typology, but not to other contexts ; for instance, speakers presented with velar palatalisation before mid vowels could extend the process before high vowels, but not vice versa. A phonology that encodes no substantive bias cannot predict these experimental observations. Two experiments on the learning of natural vs. unnatural allophonic rules in an artiﬁcial language conducted by Peperkamp and collaborators (Peperkamp et al. 2006, Peperkamp & Dupoux 2007) returned conﬂicting results. In both experiments, French subjects were exposed to alternations that illustrate intervocalic voicing (e.g. [p t k]E[b d g] / V_V) or a random generalisation (e.g. [p g z]E[Z f t] / V_V). In the test phase, the subjects did not show a learning diﬀerence between the two types of alternations in a phrase–picture-matching task (Peperkamp & Dupoux 2007), but did show a strong bias in favour of intervocalic voicing in a picture-naming task (Peperkamp et al. 2006). Peperkamp and colleagues surmise that the diﬀerent results might be due to a ceiling eﬀect in the cognitively less demanding phrase–picture-matching task, and the diﬀerence between natural and unnatural alternations could lie in either the speed with which they are learned – natural alternations are learned faster – or the ease with which they can be used in processing once they have been learned – natural alternations can be used more easily, especially in cognitively demanding tasks. In either case, the account allows random alternations to be learned, but also admits that the phonetic nature of the alternation plays a role in its acquisition. An additional issue with using the artiﬁcial language paradigm in adult research is that artiﬁcial learning at best approximates second language acquisition, whose mechanism is arguably very diﬀerent from ﬁrst language acquisition (Cook 1969, 1994, Dulay et al. 1982, Bley-Vroman 1988, Ellis 1994, among others), but the learning issue of interest here is the relevance of phonetics during the construction of native phonological grammars. Moreover, the artiﬁcial language paradigm often involves a heavy dose of explicit learning, while second language acquisition, like ﬁrst language acquisition, often involves a signiﬁcant amount of implicit learning. This increases the distance between artiﬁcial language learning and real language acquisition. 1.4 The current study The current study complements the experimental works above by using a nonce-probe paradigm (‘wug ’ test) (Berko 1958) with adult speakers. In a typical wug test, subjects are taught novel forms in their language and then asked to provide morphologically complex forms, using the novel forms as the base. This paradigm has been widely used to test the productivity of regular and irregular morphological rules (e.g. Bybee & Pardo 1981, Albright 2002, Albright & Hayes 2003, Pierrehumbert 2006) and morphophonological alternations (e.g. Hsieh 1970, 1975, 1976, Wang 1993, Zuraw 2000, 2007, Albright et al. 2001, Hayes & Londe 2006). Our

Testing the role of phonetic knowledge in Mandarin tone sandhi 161 study wug-tests two patterns of tonal alternation (tone sandhi) that diﬀer in the degree of phonetic motivation in Mandarin Chinese and compares the accuracies with which the sandhi patterns apply to nonce words. This approach is in line with the assumption that the phonological patterns observed in the language may not be identical to the speakers’ knowledge of the patterns, and provides us with a novel opportunity to test the role of phonetics in synchronic phonology. It uses real phonological patterns that exist in the subjects’ native language, which circumvents the learning-strategy problem with the artiﬁcial language paradigm. It also allows easier manipulations of confounding factors such as lexical frequency and thus minimises the control problem in studying phonological learning in a naturalistic setting. 1.5 Organisation of the paper We discuss the details of two tone sandhi patterns under investigation in Mandarin in w2. The methodology and results for the two experiments that compare the productivity of the two sandhi patterns are discussed in ww3 and 4. Theoretical implications of the results are further discussed in w5. w6 is the conclusion.

2 Tone sandhi in Mandarin Chinese and the general hypotheses Mandarin Chinese is a prototypical tone language. The standard variety of Mandarin spoken in Mainland China, particularly Beijing, has four lexical tones – 55, 35, 213 and 51 – as shown in (1).4 (1) Mandarin tones ‘mother’ Tone 1 ma55 ‘hemp’ Tone 2 ma35 Tone 3 ma213 ‘horse’ ‘to scold’ Tone 4 ma51 The pitch tracks of the four tones with the syllable [ma] pronounced in isolation by a male speaker, each averaged over ﬁve tokens, are given in Fig. 1. Although Tone 2 is usually transcribed as a high-rising tone 35, there is a small pitch dip at the beginning of the tone, creating a turning point, and research has shown that the perceptual diﬀerence between Tones 2 and 3 lies primarily in the timing and pitch height of the turning 4 Tones are marked with Chao tone numbers (Chao 1948, 1968) here. ‘ 5’ indicates

the highest pitch used in lexical tones, while ‘ 1’ indicates the lowest pitch. Contour tones are marked with two juxtaposed numbers. For example, 51 indicates a falling tone from the highest pitch to the lowest pitch.The variety of Mandarin spoken in Taiwan has a slightly diﬀerent tonal inventory : Tone 3 is pronounced as 21, without the ﬁnal rise, even in prosodic ﬁnal position. This is not the variety of Mandarin studied here.

162 Jie Zhang and Yuwen Lai 200

F0 (Hz)

150

100

50

0

turning point

50

Tone 1 (55) Tone 2 (35) Tone 3 (213) Tone 4 (51)

turning point 100

150

200

250

300

350

time (ms) Figure 1 Representative pitch tracks for the four tones in Mandarin.

point (Shen & Lin 1991, Shen et al. 1993, Moore & Jongman 1997). Notice also that the diﬀerent tones in Fig. 1 have diﬀerent durational properties ; in particular, Tone 3 has the longest duration. These observations will become important in the discussion of Mandarin tone sandhi and the experimental results. In tone languages, a tone may undergo alternations conditioned by adjacent tones or the prosodic and/or morphosyntactic position in which the tone occurs. This type of alternation is often referred to as tone sandhi (e.g. Chen 2000). Mandarin Chinese has two tone sandhi patterns, both of which involve Tone 3 (213). Speciﬁcally, 213 becomes 35 when followed by another 213 (the ‘third-tone sandhi ’); but 213 becomes 21 when followed by any other tone (the ‘ half-third sandhi’). These sandhis are exempliﬁed in (2).

(2) Mandarin tone sandhi a. 213£35 /_213 xAu213 tçjou213 £ xAu35-tçjou213 b. 213£21 /_{55, 35, 51} £ xAu21-»u55 xAu213-»u55 xAu213-3@n35 £ xAu21-3@n35 xAu213-kHan51 £ xAu21-kHan51

‘good wine’ ‘good book’ ‘good person’ ‘good looking’

The pitch tracks for the four examples in (2) pronounced in isolation by a male speaker, each averaged over ﬁve tokens, are given in Fig. 2. Both of these sandhi patterns are fully productive in Mandarin disyllabic words and phrases, and they are both ‘ phonological ’ in the traditional sense, in that they involve language-speciﬁc tone changes that cannot be predicted simply by tonal coarticulation. However, we consider the half-third sandhi to have a stronger phonetic basis than the third-tone sandhi. Our judgement is based on the following three factors.

Testing the role of phonetic knowledge in Mandarin tone sandhi 163 200

3+1 3+2 3+3 3+4

F0 (Hz)

150

100

50

0

50

100

150

0

50

100

150

200

250

300

time (ms) Figure 2 Representative pitch tracks for the tone sandhis in Mandarin.

First, in terms of the phonetic mechanism of the tone change, although both sandhis involve simpliﬁcation of a complex contour in prosodic nonﬁnal position, which has articulatory and perceptual motivations (Zhang 2002, 2004), the third-tone sandhi also involves raising of the pitch, which cannot be accounted for by the phonetic motivation of reducing pitch contours on syllables with insuﬃcient duration. The half-third sandhi, however, only involves truncation of the second half of the contour. The third-tone sandhi is also structure-preserving, at least in perception.5 Based on the closer phonetic relation between the base and sandhi pair in the half-third sandhi and the contrastive status of the sandhi tone 35 in the third-tone sandhi, we assume that the perceptual distance between the base and sandhi tones in the half-third sandhi is smaller than that in the third-tone sandhi. We take this as an argument for the stronger phonetic motivation for the half-third sandhi (cf. the P-map; Steriade 2001, 2008). Second, in the traditional Lexical Phonology sense, the third-tone sandhi has lexical characteristics – it is structure-preserving (in perception), and its application to a polysyllabic compound is dependent on syntactic bracketing ; but the half-third sandhi is characteristic of a postlexical rule – it is allophonic and applies across the board. The syntactic dependency of the third-tone sandhi is illustrated in the examples in (3) and (4). The examples in (3a) and (b) show that for underlying third-tone sequences, the output tones diﬀer depending on the syntactic branching structure : a right-branching sequence [213 [213 213]] has two possible output forms, 35 35 213 and 21 35 213, as in (3a), while a left-branching sequence [[213 213] 213] has only one output, 35 35 213, as in (3b) (from 5 A detailed acoustic study by Peng (2000) shows that this sandhi is non-

neutralising as far as production is concerned – the sandhi tone is lower in overall pitch than the lexical Tone 2. But this diﬀerence cannot be reliably perceived by native adult listeners. Therefore, /mai213 ma213/ ‘ buy horse’ and /mai35 ma213/ ‘ bury horse ’ are in eﬀect perceived as homophonous by native speakers.

164 Jie Zhang and Yuwen Lai Duanmu 2000 : 238). Examples (3c) and (d), on the other hand, illustrate that the application of the half-third sandhi is not inﬂuenced by the branching structure : an underlying 213 35 213 sequence, regardless of branching structure, is realised as 21 35 213.

(3) Left-vs. right-branching phrases a. [213 [213 213]]£35 35 213 or 21 35 213 [mai [xAu tçjou]] ‘buy good wine’ buy good wine input 213 213 213 output 1 213 35 35 output 2 213 35 21 b. [[213 213] 213]£35 35 213 only [[mai xAu] tçjou] ‘finished buying wine’ buy done wine input 213 213 213 output 213 35 35 213 *21 35 c. [213 [35 213]]£21 35 213 ‘little red horse’ [çjAu [xuN ma]] little red horse input 213 35 213 output 35 213 21 d. [[213 35] 213]£21 35 213 [[çjAu xuN] pHAu] ‘Xiaohong runs’ Xiao hong run 213 213 35 input output 213 35 21 The examples in (4a) and (4b) show that prepositions have a special status, in that they permit the non-application of the third-tone sandhi. (4a) illustrates that in a [213 [[213 213] 213]] sequence, if the second syllable is a preposition such as [wAN] ‘to ’, there are three possible outputs : 35 35 35 213, 21 35 35 213 or 35 21 35 213. But if the second syllable is not a preposition, as in (4b), there are only two possible outputs : 35 35 35 213 or 21 35 35 213 ; *35 21 35 213, where the third-tone sandhi is blocked on the second syllable, is not a possible output (from Zhang 1997: 294–295). By contrast, (4c) and (4d) illustrate that a [55 [[213 51] 213]] sequence, regardless of whether the second syllable is a preposition, is realised as 55 21 51 213, demonstrating again the irrelevance of grammatical structure to the application of the half-third sandhi.6 6 For more discussion on the application of the Mandarin third-tone sandhi, see Shih

(1997), Zhang (1997), Duanmu (2000) and Lin (2007).

Testing the role of phonetic knowledge in Mandarin tone sandhi 165 (4) The special status of prepositions a. [213 [[213prep 213] 213]]£35 35 35 213, 21 35 35 213 or 35 21 35 213 [[wAN pei] tsou]] ‘The horse walks to the [ma north.’ north walk horse to 213 213 213 213 input 35 35 35 output 1 213 35 35 21 output 2 213 21 35 35 output 3 213 b. [213 [[213non-prep 213] 213]]£35 35 35 213 or 21 35 35 213 [ma [[x@n »Au] xou]] ‘Horses rarely roar.’ horse very rarely roar 213 213 213 input 213 35 35 35 output 1 213 21 35 35 output 2 213 *35 21 35 213 c. [55 [[213prep 51] 213]]£55 21 51 213 [tHa [[wAN xou] tsou]] ‘He walks backwards.’ he to back walk 55 213 51 input 213 55 21 51 output 213 d. [55 [[213non-prep 51] 213]]£55 21 51 213 [tHa [[x@n ai] kou]] ‘He loves dogs very much.’ he very love dog 55 213 51 213 input 55 21 51 213 output

We should recognise that the third-tone sandhi is not truly lexical: it clearly applies across word boundaries ((3), (4)), its application in long strings is aﬀected by speech rate ((3), (4)), and it is not structurepreserving in production under careful acoustic scrutiny (note 5). What is uncontroversial, however, is the clear diﬀerence between the two sandhis, in that the third-tone sandhi exhibits certain lexical characteristics, while the half-third sandhi does not. The close relation between the postlexical status of a phonological rule and its phonetic motivation is well established in the Lexical Phonology literature (e.g. Kiparsky 1982, 1985, Mohanan 1982), and we take it as another piece of evidence that the half-third sandhi has a stronger phonetic motivation than the third-tone sandhi. The third reason is that the third-tone sandhi corresponds to a historical sandhi pattern in Chinese, namely shangEyang ping /_shang, where shang and yang ping refer to the historical tonal categories from which 213 and 35 respectively descended. This historical sandhi pattern dates back to at least the 16th century (Mei 1977). According to Mei’s reconstruction,

166 Jie Zhang and Yuwen Lai the pitch values for shang and yang ping in 16th century Mandarin were low level (22) and low-rising (13) respectively. The present-day rendition of the sandhi in Mandarin is the result of historical tone changes that morphed shang into low-falling-rising and yang ping into high-rising. The Mandarin third-tone sandhi was therefore not originally motivated by the phonetic rationale of avoiding a complex pitch contour on a short duration. The same point is made by the variable synchronic realisations of the same historical sandhi in related Mandarin dialects (Court 1985). For instance, in Tianjin it is 13E45 /_13 (Yang et al. 1999), in Jinan 55E42 /_55 (Qian & Zhu 1998) and in Taiyuan 53E11 /_53 (Wen & Shen 1999). The half-third sandhi, on the other hand, does not have a similar historical origin; and due to the diﬀerent tonal shapes of the historical shang tone in diﬀerent present-day dialects, it does not have comparable synchronic realisations. The diﬀerences between the third-tone sandhi and the half-third sandhi in their phonetic characteristics, morphosyntactic properties and historical origins all point to the possibility that the half-third sandhi has a stronger synchronic phonetic basis than the third-tone sandhi. It is important to note that we have not committed ourselves to an absolute cut-oﬀ point for what is phonetically based and what is not – to identify patterns that are useful for testing the synchronic relevance of phonetics, such a threshold is not necessary, nor do we believe that it exists. However, it is crucial to be able to make comparisons between patterns along the lines that we have considered for Mandarin in order to identify the relevant ones for the test. Given the diﬀerence in phonetic grounding between the two sandhi patterns, the general question we pursue is whether Mandarin speakers exhibit diﬀerent behaviours on the two sandhis in a wug test. Speciﬁcally, we test whether there is a diﬀerence in productivity between the two sandhis. In line with the synchronic approach, we hypothesise that the sandhi with the stronger phonetic motivation – the half-third sandhi – will apply more productively in wug words than the third-tone sandhi. This greater productivity may be reﬂected in two ways. First, the half-third sandhi may apply to a greater percentage of the wug tokens than the thirdtone sandhi. Second, there is no diﬀerence in the rate of application, but there is incomplete application for the third-tone sandhi in wug words as compared to real words, while the half-third sandhi applies to the wug words the same way as it applies to the real words. In light of the earlier discussion on the diﬀerence between Tones 2 and 3, we speciﬁcally expect a lower and later turning point and a longer duration for the sandhi tone in wug words than in real words for the third-tone sandhi.7 7 We also hypothesised that the reaction times for the two sandhis would be signiﬁ-

cantly diﬀerent, due to the potentially diﬀerent types of processing for the two sandhis. But it was diﬃcult to decouple the allophonic diﬀerences in rhyme duration among diﬀerent tones from diﬀerences in reaction time, and this hypothesis was not borne out in the two experiments that we conducted. We do not report the reaction time results here. Interested readers can consult a previous version of this article, in

Testing the role of phonetic knowledge in Mandarin tone sandhi 167 Finally, we must acknowledge that the third-tone sandhi and the halfthird sandhi do not have the same lexical frequency in Mandarin. Calculations based on a syllable-frequency corpus (Da 2004) containing 192,647,157 syllables indicate that the numbers of legal syllable types with Tones 1–4 are 258, 224, 254 and 318 respectively, while syllables with Tones 1–4 account for 16.7%, 18.4 %, 14.8 % and 42.5% of all syllables in the corpus.8 In other words, Tone 3 has the third-lowest type frequency and the lowest token frequency, which means that disyllabic words with the 3+3 tonal combination may have relatively low frequency. Moreover, the third-tone sandhi also has a limited environment as compared to the half-third sandhi. The environments in which the half-third sandhi applies, which include _55, _35 and _51, account for 75.9% of all sandhi environments by type-frequency counts. Therefore, we pay special attention in our study to whether the half-third sandhi behaves as a uniﬁed process before 55, 35 and 51. If so, it will present a challenge to the goal of the study, as any eﬀect that conforms to our hypothesis may be due to the considerably higher lexical frequency of the half-third sandhi. If not, the frequency proﬁle of the four tones in Mandarin will provide us with an opportunity to study the potential eﬀect of lexical frequency on sandhi productivity and its interaction with the eﬀect of phonetics. If lexical frequency inﬂuences sandhi productivity, we primarily expect a typefrequency eﬀect (Bybee 1985, 2001, Baayen 1992, 1993, Ernestus & Baayen 2003, Pierrehumbert 2003, 2006, etc.), and thus a low productivity of the half-third sandhi for 3+2 sequences. But we also cannot exclude the possible eﬀect of token frequency, which has been shown to be relevant to the productivity of Taiwanese tone sandhi in Zhang & Lai (2008) and Zhang et al. (2009a, b). If the frequency eﬀect is mainly based on token frequency, we would expect 3+3 to have the lowest productivity.

3 Experiment 1 3.1 Methods 3.1.1 Stimuli. Following Hsieh (1970, 1975, 1976)’s experimental design for a Taiwanese wug test, we constructed ﬁve sets of disyllabic test words in Mandarin. The ﬁrst set includes real words, denoted by AO-AO (where AO=actual occurring morpheme). This set serves as the control for the experiment and is the set with which results of wug words are compared. The other four sets are wug words: *AO-AO, where both syllables are actual occurring morphemes, but the disyllable is nonoccurring ; AO-AG (AG=accidental gap), where the ﬁrst syllable occurs, but the second syllable is an accidental gap in Mandarin syllabary ; AG-AO, where the ﬁrst syllable is an accidental gap and the second syllable actually occurs ; and AG-AG, where both syllables are accidental which reaction time results are reported (available (December 2009) at http:// www2.ku.edu/~ling/faculty/zhang.shtml). 8 The other 7.6% are syllables with a neutral tone.

168 Jie Zhang and Yuwen Lai gaps. The AGs were selected by the authors, who are both native speakers of Mandarin Chinese. In each AG, both the segmental composition and the tone of the syllable are legal in Mandarin, but the combination happens to be missing. For example, [pHan] is a legal syllable, and occurs with tones 55 ‘to climb’, 35 ‘ plate’ and 51 ‘to await eagerly ’, but it accidentally does not occur with Tone 3 (213). Therefore, [pHan213] is a possible AG. For each set of words, we used four diﬀerent tonal combinations: the ﬁrst syllable always has Tone 3 (213), and the second syllable has one of the four Mandarin tones. Each tonal combination is therefore in the appropriate environment to undergo either the third-tone or the half-third sandhi. Eight words for each tonal combination were used, making a total of 160 test words (8[4[5). The AO-AO words were all high-frequency words, selected from Da (1998)’s Feng Hua Yuan character and digram frequency corpus. For the four wug sets, the digram frequencies are all zero, and we used the same ﬁrst syllable to combine with the four diﬀerent tones in the second syllable. For example, for AG-AG we used [pHi@N213 ew@n55], [pHi@N213 tHG35], [pHi@N213 tsGN213] and [pHi@N213 tea51], along with seven other such sets. In the recorded stimuli, the same token was combined with diﬀerent second syllables. The identity of the ﬁrst syllable allows for the comparison of the two types of sandhi that the ﬁrst syllable may undergo. To avoid neighbourhood eﬀects in wug words at least to some extent, we ensured that any disyllabic wug word was not a real word with any tonal combination, not just the one used for the disyllable. We speciﬁcally controlled for the tonal neighbours, because research on homophony judgement (Taft & Chen 1992, Cutler & Chen 1997), phoneme (toneme) monitoring (Ye & Connine 1999) and legal-phonotactic judgement (Myers 2002) has shown that phonemic tonal diﬀerences are perceptually less salient than segmental diﬀerences, which entails that tonal neighbours in a sense make closer neighbours. Finally, to disguise the purpose of the experiment, we also used 160 disyllabic ﬁller words. All ﬁller syllables were real syllables in Mandarin ; half of the disyllabic ﬁllers were real words, and the other half were wug words. All test stimuli and ﬁllers were read by the ﬁrst author, a native speaker of Mandarin who grew up in Beijing. The Tone 3 syllables were all read with full third tones. The entire set of test stimuli, as well as additional information about the stimuli and ﬁllers are given in the Appendix. 3.1.2 Experimental set-up. The experiment was conducted with the software package SuperLab (Cedrus) in the Phonetics and Psycholinguistics Laboratory at the University of Kansas. There were 320 stimuli in total (160 test items+160 ﬁllers). Each stimulus consisted of two monosyllabic utterances separated by an 800 ms interval. The stimuli were played through a headphone worn by the subjects. For each stimulus, the subjects were asked to put the two syllables together and

Testing the role of phonetic knowledge in Mandarin tone sandhi 169 pronounce them as a disyllabic word in Mandarin. Their response was collected by a Sony PCM-M1 DAT recorder through a 33-3018 Optimus dynamic microphone placed on the desk in front of them. The sampling rate for the DAT recorder was 44.1 kHz. The digital recording was then downsampled to 22 kHz onto a PC hard drive using Praat (Boersma & Weenink 2003). There was a 2000 ms interval between stimuli. If the subject did not respond within 2000 ms after the second syllable played, the next stimulus would begin. The stimuli were divided into two blocks of the same size (A and B) with matched stimulus types ; there was a ﬁveminute break between the blocks. Half of the subjects took block A ﬁrst, and the other half took block B ﬁrst. Within each block, the stimuli were automatically randomised by SuperLab. Before the experiment began, the subjects heard a short introduction in Chinese through the headphones which explained the task both in prose and with examples ; they simultaneously read it on a computer screen. There was then a practice session involving 14 words (two each of AO-AO, *AO-AO, AO-AG, AG-AO and AG-AG, two real-word ﬁllers and two wug ﬁllers). The experiment began after a verbal conﬁrmation from the subjects that they were ready. The entire experiment took around 45 minutes. 3.1.3 Subjects. Twenty native speakers of Mandarin (12 male, 8 female), recruited at the University of Kansas, participated in the study. All speakers were from northern areas of Mainland China, and, in the opinion of the authors, spoke Standard Mandarin natively, without any noticeable accent. Except for one speaker who was 45 years old and had been in the United States for 20 years, all speakers ranged from 23 to 35, and had been in the United States for less than four years at the time of the experiment. Each subject was paid a nominal fee for participating in the study. 3.1.4 Data analyses. All test tokens from the subjects were listened to by the two authors. A token was not used in the analysis if there was a large enough gap between the two syllables that they clearly did not form a disyllabic word. For the rest of the tokens, it was judged that both the third-tone sandhi and the half-third sandhi applied 100% of the time. Non-application of the sandhi processes should be easy to detect for native speakers, as they involve clear phonotactic violations (*213 non-ﬁnally). Therefore, the test for the productivity of the sandhis lies in the accuracy of their applications to the wug words. To investigate the accuracy of sandhi application, we extracted the F0 of the rhyme in the ﬁrst syllable of the subjects’ disyllabic response, using Praat. We then took a F0 measurement every 10% of the duration of the rhyme, giving eleven F0 measurements for each rhyme. For each tonal combination (3+1, 3+2, 3+3, 3+4), we did two comparisons. The ﬁrst was between AO-AO and the rest of the word groups (*AO-AO, AO-AG, AG-AO, AG-AG); i.e. real disyllables vs. wug disyllables. The other was between AO-AO, *AO-AO, AO-AG and AG-AO, AG-AG ; i.e. real w1 vs. wug w1. The rationale for the two comparisons is that lexical listing could be at the

170 Jie Zhang and Yuwen Lai

F0 (Hz)

240 220

WordGroup 1 WordGroup 2

200 180 160

1

2

3

4

5

6

7

8

9 10 11

points Figure 3 The comparison of two F0 curves.

disyllabic word or monosyllabic morpheme level ; doing both comparisons allows us to tease apart the two possibilities. Our hypothesis for these comparisons is that the diﬀerence in sandhi tones between real words and wugs should be greater for cases of third-tone sandhi than half-third sandhi, due to the stronger phonetic motivation for the latter. In particular, we expect incomplete application of the third-tone sandhi in wugs, i.e. Tone 3 in w1 will resist the change to Tone 2. Again, given the acoustic characteristics of Tones 2 and 3 in Mandarin, the hypothesis translates into a lower and later turning point and a longer duration for the sandhi tone in wug words than in real words. Among the twenty speakers, there were two speakers (one male and one female) whose F0 values could not be reliably measured by Praat, due to high degrees of creakiness in their voice. We discarded these speakers’ data in the F0 analysis. Figure 3 illustrates how we compared two F0 curves. We conducted a two-way Huynh-Feldt repeated-measures ANOVA, which corrected for sphericity violations, with WordGroup and Point as independent variables. The WordGroup variable has two levels, WordGroup 1 and WordGroup 2, and a signiﬁcant main eﬀect would indicate that the two F0 curves representing the two word groups have diﬀerent average pitches. The Point variable has eleven levels, representing the eleven points where F0 data are taken. A signiﬁcant interaction between WordGroup and Point would indicate that the two curves have diﬀerent shapes. This method of comparing two F0 curves is used by Peng (2000). For w1 in 3+3 combinations, we also measured the F0 drop and the duration from the beginning of the rhyme to the pitch turning point, as shown in Fig. 4. Comparisons between real and wug disyllables and between real and wug w1 on these measurements were made using one-way repeated-measures ANOVAs. We expected the F0 drop to be greater and the TP duration to be longer for wug words than real words. Finally, we measured the w1 rhyme duration for all the disyllabic combinations, and compared real and wug disyllables and real and wug w1’s for each tonal combination, using one-way repeated-measures ANOVAs.

Testing the role of phonetic knowledge in Mandarin tone sandhi 171

BF0

TP duration

turning point duration

Figure 4 A schematic of the measurements taken from the pitch curve of the rhyme in w1 in 3+3 combinations. ‘ EF0 ’ and ‘TP duration ’ are the pitch drop and duration from the beginning of the rhyme to the turning point respectively. ‘ Duration ’ is the duration of the entire rhyme.

Based on the synchronic approach, we expected to ﬁnd a longer rhyme duration for the wug words in 3+3 combinations, but no diﬀerence between wug and real words in other combinations. 3.2 Results 3.2.1 F0 contour. In this section, we report the results of comparison on the F0 of the ﬁrst syllable of the subjects’ response between real disyllables and wug disyllables, and between real-w1 words and wug-w1 words. The results from the half-third sandhi comparisons are given in Fig. 5. In this and all following ﬁgures, ‘n.s. ’ indicates no signiﬁcant diﬀerence and ‘ *’, ‘** ’ and ‘***’ indicate signiﬁcant diﬀerences at the p<0.05, p<0.01 and p<0.001 levels respectively. As we can see in Fig. 5, for Tones 1 and 4 the subjects ’ performance on the half-third sandhi on wug words is generally identical to that on real words in terms of both the average F0 and the F0 contour shape. This is true for both the disyllabic and w1 comparisons for Tone 1 and the w1 comparisons for Tone 4. When w2 has Tone 2, the shape of the F0 contour on w1 is signiﬁcantly diﬀerent for real and wug words, for both comparisons. The statistical results for these comparisons are given in Table I. Figure 5 also shows that the F0 shape diﬀerence between real and wug words for 3+2 lies in the fact that the F0 shape for the wug words has a turning point at around 70% into the tone, while the F0 shape for the real words falls monotonically throughout the rhyme. This indicates that there may be incomplete application of the half-third sandhi in 3+2 ; hence its lower accuracy/productivity in this particular environment.9 We currently 9 The pitch rise at the end of the ﬁrst syllable in 3+1 and 3+4 for real disyllable and real w1 words is likely due to coarticulation with the high pitch onset of the following

tone (Tone 1=55, Tone 4=51).

172 Jie Zhang and Yuwen Lai (a)

real vs. wug disyllables 3+1

230 F0 average: n.s.

ao-ao others

F0 (Hz)

F0 shape: n.s.

210

(b)

230 F0 average: n.s.

190

170

170 1 2 3 4 5 6 7 8 9 10 11

150

1 2 3 4 5 6 7 8 9 10 11

3+2 230 F0 average: n.s.

F0 (Hz)

3+2 ao-ao others

F0 shape: ***

210

230 F0 average: n.s. 210 190

170

170 1 2 3 4 5 6 7 8 9 10 11

150

1 2 3 4 5 6 7 8 9 10 11

3+4 230 F0 average: n.s.

F0 (Hz)

F0 shape: *

210

3+4 ao-ao others

230 F0 average: n.s. F0 shape: n.s.

210

190

190

170

170

150

1 2 3 4 5 6 7 8 9 10 11

points

ao ag

F0 shape: ***

190

150

ao ag

F0 shape: n.s.

210

190

150

real s1 vs. wug s1 words 3+1

150

ao ag

1 2 3 4 5 6 7 8 9 10 11

points

Figure 5 F0 curves of the ﬁrst syllable for the half-third sandhi. (a) represents the real disyllable vs. wug disyllable comparisons for the ﬁrst syllable in 3+1, 3+2 and 3+4 ; (b) represents the real w1 vs. wug w1 comparisons for the same tonal combinations.

have no account for why there is a signiﬁcant F0 shape diﬀerence between AO-AO and other word groups for 3+4. The results from the third-tone sandhi comparisons are given in Fig. 6. Two-way repeated-measures ANOVAs indicate that although the average F0 is the same for both comparisons, the F0 contour shape is signiﬁcantly diﬀerent between the real words and wug words for both comparisons. The ANOVA results are summarised in Table II. We can also see in Fig. 6 that for the curves representing wug words the turning points are both lower and later than their counterparts for the

Testing the role of phonetic knowledge in Mandarin tone sandhi 173 (a)

Tone 1

Tone 2

Tone 4

WdGr (F0 average)

F(1·000, 17·000)= F(1·000, 17·000)= F(1·000, 17·000)= 0·005, p=0·945 0·805, p=0·382 0·000, p=1·000

Point

F(3·187, 54·180)= F(2·119, 36·023)= F(2·663, 45·263)= 125·614, p<0·001 168·840, p<0·001 133·073, p<0·001

WdGrXPoint F(3·574, 60·750)= F(2·824, 48·012)= F(3·436, 58·409)= (F0 shape) 0·880, p=0·472 13·036, p<0·001 3·535, p=0·016 (b)

Tone 1

Tone 2

Tone 4

WdGr (F0 average)

F(1·000, 17·000)= F(1·000, 17·000)= F(1·000, 17·000)= 0·061, p=0·808 0·000, p=0·997 0·189, p=0·670

Point

F(3·275, 55·680)= F(2·143, 36·439)= F(2·651, 45·059)= 167·524, p<0·001 178·423, p<0·001 117·356, p<0·001

WdGrXPoint F(2·545, 43·265)= F(3·150, 53·546)= F(2·942, 50·011)= (F0 shape) 2·178, p=0·113 9·072, p<0·001 2·265, p=0·093 Table I Two-way repeated-measures ANOVA results for the first syllable F0 curves in the half-third sandhi: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words.

(a)

real vs. wug disyllables 3+3

230 F0 average: n.s.

(b)

230 F0 average: n.s.

F0 (Hz)

F0 shape: ***

F0 shape: ***

210

210

190

190

170 150

real s1 vs. wug s1 words 3+3

ao-ao others 1 2 3 4 5 6 7 8 9 10 11

points

170 150

ao ag 1 2 3 4 5 6 7 8 9 10 11

points

Figure 6 F0 curves of the ﬁrst syllable for the third-tone sandhi. (a) and (b) represent the real disyllable vs. wug disyllable and real w1 vs. wug w1 comparisons respectively.

curves representing real words, indicating that there may be incomplete application of the sandhi. To quantify these turning point diﬀerences in w1 of the 3+3 combination, we deﬁned EF0 as the diﬀerence between the F0 of the beginning of the rhyme and the F0 turning point in the rhyme and TP duration as the duration from the beginning of the rhyme to the

174 Jie Zhang and Yuwen Lai (a)

Tone 3

(b)

Tone 3

WdGr (F0 average)

F(1·000, 17·000)= 1·351, p=0·261

WdGr (F0 average)

F(1·000, 17·000)= 0·000, p=0·997

Point

F(2·371, 40·312)= 73·135, p<0·001

Point

F(2·143, 36·439)= 178·423, p<0·001

WdGrXPoint F(2·414, 41·031)= (F0 shape) 9·537, p<0·001

WdGrXPoint F(3·150, 53·546)= 9·072, p<0·001 (F0 shape)

Table II Two-way repeated-measures ANOVA results for the first syllable F0 curves in the third-tone sandhi: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words. 40

(a)

40

(b) **

*** 30

BF0 (Hz)

BF0 (Hz)

30 20 10 0

20 10

real disyllables

wug disyllables

0

real s1 words

wug s1 words

Figure 7 EF0 results for 3+3. (a) and (b) represent the real disyllable vs. wug disyllable and real w1 vs. wug w1 comparisons respectively.

turning point. Results of comparisons between real and wug disyllables and between real and wug w1’s on EF0 and TP duration for 3+3 are given in Figs 7 and 8 respectively. In these and following ﬁgures, error bars indicate one standard deviation. One-way repeated-measures ANOVAs with WordGroup as the independent factor indicate that for EF0, AO-AO is signiﬁcantly diﬀerent from other word groups (F(1.000, 17.000)= 8.543, p<0.01), as is w1=AO from w1=AG (F(1.000, 17.000)=48.254, p<0.001) ; for TP duration, AO-AO is signiﬁcantly diﬀerent from other word groups (F(1.000, 17.000)=19.561, p<0.001), as is w1=AO from w1=AG (F(1.000, 17.000)=21.343, p<0.001). These results support our hypothesis : with a lower and later turning point, the sandhi tone on wug words is more similar to the original Tone 3 than that on real words, indicating incomplete application of the sandhi in wug words. 3.2.2 Rhyme duration. The results for w1 rhyme duration for all the tonal combinations are given in Fig. 9, and the statistical results are summarised

Testing the role of phonetic knowledge in Mandarin tone sandhi 175 100 80 60 40 20 0

100

(b) ***

TP duration (ms)

TP duration (ms)

(a)

real disyllables

***

80 60 40 20 0

wug disyllables

real s1 words

wug s1 words

Figure 8 TP duration results for 3+3. (a) and (b) represent the real disyllable vs. wug disyllable and real w1 vs. wug w1 comparisons respectively. 300

s1 duration (ms)

(a)

250 200

ao-ao others

150 100 50 0

3+1

3+2

3+3

3+4

300

s1 duration (ms)

(b)

*

250 200

ao ag

150 100 50 0

3+1

3+2

3+3

3+4

Figure 9 Rhyme duration of w1 for all tonal combinations. (a) and (b) represent the real disyllable vs. wug disyllable and real w1 vs. wug w1 comparisons respectively.

in Table III. One-way repeated-measures ANOVAs with WordGroup as the independent factor show that there are no signiﬁcant diﬀerences between AO-AO and other word groups for any of the tonal combinations. But for 3+3, the diﬀerence approaches signiﬁcance, at p<0.05 (F(1.000,

176 Jie Zhang and Yuwen Lai (a)

(b)

Tone 3+Tone 1

F(1·000, 17·000)=0·660 F(1·000, 17·000)=0·097 p=0·428 p=0·759

Tone 3+Tone 2

F(1·000, 17·000)=0·206 F(1·000, 17·000)=0·559 p=0·465 p=0·656

Tone 3+Tone 3

F(1·000, 17·000)=4·218 F(1·000, 17·000)=5·653 p=0·029 p=0·056

Tone 3+Tone 4

F(1·000, 17·000)=0·620 F(1·000, 17·000)=1·118 p=0·305 p=0·442

Table III One-way repeated-measures ANOVA results for the s1 rhyme duration in all tonal combinations: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words.

17.000)=4.218, p=0.056), and the diﬀerence is in the expected direction, i.e. wug>real. For AO vs. AG, 3+3 is the only combination in which the wug words have a signiﬁcantly longer w1 rhyme duration than the real words (F(1.000, 17.000)=5.653, p<0.05). These results support our hypothesis: the durational property for the sandhi syllables is identical for real and wug words for the half-third sandhi, but for the thirdtone sandhi, the sandhi-syllable rhyme duration in wug words is longer than in real words, again indicating incomplete application of the sandhi in wug words. These results are consistent with an approach that encodes phonetic biases in the grammar, but not with a frequency-only approach, as the latter predicts a greater durational diﬀerence between real and wug words for 3+2 than for 3+3, due to the former’s lower lexical frequency. 3.3 Discussion Our third-tone sandhi results indicate a signiﬁcant diﬀerence between real words and wug words in the contour shape of the sandhi tone ; in particular, the contour shape of the sandhi tone in wug words shares a greater similarity with the original Tone 3 in having a lower and later turning point and a longer tone duration. Given that we did not judge any 3+3 tokens in the data to have non-application of the third-tone sandhi, the diﬀerence between real and wug words for the third-tone sandhi was due not to the non-application of the sandhi to a limited number of tokens/speakers, but to the incomplete application of the sandhi to a large number of tokens. The real vs. wug comparison for the half-third sandhi, however, showed identical contour shapes for the sandhi tone for Tone 1, an inconsistent contour-shape diﬀerence for Tone 4 (a diﬀerence at p<0.05 level (p=0.016) for the disyllabic comparison, but no diﬀerence for the AO vs. AG comparison), and a signiﬁcant contour shape diﬀerence

Testing the role of phonetic knowledge in Mandarin tone sandhi 177 for Tone 2, which indicates incomplete application of the sandhi. This shows (a) that the half-third sandhi behaves diﬀerently in diﬀerent environments, and (b) that the sandhi with the lowest type frequency (3+2) also applies less consistently to wug words than to real words. The real disyllable vs. wug disyllable and real-w1 vs. wug-w1 comparisons returned similar results. But the diﬀerence between the two sandhis is more apparent in the real-w1 vs. wug-w1 comparison, as indicated by the equal or more signiﬁcant diﬀerence for the third-tone sandhi and the equal or less signiﬁcant diﬀerence for the half-third sandhi between the two groups for all F0 measures. Therefore, our hypothesis that the diﬀerence in sandhi tones between real words and wugs should be greater for cases of third-tone sandhi than half-third sandhi ﬁnds support in the facts that (a) the diﬀerence between real and wug words for the third-tone sandhi can be translated into incomplete application for the sandhi in wug words, and (b) there is no consistent diﬀerence between real and wug words for the half-third sandhi. We have also found an eﬀect that is potentially due to type frequency: the half-third sandhi in 3+2 also applies incompletely to wug words. The eﬀects overall, however, are not consistent with a frequencyonly account, as the diﬀerences between real and wug words are more consistent for 3+3 than 3+2, as evidenced by the lack of rhyme duration diﬀerence in 3+2. These results must be interpreted cautiously, however, for two reasons. First, the diﬀerences between real and wug words in the third-tone sandhi, although statistically highly signiﬁcantly, are quite small. It is thus important for us to be able to replicate these results in a separate experiment. Second, although all of our participants came from northern areas of Mainland China and spoke Standard Mandarin natively without any noticeable accent, they did have backgrounds in diﬀerent Northern Chinese dialects. This could potentially have an eﬀect on the results. Experiment 2 was designed to address these issues.

4 Experiment 2 The goals of Experiment 2 are twofold: ﬁrst, it serves as a replication of Experiment 1 ; second, it includes only participants who grew up in Beijing, and thus minimises the potential dialectal eﬀects on the results. 4.1 Methods The methods of Experiment 2 were identical to those of Experiment 1, except that the experiment was conducted in the Phonetics Laboratory of the Department of Chinese Language and Literature at Beijing University in China, and that the recordings were made by a Marantz solid state recorder PMD 671 using a EV N/D 767a microphone. The sampling rate

178 Jie Zhang and Yuwen Lai of the solid state recorder was 44.1 kHz, and the digital recording was not further downsampled.10 Thirty-one native speakers of Beijing Chinese (9 male, 22 female), recruited at Beijing University, participated in the experiment. All subjects had grown up and gone through their primary and secondary schooling in Beijing, and none reported being conversant with any other dialects of Chinese. The subjects ranged from 19 to 37 years in age. Each subject was paid a nominal fee for participating in the study. Due to technical problems with Superlab, we were not able to use one male speaker’s data. We therefore report data from 30 speakers. 4.2 Results 4.2.1 F0 contour. The F0 contour results for the half-third sandhi comparisons are given in Fig. 10. For both Tones 1 and 4, the subjects’ performance on the half-third sandhi on wug words is generally identical to that on real words in terms of both the average F0 and the F0 contour shape. This is true for both the disyllabic and w1 comparisons for Tone 1 and the w1 comparisons for Tone 4. For the disyllabic comparison for Tone 1, however, the p value is right at 0.05, and this needs to be acknowledged. When w2 has Tone 2, the average F0 pitch on w1 is signiﬁcantly lower for wug words than real words for the disyllabic comparison, and the F0 shape between real and wug words is signiﬁcantly diﬀerent for the AO vs. AG comparisons. The statistical results for these comparisons are given in Table IV. The diﬀerence in the F0 shapes of real-w1 and wug-w1 words for 3+2 lies in the fact that the F0 shape for the wug words has a turning point at around 80% into the tone, while the F0 shape for the real words falls monotonically throughout the rhyme. This is similar to the F0 shape diﬀerence in both real vs. wug comparisons in Experiment 1. It again indicates that there may be incomplete application, and hence lower accuracy/productivity, of the half-third sandhi in 3+2. The results from the third-tone sandhi comparisons are given in Fig. 11. Two-way repeated-measures ANOVAs indicate that both the average F0 and the F0 contour shape are signiﬁcantly diﬀerent for real words and wug words, for both comparisons. The ANOVA results are summarised in Table V. We have replicated our major ﬁnding regarding the F0 contours in Experiment 1: the w1 in 3+3 sequences show consistent contour-shape 10 We manipulated the duration of the second syllable of the stimuli in Praat in the

following way. We took the median rhyme duration of the 160 second syllables in the test stimuli (454 ms), and either expanded or shrank the duration of the rhymes of all second syllables to the same duration. We then calculated the expansion or shrinkage ratio of each rhyme and applied the same ratio to the VOT, frication duration or sonorant duration of its onset consonant. The duration of the ﬁllers remained unchanged. This duration manipulation was conducted in order to minimise the allophonic durational diﬀerences among diﬀerent tones so that the reaction time hypothesis could be better tested (cf. note 7).

Testing the role of phonetic knowledge in Mandarin tone sandhi 179 (a)

real vs. wug disyllables 3+1

230 F0 average: n.s.

ao-ao others

F0 (Hz)

F0 shape: n.s.

210

(b)

230 F0 average: n.s.

190

170

170 1 2 3 4 5 6 7 8 9 10 11

150

1 2 3 4 5 6 7 8 9 10 11

3+2 230 F0 average: ***

F0 (Hz)

3+2 ao-ao others

F0 shape: n.s.

210

230 F0 average: n.s.

190

170

170 1 2 3 4 5 6 7 8 9 10 11

150

1 2 3 4 5 6 7 8 9 10 11

3+4 230 F0 average: n.s.

F0 (Hz)

F0 shape: n.s.

210

3+4 ao-ao others

230 F0 average: n.s. F0 shape: n.s.

210

190

190

170

170

150

1 2 3 4 5 6 7 8 9 10 11

points

ao ag

F0 shape: *

210

190

150

ao ag

F0 shape: n.s.

210

190

150

real s1 vs. wug s1 words 3+1

150

ao ag

1 2 3 4 5 6 7 8 9 10 11

points

Figure 10 F0 curves of the ﬁrst syllable for the half-third sandhi.

diﬀerences for the real and wug words in the two comparisons. This experiment also shows that there is an average pitch diﬀerence for 3+3 between real and wug words. Moreover, other tonal sequences do not show diﬀerences between real words and wug words, except for 3+2 – the tonal combination that has the lowest type frequency. However, 3+2 diﬀerences between real and wug words are less consistent than 3+3 diﬀerences. This would not be consistent with a frequency-only account, but would be consistent with an account in which both phonetics and frequency are relevant. From Fig. 11, we can see that the contour shape diﬀerence between real and wug words for 3+3 is similar to that in Experiment 1: the turning

180 Jie Zhang and Yuwen Lai (a)

Tone 1

Tone 2

Tone 4

WdGr (F0 average)

F(1·000, 29·000)= F(1·000, 29·000)= F(1·000, 29·000)= 0·024, p=0·878 19·561, p<0·001 0·616, p=0·439

Point

F(1·773, 51·431)= F(1·460, 42·348)= F(1·855, 53·807)= 68·996, p<0·001 128·525, p<0·001 121·127, p<0·001

WdGrXPoint F(2·466, 71·504)= F(1·930, 55·958)= F(1·387, 40·243)= (F0 shape) 2·905, p=0·050 2·581, p=0·087 2·506, p=0·111 (b)

Tone 1

Tone 2

Tone 4

WdGr (F0 average)

F(1·000, 29·000)= F(1·000, 29·000)= F(1·000, 29·000)= 0·110, p=0·743 3·007, p=0·094 0·745, p=0·395

Point

F(1·597, 46·324)= F(1·618, 46·918)= F(1·683, 48·807)= 81·352, p<0·001 119·066, p<0·001 118·023, p<0·001

WdGrXPoint F(1·454, 42·165)= F(2·319, 67·242)= F(1·804, 52·323)= (F0 shape) 1·655, p=0·207 4·646, p=0·010 1·954, p=0·156 Table IV Two-way repeated-measures ANOVA results for the first syllable F0 curves in the half-third sandhi: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words.

(a)

real vs. wug disyllables 3+3

230 F0 average: *

(b)

230 F0 average: **

F0 (Hz)

F0 shape: ***

F0 shape: ***

210

210

190

190

170 150

real s1 vs. wug s1 words 3+3

ao-ao others 1 2 3 4 5 6 7 8 9 10 11

points

170 150

ao ag 1 2 3 4 5 6 7 8 9 10 11

points

Figure 11 F0 curves of the ﬁrst syllable for the third-tone sandhi.

points for wug words are both lower and later than their counterparts in real words, indicating that there may be incomplete application of the sandhi in the wug words. The comparisons between real and wug disyllables and between real and wug w1’s on EF0 for 3+3 are given in Fig. 12. A one-way repeatedmeasures ANOVA indicates that AO-AO has a signiﬁcantly smaller EF0

Testing the role of phonetic knowledge in Mandarin tone sandhi 181 (a)

(b)

Tone 3

Tone 3

WdGr (F0 average)

F(1·000, 29·000)= 4·946, p=0·034

WdGr (F0 average)

F(1·000, 29·000)= 11·153, p=0·002

Point

F(1·643, 47·654)= 154·695, p<0·001

Point

F(1·720, 49·893)= 192·180, p<0·001

WdGrXPoint F(2·161, 62·678)= (F0 shape) 12·291, p<0·001

WdGrXPoint F(2·319, 67·250)= 18·352, p<0·001 (F0 shape)

Table V Two-way repeated-measures ANOVA results for the first syllable F0 curves in the third-tone sandhi: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words.

20

(a)

20

(b) *

*** 15

BF0 (Hz)

BF0 (Hz)

15 10 5 0

10 5

real disyllables

wug disyllables

0

real s1 words

wug s1 words

Figure 12 EF0 results for 3+3.

than other word groups (F(1.000, 29.000)=4.457, p<0.05), as does w1= AO in comparison with w1=AG (F(1.000, 29.000)=28.523, p<0.001). Comparisons between real and wug words for TP duration of 3+3 are given in Fig. 13. A one-way repeated-measures ANOVA indicates that AO-AO has a signiﬁcantly shorter TP duration than other word groups (F(1.000, 29.000)=28.793, p<0.001), as does w1=AO in comparison with w1=AG (F(1.000, 29.000)=56.235, p<0.001). Given that we will see in w4.2.3 that wug words generally have a longer w1 rhyme duration than real words, we also calculated the TP duration as a percentage of the entire w1 rhyme duration and compared the real words with wug words, to ensure that the longer TP duration in wug words is not simply due to the longer w1 duration. These comparisons are shown in Fig. 14. ANOVA results show that the AO-AO turning point is still signiﬁcantly earlier than that of other word groups (F(1.000, 29.000)=5.082, p<0.05), as is w1=AO in comparison with w1=AG (F(1.000, 29.000)= 34.617, p<0.001).

182 Jie Zhang and Yuwen Lai 100 80 60 40 20 0

100

(b) ***

TP duration (ms)

TP duration (ms)

(a)

real disyllables

60 40 20 0

wug disyllables

***

80

real s1 words

wug s1 words

Figure 13 TP duration results for 3+3. 50 *

40 30 20 10 0

50

(b) TP duration (%)

TP duration (%)

(a)

real disyllables

wug disyllables

***

40 30 20 10 0

real s1 words

wug s1 words

Figure 14 TP duration as a percentage of the entire w1 rhyme duration in 3+3.

We have replicated our turning point results in Experiment 1: the w1 turning point in 3+3 sequences is signiﬁcantly lower and later in wug words than real words, which makes the tone more similar to the original Tone 3 in wug words, indicating incomplete application of the sandhi in wug words. 4.2.2 Rhyme duration. The results for w1 rhyme duration for all the tonal combinations are given in Fig. 15. For the AO-AO vs. other comparison, a repeated-measures ANOVA shows that there is a signiﬁcant WordGroup eﬀect : F(1.000, 29.000)=58.058, p<0.001 ; the ANOVA results within each tone, summarised in Table VI, show that except for 3+1, the wug words have a signiﬁcantly longer w1 rhyme duration than AO-AO. For the AO vs. AG comparison, the ANOVA again shows a signiﬁcant WordGroup eﬀect: F(1.000, 29.000)=58.576, p<0.001; the ANOVA results within each tone, also summarised in Table VI, show that the AG words have a signiﬁcantly longer w1 rhyme duration than AO words for all of the tonal combinations.

Testing the role of phonetic knowledge in Mandarin tone sandhi 183 (a)

(b)

Tone 3+Tone 1

F(1·000, 29·000)=0·698 p=0·410

F(1·000, 29·000)=25·382 p<0·001

Tone 3+Tone 2

F(1·000, 29·000)=48·128 F(1·000, 29·000)=38·187 p<0·001 p<0·001

Tone 3+Tone 3

F(1·000, 29·000)=54·432 F(1·000, 29·000)=50·444 p<0·001 p<0·001

Tone 3+Tone 4

F(1·000, 29·000)=21·346 p<0·001

F(1·000, 29·000)=7·962 p=0·009

Table VI One-way repeated-measures ANOVA results for the s1 rhyme duration in all tonal combinations: (a) real vs. wug disyllables; (b) real s1 vs. wug s1 words.

300

(a) s1 duration (ms)

***

***

***

250 200

ao-ao others

150 100 50 0

3+1

3+2

3+3

3+4

***

***

***

**

300

s1 duration (ms)

(b)

250 200

ao ag

150 100 50 0

3+1

3+2

3+3

3+4

Figure 15 Rhyme duration of w1 for all tonal combinations.

To compare the real vs. wug durational diﬀerence in diﬀerent tonal combinations, we calculated the durational diﬀerence between AO-AO and other word groups, as well as between w1=AO and w1=AG for each tonal combination, as shown in Fig. 16, and we conducted a one-way

(a)

duration di‰erence (ms)

184 Jie Zhang and Yuwen Lai 50

(b) 50

40

40

30

30

20

20

10

10

0

3+1

3+2

3+3

3+4

0

3+1

3+2

3+3

3+4

Figure 16 w1 rhyme duration diﬀerences for all tonal combinations : (a) real vs. wug disyllables ; (b) real s1 vs. wug s1 words.

repeated-measures ANOVA, with Tone as the independent variable and the durational diﬀerence as the dependent variable for each real vs. wug comparison. The ANOVA results show that for the AO-AO vs. other comparison, Tone has a signiﬁcant eﬀect on the durational diﬀerence between the two word groups (F(2.441, 70.783)=22.032, p<0.001), and post hoc tests show that the 3+3 and 3+2 sequences exhibit signiﬁcantly greater durational diﬀerences than 3+1 and 3+4 (p<0.001 for all comparisons except for 3+2 vs. 3+4, which is at p<0.01). No other pairwise diﬀerences were found. For the w1=AO vs. w1=AG comparison, Tone also has a signiﬁcant eﬀect on the durational diﬀerence between the two word groups (F(3.000, 87.000)=6.174, p<0.005), and post hoc tests show that 3+3 and 3+2 exhibit signiﬁcantly greater durational diﬀerences than 3+4 (p<0.005 for 3+3 vs. 3+4; p<0.05 for 3+2 vs. 3+4). The w1 rhyme duration data here diﬀer from that of Experiment 1 in that wug words have an overall signiﬁcantly longer duration than real words regardless of the tonal combination. But the durational diﬀerence in w1 rhyme between real and wug words is dependent on the tonal combination. 3+3 and 3+2 sequences induced signiﬁcantly greater durational diﬀerences between real and wug words than the other tonal sequences. The numerical diﬀerences between 3+3 and 3+2 observed in Fig. 16, though in the expected direction, did not reach statistical signiﬁcance. These results indicate that in wug words, 3+3 and 3+2 sequences may have involved incomplete sandhi application, which would give the ﬁrst syllable a longer duration. They are again consistent with a synchronic approach that take into account both phonetics and lexical frequency. 4.3 Discussion Our data provide converging evidence with Experiment 1 for the lower application accuracy of the third-tone sandhi than the half-third sandhi. In all F0 comparisons between real and wug words in both Experiment 1

Testing the role of phonetic knowledge in Mandarin tone sandhi 185 and Experiment 2, the contour shape of 3+3 sequences is the only comparison that consistently shows a signiﬁcant diﬀerence. Moreover, the properties of the diﬀerence are consistent across comparisons and experiments : the turning point of the sandhi tone is signiﬁcantly lower and later in wug words than in real words, and similarly to Experiment 1, these diﬀerences are not caused by the non-application of the sandhi to a limited number of tokens/speakers, indicating that the sandhi is incompletely applied to a large number of wug words. The potential frequency eﬀects observed in Experiment 1 are also replicated here. The 3+2 sequences exhibited diﬀerences between real and wug words, in that the sandhi tones in wug words showed properties of non-application – the existence of a turning point and a longer duration. But the diﬀerence in F0 shape is less consistent than in 3+3. This is consistent with an approach that encodes the eﬀects of both phonetics and frequency, but not with a frequency-only approach, which would predict a more consistent diﬀerence between real and wug words for 3+2 than for 3+3. Again as in Experiment 1, the diﬀerence between the third-tone sandhi and the half-third sandhi is more apparent in the real-w1 vs. wug-w1 comparison, as indicated by the equal or more signiﬁcant diﬀerence for the third-tone sandhi and the equal or less signiﬁcant diﬀerence for the halfthird sandhi for the two word groups for all F0 measures.

5 General discussion 5.1 The relevance of phonetics to synchronic phonology Our F0 pitch-track, turning point and duration data from the two wug-test experiments collectively support our hypothesis that there is a diﬀerence in productivity between the two tone-sandhi patterns in Mandarin : the more innovative sandhi, which has a stronger phonetic basis – the half-third sandhi – applies accurately to wug words, except for 3+2, which has the lowest type frequency ; the sandhi with the longer history and more opaque phonetic basis – the third-tone sandhi – applies incompletely to wug words, as evidenced by the signiﬁcantly lower and later turning point in the sandhi tone. The F0 data suggest that phonological patterns with diﬀerent degrees of phonetic basis have diﬀerent synchronic statuses : there is a bias that favours the pattern that has a stronger phonetic basis. Lexical frequency by itself cannot account for the data patterns, for two reasons. First, the half-third sandhi behaves diﬀerently in diﬀerent environments, indicating that speakers do not pool these environments together when they internalise the sandhi. It is therefore inaccurate to say that the third-tone sandhi has an overall lower frequency than the half-third sandhi ; rather, it has a lower type frequency than the half-third sandhi in 3+1 and 3+4, but a higher type frequency than the half-third sandhi in 3+2. Second, the diﬀerence between real words and wug words is more consistently

186 Jie Zhang and Yuwen Lai observed in 3+3 than 3+2, as indicated by the rhyme-duration data in Experiment 1 and the F0 data in Experiment 2. A frequency-only account would predict the opposite. The phonetic eﬀect manifests itself here gradiently in the following sense : the sandhi with a weaker phonetic motivation applies without fail to the wug words, but the application is incomplete, in that the sandhi tone bears more resemblance to the base tone than the sandhi tone in real words. In a way, this is a more subtle gradient eﬀect than the one in which the pattern applies to only a percentage of the structures that satisfy its environment, as shown by other work on gradience and exceptionality in phonology (e.g. Zuraw 2000, 2007, Frisch & Zawaydeh 2001, Ernestus & Baayen 2003, Hayes & Londe 2006, Pierrehumbert 2006, Coetzee 2008a, Coetzee & Pater 2008, Zhang & Lai 2008, Zhang et al. 2009a, b). Methodologically, this result indicates the importance of detailed phonetic studies that can reveal patterns that traditionally escaped the attention of phonologists, but could potentially shed light on issues of theoretical contention. This ﬁnds a parallel in the discovery of incomplete neutralisation in many processes thought to be neutralising, such as ﬁnal devoicing in a host of languages (e.g. Charles-Luce 1985, Slowiaczek & Dinnsen 1985, Port & Crawford 1989, Warner et al. 2004), English ﬂapping (Zue & Laferriere 1979, Dinnsen 1984, Patterson & Connine 2001) and Mandarin third-tone sandhi (Peng 2000).11 5.2 Frequency eﬀects As argued above, frequency eﬀects alone cannot account for our data. But frequency does seem to correlate positively with sandhi productivity : the half-third sandhi in 3+2, which has the lowest type frequency, has the lowest application accuracy in wug words among all half-third sandhi environments, and the inaccurate application can be characterised as incomplete application of the sandhi, just as we have observed for the third-tone sandhi. The frequency eﬀects here are also of a slightly diﬀerent nature than the frequency matching of patterned exceptionality in the lexicon in wug tests (Zuraw 2000, Albright 2002, Albright & Hayes 2003, Ernestus & Baayen 2003, Hayes & Londe 2006 et al.) – the pattern here is exceptionless in the lexicon, but is less frequent than other non-competing 11 An anonymous reviewer points out that the results here are in fact the opposite of

what is expected of a comparison between a ‘ phonological ’ and a ‘ phonetic ’ process, as conventional wisdom would have us believe that a more ‘ phonological ’ process tends to be more categorical, while a ‘ phonetic ’ process is more likely to exhibit gradient properties (e.g. Keating 1984, 1990, Pierrehumbert 1990, Cohn 1993). However, as we mentioned in w2, the diﬀerence between the two sandhis in question lies in the degree of their phonetic motivation, not in a binary ‘ phonological ’ vs. ‘ phonetic ’ distinction. Both of the sandhis are ‘ phonological ’ in the sense that they involve language-speciﬁc tone changes that cannot be predicted simply by tonal coarticulation. But in the wug test results, both patterns show gradience – third-tone sandhi in 3+3, and half-third sandhi in 3+2. This mirrors the results from the incomplete neutralisation literature.

Testing the role of phonetic knowledge in Mandarin tone sandhi 187 patterns. The eﬀects are also subtler than a comparable case – Taiwanese tone sandhi – documented in Zhang & Lai (2008) and Zhang et al. (2009a, b), in which frequency diﬀerences in the lexicon cause applicationrate diﬀerences in wug tests : the application rates here are consistently 100% ; but the degree of application diﬀers.12 5.3 Alternative interpretations Finally, we consider four other alternative interpretations to our results here, all of which were suggested by anonymous reviewers, to whom we are grateful. An important alternative to consider is whether it is possible to treat Tone 3 as underlyingly 21 and insert a high pitch to the right when the tone occurs phrase-ﬁnally. The insertion of a pre- or post-[eT] is crosslinguistically attested, and referred to as a ‘bounce’ eﬀect by Hyman (2007). The tone sandhi in the third-tone sandhi can then be considered as OCP avoidance, and the half-third sandhi as simply non-existent. The 21 underlying form for Tone 3 is a particularly attractive option for Taiwan Mandarin, in which Tone 3 is pronounced as 21 even in ﬁnal position. This position is technically workable for Beijing Mandarin, but diﬃcult to defend from a typological perspective. First, Northern Chinese dialects, of which Mandarin is one, are known to have ‘right-dominant ’ sandhis that protect domain-ﬁnal tones and change non-ﬁnal tones (Yue-Hashimoto 1987, Zhang 2007). It is not clear why Mandarin would be an exception. Second, while contour simpliﬁcation in non-ﬁnal positions is extremely common cross-linguistically, contour complication, even in ﬁnal position, is quite rare. Yue-Hashimoto’s (1987) typology of Chinese tone-sandhi systems identiﬁes close to 100 cases of contour levelling or simpliﬁcation, but only three cases of contour complication. It is not clear why we would want to entertain a typologically odd analysis when a better-attested option is available. These points are also made in Zhang (2007: 260). The second alternative relates to our observation above that the thirdtone sandhi is sensitive to syntactic information, while the half-third sandhi is not. Another manifestation of this is that the third-tone sandhi 12 An anonymous reviewer questions whether the lexical frequency diﬀerences be-

tween Tone 2 and other tones are big enough to have noticeable eﬀects in productivity. It is diﬃcult, and possibly impractical, to quantify a minimum diﬀerence in lexical frequency that can elicit an eﬀect on productivity. Studies that illustrate the eﬀects of frequency on phonological productivity (e.g. Zuraw 2000, Ernestus & Baayen 2003, Hayes & Londe 2006, Zhang & Lai 2008, Zhang et al. 2009a, b) and production (e.g. Bybee 2000, Jurafsky et al. 2001, Ernestus et al. 2006) typically use regression analyses or binary comparisons between high vs. low frequencies. However, in Hayes & Londe’s (2006) study on variable backness harmony in Hungarian, a lower than 8% harmony rate diﬀerence between two types of stems (N and NN, where N=neutral) in a web-based corpus translates into a comparable productivity diﬀerence in a wug test ; in Zhang & Lai’s (2008) and Zhang et al.’s (2009a, b) studies on tone-sandhi productivity in Taiwanese, type and token frequencies diﬀerences that are smaller than those observed here are also shown to correlate signiﬁcantly with the productivity results.

188 Jie Zhang and Yuwen Lai sometimes does not apply across a [NP][VP] boundary, as shown in (5a): the [li] syllable has the option of not undergoing the third-tone sandhi, thus giving a 21 21 sequence in the output. This makes the processing of the third-tone sandhi potentially more diﬃcult, as the speaker needs to access the syntactic information in order to determine whether the third-tone sandhi should apply. However, the stimuli that we used in the experiments were all disyllabic, and 3+3 disyllabic sequences do not have the option of not undergoing the sandhi even if the syntactic conﬁguration is [NP][VP], as shown in (5b). The syntactic information is therefore immaterial to the stimuli that we used in the experiments. (5) Third-tone sandhi in [NP][VP] ‘Old Li buys shoes.’ a. [[lAu li] [mai çjE]] old Li buy shoes input 213 213 213 35 output 1 35 21 21 35 output 2 35 35 21 35

b. input output

[[ni] you 213 35 *21

[xAu]]13 good 213 213 213

‘How are you?’

The third alternative is that the productivity diﬀerence stems from the nature of lexical listing, in that the third-tone sandhi is lexically listed, while the half-third sandhi is productively derived from markedness and faithfulness interactions in an OT grammar. This is consistent with the fact that the third-tone sandhi has a long history and thus may have a higher degree of lexicalisation. Therefore, even if the two sandhis do diﬀer in their synchronic phonetic motivation, it is their diﬀerence in lexical listing that causes the productivity diﬀerence. There are two arguments against this alternative. First, if the nature of lexical listing is truly diﬀerent between the two sandhis, then we would expect the third-tone sandhi to be entirely unproductive and the half-third sandhi to be entirely productive, regardless of lexical frequency. However, we observed a gradient diﬀerence between the two sandhis, and the halfthird sandhi is aﬀected by lexical frequency. These gradient eﬀects, we believe, are better captured by an analysis that is gradient in nature rather than one that imposes a categorical distinction between the two sandhis based on the presence vs. absence of lexical listing. Second, despite the long history of the third-tone sandhi, its application to disyllabic words in Mandarin is in fact exceptionless, just like the half-third sandhi. 13 The adjective [xAu] ‘good ’ is traditionally treated as an adjectival verb in Chinese

syntax (see Li & Thompson 1981).

Testing the role of phonetic knowledge in Mandarin tone sandhi 189 Therefore, learners of Mandarin cannot conclude purely from input statistics that the former has a higher degree of lexicality than the latter. In order to reach this conclusion, it seems that the learner still has to access the phonetic nature of the sandhis, indicating the synchronic relevance of phonetics. The ﬁnal alternative capitalises on the observation that the subjects produced the half-third sandhi after hearing only one full Tone 3 in w1 position followed by a diﬀerent tone, but produced the third-tone sandhi after hearing two identical full third tones. It is thus possible that the production of the third-tone sandhi is inﬂuenced by a greater perceptual perseveration eﬀect from the input than that of the half-third sandhi, which causes the nonce syllable in w1 position of 3+3 to have more characteristics of Tone 3. Although this approach correctly predicts incomplete neutralisation in both real and wug words (see note 5 for results on incomplete neutralisation between 3+3 and 2+3 in real word productions), it cannot predict the diﬀerence between them, as it is not clear why the perceptual perseveration eﬀect should be stronger for wug words than for real words. But more importantly, the approach assumes tone priming irrespective of segmental content, as it assumes that the two third tones both have an eﬀect on the subjects’ production of Tone 3 aﬀected by sandhi, even though the second syllable has completely diﬀerent segmental contents from the syllable undergoing sandhi. However, whether tone by itself is an eﬀective prime in a tone language is a controversial issue. Although Cutler & Chen (1995) show that tone and segments in Cantonese behave similarly as primes for lexical decision, other studies on Mandarin (Chen et al. 2002, Lee 2007) and Cantonese (Yip et al. 1998, Yip 2001) show that priming eﬀects in lexical decision and production latency are only found when the prime and the target share either segmental contents or segmental contents and tone. Tone by itself is an ineﬀective prime. This casts further doubt on the workability of this alternative.

6 Conclusion In this paper, we have proposed a novel research paradigm to test the relevance of phonetics to synchronic phonology – wug testing of patterns diﬀering in phonetic motivations that coexist in the same language. By directly addressing existing native patterns and allowing easier control of confounding factors such as lexical frequency, the wug-test paradigm provides evidence which converges with other research paradigms that have been used to test this issue, such as the study of phonological acquisition in a ﬁrst language and the artiﬁcial language paradigm. The language we used was Mandarin Chinese, which has two tone-sandhi patterns which diﬀer in their degrees of phonetic motivation, and our wug tests showed that Mandarin speakers applied the sandhi with a stronger phonetic motivation, the half-third sandhi, to wug words with a greater

190 Jie Zhang and Yuwen Lai accuracy than the phonetically more opaque sandhi, the third-tone sandhi, thus supporting the direct relevance of phonetics to synchronic phonology. We also showed that lexical frequency is relevant to the application of the half-third sandhi in wug words, as reﬂected in the lower accuracy of the sandhi in the 3+2 environment. However, lexical frequency alone cannot account for the low sandhi accuracy of 3+3, as the sandhi tone diﬀerences between real and wug words are more consistent for 3+3 than 3+2, even though 3+2 has a lower lexical frequency. We recognise that our position that phonetics, likely in the form of substantive biases, is part of the design feature of grammar construction complicates the search for phonological explanations in the following sense : it potentially creates a duplication problem for patterns whose explanation may come from either substantive bias or misperception; how, then, does one tease apart which is the true explanatory factor ? This problem is pointed out by Hansson (2008 : 886), for example. We surmise that the answer will not come from individual cases for which the explanation may truly be ambiguous, but from comprehensive experimental studies on many diﬀerent patterns to establish which approach makes better predictions on both the speakers’ internal knowledge and the evolution of these patterns in general. Therefore, the study reported here can simply be viewed as food for future research into the phonetics–phonology relationship. To conduct similar studies, we need the two patterns under comparison to satisfy the following conditions : (a) they have comparable triggering environments, (b) they are of comparable productivity in the native lexicon, (c) they have comparable frequencies of occurrence in the native lexicon and (d) they diﬀer in their degrees of phonetic motivation. There are many other Chinese dialects, especially the Wu and Min dialects, that have considerably more intricate patterns of tone sandhi than Mandarin, and we often ﬁnd diﬀerences in the degree of phonetic motivation among the sandhi patterns in these dialects. We hope that our study on Mandarin will lead to similar research in other Chinese dialects, which will make further contributions to the phonetics–phonology interface debate. Starting from Hsieh’s seminal works on wug-testing Taiwanese tone sandhi, the productivity of complicated tone-sandhi patterns has been a long-standing question in Chinese phonology. This is especially true for patterns involving phonological opacity (e.g. the tone circle in Southern Min ; see Chen 2000 for examples) and syntactic dependency (e.g. the diﬀerent sandhi patterns that subject-predicate and verb-object compounds undergo in Pingyao; see Hou 1980). We hope that our research will inspire more psycholinguistic testing of these patterns that will shed light on this question. Some results on how sandhi productivity is gradiently inﬂuenced by phonological opacity have already been obtained for Taiwanese (Zhang & Lai 2008, Zhang et al. 2009a, b). Finally, our results here shed additional light on the nature of gradience in phonology. Not only are the phonetic and frequency eﬀects observed here gradient, they are gradient in an interesting way: the sandhis may

Testing the role of phonetic knowledge in Mandarin tone sandhi 191 apply to all wug words, but they apply incompletely in that the sandhi tone bears more resemblance to the base tone than the sandhi tone in real words. This complements well-attested gradient eﬀects whereby a phonological pattern only applies to a certain percentage of the experimental test items.14 This observation is both methodologically and theoretically signiﬁcant : methodologically, it further demonstrates the importance of careful acoustic studies, which can reveal phonological patterns that have hitherto escaped our attention ; theoretically, it forces us to rethink theoretical models of phonology, which need to provide a viable explanation for the multiple layers of gradience. Appendix: Additional test-stimuli information This appendix provides additional information and complete word lists for the five word groups (AO-AO, *AO-AO, AO-AG, AG-AO and AG-AG) and the fillers used in the experiments.

1 AO-AO For AO-AO words, we controlled both the frequency and the mutual information score for the disyllables, using Da (1998)’s Feng Hua Yuan character and digram frequency corpus, which contains 4,718,131 characters and 4,159,927 digrams. All AO-AO disyllables fall within the raw frequency (raw number of occurrence) range of 31–62, and are relatively common words. The mutual information score is calculated as: I(x, y)=log2

ºp(x, y)º p(x)p(y)

where p(x, y) represents the digram frequency, and p(x) and p(y) represent the frequencies of the two characters respectively. A higher mutual information score indicates a higher likelihood for the two characters to co-occur, and hence to form real words. All AO-AO words fall within the range of 8–17 for the mutual information score, indicating that all these digrams are common words. Da (1998) provides the following guidelines on how to interpret mutual information scores: a score greater than 3 indicates that the two words have a strong collocation, a score less than 1 indicates that they are unlikely to be related and a score between 1 and 3 is in the grey area. For more information on mutual information scores, see Oakes (1998).

14 As one anonymous reviewer suggests, whether any predictions can be made about

the nature of gradience in productivity is an independently interesting question. Previous work has shown that it may be inﬂuenced by multiple factors, including the nature of the gradience in the lexicon (Zuraw 2000, 2007, Hayes & Londe 2006, Pierrehumbert 2006, among others) and phonological opacity (Zhang & Lai 2008, Zhang et al. 2009a, b). But more empirical research is needed to identify both the factors and the mechanism with which the factors interact with each other.

192 Jie Zhang and Yuwen Lai base Chinese transcription tones digram

gloss

digram mutual frequency info score

3+1

ku t»Hwei tçin pjAu »an çi t»an çin nAu tçin jan kHwAN fAN t»î sa tHwO

‘to advocate’ ‘trophy’ name of province ‘brand new’ ‘brains’ ‘eye socket’ ‘to spin and weave’ ‘free and easy’

45 44 39 38 34 34 33 31

º9·11 º9·51 º8·79 º8·90 º9·68 º9·09 11·45 º8·97

3+2

»@n jAN hwAN jEn tu pwO pu t»AN li ji tçjEn fei jE man jin»î

name of city ‘lie’ ‘to gamble’ ‘to compensate’ ‘etiquette’ ‘to lose weight’ ‘barbaric’ ‘food intake’

45 41 39 38 36 35 32 32

º9·89 º8·95 10·30 10·46 º8·79 10·00 10·89 º9·50

3+3

t»an lan tçjEn tHAu kHu nAu mu t»î tçja pan tsu tAN çi wan ma ji

‘exhibit’ ‘self-criticism’ ‘worried’ ‘thumb’ ‘ship deck’ ‘to obstruct’ ‘to wash dishes’ ‘ant’

60 62 41 34 41 39 34 33

º8·90 º9·20 º8·62 10·50 º8·50 10·48 º9·15 16·69

3+4

t»@N tçju f@n swei jEn xu 3@n nai tçHjAu mjAu pAN tçja jin ljAu t»Hî tsHw@n

‘to rescue’ ‘to shatter’ ‘to cover’ ‘to tolerate’ ‘ingenious’ ‘to kidnap’ ‘drinks’ ‘size’

41 39 38 36 35 34 33 31

12·15 10·32 º8·53 º9·28 º9·56 11·08 º8·82 10·98

Testing the role of phonetic knowledge in Mandarin tone sandhi 193 2 *AO-AO base Chinese transcription tones digram

3+1

t»Hî tsHAN Wy t»AN çjE t»uN luN t»Ha pAN t»uN mu tsHw@n tçi@N pHi tçjan tsHAN

3+2

t»Hî wan Wy li@N çjE tçHWEn luN t»ai pAN ljEn mu nwO tçi@N pHu tçjan xÄ

base Chinese transcription tones digram

3+3

t»Hî sa Wy lan çjE wu luN fa pAN sa mu jin tçi@N mjEn tçjEn jE

3+4

t»Hî tsAN Wy jAu çjE ni luN ljAu pAN pAu mu tsAN tçi@N mjAu tçjan xwei

3 AO-AG base tones

Chinese digram

transcription

3+1

shun mu lan re mai liang lang rao

t»HwAN »w@n xwO mu li@N lan tçHjAu 3Ä p@n mai kHu ljAN kHwan lAN sw@n 3Au

3+2

te ka pie jiu mie geng dui duan

t»HwAN tH xwO kHa li@N pHjE tçHjAu tçju p@n mjE kHu k@N kHwan twei sw@n twan

base tones

Chinese digram

transcription

3+3

zeng suan huai hang xun heng pan cuo

t»HwAN ts@N xwO swan li@N xwai tçHjAu xAN p@n çWyn kHu x@N kHwan pHan sw@n tswO

3+4

zhua sen dei shua dei keng mang diu

t»HwAN t»wa xwO s@n li@N tei tçHjAu »wa p@n tei kHu kH@N kHwan mAN sw@n tj@u

194 Jie Zhang and Yuwen Lai 4 AG-AO base tones

Chinese digram

transcription

3+1

ping pan xia cang zhui chua run shuan

pHi@N pa pHan t»Au çja çjuN tsHAN xei t»wei mi t»Hwa tan 3w@n tçHjou »wan çWyn

3+2

ping pan xia cang zhui chua run shuan

pHi@N xAu pHan xu çja lin tsHAN WyEn t»wei lw@n t»Hwa lin 3w@n pHan »wan kHwei

base tones

Chinese digram

transcription

3+3

ping pan xia cang zhui chua run shuan

pHi@N ma pHan xai çja na tsHAN t»Hî t»wei fa t»Hwa kwei 3w@n tçHi »wan lAu

3+4

ping pan xia cang zhui chua run shuan

pHi@N tHAu pHan Wy çja lei tsHAN ly tîwei pan t»Hwa lu 3w@n fei »wan nu

Chinese digram

transcription

5 AG-AG base tones

Chinese digram

transcription

3+1

ping pan xia cang zhui chua run shuan

shun mai mei re mai liang lang kuo

pHi@N »w@n pHan mai çja mei tsHAN 3Ä t»wei mai t»Hwa ljAN 3w@n lAN »wan kHwO

3+2

ping pan xia cang zhui chua run shuan

te ka kong mie mie geng dui ta

pHi@N t pHan kHa çja kHuN tsHAN mjE t»wei mjE t»Hwa g@N 3w@n twei »wan tHa

base tones

3+3

ping pan xia cang zhui chua run shuan

zeng seng lue xia kuang heng pan sai

pHi@N ts@N pHan s@N çja lWE tsHAN çja t»wei kHwAN t»Hwa x@N 3w@n pHan »wan sai

3+4

ping pan xia cang zhui chua run shuan

zhua sen dei shua dei keng mang sen

pHi@N t»wa pHan s@n çja tei tsHAN »wa t»wei tei t»Hwa kH@N 3w@n mAN »wan s@n

Testing the role of phonetic knowledge in Mandarin tone sandhi 195 6 Fillers All filler syllables are real syllables in Mandarin; half of the disyllabic fillers were real words, and the other half were wug words. The tonal combinations of the fillers were chosen randomly and are given below. s2 s1 T1 T2 T3 T4

real fillers

wug fillers

T1

T2

T3

T4

T1

T2

T3

T4

7 3 2 2

4 6 4 2

5 6 2 5

º7 11 º6 º8

3 2 2 6

º3 º8 º1 10

0 4 1 0

º6 14 º3 17

REFERENCES

Albright, Adam (2002). Islands of reliability for regular morphology : evidence from Italian. Lg 78. 684–709. Albright, Adam, Argelia Andrade & Bruce Hayes (2001). Segmental environments of Spanish diphthongization. UCLA Working Papers in Linguistics 7: Papers in Phonology 5. 117–151. Albright, Adam & Bruce Hayes (2003). Rules vs. analogy in English past tenses : a computational/experimental study. Cognition 90. 119–161. Anderson, Stephen R. (1981). Why phonology isn’t ‘ natural ’. LI 12. 493–539. Archangeli, Diana & Douglas Pulleyblank (1994). Grounded phonology. Cambridge, Mass. : MIT Press. Baayen, R. Harald (1992). Quantitative aspects of morphological productivity. Yearbook of Morphology 1991. 109–149. Baayen, R. Harald (1993). On frequency, transparency, and productivity. Yearbook of Morphology 1992. 181–208. Bach, Emmon & Robert T. Harms (1972). How do languages get crazy rules ? In Robert P. Stockwell & Ronald K. S. Macaulay (eds.) Linguistic change and generative theory. Bloomington : Indiana University Press. 1–21. Berko, Jean (1958). The child’s learning of English morphology. Word 14. 150–177. Blevins, Juliette (2004). Evolutionary Phonology : the emergence of sound patterns. Cambridge : Cambridge University Press. Blevins, Juliette (2006). A theoretical synopsis of Evolutionary Phonology. Theoretical Linguistics 32. 117–166. Blevins, Juliette & Andrew Garrett (1998). The origins of consonant–vowel metathesis. Lg 74. 508–556. Bley-Vroman, Robert (1988). The fundamental character of foreign language learning. In William E. Rutherford & Michael Sharwood Smith (eds.) Grammar and second language teaching : a book of readings. Rowley, Mass. : Newbury House. 19–30. Boersma, Paul (1998). Functional phonology : formalizing the interactions between articulatory and perceptual drives. PhD dissertation, University of Amsterdam. Boersma, Paul & David Weenink (2003). Praat: a system for doing phonetics by computer. http://www.praat.org/. Buckley, Eugene (1999). On the naturalness of unnatural rules. UCSB Working Papers in Linguistics 9. 16–29.

196 Jie Zhang and Yuwen Lai Buckley, Eugene (2002). Rule naturalness and the acquisition of phonology. Paper presented at the 2nd North American Phonology Conference, Concordia University, Montreal. Buckley, Eugene (2003). Children’s unnatural phonology. BLS 29. 523–534. Bybee, Joan (1985). Morphology : a study of the relation between meaning and form. Amsterdam & Philadelphia : Benjamins. Bybee, Joan (2000). The phonology of the lexicon : evidence from lexical diﬀusion. In Michael Barlow & Suzanne Kemmer (eds.) Usage-based models of language. Stanford : CSLI. 65–85. Bybee, Joan (2001). Phonology and language use. Cambridge : Cambridge University Press. Bybee, Joan (2006). Frequency of use and the organization of language. Oxford : Oxford University Press. Bybee, Joan & Elly Pardo (1981). On lexical and morphological conditioning of alternations : a nonce-probe experiment with Spanish verbs. Linguistics 19. 937–968. Chambers, Kyle E., Kristine H. Onishi & Cynthia Fisher (2003). Infants learn phonotactic regularities from brief auditory experience. Cognition 87. B69–B77. Chao, Yuen Ren (1948). Mandarin primer : an intensive course in spoken Chinese. Cambridge, Mass. : Harvard University Press. Chao, Yuen Ren (1968). A grammar of spoken Chinese. Berkeley : University of California Press. Charles-Luce, Jan (1985). Word-ﬁnal devoicing in German : eﬀects of phonetic and sentential contexts. JPh 13. 309–324. Chen, Jenn-Yeu, Train-Min Chen & Gary S. Dell (2002). Word-form encoding in Mandarin Chinese as assessed by the implicit priming task. Journal of Memory and Language 46. 751–781. Chen, Matthew Y. (2000). Tone sandhi: patterns across Chinese dialects. Cambridge : Cambridge University Press. Chomsky, Noam (1986). Knowledge of language : its nature, origin, and use. New York: Praeger. Chomsky, Noam & Morris Halle (1968). The sound pattern of English. New York : Harper & Row. Coetzee, Andries W. (2008). Grammaticality and ungrammaticality in phonology. Lg 84. 218–257. Coetzee, Andries W. & Joe Pater (2008). Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic. NLLT 26. 289–337. Cohn, Abigail C. (1993). Nasalisation in English : phonology or phonetics. Phonology 10. 43–81. Cook, Vivian J. (1969). The analogy between ﬁrst and second language learning. International Review of Applied Linguistics 7. 207–216. Cook, Vivian J. (1994). The metaphor of access to Universal Grammar in L2 learning. In Nick C. Ellis (ed.) Implicit and explicit learning of languages. San Diego : Academic Press. 477–502. Court, Christopher (1985). Observations on some cases of tone sandhi. In Graham Thurgood, James A. Matisoﬀ & David Bradley (eds.) Linguistics of the Sino-Tibetan area : the state of the art. Canberra: Australian National University. 125–137. Cutler, Anne & Hsuan-Chih Chen (1995). Phonological similarity eﬀects in Cantonese word recognition. In Kjell Elenius & Peter Branderud (eds.) Proceedings of the 13th International Congress of the Phonetic Sciences. Vol. 1. Stockholm : KTH & Stockholm University. 106–109. Cutler, Anne & Hsuan-Chih Chen (1997). Lexical tone in Cantonese spoken-word processing. Perception and Psychophysics 59. 165–179.

Testing the role of phonetic knowledge in Mandarin tone sandhi 197 Da, Jun (1998). Chinese text computing : bigrams and statistical measures. Available (December 2009) at http://lingua.mtsu.edu/chinese-computing/old-version1/ statistics/mi.html. Da, Jun (2004). Chinese text computing : syllable frequencies with tones. Available (December 2009) at http://lingua.mtsu.edu/chinese-computing/phonology/ syllabletone.php. Dell, Gary S., Kristopher D. Reed, David R. Adams & Antje S. Meyer (2000). Speech errors, phonotactic constraints, and implicit learning: a study of the role of experience in language production. Journal of Experimental Psychology : Learning, Memory, and Cognition 26. 1355–1367. Dinnsen, Daniel A. (1984). A re-examination of phonological neutralization. JL 21. 265–279. Duanmu, San (2000). The phonology of Standard Chinese. Oxford: Oxford University Press. Dulay, Heidi, Marina Burt & Stephen Krashen (1982). Language two. New York & Oxford : Oxford University Press. Ellis, Rod (1994). The study of second language acquisition. Oxford: Oxford University Press. Ernestus, Mirjam & R. Harald Baayen (2003). Predicting the unpredictable : interpreting neutralized segments in Dutch. Lg 79. 5–38. Ernestus, Mirjam, Mybeth Lahey, Femke Verhees & R. Harald Baayen (2006). Lexical frequency and voice assimilation. JASA 120. 1040–1051. Fleischhacker, Heidi (2001). Cluster-dependent epenthesis asymmetries. UCLA Working Papers in Linguistics 7: Papers in Phonology 5. 71–116. Flemming, Edward (2001). Scalar and categorical phenomena in a uniﬁed model of phonetics and phonology. Phonology 18. 7–44. Frisch, Stefan A. & Bushra Adnan Zawaydeh (2001). The psychological reality of OCP-place in Arabic. Lg 77. 91–106. Goldstein, Louis, D. H. Whalen & Catherine T. Best (eds.) (2006). Laboratory Phonology 8. Berlin & New York : Mouton de Gruyter. Hale, Mark & Charles Reiss (2000). ‘ Substance abuse ’ and ‘ dysfunctionalism ’ : current trends in phonology. LI 31. 157–169. ´ lafur (2001). Theoretical and typological issues in consonant Hansson, Gunnar O harmony. PhD dissertation, University of California, Berkeley. ´ lafur (2008). Diachronic explanations of sound patterns. Language Hansson, Gunnar O and Linguistics Compass 2. 859–893. Hayes, Bruce & Zsuzsa Czira´ky Londe (2006). Stochastic phonological knowledge : the case of Hungarian vowel harmony. Phonology 23. 59–104. Hayes, Bruce, Robert Kirchner & Donca Steriade (eds.) (2004). Phonetically based phonology. Cambridge : Cambridge University Press. Hayes, Bruce & Donca Steriade (2004). Introduction : the phonetic bases of phonological markedness. In Hayes et al. (2004). 1–33. Hayes, Bruce (1999). Phonetically driven phonology : the role of Optimality Theory and inductive grounding. In Michael Darnell, Edith Moravcsik, Frederick Newmeyer, Michael Noonan & Kathleen Wheatley (eds.) Functionalism and formalism in linguistics. Vol. 1: General papers. Amsterdam & Philadelphia : Benjamins. 243–285. Hou, Jing-Yi (1980). Pingyao fangyan de liandu biandiao. [Tone sandhi in the Pingyao dialect.] Fangyan [Dialects] 80. 1–14. Hsieh, Hsin-I (1970). The psychological reality of tone sandhi rules in Taiwanese. CLS 6. 489–503. Hsieh, Hsin-I (1975). How generative is phonology ? In E. F. K. Koerner (ed.) The transformational-generative paradigm and modern linguistic theory. Amsterdam : Benjamins. 109–144.

198 Jie Zhang and Yuwen Lai Hsieh, Hsin-I (1976). On the unreality of some phonological rules. Lingua 38. 1–19. Hume, Elizabeth & Keith Johnson (eds.) (2001). The role of speech perception in phonology. San Diego : Academic Press. Hyman, Larry M. (2001). The limits of phonetic determinism in phonology : *NC revisited. In Hume & Johnson (2001). 141–185. Hyman, Larry M. (2007). Universals of tone rules: 30 years later. In Tomas Riad & Carlos Gussenhoven (eds.) Tones and tunes. Vol. 1: Typological studies in word and sentence prosody. Berlin & New York: Mouton de Gruyter. 1–34. Jun, Jongho (1995). Perceptual and articulatory factors in place assimilation : an Optimality-theoretic approach. PhD dissertation, University of California, Los Angeles. Jun, Jongho (2004). Place assimilation. In Hayes et al. (2004). 58–86. Jurafsky, Daniel, Alan Bell, Michelle Gregory & William D. Raymond (2001). Probabilistic relations between words : evidence from reduction in lexical production. In Joan Bybee & Paul Hopper (eds.) Frequency and the emergence of linguistic structure. Amsterdam & Philadelphia : Benjamins. 229–254. Jusczyk, Peter W., Paul Smolensky, Karen Arnold & Elliott Moreton (2003). Acquisition of nasal place assimilation by 4.5-month-old infants. In Derek Houston, Amanda Seidl, George J. Hollich, Elisabeth Johnson & Ann Marie Jusczyk (eds.) Jusczyk lab ﬁnal report. Available (December 2009) at http://hincapie.psych. purdue.edu/Jusczyk/pdf/Nasal.pdf. Kang, Yoonjung (2003). Perceptual similarity in loanword adaptation : English postvocalic word-ﬁnal stops in Korean. Phonology 20. 219–273. Keating, Patricia A. (1984). Phonetic and phonological representation of stop consonant voicing. Lg 60. 286–319. Keating, Patricia A. (1990). Phonetic representations in a generative grammar. JPh 18. 321–334. Kenstowicz, Michael (2007). Salience and similarity in loanword adaptation : a case study from Fijian. Language Sciences 29. 316–340. Kingston, John & Randy L. Diehl (1994). Phonetic knowledge. Lg 70. 419–454. Kiparsky, Paul (1982). Lexical morphology and phonology. In The Linguistic Society of Korea (ed.) Linguistics in the morning calm. Seoul : Hanshin. 3–91. Kiparsky, Paul (1985). Some consequences of Lexical Phonology. Phonology Yearbook 2. 85–138. Kiparsky, Paul (2006). The amphichronic program vs. Evolutionary Phonology. Theoretical Linguistics 32. 217–236. Kirchner, Robert (2000). Geminate inalterability and lenition. Lg 76. 509–545. Kirchner, Robert (2001). An eﬀort-based approach to consonant lenition. New York & London : Routledge. Kirchner, Robert (2004). Consonant lenition. In Hayes et al. (2004). 313–345. Lee, Chao-Yang (2007). Does horse activate mother? Processing lexical tone in form priming. Language and Speech 50. 101–123. Li, Charles N. & Sandra A. Thompson (1981). Mandarin Chinese : a functional reference grammar. Berkeley : University of California Press. Lin, Yen-Hwei (2007). The sounds of Chinese. Cambridge : Cambridge University Press. MacWhinney, Brian (1978). The acquisition of morphophonology. Chicago : University of Chicago Press Mei, Tsu-lin (1977). Tones and tone sandhi in 16th century Mandarin. Journal of Chinese Linguistics 5. 237–260. Menn, Lise & Carol Stoel-Gammon (1995). Phonological development. In Paul Fletcher & Brian MacWhinney (eds.) The handbook of child language. Oxford : Blackwell. 335–359.

Testing the role of phonetic knowledge in Mandarin tone sandhi 199 Mohanan, K. P. (1982). Lexical Phonology. PhD dissertation, MIT. Distributed by Indiana University Linguistics Club. Mohanan, K. P. (1993). Fields of attraction in phonology. In John Goldsmith (ed.) The last phonological rule : reﬂections on constraints and derivations. Chicago: University of Chicago Press. 61–116. Moore, Corinne B. & Allard Jongman (1997). Speaker normalization in the perception of Mandarin Chinese tones. JASA 102. 1864–1877. Moreton, Elliott (2008). Analytic bias and phonological typology. Phonology 25. 83–127. Myers, James (2002). An analogical approach to the Mandarin syllabary. Chinese Phonology 11. 163–190. Oakes, Michael P. (1998). Statistics for corpus linguistics. Edinburgh : Edinburgh University Press. Ohala, John J. (1981). The listener as a source of sound change. In C. S. Masek, R. A. Hendrick & M. F. Miller (eds.) Papers from the parasession on language and behavior. Chicago : Chicago Linguistic Society. 178–203. Ohala, John J. (1990). The phonetics and phonology of aspects of assimilation. In John Kingston & Mary E. Beckman (eds.) Papers in laboratory phonology I : between the grammar and physics of speech. Cambridge : Cambridge University Press. 258–275. Ohala, John J. (1993). Coarticulation and phonology. Language and Speech 36. 155–170. Ohala, John J. (1997). The relation between phonetics and phonology. In William J. Hardcastle & John Laver (eds.) The handbook of phonetic sciences. Oxford & Cambridge, Mass. : Blackwell. 674–694. Onishi, Kristine H., Kyle E. Chambers & Cynthia Fisher (2002). Learning phonotactic constraints from brief auditory experience. Cognition 83. B13–B23. Patterson, David & Cynthia M. Connine (2001). Variant frequency in ﬂap production : a corpus analysis of variant frequency in American English ﬂap production. Phonetica 58. 254–275. Peng, Shu-Hui (2000). Lexical versus ‘ phonological ’ representations of Mandarin sandhi tones. In Michael B. Broe & Janet B. Pierrehumbert (eds.) Papers in laboratory phonology V: acquisition and the lexicon. Cambridge : Cambridge University Press. 152–167. Peperkamp, Sharon & Emmanuel Dupoux (2007). Learning the mapping from surface to underlying representations in an artiﬁcial language. In Jennifer Cole & Jose´ Ignacio Hualde (eds.) Laboratory phonology 9. Berlin & New York: Mouton de Gruyter. 315–338. Peperkamp, Sharon, Katrin Skoruppa & Emmanuel Dupoux (2006). The role of phonetic naturalness in phonological rule acquisition. In David Bamman, Tatiana Magnitskaia & Colleen Zaller (eds.) Proceedings of the 30th Annual Boston University Conference on Language Development. Somerville, Mass. : Cascadilla Press. 464–475. Pierrehumbert, Janet B. (1990). Phonological and phonetic representation. JPh 18. 375–394. Pierrehumbert, Janet B. (2003). Probabilistic phonology : discrimination and robustness. In Rens Bod, Jennifer Hay & Stefanie Jannedy (eds.) Probabilistic linguistics. Cambridge, Mass. : MIT Press. 177–228. Pierrehumbert, Janet B. (2006). The statistical basis of an unnatural alternation. In Goldstein et al. (2006). 81–106. Port, Robert F. & Penny Crawford (1989). Incomplete neutralization and pragmatics in German. JPh 17. 257–282. Prince, Alan & Paul Smolensky (1993). Optimality Theory : constraint interaction in generative grammar. Ms, Rutgers University & University of Colorado, Boulder. Published 2004, Malden, Mass. & Oxford : Blackwell.

200 Jie Zhang and Yuwen Lai Pycha, Anne, Pawel Nowak, Eurie Shin & Ryan Shosted (2003). Phonological rule-learning and its implications for a theory of vowel harmony. WCCFL 22. 423–435. Qian, Zeng-Yi & Guang-Qi Zhu (1998). Jinanhua yindang. [A record of the Jinan dialect.] Shanghai : Shanghai Jiaoyu Chubanshe. Reber, Arthur S. (1967). Implicit learning of artiﬁcial grammars. Journal of Verbal Learning and Verbal Behavior 6. 855–863. Reber, Arthur S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology : General 118. 219–235. Redington, Martin & Nick Chater (1996). Transfer in artiﬁcial grammar learning : a reevaluation. Journal of Experimental Psychology : General 125. 123–138. Seidl, Amanda & Eugene Buckley (2005). On the learning of arbitrary phonological rules. Language Learning and Development 1. 289–316. Shen, Xiaonan Susan & Maocan Lin (1991). A perceptual study of Mandarin Tones 2 and 3. Language and Speech 34. 145–156. Shen, Xiaonan Susan, Maocan Lin & Jingzhu Yan (1993). F0 turning point as an F0 cue to tonal contrast : a case study of Mandarin tones 2 and 3. JASA 93. 2241–2243. Shih, Chilin (1997). Mandarin third tone sandhi and prosodic structure. In Wang Jialing & Norval Smith (eds.) Studies in Chinese phonology. Berlin : Mouton de Gruyter. 81–123. Silverman, Daniel (2006a). A critical introduction to phonology : of sound, mind, and body. London : Continuum. Silverman, Daniel (2006b). The diachrony of labiality in Trique, and the functional relevance of gradience and variation. In Goldstein et al. (2006). 133–152. Slobin, Dan I. (1985). Crosslinguistic evidence for the language-making capacity. In Dan I. Slobin (ed.) The crosslinguistic study of language acquisition. Vol. 2: Theoretical issues. Hillsdale : Erlbaum. 1157–1256. Slowiaczek, Louisa M. & Daniel A. Dinnsen (1985). On the neutralizing status of Polish word-ﬁnal devoicing. JPh 13. 325–341. Stampe, David (1979). A dissertation on Natural Phonology. New York & London : Garland. Steriade, Donca (1999). Phonetics in phonology : the case of laryngeal neutralization. UCLA Working Papers in Linguistics 2: Papers in Phonology 3. 25–146. Steriade, Donca (2001). Directional asymmetries in place assimilation : a perceptual account. In Hume & Johnson (2001). 219–250. Steriade, Donca (2008). The phonology of perceptibility eﬀects : the P-map and its consequences for constraint organization. In Kristin Hanson & Sharon Inkelas (eds.) The nature of the word: studies in honor of Paul Kiparsky. Cambridge, Mass. : MIT Press. 151–179. Steriade, Donca & Jie Zhang (2001). Context-dependent similarity and the Romanian semi-rime. Paper presented at the 37th Annual Meeting of the Chicago Linguistic Society, Chicago. Taft, Marcus & Hsuan-Chih Chen (1992). Judging homophony in Chinese : the inﬂuence of tones. In Hsuan-Chih Chen & Ovid J. L. Tzeng (eds.) Language processing in Chinese. Amsterdam : North-Holland. 151–172. Thatte, Victoria Anne (2007). Phonetic motivation as a learning bias in phonological acquisition : an experimental study. MA thesis, University of California, Los Angeles. Wang, H. Samuel (1993). Taiyu biandiao de xinli texing. [On the psychological status of Taiwanese tone sandhi.] Tsinghua Xuebao [Tsinghua Journal of Chinese Studies] 23. 175–192. Warner, Natasha, Allard Jongman, Joan Sereno & Rache`l Kemps (2004). Incomplete neutralization and other sub-phonemic durational diﬀerences in production and perception : evidence from Dutch. JPh 32. 251–276.

Testing the role of phonetic knowledge in Mandarin tone sandhi 201 Wen, Duan-Zheng & Ming Shen (1999). Taiyuanhua yindang. [A record of the Taiyuan dialect.] Shanghai : Shanghai Jiaoyu Chubanshe. Wilson, Colin (2003). Experimental investigation of phonological naturalness. WCCFL 22. 533–546. Wilson, Colin (2006). Learning phonology with substantive bias : an experimental and computational study of velar palatalization. Cognitive Science 30. 945–982. Yang, Zi-Xiang, He-Tong Guo & Xiang-Dong Shi (1999). Tianjinhua yindang. [A record of the Tianjin dialect.] Shanghai : Shanghai Jiaoyu Chubanshe. Ye, Yun & Cynthia M. Connine (1999). Processing spoken Chinese : the role of tone information. Language and Cognitive Processes 14. 609–630. Yip, Michael C. W. (2001). Phonological priming in Cantonese spoken-word processing. Psychologia 44. 223–229. Yip, Michael C. W., Po-Yee Leung & Hsuan-Chih Chen (1998). Phonological similarity eﬀects in Cantonese spoken-word processing. Proceedings of ICSLP ’98. Vol. 5. Sydney. 2139–2142. Yu, Alan C. L. (2004). Explaining ﬁnal obstruent voicing in Lezgian : phonetics and history. Lg 80. 73–97. Yue-Hashimoto, Anne O. (1987). Tone sandhi across Chinese dialects. In Chinese Language Society of Hong Kong (ed.) Wang Li memorial volumes : English volume. Hong Kong : Joint Publishing Co. 445–474. Zhang, Jie (2002). The eﬀects of duration and sonority on contour tone : typological survey and formal analysis. New York & London : Routledge. Zhang, Jie (2004). The role of contrast-speciﬁc and language-speciﬁc phonetics in contour tone distribution. In Hayes et al. (2004). 157–190. Zhang, Jie (2007). A directional asymmetry in Chinese tone sandhi systems. Journal of East Asian Linguistics 16. 259–302. Zhang, Jie & Yuwen Lai (2008). Phonological knowledge beyond the lexicon in Taiwanese double reduplication. In Yuchau E. Hsiao, Hui-Chuan Hsu, Lian-Hee Wee & Dah-An Ho (eds.) Interfaces in Chinese phonology : Festschrift in honor of Matthew Y. Chen on his 70th birthday. Taiwan : Academia Sinica. 183–222. Zhang, Jie, Yuwen Lai & Craig Sailor (2009a). Opacity, phonetics, and frequency in Taiwanese tone sandhi. In Current issues in unity and diversity of languages : collection of papers selected from the 18th International Congress of Linguists. Linguistic Society of Korea. 3019–3038. Zhang, Jie, Yuwen Lai & Craig Sailor (2009b). Eﬀects of phonetics and frequency on the productivity of Taiwanese tone sandhi. CLS 43:1. 273–286. Zhang, Ning (1997). The avoidance of the third tone sandhi in Mandarin Chinese. Journal of East Asian Linguistics 6. 293–338. Zue, Victor W. & Martha Laferriere (1979). Acoustical study of medial /t,d/ in American English. JASA 66. 1039–1050. Zuraw, Kie (2000). Patterned exceptions in phonology. PhD dissertation, University of California, Los Angeles. Zuraw, Kie (2007). The role of phonetic knowledge in phonological patterning : corpus and survey evidence from Tagalog inﬁxation. Lg 83. 277–316.

Contributors Ayla Applebaum Department of Linguistics University of California, Santa Barbara Santa Barbara, CA 93106 U.S.A. ([email protected])

Karen Jesney Department of Linguistics University of Massachusetts, Amherst Amherst, MA 01003 U.S.A. ([email protected])

Michael Becker Department of Linguistics Harvard University Boylston Hall, 3rd Floor Cambridge, MA 02138 U.S.A. ([email protected])

Yuwen Lai Department of Foreign Languages and Literatures National Chiao-Tung University 3/F Humanities Building 2 1001 Ta-Hsueh Road Hsinchu 300 Taiwan ([email protected])

Rajesh Bhatt Department of Linguistics University of Massachusetts, Amherst Amherst, MA 01003 U.S.A. ([email protected])

Joe Pater Department of Linguistics University of Massachusetts, Amherst Amherst, MA 01003 U.S.A. ([email protected])

Juliette Blevins Department of Linguistics Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany ([email protected])

Andrew Pawley Department of Linguistics Research School of Paciﬁc and Asian Studies Australian National University Canberra, ACT 0200 Australia ([email protected])

Matthew Gordon Department of Linguistics University of California, Santa Barbara Santa Barbara, CA 93106 U.S.A. ([email protected])

Christopher Potts Department of Linguistics Margaret Jacks Hall, Building 460 Stanford University Stanford, CA 94305 U.S.A. ([email protected])

Anne Pycha Institute for Research in Cognitive Science University of Pennsylvania 3401 Walnut Street, Suite 400A Philadelphia, PA 19104 U.S.A. ([email protected])

Jie Zhang Department of Linguistics University of Kansas 427 Blake Hall 1541 Lilac Lane Lawrence, KS 66045-3129 U.S.A. ([email protected])

PHONOLOGY Editors Colin Ewen (University of Leiden) Ellen Kaisse (University of Washington) Review editor Andrew Nevins (University College London) Associate editors Bruce Hayes (University of California, Los Angeles) Elizabeth Hume (Ohio State University) Larry Hyman (University of California, Berkeley) William Idsardi (University of Maryland) René Kager (University of Utrecht) D. Robert Ladd (University of Edinburgh) Joe Pater (University of Massachusetts, Amherst) Keren Rice (University of Toronto) Editorial board John Alderete (Simon Fraser University) Diana Archangeli (University of Arizona) Amalia Arvaniti (University of California, San Diego) Ellen Broselow (State University of New York at Stony Brook) Andries Coetzee (University of Michigan) Matthew Goldrick (Northwestern University) Laura Downing (Research Centre for General Linguistics, Berlin) Gregory Iverson (University of Wisconsin-Milwaukee) Yoonjung Kang (University of Toronto Scarborough) Scott Myers (University of Texas at Austin) Marc van Oostendorp (Meertens Institute, Amsterdam) Tobias Scheer (CNRS/University of Nice) Richard Wright (University of Washington) Members of the editorial board are appointed for terms of five years. Subscriptions Phonology (ISSN 0952–6757) is published three times a year, in May, August and December. The subscription price of Volume 27 (2010) for institutions, which includes print and electronic access, is £170.00 (US $300.00 in the U.S.A., Canada and Mexico). The electroniconly price available to institutional subscribers is £146.00 (US $255.00). The print-only price available to institutional subscribers is £152.00 (US $265.00). The price to individuals ordering direct from the publishers and certifying that the journal is for their personal use is £30.00 (US $45.00). This includes both a print subscription and online access. Orders, which must be accompanied by payment, may be sent to a bookseller, subscription agent or direct to the publisher: Cambridge University Press, The Edinburgh Building, Shaftesbury Road, Cambridge CB2 8RU. Orders from the U.S.A., Canada and Mexico should be sent to: Cambridge University Press, Journals Fulfillment Department, 100 Brook Hill Drive, West Nyack, NY 10994-2133, U.S.A. Japanese prices for institutions are available from Kinokuniya Company Ltd, P.O. Box 55, Chitose, Tokyo 156, Japan. Prices include delivery by air. Orders may also be placed through the website: http://titles.cambridge.org/journals.

Copying This journal is registered with the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, U.S.A. (www.copyright.com). Organisations in the U.S.A. who are registered with the CCC may therefore copy material (beyond the limits permitted by the sections 107 and 108 of U.S.A. copyright law), subject to payment to the CCC. This consent does not extend to multiple copying for promotional or commercial purposes. Organisations authorised by the Copyright Licensing Agency may also copy material subject to the usual conditions. ISI Tear Sheet Service, 3501 Market Street, Philadelphia, PE 19104, U.S.A. is authorised to supply single copies of separate articles for private use only. For all other use, permission must be sought from the Cambridge or American branch of Cambridge University Press. Policy Phonology is concerned with all aspects of phonology and related disciplines. Preference is given to papers which make a substantial theoretical contribution, irrespective of the particular theoretical framework employed, but the submission of papers presenting new empirical data of general theoretical interest is also encouraged. One of the three issues of a volume is occasionally devoted to a particular theme. The editors welcome proposals for themes and offers to act as guest editors for thematic issues. Submission of papers Submissions should be sent to the editors in PDF format, preferably by e-mail. The editorial addresses are: Colin J. Ewen, Opleiding Engels, Universiteit Leiden, Postbus 9515, 2300 RA Leiden, The Netherlands ([email protected]); Ellen M. Kaisse, Department of Linguistics, University of Washington, Box 354340, Seattle, WA 98195-4340, U.S.A. ([email protected]). An abstract (no longer than 150 words) should be e-mailed to both editors when the manuscript is submitted. The author’s name should not appear on the paper itself, and, as far as possible, should not be identifiable from references in the text. A full set of notes for contributors is published on pp. 545–548 of Volume 26, and can also be found on the journal website. The language of submission and publication is English. Internet access Phonology is included in the Cambridge Journals Online service, which can be found at www.journals.cup.org. Information on other Press titles may be accessed at www.journals.cambridge.org or www.cambridge.org. This journal issue has been printed on FSC-certified paper and cover board. FSC is an independent non-governmental, not-for-profit organization established to promote the responsible management of the word’s forests. Please see www.fsc.org for information.

Printed in the United Kingdom at the University Press, Cambridge. © Cambridge University Press, 2010

CONTENTS 1 Typological implications of Kalam predictable vowels Juliette Blevins and Andrew Pawley 45 Prosodic fusion and minimality in Kabardian Matthew Gordon and Ayla Applebaum

Cambridge Journals Online For further information about this journal please go to the journal website at:

journals.cambridge.org/pho

PHONOLOGY

153 Testing the role of phonetic knowledge in Mandarin tone sandhi Jie Zhang and Yuwen Lai

NUMBER 1

119 A test case for the phonetics–phonology interface: gemination restrictions in Hungarian Anne Pycha

PHONO

PHONOLOGY PHONOLOGY

27

77 Harmonic Grammar with linear programming: from linear systems to linguistic typology Christopher Potts, Joe Pater, Karen Jesney, Rajesh Bhatt and Michael Becker

PH NOLOGY

VOLUME 27 . NUMBER 1 . 2010

PHONOLOGY

PHONOLOGY

27

NUMBER 1

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents