Interlanguage Variation in Theoretical and Pedagogical Perspective

Interlanguage Variation in Theoretical and Pedagogical Perspective In this book H. D. Adamson reviews scholarship in s...

Author: H.D. Adamson

29 downloads 1177 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Interlanguage Variation in Theoretical and Pedagogical Perspective

In this book H. D. Adamson reviews scholarship in sociolinguistics and second language acquisition, comparing theories of variation in ﬁrst-and-second language speech, with special attention to the psychological underpinnings of variation theory. Interlanguage is what second language learners speak. It contains syntactic, morphological, and phonological patterns that are not those of either the ﬁrst or the second language, and which can be analyzed using the principles and techniques of variation theory. Interlanguage Variation in Theoretical and Pedagogical Perspective:

• • • •

Relates the emerging ﬁeld of variation in second language learners’ speech (interlanguage) to the established ﬁeld of variation in native speakers’ speech Relates the theory of linguistic variation with psycholinguistic models of language processing Relates sociolinguistic variation theory to the theory of Cognitive Linguistics Suggests teaching applications that follow from the theoretical discussion

At the forefront of scholarship in the ﬁelds of interlanguage and variation theory scholarship, this book is directed to graduate students and researchers in applied English linguistics and second language acquisition, especially those with a background in sociolinguistics. H. D. Adamson is Professor of English at the University of Arizona, where he has served as the Director of the Ph.D. Program in Second Language Acquisition and Teaching. He has taught English as a second or foreign language in Ethiopia, Spain, and the United States.

Interlanguage Variation in Theoretical and Pedagogical Perspective

H. D. Adamson University of Arizona

First published 2009 by Routledge 270 Madison Ave, New York, NY 10016 Simultaneously published in the UK by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business This edition published in the Taylor & Francis e-Library, 2008. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” © 2009 Taylor & Francis All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identiﬁcation and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Adamson, H. D. (Hugh Douglas) Interlanguage variation in theoretical and pedagogical perspective / H.D. Adamson. p. cm. Includes bibliographical references and index. Language and languages—Variation. 2. Interlanguage (Language learning). 3. Psycholinguistics. 4. Language and languages—Study and teaching. I. Title. P120.V37A32 2008 401′93–dc22 2008021718 ISBN 0-203-88736-0 Master e-book ISBN

ISBN10: 0–8058–5576–9 (hbk) ISBN10: 0–203–88736–0 (ebk) ISBN13: 978–0–8058–5576–0 (hbk) ISBN13: 978–0–203–88736–3 (ebk)

Once again this book is dedicated, with love, to Alice

v

Contents

Preface

xi

Phonetic Symbols Used in the Book

xv

List of Figures and Tables

xvii

Part I Variation in Native Speaker Speech 1 Variation Theory Introduction—The Cartesian Mind Generative Grammar Variation Theory—A History

3 3 7 12

2 A Study of Variation in the Native Speaker Speech Community h. d. adamson and vera regan

23

Introduction Sociolinguistic Studies of -ing Methods of Data Collection and Analysis Results Discussion

23 24 26 30 32

3 Language Variation and Change Introduction Measuring Sound Change Five Problems in Explaining Sound Change Conclusions

33 33 34 36 46

Part II Variation in Nonnative Speaker Speech 4 The Study of Variation in Interlanguage Introduction Early Studies of Vertical Variation

49 49 50 vii

viii

• Contents Studies of Horizontal Variation Conclusions

5 The Acquisition of English Irregular Past Tense by Chinese-speaking Children larry berlin and h. d. adamson Introduction Constraints on Past Tense Marking Variation in Chinese-speaking Children’s Marking of English Irregular Past Tense Results Discussion Conclusions

51 59 61 61 61 65 70 71 73

Part III Variation in Theoretical Perspective 6 Psychological Theories of Linguistic Variation Introduction Psychological Studies of Probability-matching Psycholinguistic Models of Language Performance Elliott’s Study of Spanish Acquisition Elliott’s Results Interpreted as a Connectionist Network Conclusion 7 Cognitive Linguistics Introduction Prototype Schemas in Morphology Prototype Schemas in Syntax/Semantics: The Acquisition of Argument Structure A Pilot Study of Ditransitive Acquisition Among Korean Speakers Discussion: Prototype Schemas, Connectionist Networks, and Variable Rules

77 77 77 81 93 96 99 101 101 106 111 120 126

Part IV Variation in Pedagogical Perspective 8 Speaking Style and Monitoring Monitoring—Attention Paid to Speech Accommodation, Audience Design, and Self-identiﬁcation Reconciling Monitoring and Audience Design

133 133 140 151

Contents

• ix

9 Teaching Implications Social Dimensions Psychological Dimensions A Philosophy of Language Teaching

153 153 166 173

Appendix: Variation and Change in Color Semantics

183

Notes

192

References

195

Index

205

Preface

Over the years, second language acquisition (SLA) scholars have followed in the footsteps of researchers in other branches of linguistics. For example, the theory of Universal Grammar was developed to explain ﬁrst language acquisition, but that research has led to a large number of studies investigating whether the theory also applies to second language acquisition. The same is true in the ﬁeld of quantitative sociolinguistics, a ﬁeld that was developed by William Labov and his associates in the 1960s and 1970s to describe social and regional dialects. Labov (1966, 1969, 1972a, 1972b) found that these language varieties diﬀered from each other not so much in terms of features that were always present in one variety and never present in another, but in terms of the frequency at which shared features occurred. For example, young Philadelphians of all social classes sometimes raise their front vowels so that “plate” can be pronounced [pliyt], but working-class speakers raise them more frequently than middle and upperclass speakers. Labov also found that the way people talk can be correlated with factors besides their social class, such as their age, race and gender, and also with the topic, the setting, and the linguistic context in which their speech occurs. Thus, a young working-class man telling a story in a bar is almost certain to say, “I’m runnin’ late,” whereas a young, middle-class woman apologizing for being late to a job interview is almost certain to say, “I’m running late.” Labov and his associates developed a number of sophisticated tools, including the Varbrul multivariate analysis program, for studying variation in speech and correlating alternating forms, like [pleyt] and [pliyt], with social and linguistic factors. Using these tools to study SLA was an obvious notion because variation is the hallmark of interlanguage. Language learners, like mature native speakers, don’t always say the same thing in the same way. But, whereas native speakers alternate between a formal and an informal variant (for example, “I’m running” versus “I’m runnin’ ”), nonnative speakers often alternate between a grammatical and an ungrammatical variant. For example, Alberto, Schumann’s (1978) famous subject, sometimes said, “I don’t like it” and sometimes said, “I no like it.” SLA researchers asked whether the methods developed by mainstream variationists to study systematic variation in ﬁrst language speech could uncover systematic variation in second language speech. The answer was “yes,” and the quantitative study of variation in interlanguage is now one of the recognized schools of SLA research, included in standard introductions to the ﬁeld such as Ellis (1994) and Mitchell and Myles (2004). The question of how quantitative sociolinguistics and related theories of language use can be applied to SLA is the topic of this book. xi

xii

• Preface

Part I, Variation in Native Speaker Speech, describes the basics of Labovian sociolinguistics and shows how the ﬁeld is relevant to the study of interlanguage. Labov began his work at a time when the reigning linguistic paradigm was Chomsky’s (1965) Standard Theory, and Labov considered his own work to be an extension and reﬁnement of that theory. This extension was challenged by generative grammarians, as was the application of quantitative analysis to the study of SLA. These debates are reviewed in chapter 1, where it is suggested that generative theory and Labovian theory describe language at diﬀerent levels of abstraction, with quantitative descriptions of language closer to performance models of language comprehension and production, as described in chapter 6. In order to give readers a concrete example of what a variation study looks like, chapter 2 presents a study of -ing variation in Philadelphia speech, which uses the Varbrul program. Chapter 3 shows how knowledge of synchronic variation can shed light on the processes of historical language change. The basic idea is that synchronic variation in a speech community is a snapshot of linguistic change in progress. For example, at an earlier time, Philadelphians did not raise their front vowels. This change emerged among working-class speakers and has spread to the other social classes. It is possible that front vowel raising will continue to increase in frequency among all social classes until the vowel system in Philadelphia English is reorganized, as happened earlier in the Great English Vowel Shift. Chapter 3 also features a discussion of ﬁve basic problems in understanding sound change that were identiﬁed by Weinrech, Labov, and Herzog (1968). These problems are revisited from the perspective of change in color category systems in the appendix. Part II, Variation in Nonnative Speaker Speech, extends the study of variation to interlanguage. Chapter 4 reviews the history and development of variation studies of interlanguage, and chapter 5 presents an example of such a study, the acquisition of English irregular past tense forms by Chinese-speaking children, again using the Varbrul program. Part III, Variation in Theoretical Perspective, explores the psychological underpinnings of quantitative sociolinguistics. Historically, Labovian sociolinguists have used the term Variation Theory to describe their school of linguistics, but the theoretical aspects of this work have lagged behind the data-based research and empirical ﬁndings. Nevertheless, I will use the term “Variation Theory” throughout the book, and in this section I suggest some of the theoretical underpinnings for developing a more comprehensive theory of language variation. As mentioned earlier, Variation Theory was originally conceived as an extension of generative grammar, but that attempt proved to be a category error. Generative grammar is not concerned with the probabilities at which linguistic forms are used, only with whether the forms are grammatical. However, psycholinguistic models of language production and comprehension are

Preface

• xiii

very much concerned with probabilities, and in chapter 6 I show how Variation Theory is compatible with these models. Barsalou’s (1992) sentence production model (based on Levelt, 1989 and Garrett, 1975) and Townsend and Bever’s (2001) sentence comprehension model are reviewed. Barsalou’s model, when suitably updated, could employ connectionist networks to make probabilistic decisions about what forms to produce, and Townsend and Bever’s model employs connectionist networks to interpret incoming speech. Because both connectionist networks and Variation Theory deal with probabilities, their similarity is obvious, and Mitchell and Myles (1998, p. 178), among others, have suggested that the relationship between the two theories ought to be worked out. Like Variation Theory, connectionist linguistics is empirically oriented and has not been closely associated with a particular grammatical theory, with one exception. That exception is the theory of Cognitive Linguistics (CL), which, as Feldman (2006) shows, is compatible with connectionist networks. In fact, in some cases a CL description can be considered a more abstract representation of what connectionist networks do. Chapter 7 describes CL and shows how it is compatible with connectionist models. Then, the chapter goes on to show that both CL and connectionism are compatible with Variation Theory. This is perhaps the most innovative claim in the book because it suggests that variationists can look to CL for a more abstract characterization, a grammatical characterization if you like, of much of their work. Part IV, Variation in Pedagogical Perspective, describes some teaching implications of the theoretical perspectives and research ﬁndings presented in Parts I, II, and III. Chapter 8 compares Labov’s theory of monitoring (paying attention, either consciously or unconsciously, to the form of speech) to Krashen’s (1978, 1982, 1985, 1987) theory of monitoring, which has had enormous inﬂuence on language teaching practice. Both similarities and diﬀerences in the two theories are pointed out. Chapter 8 also discusses accommodation theory and related theories that were developed by sociolinguists who were not followers of Labov. These theories complement (their proponents say contradict) Labov’s theory of monitoring but better explain how a speaker’s sense of personal identity and how diﬀerent audiences aﬀect the way people talk. This chapter also discusses how accommodation theory has been applied to the study of interlanguage. Chapter 9 further explores the teaching implications of Variation Theory and CL. The chapter is divided into two parts. The ﬁrst part, Social Dimensions, reviews the debate on whether to teach informal and nonstandard forms of a foreign or second language. It is noted that, although communicative competence is usually taught in both foreign and second language settings, sociolinguistic competence (Bachman, 1990), that is, the ability to understand and produce informal and nonstandard forms, is seldom taught and is a highly controversial subject. The second part of the chapter, Psychological Dimensions, considers the teaching implications of Variation Theory proper and CL.

xiv

• Preface

The chapter ends with a discussion of the pedagogical implications of the philosophy of Empirical Realism (Lakoﬀ, 1987), a philosophy developed by cognitive linguists. In the appendix, the similarities and diﬀerences between sound change, as discussed in chapter 3, and change in color category systems are pointed out. The theory of color category change has provided core concepts to the theory of CL. Berlin and Kay (1969) found that the languages of the world contain from as many as 11 basic color terms to as few as two. They also found that as a society develops in complexity and technology, it expands its inventory of basic color terms. In other words, color category systems evolve over time just as sound systems do. Because of this similarity, it is possible to discuss Berlin and Kay’s (1969) theory of color category change in terms of the ﬁve problems for understanding sound change that were introduced in chapter 3. This book is intended for students who have a good foundation in linguistics but who are not familiar with Variation Theory or CL. The book attempts to place the quantitative study of SLA in the context of related disciplines, including generative linguistics, psycholinguistics, CL, and, of course, Variation Theory. Therefore, the discussion is wide-ranging, but I have tried to write at the level of an introductory textbook in each of these disciplines. Most of the chapters review, relate, and interprete the relevant literature, with three exceptions. Chapter 2 contains a data-based study of variation in native speaker speech, and chapter 5 contains a data-based study of variation in interlanguage. The suggestions for teaching in chapter 9, for the most part, have not been made previously. The book could be used as an auxiliary text in a course in SLA or mainstream sociolinguistics. It would be even better as the main text in a course in quantitative SLA, and it is oﬀered in the hope that there will be more such courses in the future. Acknowledgements Thanks to my students and colleagues in the Interdisciplinary Ph.D. Program in Second Language Acquisition and Teaching at the University of Arizona. I have used material from the book in my classes, and my students’ comments have been most helpful. Thanks especially to my co-authors, Vera Regan and Larry Berlin, for their support and collaboration. Conversations with Roy Major, Norma Mendoza-Denton, and the late Robert MacLaury were also very helpful. The author is solely responsible for any shortcomings in the book. The author is sorry.

Phonetic Symbols Used in the Book

Vowels front High

mid

iy I ey e æ

Central Low

back uw u ow o

ə

a

Diphthongs ay, aw, oy Consonants bilabial labiodental Stops p, b Fricatives Aﬀricates Nasals m Liquids Glides w

f, v

dental

alveolar

palatal velar

θ, ð

t, d s, z

k, g sˇ, zˇ cˇ,

n

glottal

ŋ

l, r y

h

xv

List of Figures and Tables

Figures 2.1 Decision tree for stylistic analysis of spontaneous speech in the sociolinguistic interview (from Labov, 2001b). 3.1 Mean values of all Philadelphia vowels with age coeﬃcients. 3.2 Mean values of all Philadelphia vowels with age coeﬃcients. 6.1 Results of Hudson Kam and Newport’s (2005) experiment in the acquisition of determiners in an artiﬁcial language. 6.2 Barsalou’s (1992) model of speech production. 6.3 The Aphasia Model for naming cat. 6.4 Simpliﬁed version of Spivey and Tanenhaus’s (1998) connectionist model for selecting an RR or MC template. 6.5 Spivey and Tanenhaus’s (1998) connectionist model for selecting an RR or MC template (adapted). 6.6 A connectionist interpretation of Elliott’s (1975) data. 7.1 The prototype category BIRD. 7.2 The radical category “ditransitive verb” (from Goldberg, 1995). 7.3 Verbs that fuse with the ditransitive construction (from Goldberg, 1995). 7.4 A possible representation of prototype schema (6) as a connectionist network. 8.1 Class stratiﬁcation of /r/ in guard, car, beer, beard, etc. for native New York City adults (based on Labov 1972a, p. 114). 8.2 The monitor model. 8.3 Bell’s categorization of diﬀerent types of style shifting (from Bell, 2001). 9.1 Bachman’s model of communicative language ability. 9.2 The force image schema. 9.3 The transfer of possession schema. A.1 Stages of evolution in color categories.

27 35 36 79 81 85 90 91 97 103 119 120 128 134 136 143 154 177 179 184

Tables 1.1 Deletion of ﬁnal /t,d/ in Detroit African American English in informal style by social class. 1.2 /t,d/ deletion in Detroit African American English by social class and linguistic environment. 1.3 Varbrul results for /t,d/ deletion by African American English speakers from Detroit (all social classes).

13 13 16 xvii

xviii

• Figures and Tables

2.1 Syntactic categories in which -ing occurs. 2.2 Probabilities of N in the Philadelphia native speaker data according to monitoring, gender, grammatical category, and following phonological environment. 2.3 Frequency of N according to style and gender. 2.4 Percentage of N by following phonological environment. 3.1 Frequencies of [æ] tensing for two words by age group. 3.2 Varbrul weights for the backing of // in Belten High, for social categories and genders. 4.1 Percentage of forms for L1 speakers of Quebec French, French language arts materials, French immersion teachers, French immersion students. 5.1 Three semantic features associated with Vendler’s (1967) four semantic categories of verbs. 5.2 Overview of individual subjects. 5.3 Percentage of accurate past tense marking by individual subject and time period. Subjects are ranked according to their overall accuracy. 5.4 Percentage of accurate past tense marking by individual subject and verb class. 5.5 Percentage of accurate past tense marking by individual subject and semantic type of verb. 5.6 Percentage of accurate past tense marking by individual subject and clause type. 5.7 Varbrul analysis for past tense marking. 5.8 Percentage of accurate past tense marking by time period and semantic type of verb. 6.1 Percentage of correct se usage by semantic domain in Elliott’s (1995) study. 6.2 Percentage of overgeneralization of se by semantic domain in Elliott’s (1995) study. 7.1 Classiﬁcation of verbs on the grammaticality judgment task according to Goldberg’s (1995) and Gropen et al.’s (1991) criteria. 7.2 Native English and native Korean speakers’ ratings for the grammaticality of the ditransitive construction with various verbs. 8.1 Style shifting on four target language forms in narrative style, interview style, and grammar test style by Arabic speakers. 8.2 Varbrul p values for plural -s marking by Chinese speakers according to convergence with interlocutors. 8.3 Percentage of informal variants for sentence reading style versus phrase reading style for three native language groups. 8.4 Percentage of informal variants for males and females for three native language groups.

23

29 30 31 40 42

57 63 65

67 68 69 69 70 72 96 96 122 123 139 146 148 148

I Variation in Native Speaker Speech

1

Variation Theory

Introduction—The Cartesian Mind The Logical Problem of Learning Two of the oldest questions in philosophy are the ontological question and the epistemological question. The ontological question asks, “What is the nature of reality?” The epistemological question asks, “How do we know what we know?” Since Plato, philosophers have observed that we know more than we have evidence for. As Bertrand Russell put it, “How comes it that human beings, whose contacts with the world are brief and personal and limited, are nevertheless able to know as much as they do know?” (quoted in Chomsky, 1986, p. xxv). An example of something we know instinctively, without suﬃcient evidence, is that physical objects continue to exist when no one is looking at them. But the claim that objects do not disappear when no one is looking is not refutable by any observation, as Hume pointed out. The problem of how we know things based on limited evidence has been called the logical problem of learning. The epistemological question and the logical problem of learning were debated by philosophers during the seventeenth century, who came up with some surprisingly modern answers. As Chomsky (1999, p. 36) observes: These 17th century thinkers speculated rather plausibly on how we preserve the objects around us in terms of structural properties, in terms of our concepts of object and relation, cause and eﬀect, whole and part, symmetry, proportion, the functions served by objects and the characteristic uses to which they are put. We perceive the world around us in this manner, they argued, as a consequence of the organizing activity of the mind, based on its innate structure and the experience that has caused it to assume new and richer forms. John Locke and Rene Descartes proposed two diﬀerent answers to the epistemological question, and the debate between their intellectual descendants continues to this day. Locke believed that the mind is like a blank slate and that all of the ideas that it eventually contains are supplied by experience. He wrote: Let us then suppose the mind to be, as we say, white paper void of all characters, without any ideas. How comes it to be furnished? Whence comes it by that vast store which the busy and boundless fancy of man 3

4

• Variation in Native Speaker Speech has painted on it with an almost endless variety? Whence has it all the materials of reason and knowledge? To this I answer, in one word, from EXPERIENCE. (quoted in Pinker, 2002, p. 5)

Descartes, on the other hand, believed that if all human knowledge were based only on experience, there would be no way to be certain about the truths of mathematics, science, or anything else because no two people’s experience is the same. However, he believed that we can be certain of at least one thing: the existence of our own minds: “I think, therefore I am.” A corollary of this proposition is that the mind can also know its own ideas or representations. Knowledge consists of grasping what these ideas are, and working out the connections between them. Many of our ideas, Descartes believed, are imposed on our minds by the force of logical necessity. Such innate ideas included the basic concepts of mathematics and geometry, and, indeed, the idea of God. Other seventeenth-century philosophers took a middle position. Kant agreed with Locke that knowledge of the world is gained through sensory experience, but argued that this experience must be organized by principles that are inherent in the mind. Leibniz expressed the same idea this way: “There is nothing in the intellect that was not ﬁrst in the senses except the intellect itself” (quoted in Pinker, 2002, p. 34). Modern psychology has found considerable evidence that supports Kant’s and Leibniz’s position that the mind has innate ways of organizing experience. In a famous set of experiments Spelke, Vishton, and von Hofsten (1995) demonstrated that three- to four-month-old infants have a fairly well-developed ontological theory. They understand what an object is, that objects normally move along a continuous trajectory, and that objects cannot disappear from one place and reappear in another. The method used to demonstrate these facts is extraordinary. Before they can crawl, infants can turn their heads to observe their surroundings, and they like to watch new or unexpected things. They are easily bored and will turn their heads to see new things and will look longer at things that are unexpected. By timing how long infants will look at a scene before turning their heads, researchers can infer what the infants consider new or unexpected. Using this methodology, Baillargeon (1995) showed that infants do not expect one object to pass through another. The babies he studied were fascinated by an animated scene in which it appeared that a panel placed in front of a cube fell ﬂat, right through the space that the cube should be occupying. Another well-studied example of innate mental organizing principles (which is examined in more detail in the appendix) comes from research on how languages name colors. At ﬁrst glance color naming systems seem to be remarkably diverse. The language of the Dani of New Guinea has only two color terms (which roughly correspond to “light” and “dark”), whereas English has eleven basic color terms. According to Berlin and Kay (1969), a color term

Variation Theory

• 5

is basic if it is a single morpheme, not derived from another color term (like reddish-brown), and uniquely names a region of the color spectrum. The eleven basic color terms in English are: black white

red

yellow green blue

brown

purple pink orange gray

Berlin and Kay (1969) discovered a remarkable fact about the color-naming systems in all of the languages they studied. If a language has a color term on the right side of the chart above, it will also have the terms to the left of it. Thus, if brown is a basic color term in a language, that language will also have terms for yellow, green, blue, red, white, and black. The reason for this universal order among color systems is the human perceptual apparatus. People perceive six purest colors: red, yellow, green, blue, white, and black. The maximum contrasts between these colors determine which colors will be named in a system with a particular number of terms. For example, if a system has three terms, they will be roughly light, dark, and reddish, which divide the color spectrum up into three maximally contrasting regions. As discussed in the appendix, social factors are also important in how a language names basic colors, but these factors are constrained by principles of color perception and sensory processing that are innate in human beings. The Logical Problem of Language Acquisition One aspect of the logical problem of learning involves how we are able to learn a language. The problem applies to ﬁrst language acquisition in a straightforward way. How can we explain the fact that children know grammatical patterns that they have not encountered in the input they receive? Evidence of such knowledge was found in a classic experiment by Crain and McKee (1986). They showed that three-year-olds understand that in the sentence, “He was eating pizza while Kermit the Frog was dancing,” he must refer to someone other than Kermit the Frog, say Cookie Monster. This ﬁnding is surprising because the reason he must have a referent that is not mentioned later in the sentence is very abstract. At ﬁrst glance one might think that the reason is fairly simple: a pronoun cannot precede its referent. But this is not the case, as shown by the sentence, “While he was dancing, Kermit the Frog was eating pizza.” The actual reason involves a principle of Universal Grammar (UG) called Binding Principle B, which states that a pronoun cannot refer to an NP that it c-commands. C-command, roughly put, means this: In a tree structure, node A c-commands node B if there is a node C that directly dominates node A and also dominates node B. For example, in the case of the tree on page 8, the ﬁrst NP (Marsha) c-commands VP and every node within VP. Similarly, V (picked) c-commands Part, the ﬁnal NP, and every node within the ﬁnal NP. Chomsky

6

• Variation in Native Speaker Speech

claims that children know Binding Principle B and the other principles of UG innately. Chomsky’s answer to the logical problem of ﬁrst language acquisition, then, is that children can use grammatical knowledge that they could not have ﬁgured out from input because they are born with it. The logical problem of learning a language applies diﬀerently to adult second language acquisition because in this case it is more diﬃcult to claim that learners know grammatical patterns that they have not encountered; after all, they already know a whole language. Therefore, it is not clear what role, if any, UG plays in adult second language acquisition. Three positions are usually distinguished. The fundamental diﬀerence hypothesis (Bley-Vroman, 1990) claims that UG is not involved at all in second language acquisition. The evidence cited for this position is that, unlike ﬁrst language acquisition, second language acquisition is often not successful, as implied by the failure of many language teaching programs to produce ﬂuent speakers. The second position is that UG works for adults in much the same way that it works for children. Supporters of this view (Schwartz and Sprouse, 1996; Shi, 2003) point out that there are many cases of successful adult language learning and that the logical problem of language acquisition applies to these cases just as it applies to children. Therefore, UG must be at work. The third position is that only the principles and parameters employed during ﬁrst language acquisition are available to aid the second language learner. As an example, consider another principle of UG, the null subject principle, which states that a sentence subject either must appear in the surface structure or need not appear in the surface structure. Like many UG principles, the null subject principle comes with an associated parameter, which can be set to plus or minus. In Spanish, the null subject parameter is set to plus, which means that Spanish allows null subject sentences, that is, it allows sentences without a surface subject, such as: Tengo un libro “I have a book” (literally “Have a book”). In English and French, on the other hand, the null subject parameter is set to minus, which means that these languages require that sentences have subjects on the surface, except in special cases like commands. Those who say that only the principles and parameters settings employed in the ﬁrst language are available in learning a second language would predict that a speaker of French will be able to learn this aspect of English easily because English has the same setting, but a speaker of Spanish will have more diﬃculty because Spanish has the opposite setting. The Developmental Problem of Language Acquisition Gregg (1996) makes a distinction between the logical problem of language acquisition and the developmental problem. The answer to the logical problem, he maintains, must involve a comprehensive theory of language, like UG. Such a theory is a property theory, which addresses the question, “How is acquisition possible?” A property theory not only explains how language is acquired but also goes a long way toward explaining what language is. “A property theory

Variation Theory

• 7

describes the components that constitute the system, and their interrelations” (Gregg, 1996, p. 51). The developmental problem of language acquisition, on the other hand, involves a transition theory, which addresses the question, “Why does child language or interlanguage change from state A to state B?” UG is not a transition theory. For example, notice that Binding Principle B sets limits on how children construct an internal grammar of pronoun reference, but it does not describe the course of this acquisition in detail, predicting, for example, which pronouns a child will use ﬁrst. A transition theory of pronoun reference might complement the UG property theory of pronoun reference by describing the order in which a particular speaker or group of speakers acquired pronouns, coupled with an explanation of why that order occurred, an explanation that might involve, for example, the frequency of pronouns encountered in the input. Gregg (1996) maintains that an adequate transition theory must be associated with a property theory, and that would, no doubt, be desirable. Unfortunately, however, we have no agreed-upon property theory. The property theory that Gregg endorsed in 1996, the Government and Binding version of generative grammar, was in the process of changing fundamentally even as he wrote. Therefore, it seems reasonable to investigate a transition theory of language development independently of any particular property theory, keeping in mind that the two kinds of theories must eventually be compatible. I will claim in the next section that Variation Theory, a way of modeling change in linguistic systems, can help to explain the developmental aspects of ﬁrst and second language acquisition, and thus serve as part of a transition theory. Generative Grammar Variation theory was developed in the 1960s and 1970s, largely by William Labov (1966, 1969, 1972a, 1972b), to explain the relationship between diﬀerent varieties of the same language. Labov and his colleagues took a special interest in nonstandard varieties, especially African American English, whose speakers often use both nonstandard forms and standard forms in the same discourse, for example, “He play basketball everyday” and “He plays basketball everyday.” It should be noted that similar variation is also common in standard varieties of English (“He’s playing basketball now” and “He is playing basketball now”). As we will see in chapter 2, variationist research has shown that such alternations are usually systematic and not random. Originally, Variation Theory was thought of as an extension of generative grammar, so in order to understand the early days of Variation Theory, we must take a look at the reigning generative theory of that time. The Standard Theory The generative model that had the most inﬂuence on Variation Theory was the Standard Theory proposed by Chomsky in 1965. Recall that in this

8

• Variation in Native Speaker Speech

syntax-driven model re-write rules of the form S → NP VP; NP → det N; and VP → V NP produced tree structures that terminated in lexical items, as in the tree below, which was called a deep structure.

Deep structures could be rearranged using transformational rules to produce surface structures. For example, the particle movement transformation applied to the tree structure above would change the sentence “Marsha picked up the book” into the sentence “Marsha picked the book up.” The particle movement transformation rule looked like this: V picked

NP → V NP the book picked the book

part up

part up

This rule says that in a sentence that contains a verb/particle combination followed by an NP, the particle can be moved to the right of the NP. The rule can, of course, be used to generate many other sentences, including “Marsha picked the book with the Corinthian leather binding up.” Whether a particle is likely to move is not indicated by the rule, even though we know that particles more often move around short NPs, like “the book,” than around long NPs, like “the book with the Corinthian leather binding.” However, probabilistic statements were outside the scope of generative grammar. The goal of the grammar was to generate sentences that were grammatical, not to address questions of probability. Competence and Performance Suppose you turn on your computer, boot up a math program, and punch in 2

+

2

=

The computer produces the answer: 4. You may have wondered how the machine does it. The explanation must be given at several levels of abstraction. At the most abstract level is the equation 2 + 2 = 4. This tells us conceptually what the machine is up to. At the most concrete level is a description of the actual circuits that go on and oﬀ as the computer performs the operation. There are also descriptions at intermediate levels of abstraction. For example, writing the equation in base 2 (10 + 10 = 100) is closer to the concrete level

Variation Theory

• 9

because the machine does its calculations using the binary system. At a still more concrete level are lines of code in a programming language, which are actual instructions to the computer. For example, here is how you program a computer to add 2 and 2 in Basic: 1.

int x = 0

2.

x=2+2

This tells the computer to reserve a space in memory to hold an integer, to name the space “x”, and to place the integer 0 in the space. This tells the computer to combine 2 and 2 and place the result in space x.

A level of abstraction between a programming code and the description of circuits is called the machine code, and it corresponds fairly closely to what the actual circuits do when making the calculation. All of these descriptions are conceptually equivalent in the sense that they share the same basic information. However, the descriptions at diﬀerent levels of abstraction are useful for diﬀerent purposes. A machine code description is useful to someone who is investigating an error message that may be caused by a virus. The Basic code description is useful to someone who wants to be able to add 2 and 2 on a computer that can run Basic. The mathematical equation is useful for understanding the logic of what the computer is doing. Many linguists and psychologists have adopted the computer metaphor of the mind, which Smolensky (2001, p. 323) describes as follows: “Just as a program is an abstract higher-level description of a computer, a mind is an abstract, higher-level description of a brain.” How does the mind produce sentences like “Marsha hit John”? The explanation is not straightforward but (like the explanation of how a computer performs a calculation) must be given at several levels of abstraction. The most concrete level, the level of “wetware,” is a description of which neurons in the brain ﬁre during the production of a particular sentence. At a more abstract level, and perhaps analogous to machine code, is what Fodor (1975) calls the “language of thought.” At a still more abstract level, is a Standard Theory type of explanation, which provides the derivational history of a sentence in terms of phrase structure rules, transformational rules, lexical insertion rules, and phonological rules. Such an explanation might be considered analogous to equations like 2 + 2 = 4 in a description of how a computer performs addition. Chomsky has singled out two levels of abstraction as important for understanding the mental processes involved in sentence production (and this claim applies to all versions of Chomskian theory, including the most recent): competence and performance. These terms have caused a lot of confusion. A competence theory is a theory of the mental system that underlies language behavior (Gregg, 1996, p. 53). A performance theory is a theory of the internal mental processes involved in language behavior. Generative linguistics is often considered to be the study of competence and psychology is considered to be

10

• Variation in Native Speaker Speech

the study of performance, but Chomsky insists that both levels of description are important for understanding human language abilities and both are part of a psychological explanation. Stabler (1984) characterizes Chomsky’s position as follows: Chomsky . . . argu[es] that linguistics characterizes what is computed, the linguistic “data structures,” while other psychological investigations explain how those linguistic structures are computed. (p. 156) In terms of the addition analogy, we might consider competence to be the equivalent of 2 + 2 = 4 and performance to be the equivalent of the Basic code. Some psychologists, however, have claimed that generative grammar adopts a level of description that is too abstract. Valian (1979) summarizes this objection as follows: [Because] the grammar of a language does not have an automatic performance interpretation, . . . psycholinguists have attempted to specify performance independently of competence. To the extent that they have been successful, they have suggested that the distinction between competence and performance is unnecessary and that competence itself is not a useful notion. (p. 1) In terms of the computer analogy, it is as though computer scientists who were able to explain how a computer performs addition at the level of machine code went on to claim that a discussion of addition using numbers in base 10 was unnecessary. That claim would be wrong. It is true that base 10 mathematics does not get you far in describing the operation of a speciﬁc machine because diﬀerent computers perform addition in diﬀerent ways, but all of the ways are based on the same mathematical principles, which are usefully stated (in our culture, at least) in terms of base 10 mathematics. Similarly, many principles of language acquisition, such as Binding Principle B, can be usefully stated at the competence level. During the 1960s, some psychologists proposed an interesting possibility for narrowing the gap between competence and performance. As we have seen, the Standard Theory generates sentences, but generative grammarians have often pointed out that the term “generate” is intended in the mathematical sense of specifying all and only the grammatical sentences of a language. That is, the rules of the grammar are not supposed to apply in real time to produce sentences but rather to exist all at the same time to specify which word strings are grammatical and which are not. Such a deﬁning function is clearly in the realm of competence. But, these psychologists wondered, what if generative rules were thought of as applying in real time? That is, what if the competence rules (perhaps with some modiﬁcations) could work as a performance model of sentence production? If this turned out to be true, the competence/performance

Variation Theory

• 11

distinction would still exist (competence rules would still “deﬁne” grammatical sentences while performance rules would model how speakers actually mentally produced sentences), but the gap between the two levels of explanation would be greatly narrowed. For a few heady years, it looked like this hypothesis, called the derivational theory of complexity (DTC), was right. Generative Grammar and Psycholinguistics In the 1960s, a number of experiments showed that the amount of time it takes to produce or comprehend certain sentences depends on how many transformations are in their derivations. For example, the passive transformation will change the base sentence “Marsha hit John” into “John was hit by Marsha,” and the negative transformation will change that sentence into “John was not hit by Marsha.” Thus, the negative passive sentence requires more transformations than either the active sentence or the declarative passive sentence. Miller and McKean (1964) found that subjects took longer to comprehend negative passives than declarative passives, and that declarative passives took longer to comprehend than actives. This ﬁnding suggested that to understand a sentence, subjects had to mentally undo the transformations that had been applied to it. Miller and McKean (1964) also found that when subjects were prompted with a declarative active sentence, it was easier for them to produce a declarative passive sentence than a negative passive sentence, suggesting that the negative passive sentence required additional mental processing (in the form of a mental negative transformation) just as the Standard Theory implied. Other studies (Clifton, Kurcz, and Jenkins, 1965; Clifton and Odom, 1966) supported the idea that derivationally complex sentences are more difﬁcult to process than their less complex counterparts, suggesting that, at least in regard to the sentences in question, the Standard Theory competence account of a derivation could also work as a performance model of sentence comprehension and production. Townsend and Bever (2001, pp. 29–30) remark, “The . . . hypothesis of a direct mapping from the structure of linguistic knowledge and language behavior was wildly successful. The golden age had arrived.” The golden age did not last long, as problems with the DTC were noticed immediately. The researchers just mentioned had not claimed that all transformations made sentences more diﬃcult to process, just the transformations they had studied, but it seemed possible that transformations usually had this eﬀect. However, experiments showed that some transformations seemed to make processing easier, not harder. For example, according to the Standard Theory, the second sentence below is derived from the ﬁrst sentence by applying the extraposition transformation. That John left early surprised Marsha. It surprised Marsha that John left early.

12

• Variation in Native Speaker Speech

But Fodor and Garrett (1967) showed that subjects understood the extraposed sentence more easily. A second problem for the DTC was that the Standard Theory was replaced by other generative theories. In the Government and Binding Theory (Chomsky, 1981), for example, all transformations except one disappeared. Another problem was that Government and Binding Theory had no clear interpretation as a processing model (Townsend and Bever, 2001, p. 179), and so the distance between a competence grammar and performance model of sentence production increased. However, the latest version of generative grammar, the Minimalist Program (Chomsky, 1995) is less abstract than Government and Binding Theory and, like the Standard Theory, can be interpreted as part of a production model. As Townsend and Bever (2001) remark, “Perhaps a new ‘derivational theory’ of the psychological operations involved in assigning syntactic derivations is at hand” (p. 179). In chapter 6, we will examine this possibility. But now, having reviewed the linguistic and psychological background of the 1960s and 1970s, we can move on to consider Variation Theory, which emerged at this time. Variation Theory—A History Variable Rules Within a speech community, speakers who belong to diﬀerent age groups, social classes, ethnic groups, and genders show systematic diﬀerences in the way they talk. For example, words ending in -ing, such as running and darling, have an informal pronunciation (runnin’, darlin’) as well as a formal pronunciation. As we will see in chapter 2, studies (Cofer, 1972; Houston, 1985) have found that middle-class speakers and women use the formal variant more often than working-class speakers and men. Perhaps the most studied example of socially patterned variation involves the deletion of the sounds /t/ and /d/ when they occur in a consonant cluster in word ﬁnal position, so that the words mist and buzzed are pronounced mis’ and buzz’. Studies (Fasold, 1972; Wolfram, 1969) have found that men delete /t,d/ more often than women, that working-class speakers delete /t,d/ more often than middle-class speakers, and that almost all speakers delete /t,d/ more often when they are speaking informally. Wolfram (1975) found that diﬀerent rates of /t,d/ deletion correlated with the social class of African American English (AAE) speakers living in Detroit (who can delete /t,d/ from non-clusters, so that did can be pronounced [di]). This pattern is shown in table 1.1, where /t,d/ deletion rates range from 51 percent for upper-middle-class speakers to 84 percent for lower-working-class speakers. It is remarkable that speakers can learn the frequency at which they should produce variable linguistic forms like in’ and word ﬁnal /t,d/ in order to sound like other members of their demographic groups. But, it turns out that this task

Variation Theory

• 13

Table 1.1 Deletion of ﬁnal /t,d/ in Detroit African American English in informal style by social class. Classes

Deletion rate

Upper middle Lower middle Upper working Lower working

.51 .66 .79 .84

is even more complex than has been suggested so far. The frequency at which a speaker uses variable forms depends not only on the speaker’s demographic characteristics, but also on the linguistic environment in which the form occurs. For example, all speakers sometimes delete ﬁnal /t,d/ when the following word starts with a consonant (this makes sense because it is more diﬃcult to pronounce a three-consonant cluster than a two-consonant cluster followed by a vowel). Final /t,d/ deletion is also less likely (in native speaker speech) if the ﬁnal /t,d/ does not serve as a past tense morpheme. Thus, deletion is most likely in a phrase like “test me,” where t is part of a three-consonant cluster and is not a past tense morpheme, and is least likely in a phrase like “mist over,” where t is not part of a three-consonant cluster and not a past tense morpheme (we will discuss the eﬀect of linguistic environment in more detail below). The eﬀect of both linguistic environment and social class on /t,d/ deletion is shown in table 1.2. But there are still other factors to consider. Many studies have found that the frequency at which a variable feature is used also depends on the circumstances Table 1.2 /t,d/ deletion in Detroit African American English by social class and linguistic environment. Social classes Environments

Following vowel: /t,d/ is past morpheme (e.g. “missed in”) /t/d/ is not past morpheme (e.g. “mist in”) Following consonant: /t,d/ is past morpheme (e.g. “missed by”) /t,d/ is not past morpheme (e.g. “mist by”)

Upper middle

Lower middle

Upper working

Lower working

.07

.13

.24

.34

.28

.43

.65

.72

.49

.62

.73

.76

.79

.87

.94

.97

14

• Variation in Native Speaker Speech

of speaking. A classic example is Labov’s (1966) study of /r/ deletion in New York City. New Yorkers can delete /r/ after a vowel (so that forth ﬂoor is pronounced [foəθ ﬂoə]). Labov found that this deletion correlated not only with the linguistic environment and the speaker’s social class but also with the speaking task. He asked subjects to speak in three diﬀerent circumstances. First, he interviewed the subjects, asking them to provide demographic information (such as how old they were and where they were born) and also to tell stories about childhood ﬁghts, times when they were in danger of death, and other topics. Labov found that the speakers tended to delete /r/ more often when they were telling stories than when they were providing demographic information. He therefore distinguished between a casual style, which was used in narrative, and a formal style, which was used in other parts of the interview. Labov suggested that the speakers tended to delete /r/ more in the casual style because they paid less attention to how they sounded, concentrating instead on telling the story. However, in formal style the speakers monitored their speech, trying to avoid stigmatized forms like deleted /r/. The second speaking situation that Labov (1966) investigated involved reading a passage that contained a number of words from which /r/ could be deleted. He found that in this reading style the speakers deleted /r/ less often than in casual style, a fact that also supports the monitoring hypothesis because when reading speakers have more attentional resources to devote to checking for correctness. The third situation that Labov (1966) investigated was reading individual words from a list, a task that allowed the speakers to monitor even more carefully. The speakers ﬁrst read a list of words containing /r/. In this word list style, they deleted even less than in the reading style. The subjects then read a list of minimal pairs, one member of which contained /r/, such as god— guard. This minimal pair style, which could be most carefully monitored, contained the least amount of /r/ deletion. To summarize, Labov (1966) found that there are no single-style speakers. His subjects modiﬁed their natural way of speaking (or vernacular style) in circumstances that allowed them to do so. Thus, speaking style (as deﬁned by the context of speaking) is still another factor that sociolinguists must take into account if they wish to model variation in speech. The relationship between monitoring and speaking style will be discussed more fully in chapter 8. Researchers who wished to write a grammar that described probabilistic patterns in speech production, such as those found by Labov, faced a basic problem. How could frequency information be included in a Standard Theory grammar? The solution that Labov and his colleagues proposed was to modify the transformational rules of the Standard Theory so that they speciﬁed the linguistic factors that aﬀected rule application. At ﬁrst, this change appeared to be minor. Generative grammar already contained optional rules, like the rule for particle movement mentioned previously, which generated alternative forms. An optional rule for /t,d/ deletion would look like this:

Variation Theory (1) t,d → (Ø) / C (#)

• 15

## {V,C}

Rule (1) says that /t,d/ at the end of a word is optionally deleted when it occurs after a consonant, when followed by either a vowel or a consonant, and regardless of whether it is a separate morpheme. But how could the grammar show that deletion is more likely before a consonant and when /t,d/ is not a morpheme? Labov’s (1969) answer was to propose the variable rule, which speciﬁes the environmental features (called constraints) that favor rule application. Rule (1) can be re-written as a variable rule as follows: (2) t,d → Ø / C 〈 Ø# 〉

# # 〈 CV 〉

Rule (2) says that /t,d/ deletion is optional, but that it is more likely before a consonant (which is indicated by writing the C above the V in the angled brackets), and when there is no morpheme boundary (which is indicated by writing the Ø above the #). Rule 2 generates four environments in which deletion can occur. These can be ordered from strongest to weakest (that is, from the environment that most favors deletion to the environment that least favors deletion) as follows: Ø,C (no morpheme boundary, following C) #,C (morpheme boundary, following C) Ø,V (no morpheme boundary, following V) #,V (morpheme boundary, following V) Inspection of table 1.2 reveals that this is the order of the environments that favor /t,d/ deletion in Detroit AAE speech. The Varbrul Program A computer program called Varbrul (Rand and Sankoﬀ, 1990) was developed as a tool for discovering the constraints on variable rules. The program is similar to other programs used for analyzing variable data in the social sciences, such as ANOVA, but is specially adapted for use with linguistic data. The logic behind a Varbrul analysis is similar to the logic behind other kinds of correlational studies, such as those carried out in agriculture, where scientists might try to correlate various combinations of plant food and fertilizer (the independent variables) with the weight of the tomatoes a plant produces (the dependent variable). Similarly, rule (2) can be thought of as a hypothesis about which linguistic factors correlate with the deletion of /t,d/, where a following consonant and the lack of morpheme status are similar to independent variables and the deletion of /t,d/ is similar to the dependent variable. The function of the Varbrul program, like the function of ANOVA, is to test statistically whether a particular hypothesis is correct. To do this, the analyst supplies the program with data regarding the frequency of /t,d/ deletion in the diﬀerent linguistic environments, such as the data in table 1.2. The program will then

16

• Variation in Native Speaker Speech

determine whether a proposed feature, such as a following consonant, is correlated with deletion at a statistically signiﬁcant level, and the program will calculate the relative strengths of the linguistic features that favor deletion. In addition, the program calculates a ﬁgure called the input probability, which represents the likelihood that the rule will apply regardless of which constraints are present. The input probability is necessary because variable rule theory assumes that there is some inherent variation in performance that cannot be completely accounted for by all of the known variables. Preston (1989, p. 17) ran a Varbrul analysis on the data in table 1.2 and got the results shown in table 1.3. The decimal ﬁgures (called p values) associated with each of the independent variables in table 1.3 indicate how much that factor contributes to the probability of /t,d/ deletion. A p value greater than .5 indicates that a factor promotes deletion, while a p value less than .5 indicates that a factor inhibits deletion. Notice that the Varbrul analysis in table 1.3 shows how a speaker’s socioeconomic class aﬀects deletion, but that this information is not included in variable rule (2). Similarly, if the eﬀects of monitoring were known, this information could be included in a Varbrul analysis, which would assign a high p value to casual style and a low p value to formal style. However, this information is also not included in variable rule (2). Non-linguistic information was excluded from variable rules because, like Standard Theory rules, they were considered to be part of a competence grammar, whereas social patterning and monitoring were considered to be aspects of performance. Thus, a Varbrul analysis was not equivalent to a variable rule because it made no psychological claims. Rather, Varbrul was considered to be only an analytical tool that could help the researcher write a variable rule, which did make psychological claims. As previously mentioned, changing optional rules to variable rules seemed like a minor alteration in generative theory, but in fact it was a fundamental change and caused considerable controversy among both sociolinguists and generative

Table 1.3 Varbrul results for /t,d/ deletion by African American English speakers from Detroit (all social classes). Following vowel Following consonant Morpheme Nonmorpheme Upper middle class Lower middle class Upper working class Lower working class Input probability

.25 .75 .31 .69 .29 .42 .60 .69 .60

Variation Theory

• 17

linguists. But before looking at these objections, let us continue with the history of variation theory. The Scope of Variable Rules The variable rule speech community. Variation analysis, and especially the use of the Varbrul program, requires a large number of tokens of the variable being studied. For example, Wolfram’s (1969) study of Detroit African American speech, upon which table 1.2 is based, involved 12 informants, who produced a total of 377 tokens. Cedergren’s (1973) massive study of syllable ﬁnal -s deletion in Panamanian Spanish (where dos libros “two books” can be pronounced [dow liybrow]) involved some 79 speakers and over 22,000 tokens. Such large numbers of tokens cannot be gathered from a single informant; therefore, these studies, like most variation studies, combined the data from many informants, and they assumed that the linguistic constraints operated the same way for almost all of the individuals. In many cases, there are good reasons why this should be so. For example, as mentioned, the fact that a following consonant favors deletion of /t,d/ from a consonant cluster in word ﬁnal position makes sense because the sequence CCC is more diﬃcult to pronounce than the sequence CCV. However, in other cases it is not clear that all speakers in a speech community will share a variable rule with the same linguistic constraints and the same constraint ordering. Nevertheless, the uniform constraints assumption was accepted in early studies without being tested. In fact, several scholars (Cedergren & D. Sankoﬀ, 1974; G. Sankoﬀ, 1974; Wolfram, 1974) tacitly adopted the uniform constraints assumption as one of the features that deﬁned a speech community, a claim that was strongly disputed. Objections to variable rules. Kay (1978; Kay & McDaniel, 1979) objected to the uniform constraints assumption, noting that numerous studies had, in fact, found that the linguistic constraints on a variable rule were not similarly ordered for all of the demographic groups within a speech community, so a single variable rule could not describe the speech community as a whole. For example, in his study of Martha’s Vineyard, Labov (1972a) looked at three ethnic groups living on the island: Yankees, Portuguese, and Indians. He found that speakers in all of these groups centralized the ﬁrst vowel of the diphthongs [ay] and [aw], so that “about the house” could be pronounced [əbəwt ðə həws] and “pie in the sky” pronounced [pəy in ðə skəy]. This centralization could be represented by a single variable rule in which [a] is centralized to [ə] before a following glide, and where [y] and [w] are variable constraints, with [y] the stronger of the two. Such a rule would look like (3), which is written using a formalism diﬀerent from that in (2), but which was common in the variationist literature. It is included here to facilitate the discussion in later chapters. The ranking of the constraints in (2) was indicated by their order within the diagonal brackets, with the strongest constraint placed

18

• Variation in Native Speaker Speech

highest. In (3) the ranking of the constraints is indicated by Greek letters, where A (alpha) marks the strongest constraint, and B (beta) marks the next strongest constraint. (3) a → (ə) /

{ABwy }

Rule 3 claims that centralization is more likely to occur in [ay] than in [aw]. However, Labov found that, as centralization became more common, speakers in the Portuguese and Indian communities had re-ordered the constraints so that [w] was stronger than [y], thus violating the uniform constraints assumption for the community as a whole. Romaine (1982) raised a similar objection to the uniform constraints assumption. She pointed out that within larger speech communities there exist separate social networks, whose speech patterns may diﬀer. Milroy (1982) seconded this observation, noting “the requirement that variable rules are stated in terms of generative rules imposes decisions as to the . . . structural description of, a rule—for the whole community [emphasis in original]—when our experience suggests that there may be diﬀerent variable ‘competences’ within the community” (p. 38). According to Milroy, speech in British cities is more varied than speech in American cities. He characterizes the regular variation observed by American sociolinguists as the “tip of the iceberg” and says that British sociolinguists, looking beneath the waterline in cities like Glasgow, Edinburgh, and Belfast, have observed a lot more irregularity. The contrast between the British and American patterns can be seen by comparing Labov’s (1966) study of New York City speech with Milroy’s (1982) study of Belfast speech. As we have seen, Labov found that the frequency of /r/ deletion correlated with social class and the amount of attention paid to speech. Middle-class speakers deleted /r/ less than working-class speakers, and speakers of all classes deleted /r/ less in monitored styles. These facts suggest that for New Yorkers of all social classes /r/ deletion is stigmatized and constitutes evidence that New York City is a single speech community. The situation is quite diﬀerent in Belfast. There, working-class speakers variably front low vowels (especially before velars), so that back and cab, which are normally pronounced [bak] and [kab], can be pronounced [bæk] and [kæb], as in American English. For these speakers, however, fronting is stigmatized, as shown by the fact that in their monitored speech the working-class informants front these vowels less frequently. Middle-class speakers, on the other hand, use back vowels for this class of words, in all styles. Complicating the picture still further is the fact that the most prestigious style, British Received Pronunciation, employs fronted vowels, and this style is regularly heard on local newscasts. Clearly, all Belfast speakers do not evaluate low front and back vowels in the same way. Therefore, Belfast can be described as a speech community that contains several social networks.

Variation Theory

• 19

A diﬀerent kind of objection to variable rules was raised by Derek Bickerton (1971), the eminent creolist. He claimed that they were unlearnable, observing: If we accept the variable-rule principle, we must also accept that the mind possesses not only the apparatus necessary for framing two quite diﬀerent types of rule (i.e. standard grammatical rules and variable rules), but also some kind of recognition device to tell the speaker whether to interpret a particular set of data as rule-plus-exceptions or as area-of-variability. When we recall that the data on which non-variable rules are based is often incomplete and heterogeneous, the mode of operation of such a device must seem somewhat mysterious. (p. 460) Bickerton (1971) went on to point out that the learnability problem posed by variable rules is even more serious: In order that the average for [a group of speakers that share a variable rule] should remain constant, the variation of the individual must be conﬁned within a relatively narrow range. What keeps his percentage within those limits? And how can it keep within them unless something, somewhere is COUNTING ENVIRONMENTS and keeping a running score of percentages? (p. 461) Variationists (the ﬁrst was Anshen, 1975) were quick to point out that Bickerton’s second objection ignores the way probability laws operate. These laws govern, for example, how often various combinations of numbers will turn up on a pair of dice during an evening at a casino, but no one would claim that the dice must keep track of past outcomes. Rather, the probabilities of the diﬀerent combinations result from how the dice were manufactured (or subsequently altered). By analogy, the probability of deleting /t,d/ in a particular set of circumstances could result from how connections within the brain have been established and possibly altered by learning. Thus, this objection can be dismissed. However, Bickerton’s ﬁrst objection, that the “mode of operation of such a [mental] device must be somewhat mysterious,” was certainly valid in 1971. Bickerton’s (1971) objections to variable rules were repeated by Gregg (1990). The journal Applied Linguistics published a face-oﬀ between Gregg, a second language acquisition (SLA) scholar from the generative camp, and Rod Ellis (1990) and Elaine Tarone (1990), SLA scholars from the variationist camp. Gregg’s article was titled “The Variable Competence Model and Why It Isn’t.” Gregg seconded Bickerton’s objection that variation theory does not include a theory of acquisition; that is, that variation theory has no explanation for how speakers can learn the probabilities embedded in variable rules. What is needed to answer Bickerton’s and Gregg’s argument is a theory that explains how speakers can learn and produce probabilistic patterns. Since Bickerton’s (1971) article, such a theory has appeared and has become an important part of

20

• Variation in Native Speaker Speech

theories of language production and comprehension. It is called connectionism, and we will discuss it in some detail in chapter 6. Replies to objections. Sankoﬀ and Labov (1979) replied to Kay and McDaniel’s (1979) criticisms of variation theory in the same issue of Language in Society, and their reply can serve as an answer to Romaine’s (1982) criticism, as well. Sankoﬀ and Labov (1979) denied that most variation studies had adopted the uniform constraints assumption, pointing out ﬁrst of all that most studies that postulated a single shared variable rule, such as Cedergren and Sankoﬀ ’s (1974) discussion of s spirantization in Panama Spanish, were “rife with disclaimers.” Cedergren and Sankoﬀ (1974, p. 353) had, in fact, stated: This approach neatly solves the problem of community heterogeneity— perhaps too neatly; care should be taken to detect categorical rule diﬀerences where these exist . . . Further statistical methods must be developed in order to judge when small data sets on individual speakers can be aggregated without obscuring categorical distinctions between individual grammars. (quoted in Sankoﬀ and Labov, 1979, p. 203) Sankoﬀ and Labov (1979) also pointed out that many studies employing variable rules were speciﬁcally aimed at discovering which individuals and groups within a community shared which rules and which constraints. For example, in their study of South Harlem teenagers, Labov, Cohen, Robins, and Lewis (1968) constructed a single variable rule for /t,d/ deletion only after analyzing the data from each of their subjects individually. Because these were early days in variation studies, the researchers were surprised to ﬁnd that their subjects shared the same constraints and the same constraint ordering. Sankoﬀ and Labov (1979) also denied that variationists believe speech communities are deﬁned only by shared variable rules with uniform constraints, or that speech communities can even be precisely deﬁned. They said, “We know that every speaker is a member of many nested and intersecting speech communities” (p. 202). An example of such intersection is found in the speech of African Americans and European Americans in New York City. Both groups of speakers share a variable rule for copula contraction, allowing “He is a student” to be shortened to “He’s a student.” The linguistic constraints on this rule are similarly ordered for both groups. However, the African American speakers also have a variable rule for copula deletion, allowing “He’s a student” to be further shortened to “He a student.” European American speakers do not share this rule, nor do they share a number of other rules characteristic of AAE. Fasold (1985) put the sociolinguists’ debate about variable rules in perspective in a review of Romaine’s (1982) book. He observed that Labov and his followers approached the study of language by addressing the same basic question that Chomsky addressed: What is the nature of the speaker’s mental grammar? Labov answered the question, in part, in the same way that Chomsky did: The mental grammar contains (or at least can be modeled by) generative rules.

Variation Theory

• 21

However, Labov went on to ask a second question: How does the speaker’s mental grammar allow for the patterned variation that is found in speech communities? His answer was to modify the optional rules of generative grammar to create variable rules, which could describe the probabilistic patterns observed in speech. Thus, Labovian linguistics, like Chomskian linguistics, is conceived as a psychological enterprise. Sociolinguists like Romaine and Milroy, on the other hand, approached linguistics from a sociological perspective. They looked at communities deﬁned geographically, like Edinburgh and Belfast, and asked how speech varies within these communities. They reported their ﬁndings in terms of statistical patterns but not in terms of rules of a mental grammar. In other words, Labov and his colleagues’ top priority was theoretical and explanatory; Romaine and her colleagues’ top priority was social and descriptive. Preston (2002) sorts out the diﬀerences between the psychological and social perspectives in variation studies in a somewhat diﬀerent way. He distinguishes three levels of sociolinguistic theories. A level I theory concerns itself only with correlations between linguistic forms and social facts. Table 1.1 represents a level I description. A level II theory concerns itself only with correlations between linguistic forms and the linguistic environments in which they occur. Variable rules represent a level II description. A level III theory represents both social and linguistic factors, and also aims to describe the course of linguistic change over time. Chapters 6 and 7 contain level III studies. The Logical Status of Variable Rules The attempt to incorporate variable rules into a generative grammar also came in for some strong criticism. One objection, mentioned by Bickerton (1971), Kay and McDaniel (1979), and Gregg (1990), was that variable rules are incompatible with generative grammar because they represent a diﬀerent logical object. Berdan (1975) put it this way: If variable rules . . . are to become part of the competence grammar, there needs to be serious rethinking of the way in which language is deﬁned and the deﬁnition of the grammar that accounts for language. (p. 22) At the time variable rules were introduced, generative grammar had two major goals: (1) to construct an algorithm for generating all and only the grammatical sentences of a language, and (2) to discover principles of Universal Grammar that explained how speakers can learn the grammar described by (1). Generative linguists believed that both of these goals could be accomplished by a competence grammar, and a competence grammar did not address questions of how often or under what linguistic and social circumstances a particular rule would be used, as we have seen. Generative research involved the study of types of structures (what are the possibilities for pronouncing the -ing morpheme?). Variation research involved the tabulation of tokens of a structure

22

• Variation in Native Speaker Speech

(how many times does a speaker use in’ versus -ing?). This question was considered to be a matter of performance. Thus, according to Bickerton (1971), Labov was committing a category error by introducing probabilistic description into a generative grammar. Bickerton was right, and Sankoﬀ and Labov (1979) acknowledged that probabilistic grammars had a “diﬀerent logical status” than categorical grammars, and that “variable rules are rules of production” (p. 202). It remained for variationists to ﬁnd a performance level theory of production that would allow for probabilistic description. Such theories will be discussed in chapter 6, where probabilistic models of language production, comprehension, and acquisition from the ﬁeld of psychology are described. The psychologists who have proposed these models do not use the term “variable rule,” and that term has almost disappeared from the vocabulary of sociolinguists as well. Nevertheless, the variable rule is still useful for describing the constraints on alternating linguistic forms, and I will continue to use it in this book. It is understood, however, that the term is a heuristic device that does not make direct psychological claims. But before looking at the psychological underpinnings of variable speech behavior, we will turn to some speciﬁc studies of variation in the speech of native speakers and language learners.

2

A Study of Variation in the Native Speaker Speech Community H. D. ADAMSON AND VERA REGAN

Introduction In order to get a closer look at a quantitative sociolinguistic study, and to see how the Varbrul program works, we now turn to a small-scale study of variation in the speech of native English speakers living in Philadelphia. This study involves the well-researched variable -ing,1 which can occur in a number of diﬀerent linguistic contexts. The contexts examined in this study, which include nouns (“darling”), gerunds (“skiing is fun”), adjectives (“an intriguing idea”), and others, are shown in table 2.1. In all of its contexts, -ing allows two Table 2.1 Syntactic categories in which -ing occurs. Categories

Example

Verbals progressive periphrastic future VP complement WHIZ deletion sentential complement participle

He’s eating pizza. He’s going to eat pizza. I like watching rugby. The man going home stopped. You’ve got to be quick, throwing answers back. We go out there ﬁshing.

Modiﬁers adjective complex gerund

This is a tempting idea. I want a swimming pool.

Preposition preposition

It was during the summer.

Gerund gerund

I was amazed by Mary’s recovering her wallet.

Nominals place name (internal) noun t-word (only two)

Washington is the capital. It’s on the ceiling. I saw something. I saw nothing.

23

24

• Variation in Native Speaker Speech

pronunciations: the informal [in], sometimes spelled darlin’, skiin’, etc., and the formal [iyŋ]. Hereafter, [in] will be referred to as N and [iyŋ] will be referred to as G. Sociolinguistic Studies of -ing Research has consistently shown that the -ing variable is widespread throughout the English-speaking world. In Philadelphia, -ing is a stable sociolinguistic variable, one that has existed in the community for many years and is not in the process of spreading or declining. In this respect, -ing contrasts with other kinds of variable features of Philadelphia speech. These include variables involved in changes in pronunciation that are nearly complete, such as the raising of /æ/ before nasals, so that planet is pronounced [plinət], and new and vigorous changes in pronunciation, such as the fronting of /ow/, so that know is pronounced [nəw]. In the ﬁrst quantitative study of -ing, Fischer (1958) examined the factors that conditioned the variable. He found that, in the case of schoolchildren, gender and topic aﬀected the proportion of N and G variants. Boys used N more than girls, and casual speech had a higher proportion of N. Anshen (1969) studied -ing in southern black and white speech, and found a number of similar features. In both varieties of speech, men used a higher percentage of N than women, and casual speech contained a higher percentage of N than careful speech. In addition, speakers with less education and less prestigious occupations used more N. African Americans had a higher percentage of N than European Americans. Labov (1966) ﬁrst demonstrated the social stratiﬁcation of -ing. In his study of New York City speech, he found a correlation between race and frequent N usage. He also found that southern African American speakers used more N than northern African American speakers. In his Norwich study, Trudgill (1974) found, in common with the other studies, that men tended to use N more than women, and that casual speech contained higher frequencies of N than careful speech. Trudgill (1974) also found that -ing is a good indicator of social class and that the percentage of N can vary from 0 percent for middle-middle-class (MMC) and lower-middle-class speakers in word list style to 100 percent in lower-working-class speakers in casual style. Stylistic variation was greatest in the case of the upper working class, with a range of from 5 to 87 percent. Trudgill (1974) suggested that this is due to “U[pper] W[orking] C[lass] L[ower] M[iddle] C[lass] awareness of the social signiﬁcance of the linguistic variable because of the border-like nature of their social class position” (p. 100). He observed that -ing diﬀerentiated between the ﬁve social classes he isolated, but that it particularly marked the distinction between middle-class and working-class speakers. UWC speakers showed the greatest amount of stylistic variation and MMC speakers showed the least. Because -ing pronunciation distinguishes between social classes, speaking styles, and genders, Trudgill (1974) concluded that this phonological

Variation in the Native Speaker Speech Community

• 25

variable reﬂects part of the value system of English speakers, a point that will be taken up in chapter 8. Perhaps the most extensive study of -ing is reported in Labov (2001a), a study that is of particular relevance to the research reported here because it involved residents of the same Philadelphia neighborhood as the present study. Labov (2001a) found results similar to those of previous researchers. There was considerable gender stratiﬁcation, with men using higher frequencies of N than women, as well as social class stratiﬁcation. Like Trudgill (1974), Labov (2001a) found that stylistic variation was not great for lower- and working-class speakers, but increased among the middle social classes. Furthermore, he found that the diﬀerence between men’s and women’s N usage was far higher for middle-class speakers than for working-class speakers. Echoing Trudgill’s (1974) theory of the linguistic behavior of the border-like classes, Labov observed, “In general, the second highest social group shows the greatest gender diﬀerence and the sharpest [diﬀerence in] style shifting” (2001a, p. 272). In regard to the linguistic constraints on -ing, several studies have found evidence for assimilation to the place of articulation of the following sound. Shuy, Wolfram, and Riley (1968) and Cofer (1972) found that a following velar stop favored G, and a following alveolar stop favored N. Labov (2001a), however, did not ﬁnd this eﬀect. Houston (1985) found a grammatical eﬀect on the distribution of N and G among British speakers according to the syntactic category of the ing word. When -ing occurred in nouns or in the pronouns something and nothing, G was used at a high frequency. However, when -ing occurred in verbals, such as the periphrastic future tense or a progressive tense, N occurred more frequently. Houston (1985) was able to arrange the categories in which -ing occurred along a continuum ranging from noun to verb that reﬂected the frequency at which a category took N. A simpliﬁed version of this continuum is shown in (1). (1)

progressives > participles > gerunds > pronouns > proper nouns

The hierarchy in (1) is implicational, making the claim that the frequency of N in any particular category is higher than the frequency of N in all the categories to the right of it. Labov (2001a) found a similar grammatical eﬀect, and set up an implicational hierarchy similar to (1), where verbal elements correlated with higher N usage and nominal elements correlated with higher G usage. To explain this ordering, Houston (1985) claimed that the grammatical categories in which -ing can occur are not discrete, but form a continuum ranging from noun to verb, and that this continuum is the result of a historical merger. Prior to the fourteenth century, the present participle in English did not take the suﬃx -ing, but rather -ind.2 The suﬃx -ing occurred with verbal nouns, such as luﬁung (loving): and concrete nouns, such as farthing. However, verbal nouns began to acquire features of verbs, as shown in (2).

26

• Variation in Native Speaker Speech (2)

Mary’s frequently playing the drums bothered George.

In (2) the gerund playing acts like a noun in that it is part of the subject of bothered, but it acts like a verb in that it takes the adverbial modiﬁer frequently and the object the drums. Gradually, the functional diﬀerence between nominals and verbals was blurred, with gerunds occupying the fuzzy area between the two categories. The partial coalescing of the categories verbal noun and verbal led to a blurring in the phonetic distinction between the two categories, which were already very similar; thus, [ind] began to be pronounced [in]. By the end of the fourteenth century, this sound change had progressed to the point that it was reﬂected in the orthography, so that all the forms in table 2.1 were spelled with -ing. Despite the similarity in spelling, however, nominal and verbal -ing forms continued to be pronounced variably, with nominals favoring G and verbals favoring N, a pattern that has continued up to the present day. To summarize, all of the studies have found the variable -ing to be sensitive to social and linguistic constraints. On the social level, -ing is sensitive to a speaker’s gender, speaking style, and socioeconomic class. On the linguistic level, -ing is sensitive to the syntactic environment in which it occurs and may be sensitive to the phonological environment, as well. As Houston (1985) concluded, “not only external, social factors inﬂuence the realization of -ing in a regular stable way across diverse speech communities, but internal linguistic factors exhibit such stable patterns as well” (p. 50). Methods of Data Collection and Analysis The subjects for the present study of -ing were 31 native English speakers, 10 men and 21 women, whose ages ranged from 17 to 90, with a median age of 38. The data were collected by graduate students in a sociolinguistics ﬁeld methods course at the University of Pennsylvania in 1988. The students taperecorded interviews conducted using the standard question modules developed at the University of Pennsylvania (Labov, 1984). These modules are intended to control for shifts in formality, topic, and audience by using a standard format in which one or two interviewers ask memorized questions about topics that include “danger of death,” “community services,” “childhood games,” and so on. The variation of -ing in the speech samples was analyzed using the Varbrul 2 computer program (Cedergren and Sankoﬀ, 1974). In order to use the program, the analyst must ﬁrst specify the linguistic and extralinguistic factors believed to constrain the variation. Following Cofer (1972) and Houston (1985), the following factors were examined for their possible eﬀect on -ing variation: grammatical category, following phonological environment, speaking style, and gender. These broad factors, called factor groups, were divided into their smaller constituent factors, as shown in table 2.2. For example, the factor group gender contains two factors: men and women. Next, each token of -ing that

Variation in the Native Speaker Speech Community

• 27

appeared in the data was coded according to which dependent variable (N or G) occurred, and which of the independent variables (the proposed factors in table 2.2) co-occurred with the dependent variable. In regard to the coding of speaking style, it should be noted that the deﬁnition and separation of speaking styles has been a continuing controversy in sociolinguistics (see the discussion in chapter 8). As discussed in chapter 1, Labov has deﬁned style in terms of the context of speaking. One such context is reading a list of words; another context is reading a paragraph; a third context is speaking to a researcher during a sociolinguistic interview. Word list style and reading style are, of course, easy to identify. The major problem has been to identify styles within the sociolinguisitc interview. Over the years, researchers at the University of Pennsylvania have developed what is called the style tree, shown in ﬁgure 2.1 (Labov, 2001b). The tree is a coding device that allows the analyst to determine whether an utterance is likely to be in careful or casual style according to the topic being addressed. The deﬁnitions of the diﬀerent styles are as follows: response = the ﬁrst sentence in response to a question; language = discourse about language; soapbox = persuasive discourse: when the subject mounts a “hobby horse”; careful = other discourse that the researcher believes to be monitored on the basis of “channel cues” (Labov 1972a, pp. 79–99), such as a change in tempo, pitch range, volume, or rate of breathing. The deﬁnitions of the unmonitored speech styles are as follows:

Figure 2.1 Decision tree for stylistic analysis of spontaneous speech in the sociolinguistic interview (from Labov, 2001b). Reprinted with permissions.

28

• Variation in Native Speaker Speech

quote = telling what someone else said (note that this style is not represented in the style tree in ﬁgure 2.1; if it were, it would be placed above narrative); narrative = recounting a past event; group = when addressing an audience other than just the interviewer(s); kids = discourse about children; tangent = an aside; casual = other discourse that the researcher believes to be unmonitored on the basis of the channel cues described earlier. In addition to these cues, laughter is a cue of casual style. Notice that the upper branches of the style tree represent more objective decisions than the lower branches. It is easy to tell if an utterance is an immediate response to a question, but more diﬃcult to tell if a topic is tangential to the main subject of discussion. Having coded all tokens of -ing in the data for the factors in table 2.2, the analyst can run the Varbrul 2 program, which identiﬁes the contribution of each proposed factor to the probability of N being produced. Table 2.2 also displays the results of the Varbrul analysis. As mentioned, the ﬁrst column in table 2.2 shows the four proposed factor groups: speaking style, gender, grammatical category, and following phonological environment. Column 2 shows the individual factors that make up each factor group. The Varbrul 2 program reﬂects the claim of Variation Theory that many factors simultaneously inﬂuence a speaker’s choice of a particular variant. As we have seen, in previous studies casual style favored N; men favored N; verb-like syntactic categories favored N; and in some studies, a following apical favored N. The Varbrul 2 program calculates the probability (if any) that each proposed factor contributes to the occurrence of N and displays its ﬁnding by attaching a decimal number, or coeﬃcient (p), to each factor. The p values are shown in column 4 of table 2.2. A p value greater than .50 indicates that the factor favors N, whereas a p value less than .50 indicates that the factor disfavors N. The program also provides statistical measures of how well the linguist’s analysis of the data (that is, the proposed factors and factor groups) actually ﬁts the data. One measure is a chi-square per cell ﬁgure which, according to Preston (1989, p. 15), should be no higher than 1.5, and preferably below 1.0. The second measure is a stepwise regression analysis that calculates the extent to which each factor group accounts for the variability in the data. In coding the data for a Varbrul 2 analysis, it is a good idea to code as broadly as possible, so that all factors that are suspected to aﬀect the variable are included. This is so because factors for which there are insuﬃcient data can be combined with similar factors during the data analysis. However, if it is suspected that a factor is at work that has not been coded for, the entire corpus must be coded again. Table 2.2 shows that many of the original factors have been combined. For example, in the factor group style there were originally ten factors, four of which represented a careful style, and six of which represented a casual style. In the ﬁnal analysis, these factors were combined to form only two factors: careful style and casual style. The next step in the analysis is to determine whether there are any factors in

Variation in the Native Speaker Speech Community

• 29

Table 2.2 Probabilities of N in the Philadelphia native speaker data according to monitoring, gender, grammatical category, and following phonological environment. p Speaking Style

Careful language

Casual

Speaker’s Gender

Female Male

Grammatical Category

Futurea Progressive Verbal

Gerund Modiﬁer Nominal

Following Phonological Environment

response soapbox careful quote narrative group kids tangent casual

participle verb complement sentence complement WHIZ deletion adjective complex gerund noun t-form internal

%

n

.32

28

228

.72

72

231

.24 .77

20 65

269 251

1.00 .63 .47

100 55 33

20 209 96

.46 .45

24 43

67 35

.29

23

78

Preposition

.13

2

9

Apical Labial Back

.61 .56 .50

49 53 45

137 79 22

.46 .43 .42

42 33 36

45 79 151

Semivowel Pause Vowel

velar palatal

Note: input probability .39; chi-square per cell .870. a Since N occurs categorically for this factor, future is a “knockout” constraint and, therefore, was not included in the actual Varbrul analysis. It is included here to give a complete picture of the eﬀect of grammatical categories.

whose presence N always or never occurs. If so, these knockout factors must be excluded from the input to the Varbrul 2 program because it can handle only variable data. As is often the case, several knockout factors occurred in the initial analysis due to an insuﬃcient number of tokens involving that factor. However, when the factors were conﬂated in the way shown in table 2.2, all of

30

• Variation in Native Speaker Speech

the knockout factors except future, which showed 100 percent N, disappeared. Thus, although future is included in table 2.2 for convenience, it was not part of the Varbrul 2 analysis. Results The chi-square per cell score for the Varbrul 2 analysis is .870, which exceeds Preston’s (1989, p. 15) criterion for a good ﬁt between the theory of which factors constrain the variation represented in table 2.2 and the actual data. The stepwise regression showed that the eﬀects of three of the four factor groups were signiﬁcant: style, grammatical category, and gender. The eﬀect of following phonological environment was not signiﬁcant, as was the case in Labov (2001a). The Effects of Gender and Style Table 2.2 shows that men and women produced N at very diﬀerent rates: for women, p = .24 (20%), whereas for men, p = .77 (65%). Before looking at the eﬀect of style, it is necessary to ask whether style shifting had the same eﬀect for both genders. It is possible, for example, that in casual speech women lower their frequency of N, whereas men raise their frequency. If this were the case, the variables gender and style would be said to interact. The Varbrul 2 program assumes that independent variables do not interact, so if an interaction is detected, one of the interacting variables must be eliminated from the Varbrul 2 analysis. Interaction can be checked for by cross-tabulating the factor groups gender and style, as shown in table 2.3, which reveals that style shifting has the same eﬀect for both sexes. In careful style, women produce N at 8 percent versus 42 percent in casual style. Men produce N at 51 percent in careful style versus 85 percent in casual style. Thus, for both genders, switching from careful to casual style results in a higher percentage of N. Since this is the case, gender and style do not interact, so including both factor groups in the Varbrul 2 analysis is justiﬁed.

Table 2.3 Frequency of N according to style and gender. Style Careful

Casual

Total

Gender

%

n

%

n

%

n

women men total

8 51 28

157 130 287

42 85 65

85 95 180

20 65 58

242 225 467

Variation in the Native Speaker Speech Community

• 31

The Effect of the Grammatical Category As we have seen, Houston (1985) and Labov (2001a) found that the frequency of N was conditioned by the grammatical category to which individual tokens belonged. More noun-like categories favored G, and more verb-like categories favored N. In our study, the data for the Philadelphia native speakers contain only 520 tokens of -ing compared to Houston’s 2,363 tokens; therefore, it is not possible for us to make distinctions in grammatical categories that are as ﬁne as Houston’s. However, the continuum of grammatical categories shown in the third factor group in table 2.2 is similar to Houston’s continuum. Table 2.2 shows that the two verbal categories progressive and periphrastic future are most favorable to N, whereas the nominal category is highly unfavorable. As in Houston’s data, the verbal, gerund, and adjective forms fall in between these two extremes. In the Philadelphia data, however, the forms in between verbal and nominal are not distinguished: they take N at approximately the same frequency. A major diﬀerence between the use of -ing by our Philadelphia speakers and Houston’s (1985) British speakers is the pronunciation of prepositions. For prepositions, the British speakers highly favor N (77%), whereas the Philadelphia speakers highly favor G (98%). In sum, our analysis of the Philadelphia data, like Houston’s (1985) analysis of the British data, shows a grammatical eﬀect on the production of -ing. The Effect of the Following Phonological Environment As mentioned, several studies of -ing have reported the eﬀect of regressive assimilation. Shuy, Wolfram, and Riley (1968), Cofer (1972), and Houston (1985) found that a following velar stop favored G and a following apical favored N. There is some tendency for this eﬀect in the present study, as well, as shown in table 2.4, which shows that when the sound following -ing is produced at the front of the mouth, as in the case of labials and apicals, N occurs at the rate of 53 percent and 49 percent, respectively. However, when the following sound is produced at the back of the mouth, as in the case of palatals and velars, N is produced at the rate of only 45 percent. This diﬀerence, Table 2.4 Percentage of N by following phonological environment. Following environment

%

n

apical labial velar and palatal semi-vowel pause vowel

49 53 45 42 33 36

137 79 22 45 79 151

32

• Variation in Native Speaker Speech

however, is not large and does not reach signiﬁcance. Thus, the present study supports Labov (2001a), which found no signiﬁcant eﬀect of the following phonological environment. Discussion This analysis of a small sample of Philadelphia speech has revealed a number of patterns that have been found in more extensive sociolinguistic studies, including those of Labov (2001a), which included a much larger number of Philadelphia speakers. One ﬁnding is that there are no single-style speakers. All of our subjects produced higher proportions of N in casual speech than they did in careful speech. A second, related, ﬁnding is that -ing is evaluated in the same way throughout the speech community, as shown by the fact that all of our subjects, both men and women, used less of the stigmatized form N in careful speech. Labov (2001a, p. 214) notes that uniform evaluation throughout the speech community is characteristic of stable sociolinguistic variables like -ing (but recall the discussion of the variable rule speech community in chapter 1). A third ﬁnding is that diﬀerent social groups set themselves apart from each other by the frequency at which they use sociolinguistic variables. The two groups examined in this study, men and women, used N at very diﬀerent rates: men at an average of 65 percent and women at an average of 20 percent, a diﬀerence of 45 percentage points. A fourth ﬁnding of this study is that speakers’ internal grammars can aﬀect the production of linguistic variables. As we have seen, the frequencies of N and G are aﬀected by the grammatical category of the word to which the -ing suﬃx is attached. Other studies of sociolinguistic variables have discovered a similar eﬀect. As we saw in chapter 1, Wolfram and Fasold (1974) found that ﬁnal /t,d/ is less often deleted from consonant clusters when it represents the past tense morpheme, as in missed, than when it does not represent a morpheme, as in mist. This ﬁnding supports the intuitive notion that the mental mechanisms for learning grammatical categories and the mechanisms for learning probabilities of variable forms are connected. This topic will be explored further in chapter 6. In conclusion, this small-scale study has revealed a number of important principles of language variation within a speech community. However, several important factors, such as the eﬀect of age and social class on variation, have not been addressed. These topics, and the important question of how variation relates to language change, are the subject of the next chapter.

3

Language Variation and Change

Introduction Traditional theories of language have described linguistic forms and systems by abstracting away from the probabilistic nature of speech. In this regard, they contrast with the kind of study in chapter 2. For example, generative grammar might analyze -ing forms by saying that the underlying form is /iŋ/ for all the grammatical contexts listed in table 2.2, and that this form can be optionally changed to /in/ by means of a phonological rule. Thus, the grammar would specify that -ing has two forms and the description would end at that point. But, as we saw in chapter 2, more can be said. By studying natural speech, the probabilistic patterns of -ing usage can be discovered, and the two forms of -ing can be correlated with linguistic and social factors. Such studies show that the linguistic system used in a speech community is more complex than traditional theories of language have implied. The Philadelphia system of -ing usage is not uniform and neat, but variable and messy. Charles Darwin (1859/1998) pointed out that the messiness of linguistic variation is in some ways comparable to the messiness of variation among species of horses, and that there is an important relationship between variation and evolution in both languages and living things. Natural variation among members of the same species is an essential component of the theory of evolution, and Darwin studied it at length, remarking on such examples as the diﬀerences in color among members of the same species of shellﬁsh (they are brighter in southern waters), and the diﬀerences in bar markings among members of the same species of horse. Darwin believed that the faint stripes observed on some members of the species were evidence that zebras and horses descended from a common ancestor (1998, p. 208). Darwin also observed variability in language use, remarking, “We see variability in every tongue, and new words are continually cropping up” (quoted in Labov, 2001a, p. 8). He also believed that language change results from a kind of natural selection, endorsing Muller’s (1861) view that: A struggle for life is constantly going on amongst the words and grammatical forms in each language. The better, the shorter, the easier forms are constantly gaining the upper hand, and they owe their success to their own inherent virtue. (quoted in Labov, 2001a, p. 9) 33

34

• Variation in Native Speaker Speech

In this chapter, we will discuss the mechanisms by which variation in language is related to language evolution and change. A longstanding approach to the relationship of language variability and language change was presented in an article by Weinreich, Labov, and Herzog in 1968. These scholars divided the topic into ﬁve component problems: the constraints problem, the transition problem, the evaluation problem, the embedding problem, and the actuation problem. These problems apply to all types of linguistic change, including syntactic change and semantic change, but they have most often been studied in relation to sound change, and our discussion will focus on that area. Parallels will also be drawn to change in the semantics of color category systems, which is discussed at greater length in the appendix. All of the problems identiﬁed by Weinreich, Labov, and Herzog (1968) (except the constraints problem, which concerns only ease of articulation) have both a linguistic aspect and a social aspect. We will discuss both aspects of each of the problems in turn after some introductory remarks on the study of sound change. Measuring Sound Change Articulatory phonetics and acoustic phonetics are two approaches to the study of speech sounds. Articulatory phonetics analyzes the shape of the vocal tract when diﬀerent sounds are pronounced, while acoustic phonetics analyzes the sound waves produced by the vocal tract. Modern studies of sound change in progress are carried out by acoustic phoneticians using the sound spectrograph, an instrument that measures the frequencies at which air molecules vibrate during the production of speech sounds. A single vowel (like a single note on a guitar) does not consist of a single frequency of vibration, but rather of many frequencies: a basic frequency and a number of secondary frequencies, which are called harmonics or overtones. When a vowel is pronounced, the sound energy is supplied by the vocal cords, which set the air in the vocal tract to vibrating. The vocal tract (like the sounding board of a guitar) ampliﬁes some of the frequencies of vibrating air and dampens other frequencies. The ampliﬁed frequency with the lowest pitch is called the ﬁrst formant (abbreviated F1). The ampliﬁed frequency with the next lowest pitch is called the second formant (F2), and so on. When the shape of the vocal tract is changed, as when it moves from the articulatory gesture for [a] to the articulatory gesture for [æ], diﬀerent frequencies are ampliﬁed and dampened. Thus, the F1 and F2 for [a] are diﬀerent from the F1 and F2 for [æ], which is why we hear these vowels as diﬀerent from each other. In analyzing vowels using a spectrograph, all of the vowels can be distinguished by specifying just the values of F1 and F2. Formants are measured in terms of vibrations per second, or hertz (Hz). Here are the formant values for two vowels spoken by a man (from Ladefoged 1975, quoted in Clark and Clark 1977, p. 184):

Language Variation and Change

Vowel

As in

F1

F2

[iy] [æ]

heed had

280 Hz 690 Hz

2250 Hz 1660 Hz

• 35

In articulatory phonetics diﬀerences between vowels are speciﬁed mainly in terms of the position of the tongue. For example, for the vowel [iy], the blade of the tongue is high in the front of the mouth; for the vowel [æ], the blade of the tongue is low in the front of the mouth. There is a helpful correspondence between the articulatory system and the acoustic system of classifying vowels. It turns out that higher vowels correspond to lower frequencies of the F1 formant and more fronted vowels correspond to higher frequencies of the F2 formant, as can be seen in the chart above. Thus, the familiar vowel triangle can be represented in acoustic terms as in ﬁgure 3.1 (which represents the average formant values of several Philadelphia vowels), with the F1 formant values shown on the y axis and the F2 formant values shown on the x axis. In chapter 2, we noted that -ing was a stable variable in Philadelphia speech; that is, the overall frequency of G versus N and the stratiﬁcation according to speaking style and gender is not changing. For this reason, -ing cannot be used as an indicator of change in progress. For such an indicator, we must turn to the vowel system, in which Labov (2001a, p. 140) has identiﬁed no fewer than 15 changes in progress. These can be divided into four diﬀerent types of change:

Figure 3.1 Mean values of all Philadelphia vowels with age coeﬃcients. Circles: mean F1 and F2 values. xxC = vowel followed by an obstruent consonant (“checked”). xxF = vowel not followed by an obstruent consonant (“free”). xxO = vowel followed by a voiceless consonant (from Labov, 2001a). Reprinted with permissions.

36

• Variation in Native Speaker Speech

nearly completed changes, partly completed changes, new and vigorous changes, and incipient changes. Figure 3.2 shows some examples of these ongoing changes in vowel positions. The data in ﬁgure 3.2 come from a crosssectional study of Philadelphians (Labov, 2001a), which included speakers of all ages, ranging from under 20 to over 50. The tails of the arrows in ﬁgure 3.2 represent the position of the F1 and F2 formants for speakers 25 years older than the average age of all the informants in the study, and the heads of the arrows represent the formant values for speakers 25 years younger than the average age. As ﬁgure 3.2 shows, the younger speakers are usually pronouncing the vowels in question at a higher and more fronted position than the older speakers (though /uw/ and /ow/ are moving in the opposite direction). Our discussion will focus on some of the changes shown in ﬁgure 3.2. We will consider these changes in relation to the ﬁve problems identiﬁed by Weinreich, Labov, and Herzog (1968) involved in understanding how (and perhaps why) languages change. Five Problems in Explaining Sound Change The Constraints Problem The search for linguistic constraints on sound change has focused on chain shifts. A chain shift occurs when a vowel changes its position in phonetic space (the space shown in ﬁgures 3.1 and 3.2), thus forcing other vowels to change

Figure 3.2 Mean values of all Philadelphia vowels with age coeﬃcients. Circles: mean F1 and F2 values. Heads of arrows: expected values for speakers 25 years younger than the mean. Tails of arrows: expected values for speakers 25 years older than the mean. xxC = vowel followed by an obstruent consonant (“checked”). xxF = vowel not followed by an obstruent consonant (“free”). xxO = vowel followed by a voiceless consonant (from Labov, 2001a). Reprinted with permissions.

Language Variation and Change

• 37

their positions in order to maintain contrasts. For example, in the Great English Vowel Shift /iy/ was lowered and centralized to /ay/, and /ey/ was raised to ﬁll the resulting empty phonetic space. The search for linguistic constraints on such changes is parallel to the search for linguistic universals in generative theory, and, as discussed in the appendix, to the search for cognitive universals in the evolution of color category systems. Investigations of sound change in progress have discovered several principles of chain shifting. One of the most robust of these principles is: In chain shifts, back vowels move to the front (Labov, 1994, p. 116). Attested in many studies of historical change, this principle makes sense from a physiological standpoint because the asymmetry of the mouth gives more room in the front for phonetic distinctions than in the back. As ﬁgure 3.2 shows, in Philadelphia speech many vowels are moving to more fronted positions. But, note that there are exceptions to the fronting principle, as there are to all of the principles of sound change that Labov has found. For example, in Philadelphia speech /uw/ and /ow/ are not moving forward, but rather toward more backed and lower positions. Thus, the principles of language change have the character of tendencies rather than inviolate laws. Labov (2001a) concludes that, although physiological factors have a strong inﬂuence on sound change, they interact with and can be overridden by social factors. In his words, “General linguistic principles . . . form the favorable undercurrent, or perhaps prevailing wind, for changes now in progress. Given enough social motivation or contrary linguistic pressures, retrograde movements can be set in motion, just as a boat may tack into the wind” (2001a, p. 499). The Transition Problem Linguistic aspects. The transition problem concerns the route by which (or how) a sound change moves from one phonetic position to another. Figure 3.2 illustrates two possibilities. The ﬁrst possibility is where the change is not conditioned by the linguistic environment, as illustrated by the movement of (aw).1 The arrow associated with (aw) shows that its nucleus is being fronted and raised. Note that there are no environmental constraints on (aw) movement: all words that contain (aw) are aﬀected by this change (though some lexical items, called outliers, may initially be aﬀected more than others). A second possible manner of sound change is where the linguistic environment does aﬀect movement. An example from ﬁgure 3.2 is the new and vigorous change of raising [ey] before an obstruent (which is represented by the notation (eyC)) as in plate and raise. This change is not occurring in words where [ey] is not followed by an obstruent, as in say. If this sound change continues, it is possible that [ey] raising will spread to other environments and eventually to all words that contain the [ey] sound. These two examples of sound change are instances of regular sound change, a change that aﬀects classes of words, not individual lexical items. Another type

38

• Variation in Native Speaker Speech

of sound change is lexical diﬀusion, a change that progresses on a word-by-word basis. In lexical diﬀusion some words containing a particular sound in a particular environment are aﬀected while other words are not. One result of lexical diﬀusion can be seen in ﬁgure 3.2. Notice that the ﬁgure contains two variables with the phone [æ]: (æh) and (æ). The ﬁrst of these, (aeh), is a tensed vowel with an in-glide represented by [h] (ﬁgure 3.2 shows only the value of (æhN), that is (æh) followed by a nasal. The values of (æh) in other environments would be placed near the tail of the (æhN) arrow—see the discussion below). Of the two variables, (æ) is the older sound, so words that are now pronounced as (æh) used to be pronounced as (æ). This (now nearly complete) sound change spread through the Philadelphia community in two stages. First, regular sound change aﬀected all [æ] phones in syllables that ended in a voiceless stop, such as cat and ﬂack. Next, syllables that did not end in a voiceless stop were aﬀected, and in this stage the change spread by means of lexical diﬀusion. Thus, some syllables not ending in [p,t,k] were tensed, including bad, glad, and the ﬁrst syllable in planet, but other syllables were not, including sad, glass, and the ﬁrst syllable in Janet. The word can used as a noun or main verb was tensed, but the word can used as an auxiliary verb was not. Thus, lexical diﬀusion has produced a split in homonyms, so that can (noun, verb) is tensed /kæhn/ but can (AUX verb) is not /kæn/. In this way a new phoneme, /æh/, has been created in Philadelphia speech. As ﬁgure 3.2 shows, this new phoneme is now itself being raised and fronted, especially before nasals, whereas the older phoneme /æ/ is being lowered. Social aspects. A fundamental question in regard to transition is whether the locus of sound change is individual or generational; that is, do individuals change their internal systems during their lifetimes or do children construct systems that are diﬀerent from those of their parents? Labov (2001a) concludes that, in general, individual systems remain fairly stable after the age of nine or ten. This is especially true in regard to the underlying phonemic system. Payne (1980) found that adults who moved to the Philadelphia area from out of state could acquire the Philadelphia rules for regular sound change, such as the raising of (eyC) and (aw), which, as we have seen, produce new allophones rather than new phonemes and therefore require learning only one rule that aﬀects all of the words in a class. However, these adults were not able to acquire the tensing rule for changing /æ/ to /æh/, which involves the lexical diﬀusion mechanism and must be learned, for the most part, one word at a time. Thus, the adults approximated the Philadelphia pattern, but did not completely match it. Payne (1980) found that acquiring the true Philadelphia dialect, with its major innovation of a new phoneme, was possible only for children born to parents from Philadelphia. The transition problem is intimately connected with the problem of language acquisition. Studies of ﬁrst language acquisition carried out within the generative paradigm have focused on the acquisition of invariant features, such

Language Variation and Change

• 39

as the fact that English, as a minus pro-drop language, requires that every sentence have a subject. As discussed in chapter 1, it is claimed that children are born with a Universal Grammar containing the knowledge that the target language will be subject-mandatory (like English) or subject-optional (like Spanish). Given this initial knowledge, the child can set the pro-drop parameter at either plus or minus on the basis of just a few examples, or triggers. But, learning the appropriate use of a variable feature like -ing or (eyC) raising is very diﬀerent. Here the learning is not categorical but probabilistic: the child must learn the frequencies of the variable that are appropriate to his or her age, gender, and social class. Labov (2001a, p. 419) identiﬁes the mental mechanism of such learning as probability matching, an ability that is observed in many species, which enables an individual to duplicate the probabilistic behavior of other members of the species. We will discuss how probabilities can be learned in regard to language production in chapter 6. The transition problem in regard to learning stable sociolinguistic variables like -ing is diﬀerent from the problem in regard to learning variables in the process of change like the raising of (eyC). In the ﬁrst case, children need only to learn to talk like their parents. But in the second case, children must learn to talk diﬀerently from their parents in order to move the change in the direction of the arrows for (eyC) in ﬁgure 3.2. That is, they must learn to raise (eyC) to a greater height than their parents do. Let us ﬁrst consider the simpler case of a stable variable, where children must learn to match the parents’ frequency of variation. Roberts (1993) found that by age three, children matched the adult grammatical constraints on -ing discussed in chapter 2, and that they style shifted, using N more frequently in informal speech. The ﬁrst ﬁnding implies that the mental mechanism for probability matching cannot be independent of the mental mechanism for grammar learning because children must match the probabilities of N usage associated with diﬀerent grammatical categories. In regard to style shifting, Labov (2001a, p. 418) suggests that at ﬁrst children do not learn the adult concepts of formal style versus casual style. Rather, they learn a more primitive and more relevant opposition: instructional style versus playful style. Children are likely to hear more G in instructional and disciplining situations, as “when the child is in trouble and/ or is being instructed” (Labov, 2001a, p. 420). Conversely, children are likely to hear more N in intimate and friendly situations. At a later stage, children learn that G is associated with formal style and middle-class speech, and that N is associated with casual style and working-class speech (see the broader discussion in chapter 8). We now turn to the more complex case of the transition problem: How can children learn to talk diﬀerently from their caretakers in order to advance a sound change in progress? As an example, let us take the nearly completed change (ohr) raising and backing, as in the word corner. Notice that for this change to increase with each generation, it is not enough for children to match

40

• Variation in Native Speaker Speech

the frequency of their parents, nor even the frequency of slightly older peers. To advance the change, children must understand the direction of the change, and to do that, they must sample the frequency at which their parents raise and back (ohr) and compare it to the frequency used by slightly older peers. Only by noticing that the latter is greater than the former can learners know the direction of the sound change and adjust their own production accordingly. The fact that they do so is convincingly demonstrated by the (ohr) arrow in ﬁgure 3.2. Furthermore, it appears that, while very young children can intuit the direction of a change, it takes several years of exposure to age stratiﬁcation in the speech community for them to internalize all aspects of the change, as the following example illustrates. Recall that in Philadelphia speech /æ/ split into two phonemes /æh/ and /æ/, in part by means of lexical diﬀusion. In this change, tensing applied categorically in closed syllables, but only to certain words in open syllables (e.g. planet was tensed, but Janet was not). Table 3.1 shows the frequencies of tensing for these words by adults, very young children, and slightly older children. Notice that the children in the youngest age group incorrectly apply tensing in the word Janet at a rate of 65 percent. However, children in the next oldest age group have moved toward the adult norm, tensing Janet at a rate of only 10 percent. The Embedding Problem Linguistic aspects. Regardless of how it begins, a new phone must be embedded within an existing system. The linguistic aspect of the embedding problem is usually discussed in functionalist terms: it is believed that the linguistic system arranges itself to preserve the maximum number of oppositions. The Neogrammarians and Saussure claimed that phonemes comprise a system of maximum contrasts, and that change in the value of any individual phoneme necessitates changes in the values of all the other phonemes in the system. This claim is usually true in the long run, as is illustrated in chain shifts. As mentioned, in the Great English Vowel Shift, /iy/ was lowered and centralized to /ay/, and /ey/ was raised to ﬁll the resulting empty phonetic space. Long-term sound change usually results in a reshuﬄing of phonetic values within the same number of phonemic units, rather than in an increase of phonemes. There is no maximum number of phonemes in a language, but there is a fuzzy upper Table 3.1 Frequencies of [æ] tensing for two words by age group.

planet Janet

Adults

Children, ages 3:11–4:11

Children, ages 3:2–3:10

18 0

96 10

90 65

Language Variation and Change

• 41

threshold. This limit, however ill-deﬁned, will discourage the addition of basic units. Three concepts, then, are required to explain why equilibrium in phonetic space is disturbed and then re-established in sound change: (1) the combination of social and phonetic factors that activate sound change, (2) an upper threshold of complexity, and (3) the functional pressure to maintain maximum contrast between units. A system in equilibrium can be upset by the phonetic and social factors discussed in this chapter. In the long run, however, the ceiling of complexity and the principle of maximum opposition are acknowledged and equilibrium is re-established, perhaps by reducing complexity elsewhere in the system. Labov (1982) likens this stabilizing process to the force of gravity. We can overcome gravity for a time by jumping or ﬂying in an airplane, but eventually the force reasserts itself and we must return to earth. Social aspects. Sociolinguists have found that in order to understand how a new sound is embedded within a speech community it is necessary to characterize speakers according to social class, ethnicity, age, gender, and locality. The best-known study of social embedding is Eckert’s (1988, 1999) research at “Belten” High School in Michigan. Eckert identiﬁed two main social groups at Belten: jocks and burnouts. The jocks were not just athletes, but students who participated in approved high school activities, including sports, student council, clubs, and so on. These students were generally college-bound, and they accepted the legitimacy of the academic program of courses and grades. The burnouts were generally headed for working-class jobs after high school, and they did not buy into the academic program nor the extra-curricular activities sponsored by the school. The two groups could be distinguished by their dress, the places they hung out, the kinds of music they listened to, and, Eckert discovered, the way they pronounced their vowels. It should be noted that the jock and burnout social categories were not sharply deﬁned, but rather constituted two poles of social orientation to which students were attached in varying degrees. In this sense, group membership at Belten High was like class membership in the larger society, and Labov (2001a, p. 433) suggests that the social continuum from burnout to jock was the high school reﬂection of class stratiﬁcation in the community. Belten High is located in a suburb of Detroit, a city that is participating fully in the Northern Cities Vowel Shift, a chain shift that is ongoing in cities in the Northern U.S. dialect region including Rochester, Cleveland, and Chicago. One aspect of this shift is the backing of the mid-central vowel //, so that busses can be misheard as bosses. Eckert found that the degree of // backing correlated with a speaker’s social group and gender, with female burnouts showing the most radical change in vowel position, and male jocks showing the least. The Varbrul weights for // backing at Belten High were as shown in table 3.2. Notice that in table 3.2 the female burnout category has been divided into two groups: main burnouts and burned-out burnouts. The latter group

42

• Variation in Native Speaker Speech

Table 3.2 Varbrul weights for the backing of // in Belten High, for social categories and genders. Male jocks Female jocks Male burnouts Main female burnouts Burned-out female burnouts

.32 .42 .54 .47 .93

contains those girls whose social characteristics placed them closest to the burnout pole of the jock–burnout continuum. Eckert found that this group had by far the most advanced backing of //. She was also able to distinguish other ﬁne divisions in the social categories at Belten High. Particularly interesting was a subcategory of the jocks containing individuals who had both jock and burnout friends, whom Eckert labeled brokers. Labov (2001a, p. 435) suggests that the brokers form the important link that moved the change in progress from the originating group (the burnouts) to the conservative group (the jocks). Belten High is a reﬂection of the larger society in several ways. Studies of Philadelphia neighborhoods representing diﬀerent social classes have shown that women are generally the agents of change. In regard to new and vigorous changes, such as the raising of (eyC), women are a full generation ahead of men (that is, 30-year-old women show the same phonetic values as 50-year-old men). Furthermore, the leaders of change are found in the middle socioeconomic classes, particularly the upper working class. These leaders, who are almost entirely women, are the equivalent of Eckert’s brokers because they are respected in their neighborhoods and also have contacts beyond their neighborhoods. Labov (2001a, pp. 77–78) sums up the embedding of a new linguistic form within the speech community as follows: a) Linguistic changes originate in an intermediate social group—the upper working class or the lower middle class. b) Within these groups, the innovators are usually people with the highest local status, who play a central role in community aﬀairs. c) The study of communication networks shows that the innovators have the highest density of social interaction and also the highest proportion of contacts outside the local neighborhood. d) For most linguistic changes women are in advance of men, usually to the extent of a generation. As the discussion of Belten High shows, speech patterns can serve as an emblem of group identity and status. Within the larger community, speech patterns serve as a symbolic claim to local rights and privileges. It appears that

Language Variation and Change

• 43

new ethnic groups entering a community, such as the Portuguese on Martha’s Vineyard (see chapter 1), or African Americans in urban centers, do not participate in local sound changes until they begin to gain these rights and privileges. The Evaluation Problem Linguistic aspects. The linguistic evaluation of change addresses the question of how a communicative system can change without a loss of communication among speakers. Such loss has only recently been observed. As late as 1982, Labov stated that no study so far had documented a loss of communicative eﬃciency due to sound change, and that this fact suggested that speakers have an ability to avoid confusion in talking to those with radically diﬀerent systems. In his words, “A recognition of structural heterogeneity seems to be built into the competence of members of the speech community. . . . The ability to speak and comprehend includes a knowledge of linguistic variation” (1982, p. 80). Labov’s argument here is functionalist. He implies that it is in the interest of a speech community to prevent misunderstandings, and therefore that its members develop the ability to comprehend distinctions that they themselves do not produce. This is no doubt true, or New Yorkers could not understand Philadelphians, Alabamans could not understand Bostonians, etc. However, more recent studies (Labov, 1994) have documented some loss of communicative eﬃciency among speakers with partially diﬀerent phonological systems. An example can be found in the Northern Cities Vowel Shift. One aspect of this change is that /æh/ is variably raised to [eh] and [ih]. This causes misunderstandings among speakers of other dialects, even when ample context is supplied. For example, in describing a séance during a sociolinguistic interview, Debbie S., a 13-year-old Chicagoan, said: a b c d e

But then they did this other thing that [ðiht] they would ask the candle a question an’ if it was yes, they would tell the candles to move, and if it was no, they would have [hehv] it stand still. And so, nobody really got scared of that [ðiht]. (Labov, 1994, pp. 188–189)

When this passage was played to speakers of other dialects, the word that in line b was frequently heard as yet, and the word have in line d was frequently heard as hear. Because linguistic change does sometimes cause miscommunication, the functionalist hypothesis described above may be too strong, and we must ask whether there are other ways in which meaning can be maintained within the speech community. In a weaker hypothesis, Labov (1994) claims that functional forces eventually assert themselves over the course of a linguistic change. When forms that convey a meaning are lost, other ways of conveying

44

• Variation in Native Speaker Speech

the meaning appear. For example, French formerly signaled the feminine plural of nouns with the article las, which was opposed to the feminine singular la just as in present-day Spanish. However, the radical reduction of ﬁnal consonants in French eliminated the la, las distinction (except when the sound following las was a vowel, in which case the /s/ was preserved by liaison). So, in order to maintain the singular/plural distinction, the vowel of las was changed from /a/ to /e/ by a process not entirely understood, which is reﬂected in the present spelling, as in les ﬁlles (the girls). Labov (1994) now says that when a language changes, its information-carrying capacity can be threatened, and communicative eﬃciency between speakers can be temporarily lost. But in the long run, languages preserve their means of conveying information. Social aspects. In order to understand how speakers evaluate their own and others’ speech, it is helpful to distinguish three kinds of sociolinguistic variables, which represent three ways a form can be evaluated by the community. The ﬁrst kind of variable is the sociolinguistic indicator, a form that is below the level of a speaker’s consciousness and therefore neither noticed nor evaluated. An example is a variant of -ing found in western speech (but very rare in Philadelphia and therefore not included in the study in chapter 2). In some western dialects (including the author’s own), G, the formal variant, is pronounced [iyn] rather than [iyŋ]. Thus, G is distinguished from N by the tenseness of the vowel, not by the nasal. Speakers of other American English dialects do not notice that [iyn] is diﬀerent from [iyŋ], and they attach no (conscious) stigma to the western form. Sociolinguistic markers are above the level of consciousness, and are commented on by speakers. In addition, markers show both class and style stratiﬁcation. An example is the form N in Philadelphia speech. As we saw in chapter 2, working-class speakers use more N than middle-class speakers, and all speakers use less N in monitored speech. The third type of sociolinguistic variable is the stereotype, which is a marker that has become so noticed and commented on that it is associated with a particular speech variety. One example is the /r/-less speech of Boston, where “park your car on the curb” can be pronounced [pæk yə kæ on ðə kəhb]. Another example is the southern “y’all.” There is a common (though not necessary) progression in the development of the three kinds of variables. As we have seen, linguistic change usually originates among members of the upper working class. Typically, at this early stage a new pronunciation is an indicator, below the level of social awareness. For example, raised (aw) is a new and vigorous sound change that is an indicator in Philadelphia speech. It is also possible to ﬁnd indicators that are long-entrenched, such as the [iyn] of western dialects that was just discussed. In the next stage, a change reaches the level of social awareness and becomes a marker. At this point, social stratiﬁcation appears, with the upper classes using the new form less frequently than the middle and lower classes. In this stage or the next, the change may become stablized and not continue to completion,

Language Variation and Change

• 45

leaving two variants in alternation. In the ﬁnal stage, a marker may become a stereotype, commented on by the public and evaluated unfavorably by all classes, as is the case with N. Stereotypes may even become associated with lower-class speech. The Actuation Problem The actuation problem is why a change is initiated at a particular time and place. Weinreich, Labov, and Herzog (1968) characterized the actuation problem as the most recalcitrant of the ﬁve problems, and this assessment has not changed. Labov (2001a) states, “The beginnings of change are as mysterious as ever” (p. 466). Undoubtedly, there are many reasons for a change to begin at a particular place and time, but one important motivation is when speakers adopt a new phone as an emblem of identity because they wish to align themselves with a particular social group. One example of such a change occurred after World War II when New Yorkers began to pronounce postvocalic /r/, as in forth and ﬂoor, in order to sound more like the majority of Americans. As described in chapter 1, this was an instance of change from above. However, it appears that by far the most common type of change is change from below. An example is the raising and centering of /aw/ on Martha’s Vineyard (Labov, 1972a). During the 1960s, young residents of the Vineyard had to decide whether to stay on the island or to pursue greater educational and economic opportunities elsewhere. Those who chose to stay expressed their identity as Vineyarders by reviving a local pronunciation, centralized /aw/, so that “about the house” is pronounced /əbəwt/ the /həws/. This pronunciation was used by old-timers but had almost died out. Thus, an older pattern was revived when it took on emblematic value. A third way in which speech patterns can become group emblems, and thus actuate a change, is when the patterns of one group diﬀer from those of a stigmatized group, as can be seen in the trend toward raising the nuclei of (iy) and (ey) in Philadelphia, shown in ﬁgure 3.2. The traditional Philadelphia pattern was a lowering and centering of these vowels. Labov (1982) reports that in a sample of 110 speakers, those who were 25 years older than the average age continued the lowering trend, but, as ﬁgure 3.2 shows, speakers who were 25 years younger than the average age had reversed the trend and were raising the nuclei of (iy) and (ey). Labov explains this reversal by noting two facts: (1) the raising pattern is typical of other Northern cities, such as New York City, Chicago, and Buﬀalo, and (2) since World War II a large number of African Americans from the South have moved to Philadelphia—speakers whose lowering of (iy) and (ey) is even more pronounced than that of the older Philadelphians. Thus, Labov suggests that the reversal of the older pattern may be a reaction against the pattern of the incoming Blacks.

46

• Variation in Native Speaker Speech

Conclusions Labov (2001a, p. 437) summarizes how changes are transmitted in American urban communities in the following Principles of Transition. 1. Children begin their language development with the pattern transmitted to them by their female caretakers, and any further changes are built on or added to that pattern. 2. Linguistic variation is transmitted to children as stylistic diﬀerentiation on the formal/informal dimension, rather than as social stratiﬁcation. Children associate formal speech variants with instruction and punishment and informal variants with intimacy and fun. 3. At some stage of socialization, depending on class status, children learn that variants favored in informal speech are associated with lower social status in the wider community. 4. Linguistic changes from below develop ﬁrst in spontaneous speech at the most informal level. New variants are unconsciously associated with nonconformity to sociolinguistic norms, and are advanced mostly by youth who resist conformity to adult institutional practices. 5. Linguistic changes are further promoted in the larger community by speakers who have earlier in life adopted symbols of nonconformity without taking other actions that lesson their socioeconomic mobility. This chapter has summarized some of the ﬁndings of quantitative sociolinguistic research on the speech of native speakers. In the next chapter we will see how the methods and ﬁndings discussed here have been applied to studying the speech of nonnative speakers.

II Variation in Nonnative Speaker Speech

4

The Study of Variation in Interlanguage

Introduction In part I, we looked at mainstream sociolinguistic studies of how linguistic variables are embedded in the native speaker speech community, and how they are conditioned by linguistic, stylistic, and demographic factors. These studies are mainly sociological, not psychological, in nature (although there have been psychologically oriented studies of ﬁrst language acquisition carried out within the variationist framework such as Labov and Labov [1976], Payne [1980], and Kovac and Adamson [1981]). In contrast, quantitative sociolinguistic studies of second language acquisition focus on the learning of new linguistic forms and are therefore mainly psychological in nature. Early studies, including Dickerson (1974, 1975), Adamson (1980, 1988), Huebner (1985), and Tarone (1988), looked at alternation between a nonnative form (I no play baseball) and a native form (I don’t play baseball), charting the gradual replacement of the former by the latter. In other words, these studies focused on the acquisition of the categorical rules of the target language. Corder (1981) called this kind of variation, which involves the acquisition of the basic forms of the language, vertical variation. A more recent strand of variationist research in second language acquisition (Adamson and Regan, 1991; Bayley, 1996; Mougeon, Rehner, and Nadasdi, 2004; Regan, 1996; Wolfram, Carter, and Moriello, 2004; Bayley and Regan, 2004) is both psychological and sociological in nature and therefore more closely resembles mainstream variationist studies in the native speaker community. This research examines the alternation between two or more native forms that are socially signiﬁcant (such as G versus N), investigating whether learners have acquired the frequencies of usage appropriate to their speech community in terms of age, gender, and social class; that is, whether learners have acquired what Bachman (1990) calls sociolinguistic competence. As we saw in chapter 3, sociolinguistic competence involves learning ﬁnely tuned frequencies of production of alternating forms. Corder (1981) called such variation between two native forms horizontal variation. This book is mostly about vertical variation. Chapter 5 provides an example of research in vertical variation, and other studies of vertical variation will be discussed throughout the book. But, in order to introduce the reader to the more recent strand of SLA research on 49

50

• Variation in Nonnative Speaker Speech

horizontal variation, this chapter reviews several representative studies. To set the stage for this review, we will ﬁrst brieﬂy examine three early studies of vertical variation, one involving ﬁrst language acquisition and two involving second language acquisition. Early Studies of Vertical Variation Among the ﬁrst scholars to study vertical variation in a learner’s speech were none other than William and Teresa Labov (1976), who studied the development of wh- questions in the speech of their three-year-old daughter Jessie. They found that Jessie variably inverted the subject and AUX verb in wh- questions, so that questions like “When can we go?” alternated with questions like “When we can go?” However, Jessie did not invert all wh- questions with equal frequency. Rather, the frequency of inversion depended upon the particular wh- word. Thus, individual wh- words could be considered constraints on a variable inversion rule that would look like (1): (1) WH NP AUX → AUX NP / WH

冬冭 how where what when

Variable rule (1) says that inversion is most likely after how, next most likely after where, and so on. The rule not only describes Jessie’s performance at a particular stage of development, but also makes a prediction about the future course of development. The prediction is that (1) will “go to completion” (that is, that Jessie will produce categorical inversion) ﬁrst in how questions, next in where questions, next in what questions, and ﬁnally in when questions. In variationist jargon, how will become the ﬁrst knockout constraint, followed by where, what, and when. Of course, that prediction might not turn out to be true. The ordering of constraints on rule (1) might change, implying a diﬀerent sequence of acquisition. But the variable rule gives us a way to model Jessie’s language development at a more detailed level than a categorical rule, which would describe subject–verb inversion in wh- questions merely as optional, without specifying where inversion was more likely to occur. The ﬁrst quantitative study of vertical variation in second language learners’ speech was conducted by L. Dickerson (1974), who looked at how Japanese college students studying in the United States acquired the English phoneme /r/. She found that certain linguistic environments favored accurate /r/ production and that the eﬀect of these environments changed as the subjects gained proﬁciency in English. A second early study, and the ﬁrst one to use the Varbrul program to analyze interlanguage data, was Adamson (1980; Adamson and

The Study of Variation in Interlanguage

• 51

Kovac, 1981), which reanalyzed the data from Schumann’s (1978) inﬂuential study of the speech of Alberto, a 33-year-old Spanish-speaking immigrant from Costa Rica. Schumann (1978) had found that Alberto used two main strategies to negate verbs: no + verb don’t + verb

I no can see. He don’t like it.

A Varbrul analysis showed that over the nine months of the study, Alberto’s use of don’t increased and that linguistic factors and speaking style aﬀected whether Alberto would use no or don’t. In a more statistically sophisticated reanalysis of Alberto’s data, Berdan (1996) also found that these factors aﬀected don’t production, although his analysis somewhat modiﬁed Adamson’s (1980) and Adamson and Kovac’s (1981) list of linguistic constraints. The most important modiﬁcation was the discovery that Alberto tended to use several don’ts in a row. That is, if he used don’t in a particular negative sentence, he was more likely to use don’t in the next negative sentence. This fact suggests that when Alberto activated a mental program for negation with don’t, that program stayed partially activated for immediate subsequent use. A similar phenomenon was found by Young (1991, p. 138), whose Chinese-speaking subjects were more likely to mark English nouns as plural if plurality were marked earlier in the noun phrase by a demonstrative pronoun, numeral, or quantiﬁer. For example, plural -s was more likely to appear on the word book in the sentence “He wants several of those books for Christmas”, than in the sentence “He wants books for Christmas.” Studies of Horizontal Variation Studies of how second language learners acquire sociolinguistically signiﬁcant forms have been carried out in both naturalistic settings, and in classroom settings. In the following pages, we will brieﬂy review studies of both kinds. Naturalistic Learners Bayley (1996) looked at both vertical and horizontal variation in the speech of Chinese speakers, focusing on a classic sociolinguistic variable, ﬁnal /t,d/ production in consonant clusters, as in mist and raised (see the discussion in chapter 1). He found that the surrounding linguistic environment constrained the production of these consonants in basically the same way that it constrained /t,d/ production in the speech of native English speakers (as discussed in Labov, 1994) and in the speech of Vietnamese speakers of English (as discussed in Wolfram, 1985). Bayley (1996) also took into account three factors of particular interest to SLA research: his subjects’ English proﬁciency, their speaking style, and whether they frequently interacted with native English speakers. The subjects of the study were 20 adult native speakers of Mandarin

52

• Variation in Nonnative Speaker Speech

from both China and Taiwan, mostly students, who had lived in the United States for periods ranging from 2 to 61 months. As we saw in chapter 1, for speakers of African American English the strongest constraint on /t,d/ deletion is a following consonant, and Bayley (1996) found this to be true in the speech of his subjects, as well. In addition, he found that a preceding obstruent or nasal strongly favored deletion, as did voicing agreement between the sounds preceding and following /t,d/. That is, if the preceding and following sounds were both voiced (as in mild overbite) or both voiceless (as in mist from the sea), deletion was favored, but if the two sounds had dissimilar voicing (as in mist over the river), deletion was disfavored. Bayley also found, as had Wolfram (1985) in his study of Vietnamese English, that the grammatical status of /t,d/ had the opposite eﬀect on tense marking (when /t,d/ signals past tense) than it has for native speakers. Recall from chapter 1 that Wolfram (1969) had found that for African Americans in Detroit, when ﬁnal /t,d/ marked past tense, it was less likely to be deleted than when it did not mark past tense (as in mist). However, Bayley (1996) found exactly the opposite pattern: past tense /t,d/ was more likely to be deleted. This makes sense because adult native speakers of English do not have to learn to mark past tense; that aspect of the grammar has been completely acquired. Therefore, the major cause of deletion for these speakers is the phonological tendency to reduce consonant clusters. However, deleting a past tense marker can eliminate important information, so native speakers resist this tendency. For nonnative speakers, on the other hand, there are two causes of deletion: the phonological tendency to delete consonants from clusters, and imperfect mastery of the past tense. Thus, nonnative speakers who have not mastered the past tense rule have two obstacles to overcome, and therefore are more likely to omit ﬁnal /t,d/ from a word like ﬁned than from a word like ﬁnd. Bayley (1996) also looked at the eﬀect of speaking style. He elicited data in three speaking contexts: a one-hour sociolinguistic interview, a reading passage, and the retelling of a story that the subjects had learned in Chinese. He found the most deletion in casual conversation (the least monitored style), less deletion in story retelling (a more monitored style), and the least deletion in reading (the most monitored style). The relationship between style and monitoring will be discussed more fully in chapter 8. A particularly interesting feature of Bayley’s study was that he divided the subjects into two groups according to the amount of contact they had with native English speakers. One group belonged to an almost entirely Chinese social network and had little contact with Americans outside of the classroom, while the other group belonged to a social network that included Americans. Bayley (1996) made two comparisons between these groups. The ﬁrst comparison was which group marked past tense more accurately in irregular verbs. He found that the speakers who had more contacts with native speakers were more accurate. The second comparison was which group deleted /t,d/ more

The Study of Variation in Interlanguage

• 53

from regular verbs. He found that the group with English social contacts deleted /t,d/ more often, a result that is surprising at ﬁrst glance. Shouldn’t the accuracy of past tense marking simply be a function of English proﬁciency? Why would the group that is more accurate with irregular verbs be less accurate with regular verbs? Bayley (1996) pointed out that, while past tense marking of irregular verbs is wholly a developmental process (movement on the vertical continuum), marking with /t,d/ is subject to community speech norms (movement on the horizontal continuum). He concluded that the group that had more American contacts had better acquired the general grammatical rule for past tense marking, and that they applied it to the underlying structure of all past tense verbs. However, this group had also acquired community speech norms to some extent, resulting in appropriate variable deletion of /t,d/. Wolfram, Carter, and Moriello (2004) studied Spanish-speaking English learners who, like Alberto, had come to the United States largely for economic reasons, intending to establish residence. These researchers focused on two communities in North Carolina: Raleigh and Siler City, where recent immigration from Mexico and Central America has resulted in large Hispanic communities. Wolfram, Carter, and Moriello (2004) note that within these communities Spanish speakers are learning English almost entirely in the American surroundings, and that their language abilities range from monolingual Spanish to fully proﬁcient bilingualism. According to the 2000 census (Wolfram, Carter, and Moriello, 2004, p. 356), Hispanic immigrants to North Carolina show the lowest degree of English proﬁciency of immigrants to any state, and the children who are born to these immigrants in the United States are Spanish dominant. The research question of this study was whether these immigrants and their children were acquiring features of Southern English or of general American English. Speciﬁcally, the researchers looked at the diphthong /ay/, which in Southern Piedmont speech is fronted and unglided before voiced segments, so that time and side are pronounced [ta:m] and [sa:d]. Spanish has phonemic /ay/ in words like bailar (dance) and hay (there is), though in the speech of two monolingual Spanish speakers whom the researchers examined the glide had a longer trajectory and a higher and more fronted endpoint than the English glide. The fact that Spanish has /a/, as in dama (woman), as well as /ay/, suggests that neither the Southern nor the non-Southern pronunciation of /ay/ is favored by transfer from Spanish. The researchers examined the speech of ten Spanish-speaking residents of Siler City and seven Hispanic residents of Raleigh, mostly adolescents and teenagers with lengths of residence in the United States from between two and seven years. They note that among native English speakers glide reduction in /ay/ is more prominent in rural Siler City than in urban Raleigh, where it is not heard in the speech of many residents who came from northern states. A convenient measure of the diﬀerence between Southern and non-Southern /ay/ is the duration of the glide compared to the duration of the entire vocalic

54

• Variation in Nonnative Speaker Speech

segment. In non-Southern English the glide is much longer, accounting for 47 percent of the vocalic segment, whereas in Southern English the glide accounts for only 17.5 percent. Based on the length of the glide in /ay/, Wolfram, Carter, and Moriello (2004) found that the Spanish speakers in rural Siler City had more Southern-sounding diphthongs than the Hispanic speakers in urban Raleigh, concluding that “several Siler City speakers show some accommodation to the local Southern norm, unlike their Raleigh counterparts” (p. 353). As this statement suggests, there was considerable variation among the individual speakers. A particularly telling contrast was that between two siblings, an 11-year-old girl and a 13-year-old boy. Both of these children were the oﬀspring of Mexican immigrants, and both had lived in Piedmont, North Carolina, all of their lives. In sociolinguistic interviews the girl produced only 5.9 percent unglided /ay/, while the boy produced 62.8 percent. Wolfram, Carter, and Moriello (2004) note that the boy identiﬁed with the non-Hispanic “jock culture” of adolescent boys, whereas his sister was more oriented toward mainstream American culture. Despite individual exceptions, the participants in this study were not, in general, accommodating to the Southern norm. Wolfram, Carter, and Moriello (2004) suggest that this was largely due to the residential and social segregation of the Hispanic communities, and to the fact that the constant stream of Spanish dominant immigrants reinforced the importance of communication in Spanish. Wolfram, Carter, and Moriello (2004) also found evidence that learners acquire the phonology of the target language, in part, by learning the pronunciation of individual words, a process similar to lexical diﬀusion (see chapter 3). For example, Martin, a Raleigh resident with excellent proﬁciency in English, showed much shorter glide paths for the diphthongs in the words ﬁve and outside than for the diphthongs in other words, which suggests that these lexical items had been learned individually. Other evidence that pronunciation was learned by lexical items rather than by word classes involved the Southern merger of [i] and [e] before nasals, so that him and hem are homonyms, both pronounced [him]. Wolfram, Carter, and Moriello (2004) found subjects who pronounced only certain lexical items in this word class, such as pen, with the Southern [i]. Formal Learners In formal settings, there have been more studies of French acquisition than of English acquisition, and we now review two studies that are representative of this research. The ﬁrst longitudinal variationist study of instructed learners was Regan (1996), which looked at the developing French of seven Irish university students, all of whom had studied French in high school and college and spoke French at an advanced level, and none of whom had lived in France for longer than several months. These students (six women and one man) had been selected to take part in a study abroad program that would allow them to spend

The Study of Variation in Interlanguage

• 55

a year in France, and Regan took advantage of this fact to study their sociolinguistic competence before and after the immersion in French. Regan (1996) looked at the variable ne, which occurs in negative expressions such as: Je ne pourrais pas Elle ne travaille plus Je n’aimais rien

“I wasn’t able” “She didn’t work anymore” “I didn’t like anything”

In spoken French, it is possible to delete ne in such constructions, especially in informal style. The rates of deletion range from 98 percent in Montreal French (Sankoﬀ and Vincent, 1977) to 44.1 percent in Parisian upper-middleclass French (Ashby 1996).1 In addition, Ashby (1981) found evidence of age grading (young people delete more) and gender stratiﬁcation (women delete more). Thus, ne appears to be a classic sociolinguistic variable, constrained by demographic and stylistic factors. Regan (1996) compared her subjects’ rates of ne deletion before and after their year in France to see whether they were accommodating to French sociolinguistic norms. She found that there was a signiﬁcant change. The Varbrul p value for ne deletion before immersion was .32. After the stay in France this value increased to .67. Regan (1996) also found interesting results regarding the constraints on deletion, including monitored versus unmonitored style. Native French speakers delete ne slightly more in unmonitored style (p = .47) than in monitored style (p = .52) (p. 189). Before immersion, the learners showed a much larger diﬀerence in ne deletion in these two styles: p = .63 for unmonitored style versus p = .35 for monitored style. After the stay in France, however, this diﬀerence had narrowed considerably and almost matched the rates of the native speakers (p = .57 in unmonitored style versus p = .44 in monitored style). Thus, the subjects had learned to increase the rate of deletion in their monitored style. After the year in France, their revised hypothesis seemed to be: delete less in monitored style, but delete at a high rate in both styles. Regan also found signiﬁcant eﬀects for phonological constraints (place and manner of the following segment), and morphological/syntactic constraints (presence of object clitic, pronoun subject, and clause type). A particularly interesting ﬁnding involved whether ne was deleted from formulaic phrases, such as je ne sais pas (I don’t know). Deletion was strongly favored in such phrases, both before the stay in France (p = .74) and after the stay (p = .80). Both of these p values are higher than the values for native speakers (p = .63). Thus, the learners overgeneralized ne deletion in these phrases. Such overgeneralization has also been observed in native speaker speech. In his study of /r/ in New York City (see chapter 1 and chapter 8), Labov (1966) found that the second highest social class overgeneralized /r/ deletion in highly monitored styles. In their attempt to match the style of upper-class New Yorkers, these speakers overshot the mark. Regan (1996, p. 191) notes that [sˇeypa] is a

56

• Variation in Nonnative Speaker Speech

stereotype for je ne sais pas (I don’t know), and [seypa] is a stereotype for ce n’est pas (that’s not so). She speculates that the learners may overgeneralize these easily produced stereotypes in an eﬀort to accommodate to the overall ne deletion they encountered in France. Such learning of phrases is a form of lexical learning (the phrase is mentally stored as a long word) and has been noticed by other scholars including Wolfram (1985) and Adamson and Regan (1991). Howard, Lemee, and Regan (2006) investigated the French acquisition of 19 university students in Ireland who were native speakers of English studying French as a foreign language. The researchers elicited their data by means of a sociolinguistic interview. All of the subjects had studied French for ﬁve or six years before university and for three years at university. In addition, 15 of the subjects had spent a year in France. The researchers focused on the sociolinguistic variable /l/, which can be deleted in words such as il (he) and elle (she). Several studies have identiﬁed linguistic constraints on /l/ deletion in the speech of native French speakers. Poplack and Walker (1986), for example, found that Quebec French speakers deleted /l/ more frequently in the third person pronouns, with rates of deletion ranging from 33 percent with elle to 100 percent for impersonal il. Ashby (1981) found that continental French speakers in Tours deleted at rates ranging from 63 percent for elle to 88 percent for impersonal iI. Rates of deletion have been found to be lower for object pronouns and deﬁnite articles. Howard, Lemee, and Regan (2006) looked to see if their students deleted /l/ in frequencies similar to those of native speakers and, if they did, whether their /l/ deletion was constrained by the linguistic environment, the context of elicitation, the speaker’s sex, and the degree of contact with native French speakers. The researchers found that the degree of contact with native speakers had a strong inﬂuence on the percentage of /l/ deletion. The students who had never been to France deleted /l/ at only 6 percent, while those who had studied in France for a year deleted it at 30 percent, which, it should be noted, is still far less than native speakers. A number of the linguistic constraints on /l/ deletion appeared to pattern like the constraints in native speaker speech. As in native French, impersonal il strongly favored deletion (at 45 percent), whereas elle was the least favoring pronoun (at 6 percent). Howard, Lemee, and Regan (2006) also found gender stratiﬁcation, similar to that in native French, with females deleting /l/ twice as often as males (45 to 22 percent). However, there was no signiﬁcant diﬀerence in formal versus informal speaking styles. These results are similar to those for the Japanese speakers studied by Major (2004) (see chapter 8) because the subjects had learned gender stratiﬁcation but not style stratiﬁcation. Thus, as Major (2004) suggests, it may be that gender stratiﬁcation is learned before style stratiﬁcation in second language acquisition. Mougeon, Rehner, and Nadasdi (2004) discussed several previous studies (Mougeon, Nadasdi, and Rehner, 2002; Rehner, Mougeon, and Nadasdi, 2003)

The Study of Variation in Interlanguage

• 57

that examined the spoken French of 41 adolescent native English speakers who were enrolled in a French immersion program in Toronto. These programs feature 50 percent French medium instruction in grades 5 through 8, followed by 20 percent French medium instruction in high school.2 The subjects of the study were equally divided among high, intermediate, and low proﬁciency levels in French and, as questionnaires showed, had only marginal exposure to French outside the classroom. The researchers looked to see how these students used 13 sociolinguistic variables like ne deletion in their speech and then compared the frequencies of the students’ usage to the frequencies of these variables in the speech of native Quebec French speakers, the speech of the students’ French instructors, and the textbooks that the students were using. We will discuss only three of the variables examined in this study, but they are representative of the overall results. The researchers divided the 13 linguistic forms they studied into three groups depending on their degree of formality or stigmatization: vernacular forms, mildly marked forms, and formal forms. Vernacular forms do not conform to the rules of standard French and are socially stigmatized. An example is m’as + inﬁnitive, as in M’as aller a l’école (I’m going to school). Mildly marked forms are used in informal contexts. One example is ne deletion, as discussed earlier. Another example is the construction je vas, as in “je vas à l’école” (I’m going to school). Formal forms conform to the rules of standard French and are used in writing and formal speaking contexts. An example is the form je vais, as in je vais à l’école (I’m going to school). The results for these forms, which are representative of the overall results, are found in table 4.1, where it can be seen that the texts, teachers, and students

Table 4.1 Percentage of forms for L1 speakers of Quebec French, French language arts materials, French immersion teachers, French immersion students. Materials Linguistic variables

L1 Quebec French

M’as + inf. (vernacular)

28

0

0

0

0

Je vas + inf. (marked)

60

0

0

1

10

1

99.9

97

71

70

100

99

90

Ne use (marked) Je vais + inf. (formal)

12

Texts

100

Dialogues

Immersion teachers

Immersion students

Note: Very small percentages of forms unique to the immersion students’ interlanguage have been omitted.

58

• Variation in Nonnative Speaker Speech

overwhelmingly used the formal forms. Based on all of the data, the authors reached these six conclusions: 1. The immersion students never use vernacular forms or use them only marginally. 2. The immersion students use mildly marked forms considerably less frequently than native speakers of Quebec French. 3. The immersion students use formal forms considerably more frequently than native speakers of Quebec French. 4. Female students use some of the formal forms more frequently than male students, and middle-class students use some of the formal forms more frequently than upper-working-class students. Similar correlations are found in native speaker communities. 5. Students who had contacts with native speakers outside of school display a better mastery of mildly marked forms (a ﬁnding similar to that of Regan [1996]). 6. There is a correlation between the frequency of forms in the educational input and students’ speech (as is obvious in table 4.1). The researchers note that “the teachers and, to an even greater extent, the pedagogical materials make no or only marginal use of vernacular variants [and] use mildly marked variants only infrequently . . . by and large we saw . . . these patterns . . . inﬂated in the students’ own patterns of variant use” (p. 426). Perhaps the most obvious ﬁnding of Mougeon, Rehner, and Nadasdi (2004) is that, in most cases, the frequency of variable forms in the input was reﬂected in the frequency of these forms in the students’ output. For example, the learners were exposed to variable ne production in the speech of their instructors at the rate of 70 percent, and they produced ne in their own speech at the rate of 71 percent. As expected, forms that the learners were not exposed to, such as the Quebec vernacular form m’as + inﬁnitive (though used 28 percent of the time by native Quebec French speakers) were never produced by the learners. An exception to the input/output match involved the marked form je vas + inﬁnitive. In this case, the learners had only the slightest exposure to the form: 1 percent in the speech of their instructors. Nevertheless, they produced the form at 10 percent in their own speech. This overgeneralization of frequency was also found in Regan’s (1996) study in regard to stereotyped forms like je sais pas. Recall that Regan speculated that je sais pas may be a frozen form, which students use at high frequency in an attempt to make up for their overall overuse of ne. This explanation presupposes that the learners desired to use informal forms in order to sound appropriately casual. It may be that the same motivation applies to the overgeneralization of je vas + inﬁnitive. That is, because the students wished to sound informal, they overgeneralized a marked, frozen form.

The Study of Variation in Interlanguage

• 59

Conclusions The studies reviewed in this chapter found similarities between variation in the interlanguage of both formal and naturalistic learners and variation in native speaker speech. While the interlanguage spoken by the participants was far from native-like, it varied along some of the same dimensions as native speech. For example, both interlanguage variation and native language variation can be constrained by universal articulatory processes. Bayley (1996) found that a following consonant favored ﬁnal /t,d/ deletion by his Chinese-speaking subjects, just as this environment favors /t,d/ deletion in the speech of native English speakers. This constraint was also found by Wolfram (1985) in the speech of his Vietnamese-speaking subjects. Also, speaking style and topic can constrain interlanguage variation in ways similar to native speaker speech. Bayley (1996) and Regan (1996) found that learners produced more native-like speech in circumstances that encouraged more attention to speech, or monitoring. In addition, all of the researchers found that learners were able, to some extent, to internalize constraints on the variation that they encountered in input. At the most basic level, classroom learners approximated the percentages of variable forms in the speech of their instructors and in their textbooks. Similarly, Regan’s (1996) and Howard, Lemee, and Regan’s (2006) subjects altered their rates of variation after being exposed to diﬀerent frequencies of input during a year in France. At a ﬁner level, the researchers discovered that learners were able to internalize “unnatural” constraints (that is, constraints not motivated by universal articulatory processes) that were present in input. For example, Howard, Lemee, and Regan (2006) found that, like native speakers, French learners deleted /l/ more frequently in impersonal il than in elle. At a still ﬁner level, Howard, Lemee and Regan (2006) found that female learners matched the pattern of /l/ deletion of female native speakers, and male learners matched the pattern of male native speakers. This ﬁnding is consistant with Major’s (2004) suggestion that language learners acquire gender stratiﬁcation before they acquire style stratiﬁcation, as discussed in chapter 8. It should also be noted that Wolfram, Carter, and Moriello (2004) found some dissimilarity in learner and native speaker variation. Some of their Spanish-speaking subjects used longer glides in the words ﬁve and outside than in other words containing /ay/, a fact that suggests that these two words were learned individually. Similar ﬁndings have been reported in other studies of second language acquisition, as also discussed in chapter 8.

5

The Acquisition of English Irregular Past Tense by Chinese-speaking Children LARRY BERLIN AND H. D. ADAMSON

Introduction In this chapter, we deal with the matter of vertical variation as it applies to a group of Chinese-speaking children in their acquisition of an obligatory grammatical feature. This type of work is consistent with previous studies (Bayley, 1996; Wolfram, 1985; Young, 1991) and adds to the work amassed in an eﬀort to understand the developmental process in acquiring grammatical features of a second language. Our longitudinal study examined irregular past tense marking using the Varbrul multivariate analysis program. The power of Varbrul lies in its ability to examine many possible combinations of constraints on past tense marking in order to determine which combination best accounts for the variation observed in our data. Constraints on Past Tense Marking Phonological Constraints To begin, we brieﬂy review studies by Wolfram (1985) and Bayley (1994) of the phonological constraints on second language learners’ acquisition of English past tense mentioned in chapter 4. Wolfram (1985) studied Vietnamese adolescents and adults living in the United States and found that their past tense marking of regular verbs was constrained by the surrounding phonological environment in much the same way as it is constrained in native speakers’ speech. That is, regular past tense /t,d/ is less likely to be produced when followed by a consonant than when followed by a vowel because the articulation of consonant clusters is more diﬃcult than a consonant–vowel sequence. Bayley’s (1994) study of Chinese adults living in the United States yielded the same conclusion. In a more recent study Hansen (2001) used Varbrul to determine accuracy orders and production modiﬁcations in three Chinese female learners’ production of English regular past tense over a six-month period, focusing on the consonant clusters that are created by the addition of a past tense morpheme. She found that “Chinese learners of English employ diﬀerent production strategies based on the length of the coda [that is, the 61

62

• Variation in Nonnative Speaker Speech

syllable ﬁnal consonant(s) that follows the vowel]” (p. 362), but that saliency proved a dominant criterion with various (and at times multiple) linguistic constraints as well as “natural phonological processes” contributing to the learners’ acquisition. In regard to irregular past tense verbs, Wolfram (1985) and Bayley (1994) found that the underlying verb class was most likely to indicate which verbs would be marked for past. The four irregular verb classes these researchers identiﬁed were suppletives, doubly marked verbs, verbs that undergo an internal vowel change, and verbs with replacive consonants. Suppletive verbs have idiosyncratic past tense forms (as in go → went). Doubly marked verbs undergo two phonological changes between the base or present tense form and the past tense form, typically that of an internal vowel change and either the addition of the past tense morpheme (as in do → did; tell → told) or the replacement of the ﬁnal consonant with the past tense morpheme (as in bring → brought and teach → taught). Internal vowel change verbs undergo a change in the nucleus (vowel) of the syllable rather than in the coda (as in know → knew and lie → lay). Replacive consonant change involves changing a ﬁnal consonant. This kind of change is more salient when it involves a diﬀerent place of articulation (as in have → had and make → made) and less salient when it involves only voicing (as in send → sent and lend → lent). The phonological studies conducted by Wolfram (1985) and Bayley (1994) revealed that irregular verbs whose past forms were maximally diﬀerent from their present forms, such as suppletives (e.g., go → went; am/are/is → was/ were), were more likely to be marked for past tense than irregular verbs with relatively similar past and present forms (e.g., come → came). These ﬁndings led Wolfram (1985) to propose the principle of saliency, suggesting that second language learners will ﬁrst acquire the past tense of verbs that show maximal diﬀerence between the base or present tense form and past tense form. The resultant order for English irregular past tense, then, would be expressed as follows: suppletive > doubly marked > internal vowel change > replacive consonant Both researchers found that the principle of saliency roughly held true for both low and high proﬁciency subjects, though Wolfram (1985) found there was more individual variation among the low proﬁciency subjects. Semantic Constraints Vendler (1967) proposed a classiﬁcation for verbs according to their semantic type. He suggested that there are essentially four types of verbs based on their inherent meaning; these are states, activities, accomplishments, and achievements. Stative verbs have no dynamic nature, and thus maintain the same relative quality throughout their duration. Examples include sen-

The Acquisition of English Irregular Past Tense

• 63

sory, cognitive, emotive, and possessive verbs, such as see, remember, love, and have. Activity verbs are those which have some duration and are therefore not punctual (i.e., instantaneous). The activity described is homogeneous throughout its duration, or it has the same quality at every moment it is occurring. Another feature of activities is that they have an arbitrary endpoint rather than an inherent one; in other words they are atelic, as opposed to telic verbs, which have a clear endpoint (see below). Examples of activities include run, sing, and play. Accomplishments, like run a mile and build a house, are like activities in that they are not punctual, but unlike activities in that they are telic; that is, they have a clear endpoint or they result in a product. Usually, verbs that express accomplishments are transitive, whereas verbs that express activities are intransitive. The ﬁnal verb type identiﬁed by Vendler (1967) is achievements. Achievements diﬀer from the other three verb types because they are punctual. That is, at the moment they occur, the action of the verb is simultaneously completed. Consequently, they also have an inherent endpoint and are therefore also telic. Examples of this verb type are start and ﬁnish. Although they can take the progressive form, these two actions are inherently instantaneous because, once an activity is started, the actors are engaged in the activity, not in the starting. Other examples include reach the summit, die, and recognize. Andersen (1993; Shirai and Andersen, 1995) studied semantic constraints on the acquisition of English past tense by native-speaking children. Using Vendler’s (1967) original classiﬁcations, he proposed a reclassiﬁcation using only three semantic features: punctual, telic, and dynamic (see table 5.1 for a comparison of the two classiﬁcation schemes). As a result of this reclassiﬁcation, Andersen (1993; Shirai and Andersen, 1995) posited that telicity is the most relevant feature in past tense marking, especially for novice learners, because events with an inherent endpoint can more naturally be construed as being completed than verbs without a clear endpoint. For the purpose of our study, we will classify the verbs only as telic or atelic.

Table 5.1 Three semantic features associated with Vendler’s (1967) four semantic categories of verbs.

Punctual Telic Dynamic Example

State

Activity

Accomplishment

Achievement

− − − see

− − + run

− + + run a mile

+ + + die

64

• Variation in Nonnative Speaker Speech

Discourse Constraints Pragmatic work in narrative discourse structure (Berman and Slobin, 1994) has identiﬁed two prominent types of clauses found in narratives: background and foreground clauses. Background clauses are those which set the scene and provide exposition; in contrast, foreground clauses are those which advance the action of the story as it unfolds. Examples from our own data include the following. (1)[But she draw the, eh little man on the TV] ← foreground →

[and she wasn’t listening.] ← background → (2)[And then the mole was very scared, and scared the lion would eat him.] ← background → [But the lion only picked him up and pat him] ← foreground → (3)[Mother was watching the eggs.] [And one of the eggs cracked.] ← background → ← foreground → [And she was really happy.] [Then the two cracked.] ← background→ ← foreground → As these examples show, the past tense can be used in both background and foreground clauses. However, the present tense can also be used in background clauses (as in evaluative statements like “I hate when that happens”) and in foreground clauses in the historical present tense. Thus, while the exclusive use of past tense is not requisite in narratives, its use is clearly predominant (see also Labov, 1972b; Schiﬀrin, 1981; Wolfson, 1982) for a discussion of the historical present in foreground clauses). In a study of second language learners’ acquisition of English past tense, Bardovi-Harlig (1995) examined the narratives produced by 37 subjects from various language backgrounds. The study used a cross-sectional design where the subjects represented diﬀerent levels of proﬁciency. The primary research question was whether past tense would be used ﬁrst in background clauses or in foreground clauses. The subjects were asked to view a short silent ﬁlm and then to retell the story both verbally and in writing. The analysis revealed that past tense marking occurred more frequently in foreground clauses than in background clauses, regardless of the subjects’ proﬁciency level, leading Bardovi-Harlig (1995) to conclude that “learners mark foreground events for past ﬁrst and use a variety of forms in the background” (p. 286). In a cross-linguistic study of narrative development, Aksu-Koç and von Stutterheim (1994) concluded that children under ﬁve years old from various language backgrounds tend to express early narratives as a simple sequence of equally weighted events—typically the foregrounded action—gradually adding

The Acquisition of English Irregular Past Tense

• 65

background as they begin to complexify the grammatical structure into more elaborate, hierarchical discourse structures. Moreover, an examination of the developmental stages of discourse competence acquisition of English, German, Hebrew, Spanish, and Turkish speakers from childhood to adulthood indicated that “switches [in verb forms] found in the 3-year-olds’ stories are aspectual rather than temporal [with] English-speaking preschoolers [exhibiting] diﬃculty in adhering to an anchor tense” (p. 452). Variation in Chinese-speaking Children’s Marking of English Irregular Past Tense In our study, we wanted to learn which phonological, semantic, and/or pragmatic factors constrained the acquisition of the English irregular past tense by our subjects. The transcripts we examined were comprised of spoken narratives derived from a retell protocol similar to that conducted by Bardovi-Harlig (1995). Mandatory past tense contexts were coded for a variety of factors including those identiﬁed by earlier research. Our longitudinal data represented a three-year period in the lives of eight Chinese children living in the United States and acquiring English as a second language. Subjects and Data Collection Our subjects, four boys and four girls, ranged in age between 3 and 11 years old at the beginning of the data collection period (see table 5.2 for an overview), and were all children of graduate students attending the University of Arizona. The children were all native speakers of Chinese,1 having lived with their parents in China prior to coming to the United States. At the time the data collection began, the children possessed varying levels of proﬁciency in English. Over a three-year period, the children were individually interviewed approximately one to two times each semester.2 On those occasions, they were shown one or two short, animated cartoons. After viewing the cartoons,

Table 5.2 Overview of individual subjects. Subject

Age*

Sex

X F D L M T Y J

11:0 7:2 5:8 5:5 5:0 4:10 3:11 3:0

M M F F M M F F

* Refers to age at the initiation of the study

66

• Variation in Nonnative Speaker Speech

the children were asked to retell the story they had just seen. The narratives produced by the children were tape-recorded and transcribed. Data Analysis The transcripts were analyzed for obligatory past tense contexts (for examples see the discussion of foreground and background clauses above). The question of what constitutes an obligatory context in a narrative can be somewhat complicated because native speakers and highly proﬁcient second language learners may use the historical present in foreground clauses to refer to past events (Adamson, Fonseca-Greber, Kataoka, Scardino, and Takano, 1996; Schiﬀrin, 1981). In the present study, however, there was no expectation that the children would have any knowledge of historical present because it requires considerable familiarity with native speaker norms, and it is highly unlikely that our subjects had much exposure to this stylistic device. We therefore felt conﬁdent that it would be appropriate to code all foreground clauses as obligatory for past marking, following Wolfram (1985) and Bayley (1994). We coded each verb in an obligatory past tense context as marked or unmarked for past. This marking could be considered the dependent variable in our correlational study. We also coded for the linguistic and other factors that we thought would constrain past tense marking. These factors, which could be considered independent variables, were contained within ﬁve factor groups. Initially these factor groups were: 1) individual subjects; 2) time period; 3) phonological verb class; 4) semantic verb type (telic or atelic); and 5) clause type. As in the Varbrul analysis described in chapter 2, each factor group contained a set of mutually exclusive, independent factors that exhausted all of the data. In other words, each verb could be coded for only one factor per factor group (e.g., a verb must be either telic or atelic). Analyses were conducted using Goldvarb 2001 (Robinson, Lawrence, and Tagliamonte, 2001), an updated version of the Varbrul 2 multivariate analysis program for Windows (Cedergren and Sankoﬀ, 1974). Our predictions, based on the research cited above, were as follows: 1. Past tense would be more frequent in later time periods; in other words, the subjects’ interlanguage would move closer to native norms over time. 2. Salient verb classes (classes in which the base/present form and the past form are maximally diﬀerent) would be marked more frequently. 3. Telic verbs would be marked more frequently than atelic verbs. 4. Foreground clauses would be marked more frequently than background clauses.

The Acquisition of English Irregular Past Tense

• 67

Cross-tabulation: Checking the Factor Groups for Interaction The Varbrul program assumes that the independent variables do not aﬀect one another. That is to say that if a telic verb favors marking and a suppletive verb favors marking, then a verb that is both telic and suppletive will favor marking even more. If that is not the case, then the factors are said to interact. Constraint interaction can indicate very interesting linguistic phenomena, but interacting constraints should not be simultaneously analyzed by Varbrul. Thus, before running a Varbrul analysis, the researcher should check for constraint interaction by cross-tabulating the two factor groups in question. Table 5.3 shows the cross-tabulation of the time periods against the proﬁciency level of the individual subjects. As previously mentioned, the data were collected over a threeyear period, which we divided into three one-year time periods for the purpose of analysis. Individual subjects should improve over time if they continued to acquire English regardless of their proﬁciency level at the time the data collection was initiated. As can be seen in table 5.3, this was indeed the case with the exception of subject X; nonetheless, if 80 percent accuracy is considered an indicator of acquisition, as has been done in other studies, subject X does not show any appreciable diﬀerence between his performance from time 1 to time 2. We also cross-tabulated the individual subjects against the diﬀerent verb classes. If consistent with the ﬁndings of Wolfram (1985) and Bayley (1994), the frequencies obtained in this analysis should indicate that the more salient verb classes are more frequently marked. As can be seen in table 5.4,3 however, the data do not provide strong support for the principle of saliency. Four of the subjects do mark the suppletives more frequently than the doubly-marked verbs, but the other four do the opposite. However, three of those subjects—X, D, and F—all possess a high level of proﬁciency, and we see again that all these subjects approach or surpass the criterion level for acquisition of 80 percent

Table 5.3 Percentage of accurate past tense marking by individual subject and time period. Subjects are ranked according to their overall accuracy. Time period Subject

1

X D F L J T Y M

86 78 57 52 no data no data 6 10

2 85 89 87 78 41 9 43 49

3

Total

95 no data 96 no data 75 63 59 no data

86 84 79 63 47 45 39 30

68

• Variation in Nonnative Speaker Speech

Table 5.4 Percentage of accurate past tense marking by individual subject and verb class. Verb class Subject

Suppletive

Doubly-marked

Internal vowel change

Total

X D F L J T Y M

83 93 78 73 83 59 44 25

96 94 82 61 60 17 45 8

80 70 78 56 28 38 35 44

86 84 79 63 47 45 39 30

* Percentages reported indicate accurate past tense marking

marking. Subject Y does not exhibit much diﬀerence in her marking of suppletives and doubly-marked verbs. Two of the subjects, T and M, mark internal vowel change verbs more frequently than doubly-marked verbs, with M marking internal vowel change most frequently overall. The only claim that can be made is that six of the eight subjects mark suppletives and doubly-marked verbs, which are relatively more salient, more frequently than vowel-changing verbs, which are relatively less salient. Thus, the principle of saliency, which found strong support in Wolfram’s (1985) and Bayley’s (1994) study of adult learners, is not strongly supported here. It may be that verb saliency is less important for acquisition in children than it is for adolescents and adults, yet one similarity to Wolfram’s (1985) data can be observed: the exceptions to the weak generalization are low proﬁciency subjects. Wolfram (1985) also found that the saliency hierarchy was less prevalent among his low frequency subjects than among his high frequency subjects. As a result of this cross-tabulation, then, the independent variable of verb saliency will not be tested in our Varbrul run because it does not constrain our data in a uniform way. The next factor group to cross-tabulate against the individual subjects is the semantic type of the verb. Table 5.5 shows that six of the eight subjects mark telic verbs more frequently than atelic verbs. This ﬁnding lends support to the work of Andersen (1993; Shirai and Andersen, 1995), who suggested that initially learners do not mark past time but rather telicity, which may be an innately known language property. Telecity does not favor marking for subjects D and T, though D is not really an exception because he marks both semantic types at 80 percent. Subject T marks the semantic types in the opposite manner than expected, but because the diﬀerence is only 6 percentage points we feel

The Acquisition of English Irregular Past Tense

• 69

Table 5.5 Percentage of accurate past tense marking by individual subject and semantic type of verb. Semantic type of verb Subject

Atelic

Telic

Total

Diﬀerence

X D F L J T Y M Total

83 84 72 61 38 48 32 18 55

92 84 84 65 52 42 47 48 66

86 84 79 63 47 45 39 30

9 0 12 4α 14 −6 15 30β

α Total diﬀerence for four high proﬁciency subjects = 25 β Total diﬀerence for four low proﬁciency subjects = 53

justiﬁed in including both semantic type of verb and individual subject in the Varbrul analysis. We now consider the eﬀect of clause type. Table 5.6 shows that all of the subjects except F mark past tense more frequently in background clauses than in foreground clauses. Though this ﬁnding is opposite to Bardovi-Harlig’s (1995) earlier work (see the discussion below), the data are constrained in a consistent manner and are eligible for inclusion in the Varbrul analysis.

Table 5.6 Percentage of accurate past tense marking by individual subject and clause type. Clause type Subject

Foreground

Background

Total

Diﬀerence

X D F L J T Y M Total

84 79 81 55 38 44 36 24 54

88 89 77 70 62 45 45 40 68

86 84 79 63 47 45 39 30

4 10 −4 15α 24 1 9 16β

α Total diﬀerence for four high proﬁciency subjects = 25 β Total diﬀerence for four low proﬁciency subjects = 50

70

• Variation in Nonnative Speaker Speech

Results The independent variables ultimately chosen for the Varbrul analysis are displayed in table 5.7, which shows the overall percentage of marking for each factor as well as the Varbrul p values. As explained in chapter 2, “p” does not stand for probability but rather for the relative strength of each factor within the factor group. As such, the Varbrul program tests how well the hypothesis represented by the proposed factors actually ﬁts the data and also serves as a test for constraint interaction. The overall goodness of ﬁt is provided by a chisquare per cell score. If this score is below 1.5 (Preston, 1989), the hypothesis ﬁts the data fairly well, and it is unlikely that the constraints interact. The lower the score, the better the ﬁt. The chi-square per cell score for our analysis is 1.26, a relatively good ﬁt. This suggests that our choice of factor groups to include in the ﬁnal analysis accurately explains what inﬂuences our subjects’ past tense marking. We now consider the eﬀect of the individual factor groups. For factor group 1, time period, Varbrul assigned successively higher p values to each subsequent time period, as expected. While this may not actually be deemed a ﬁnding, it does represent an important safeguard in the reliability of the ﬁndings as it clearly demonstrates that the subjects improved over Table 5.7 Varbrul analysis for past tense marking. P

%

n

Factor Group 1: Time period 1 2 3

.27 .56 .81

46 66 75

199 371 152

Factor Group 2: Individual subjects X D F L J M Y T

.82 .82 .69 .60 .25 .25 .19 .13

86 84 79 63 47 30 39 45

178 90 142 123 16 55 89 29

Factor Group 3: Semantic type Telic Non-telic

.60 .41

66 55

380 342

Factor Group 4: Clause type Background Foreground

.59 .42

68 54

388 334

Input = .65, χ2 per cell = 1.26

The Acquisition of English Irregular Past Tense

• 71

time. For factor group 2, Varbrul ranked the subjects by relative proﬁciency in almost the identical order that they were ranked by overall percentage of past marking. For factor group 3, semantic type, Varbrul assigned a much higher p value to telic verbs than to atelic verbs, as expected. Finally, for factor group 4, clause type, Varbrul assigned a much higher p value to background clauses than to foreground clauses. This result also reﬂects the percentage ﬁgures, but is the opposite of our original hypothesis based on the earlier ﬁndings of Bardovi-Harlig (1995). Discussion A continuing criticism of Varbrul analysis has been that data from diﬀerent subjects should not be lumped together, as this may obscure individual diﬀerences. For this reason, we have looked at each individual subject and found that, with minor exceptions, all of them behaved in similar ways in regard to three of the four proposed constraints on past marking. In the case of verb saliency, however, considerable individual variation was found, making this factor group inappropriate for the Varbrul analysis. We are not sure why our result in regard to saliency diﬀers from those of Wolfram (1985) and especially Bayley (1994), who studied adult Chinese speakers. But, as suggested earlier, perhaps it is because verb saliency aﬀects pre-adolescent and post-adolescent second language learners diﬀerently, with pre-adolescents learning the forms of individual irregular verbs rather than verb classes. In regard to clause type, our ﬁnding that marking is more frequent in obligatory contexts in background clauses appears to contradict the ﬁndings of Bardovi-Harlig (1995), who claimed that “the simple past emerges ﬁrst in the foreground” (p. 272). However, the contradiction is only apparent because the two studies used diﬀerent methods of tabulating data, thus addressing diﬀerent research questions. The present study, in the tradition of Wolfram (1985), Bayley (1994), and others, asked the question: Is past tense more accurately marked in foreground or background clauses? Like Wolfram (1985) and Bayley (1994), we coded the verbs in background and foreground clauses only in obligatory past tense contexts. Bardovi-Harlig (1995), on the other hand, asked the question: Does past tense marking ﬁrst emerge in foreground or background clauses? Therefore, she coded all of the verbs in background and foreground clauses, regardless of whether they occurred in an obligatory context. But speakers are free to use present tense more frequently in background clauses than in foreground clauses. Foreground clauses trace the events of the story line, all of which happened in the past. But background clauses can be used to describe a present state of aﬀairs that is relevant to the story line or to interject the narrator’s opinion. For example, the ﬁrst clause in sentence (3) could be expanded as follows. “I think the mother likes eggs, and she was watching the eggs.” The revised sentence adds two verbs that do not require past marking, yet these verbs would be coded as not marking past in

72

• Variation in Nonnative Speaker Speech

Bardovi-Harlig’s (1995) system. It is not surprising, then, that she found a higher percentage of marking in foreground clauses. We also suggest that the higher rate of accuracy in the background results from the greater freedom of verb choice in background clauses. Background clauses are not absolutely required in narrative discourse, especially that of younger speakers (Aksu-Koç and von Stuttenheim, 1994), and so the speaker has considerable freedom to use familiar verbs. Indeed, by far the most common verb our speakers used in the background was be. In the foreground, on the other hand, certain actions must be expressed to tell the story, whether or not the past tense form of the verb has been mastered. It is also worth noting that, despite Aksu-Koç and von Stuttenheim’s (1994) ﬁnding that age is a factor in native-speaking children’s acquisition of discourse competence, it did not appear to make a diﬀerence for our subjects, all but one of whom demonstrated a higher percentage of accurate past tense marking in background clauses, regardless of their age at the beginning of the study. In fact, the only subject who demonstrated a tendency to favor marking slightly in the foreground, subject F, was 7:2 when data collection began. Finally, the Varbrul analysis showed that telic verbs strongly favored marking for past tense. As the percentages in table 5.5 indicate, this tendency is more pronounced in the earlier stages of acquisition (i.e., among less proﬁcient speakers). This claim is further supported by the cross-tabulation of time period and semantic prototype of verb shown in table 5.8, which indicates that at time period 1, telic verbs are marked much more frequently than atelic verbs but that this discrepancy diminishes until, at time period 3, the two semantic types are marked with nearly equal frequency by all subjects. There appear to be two possible, yet related, explanations for the tendency to mark telic verbs more frequently at earlier stages of acquisition. The ﬁrst is Shirai and Andersen’s (1995) claim that inherent telicity represents a prototype of the larger semantic category pastness, and that in the early stages of acquisition children mark the prototype more frequently. The second explanation is a transfer explanation suggested by Bayley (1994), who notes that Chinese does Table 5.8 Percentage of accurate past tense marking by time period and semantic type of verb. Semantic type Time Period

Atelic

Telic

Total

Diﬀerence

1 2 3 Total

39 63 74 55

55 70 75 66

46 66 75

16 7 1

The Acquisition of English Irregular Past Tense

• 73

not mark the pastness of an event but does mark perfective aspect, which is very similar to telicity. In either case, it appears that as subjects become more proﬁcient they rely less on telicity and/or perfective aspect as a cue to past tense and mark verbs for pastness more consistently. Conclusions There are three main ﬁndings to this study of Chinese-speaking children’s acquisition of the English irregular past tense. 1. The principle of saliency, which was strongly supported in Wolfram’s (1985) study of Vietnamese adolescents and adults and Bayley’s (1994) study of Chinese adults, is only weakly supported here. We have suggested that perhaps children tend to learn irregular past tense verbs individually while adults tend to learn them in classes. It is also possible that there are multiple factors not examined here which particularly inﬂuence Chinese learners (cf. Hansen, 2001). 2. Our subjects marked verbs in obligatory past tense contexts in background clauses more frequently than in foreground clauses. This difference is especially pronounced for lower proﬁciency subjects. We suggest that our subjects were more likely to express background information when they had the linguistic resources to do so and could therefore be more accurate. 3. Andersen’s prototype hypothesis and Bayley’s transfer hypothesis ﬁnd strong support in our study. Our subjects appear to rely on telicity (or perfective aspect) as a cue to pastness in the earlier stages of acquisition and to abandon this strategy as they become more proﬁcient.

III Variation in Theoretical Perspective

6

Psychological Theories of Linguistic Variation

Introduction In this chapter we pick up the story of the search for the psychological underpinnings of linguistic variation that we began in chapter 1, where we observed that the phenomenon of variation in ﬁrst and second language speech has puzzled many linguists. Bickerton (1971), for example, noted that there was no existing psychological theory that explained how the mind could keep track of the probabilities associated with variable linguistic forms. More recently, Preston (2001, 2002) has suggested what such a theory would look like. But he adds (2002, p. 141) that variationists have not dealt extensively with how the facts of language variation relate to psycholinguistic models of speech production and comprehension. In this chapter we will see that psychologists have been interested in probabilistic behavior for a long time, and we will explore possible connections between psycholinguistic theories that account for probabilistic behavior and the variable speech behavior of native and nonnative speakers that was documented in the previous chapters. Psychological Studies of Probability-matching During the 1960s and 1970s a number of experiments in probability-matching showed that people can accurately gauge the proportion of diﬀerent events. For example, Robinson (1964) presented his participants with two ﬂashing lights, one of which ﬂashed more often than the other, and asked them to estimate the proportion of left light ﬂashes to right light ﬂashes. He found that the estimates were very accurate, falling within 2 percent of the true proportions. In a more recent experiment in this tradition, Hudson Kam and Newport (2005) taught adults an artiﬁcial language that contained a variable rule.1 In the artiﬁcial language, which had Verb, Subject, Object word order, nouns could variably take a following determiner. Possible sentences in one dialect of the language included: (1)

Leymz gerko be yellow sand “The sand is yellow.”

ka. det

77

78

• Variation in Theoretical Perspective (2)

smit nerk pow be beside frog det “The frog is beside the boat.”

mæuwzner boat

pow det

Notice that in these sentences, mass nouns (like gerko/“sand”) take the determiner ka and count nouns (like nerk/“frog”) take the determiner pow. During the training sessions, one group of participants received input in which 100 percent of the nouns were followed by a determiner. Other groups received input in which 75 percent, or 60 percent, or 45 percent of the nouns were followed by determiners. Because determiners were optional in the grammar of the language, all of the participants received fully grammatical input. It was expected that the participants who received 100 percent noun + determiner input would construct a categorical rule like (3), whereas the participants who received variable noun + determiner input would construct a variable rule like (4), which they would apply at rates that approximately matched the input they received. (3) (4)

NP NP

→ →

N N

det (det)

Because there were no variable constraints on where a determiner was likely to occur (for example, animate nouns did not favor determiner marking more than inanimate nouns, as is the case in some creole languages), the experimental task involved learning only the correct percentages at which the determiners occurred in the input. In terms of the Varbrul program, the participants only had to learn the right input probability for variable rule (4). Of course, it would be possible for the subjects to incorrectly construct a variable rule with a linguistic constraint. For example, they might construct a rule that incorrectly associated higher determiner frequency with animate nouns, or mass nouns, or whether a particular noun occurred for the ﬁrst time in the sentence. One goal of the experiment was to see if the participants created such unsupported associations. The 40 participants in the study, who were native English-speaking university students, were randomly assigned to one of eight experimental groups, which diﬀered along two dimensions. The ﬁrst dimension was frequency of determiner input. As mentioned, for the low input group determiners accompanied nouns 45 percent of the time; for the mid input group determiners accompanied nouns 60 percent of the time; for the high input group determiners accompanied nouns 75 percent of the time; and for the perfect input group determiners accompanied nouns 100 percent of the time, as shown in ﬁgure 6.1. The ﬁrst question the researchers posed was: Can the participants accurately reproduce these percentages on a production task? The second dimension of variation involved semantics. In one dialect of the artiﬁcial language, mass nouns took ka and count nouns took pow, as in (1) and (2). Half of the participants (assigned equally to each input fre-

Psychological Theories of Linguistic Variation

• 79

Figure 6.1 Results of Hudson Kam and Newport’s (2005) experiment in the acquisition of determiners in an artiﬁcial language. Percent of determiner presence according to input group. Reprinted with permissions.

quency group) learned this dialect. The other half of the participants (also assigned equally to each input frequency group) received input in which ka and pow were associated not with countability but with a noun class. In this dialect class 1 nouns (which might be thought of as feminine) took ka, and class 2 nouns (which might be thought of as masculine) took pow. Assignment to a noun class was entirely arbitrary, with no basis in semantics or phonology. Thus, the second research question was: Can the participants more accurately internalize the frequency of variation in input when a grammatical feature marks a natural semantic relationship like the count–mass distinction or when a grammatical feature marks an arbitrary, conventional relationship like grammatical gender? The language was taught by showing the participants pictures that illustrated the meaning of sentences like (1) and (2) while they heard the sentence spoken. Then, they were asked to repeat the sentence. On the production test, the experimenter showed each participant a scene with real objects that could be described with a sentence in the artiﬁcial language and prompted the participant by supplying the verb. For example, the prompt to elicit sentence (1) would be /lemz/. The participant could then complete the sentence, possibly supplying determiners for the nouns. The results of the experiment appear in ﬁgure 6.1. The dotted line in the ﬁgure represents the percentage of articles in the input for both dialect

80

• Variation in Theoretical Perspective

groups. The solid line connecting squares shows the percentage of determiner production for the gender agreement dialect, and the solid line connecting circles shows the percentage of determiner production for the count/mass agreement dialect. In regard to the ﬁrst research question Hudson Kam and Newport (2005) note “the participants generally used determiners in their production about as often as they heard them in the input” (p. 171). The study found two other interesting patterns in the participants’ responses. The ﬁrst pattern involved the semantically based agreement rule. Notice that for three of the four input groups in ﬁgure 6.1, whether noun + determiner agreement was based on countability or was arbitrary appears to make no diﬀerence in the accurate learning of determiner frequency. However, the high input group is an exception. This group produced count–mass based determiners at 80 percent but class 1 (feminine) and class 2 (masculine) based determiners at only 55 percent. This ﬁnding suggests that given suﬃcient input, learners can better construct a variable rule that accurately reﬂects the frequencies of variation in the input when the rule involves a language universal, such as the count–mass distinction. The second interesting pattern involved how the participants produced determiners following identical nouns depending on whether the noun phrases appeared for the ﬁrst time or the second time in a test sentence. When a particular noun was mentioned for the second time, it was more likely to be marked with a determiner by all input groups except the perfect group. Furthermore, this tendency was more pronounced in the lower input groups. In other words, participants exposed to fewer determiners were more likely to produce a determiner in the second obligatory context that they encountered on a test sentence. ESL teachers will not be surprised at this ﬁnding because it could result from transferring the pattern of English determiner usage, where the is used to mark nouns that are known to the hearer. Therefore, in English discourse the is sometimes not required when an NP is ﬁrst mentioned but is required the second time (“Sam wanted to buy coﬀee and milk, but the coﬀee was too expensive”). Hudson Kam and Newport’s (2005) study is a laboratory demonstration of what variationists have known for a long time: adults can construct a variable rule that produces alternating forms in the proportions that these forms occur in the surrounding language. But, as we saw in chapter 1, it is not clear how this ability should be modeled in a grammar. Variable rules cannot be part of a Chomskian competence grammar, which is intended to model a speaker’s knowledge of a language at a high level of abstraction. Therefore, the mechanisms of probabilistic learning must be part of a performance grammar. In the discussion of the Derivational Theory of Complexity in chapter 1, we noted that in the 1960s and 1970s psycholinguists adapted the competence grammars proposed by generative linguists to the needs of performance models, thereby creating a performance grammar that was very similar to a corresponding Chomskian competence grammar. Recently, however, models of sentence

Psychological Theories of Linguistic Variation

• 81

production and comprehension have become more distinct from competence grammars because they have added probabilistic mechanisms. It therefore seems promising for variationists to look to such mechanisms in order to understand how the mind can learn to produce variable forms at the appropriate frequencies. The most popular probabilistic mechanism in current psycholinguistic theory is the connectionist network, which we will discuss in the next section. Psycholinguistic Models of Language Performance Production Models The production model we will examine is the one described in Barsalou (1992), based on Levelt (1989) and Garrett (1975). We will ﬁrst brieﬂy outline the production of the sentence “The cat eats the food” and then look at proposals for expanding the model by adding probabilistic mechanisms. A diagram of the model is shown in ﬁgure 6.2. The production process in the ﬁgure is divided into seven levels, but we will be concerned mainly with the ﬁrst four.

Figure 6.2 Barsalou’s (1992) model of speech production. Reprinted with permissions.

82

• Variation in Theoretical Perspective

LEVEL 1: Conceptualize a message. A mental module called the conceptualizer draws from background information, discourse information, and the speaker’s communicative intentions to conceive a message in the form of a proposition with a predicate and arguments marked with case roles. At this point EAT is the abstract concept of consuming food, which might be lexicalized as gulped or gobbled, not the actual lexical item. It may be considered a bundle of semantic features. Similarly, CAT and FOOD are abstract feature bundles that could be lexicalized in several ways. The output of the conceptualizer, called the preverbal message, serves as input to the next level. LEVEL 2: Formulate an abstract sentential representation. In this module, called the formulator, lemmas that match the abstract conceptual representations of the preverbal message are retrieved. A lemma is that part of a lexical entry that contains semantic, functional, and syntactic information, but not phonological information. In the example sentence the lemma for cat is retrieved ﬁrst. It looks like this: cat conceptual speciﬁcations FELINE (i.e., a set of semantic features that identify cats) syntactic category NOUN diacritic parameters countable deﬁnite singular The lemma for the verb eat contains the syntactic information that this verb requires a subject argument and a direct object argument. Linking rules assign the agent of eat to the subject argument slot and the theme to the direct object argument slot. LEVEL 3: Construct a phrase marker. The speaker constructs a phrase marker for the sentence using the abstract sentential representation from level 2 as a guide and employing phrase structure rules like those in Chomsky’s (1965) Standard Theory (see chapter 1). These rules are employed in the reverse of the order that they are usually presented in textbooks, as will be discussed later in the chapter. LEVEL 4: Retrieve the phonological representation associated with the selected lemmas.

Psychological Theories of Linguistic Variation

• 83

Using the lemma as a cue, the model retrieves the entire lexical entry, or lexeme, associated with each lemma. The lexemes, which contain the appropriate phonological forms in addition to the information contained in the lemmas, are now inserted into the appropriate slots in the phrase structure marker. In levels 5, 6, and 7 the lexemes are segmented into individual phonemes. Then, aﬃxes and function words are added to the syntactic string, and the phonological representations are converted into appropriate phonetic representations. The phonetic string at level 7 can then serve as a set of instructions to the articulators. Notice that, in this model, sentence production is deterministic: The preverbal message marches on to the spoken sentence without any mechanism that would allow for probabilistic behavior. But, recent additions to similar models (Levelt, Roelofs, and Meyer, 1999) have added such mechanisms, and we will consider this possibility in the next section. Let us now return to level 3 to see in more detail how phrase markers are constructed. This discussion is based on Levelt (1989) rather than on Barsalou’s (1992) adaptation of Levelt’s model. In Levelt’s (1989) model a phrase structure frame is built by means of productions, or IF THEN statements. Productions can be considered phrase structure rules that have been turned into a step-by-step set of instructions. These instructions take advantage of the projection principle, which says that a lemma of grammatical category X must wind up as part of an X phrase, and, conversely, that an X phrase must contain a member of category X. For example, IF a lemma is marked as a Noun, THEN the lexical item corresponding to that lemma must be part of a Noun Phrase. In the present example, the lemma for cat is marked as a noun. Therefore, the NP production is called up, which builds a structure like this: NP N cat Because the lemma for cat is also marked as + deﬁnite, the deﬁnite article production is called up. It takes advantage of the possible determiner slot in an NP and plugs in the lemma for the. The NP now looks like this: NP det

N

the

cat

84

• Variation in Theoretical Perspective

Next, the higher grammatical category NP (which according to the projection principle must be part of an S) calls up the procedure for building an S. Then, because an S must have an NP and a VP, the S procedure builds a structure that looks like this: S NP

VP

det

N

the

cat

We now have a VP and therefore there must be a V, which is added to the tree. Fortunately, we have a lemma marked as V, namely eat, so eat gets plugged into the V slot. The structure of the sentence under construction now looks like this: S NP

VP

det

N

V

the

cat

eat

As mentioned earlier, the production model just described does not contain probabilistic mechanisms, but such mechanisms could be added in the form of connectionist networks, and we will now consider how a simple connectionist network works. A connectionist network, which is run on a computer, can be thought of as an electrical circuit in which current spreads through a maze of wires lighting up certain light bulbs along the way. Dell, Chang, and Griﬃn (2001) have developed a connectionist network that models the task of naming an object in a picture. It is called the Aphasia Model because it can model the behavior of aphasiac as well as healthy adults. A picture-naming task involves retrieving a lemma and matching it with its phonological representation. The experiments that led to the development of the Aphasia Model involved asking adult native English speakers to name an object in a picture while various kinds of distractors were provided. For example, an informant might see a picture of a cat and also see the written word dog. By measuring how long it took to name the object in the picture with diﬀerent kinds of distractors, the experimenters were able to get an idea of the mental processes involved in naming. The Aphasia Model uses the connectionist network shown in ﬁgure 6.3 to retrieve a lemma and connect it with its phonological representation.

Psychological Theories of Linguistic Variation

• 85

Figure 6.3 The Aphasia Model for naming cat (adapted from Dell, Chang, and Griﬃn, 2001).

Connectionist theory adopts the metaphor of a neural network within the brain, where electrical activation spreads from a top layer of neurons to a bottom layer.2 Figure 6.3 contains three layers of nodes that might be thought of as three layers of neurons. The top layer contains ten nodes that correspond to the semantic features of CAT, such as [mammal], [furry], [domestic], [has a tail], and [catches mice]. In experiments, when a subject is shown a picture of a cat, these nodes will be activated, and will serve as input to the middle layer. The nodes in the middle layer correspond to lemmas, and include the lemmas for cat and similar animals like dog and rat. In terms of the brain metaphor, when an informant sees a picture of a cat, neurons in the brain that correspond to the appropriate semantic features of CAT are activated. In the Aphasia Model, nodes corresponding to these features are activated, and they pass their activation down to the cat node. When that node receives suﬃcient activation, it will ﬁre, thus retrieving the lemma for cat. But the cat node is not the only middle level node to receive activation. As ﬁgure 6.3 shows, some activation will also spread to the lemmas for dog and rat because they share some of cat’s semantic features, such as [mammal], [furry], and [has a tail] as represented by the black nodes in the top level. This activation is increased, of course, if the participant reads the word dog or rat while seeing the picture of the cat. Usually, however, only the cat node ﬁres because it receives the most activation. The nodes in the third level represent phonemes, including those in the lexical entry

86

• Variation in Theoretical Perspective

for cat, namely /k/, /æ/, and /t/. Activation from the cat node spreads primarily to those phonemes, which are then activated and can eventually be pronounced. An important feature that the Aphasia Model shares with other connectionist models is that lemma selection and phoneme selection are not deterministic. Even when connections between semantic features and lemmas are very well established, as in our example, making the connections involves an element of chance. Dell, Chang, and Griﬃn (2001) refer to this inherent variability as “noise in the system.” In the language of Variation Theory we could say that the semantic features at the input level are strong constraints on the production of “cat,” but there is always some inherent variability at each level, which can result in a slip of the tongue producing the wrong word. Let us now consider how connectionist networks could be added to Barsalou’s (1992) production model in ﬁgure 6.1. My suggestions are straightforward and necessarily simpliﬁed; more sophisticated suggestions for using connectionist networks in production models can be found in Levelt, Roelofs, & Meyer (1999). The Aphasia Model uses connectionist networks to traverse two of the steps in the production model. The ﬁrst step is lemma retrieval, where at level 2 lemmas corresponding to particular semantic features are selected. Barsalou’s production model assumes that there will always be a perfect match between semantic features and lemmas, but Dell, Chang, and Griﬀen’s (2001) research shows that this is not the case. Therefore, Barsalou’s model could be improved by specifying that lemmas are retrieved using a twolayer connectionist network at level 2 that corresponds to the top and middle layers of the Aphasia Model in ﬁgure 6.3. The second step in the production model that is relevant to the Aphasia Model is phoneme retrieval, where at levels 4 and 5 individual phonemes that match the selected lemmas are called up. Again, the production model assumes that there will always be a perfect match because each lemma is associated with a lexeme that contains the appropriate phonological representation. The Aphasia Model replaces lexeme retrieval with the connectionist network between the middle layer and the bottom layer of ﬁgure 6.3. A similar two-layer network could be added to the production model, replacing levels 4 and 5. Before examining how a connectionist network might model production in a second language, let us take a look at a connectionist network used in a model of language comprehension. This network explicitly incorporates probabilistic constraints and, like a variable rule or the Varbrul program, claims that constraints of diﬀerent weights combine to determine the probability that a particular form will be chosen. A Comprehension Model Townsend and Bever’s (2001) model of sentence comprehension uses both a probabilistic mechanism and a categorical generative mechanism to analyze input sentences. This model employs the method of analysis by synthesis

Psychological Theories of Linguistic Variation

• 87

to assign a meaning to the continuous stream of incoming speech. Townsend and Bever (2001) use the following metaphor to explain how analysis by synthesis works. Producing speech is like taking an ordered lineup of diﬀerent kinds of eggs, breaking them so each overlaps with its neighbors, then scrambling them up a bit so there is a continuous egg belt, and then cooking them. Comprehension is analogous to the problem of ﬁguring out how many eggs there were originally, exactly where each was located, and what kind it was . . . (p. 161) They continue: It is clear how [an analysis-by-synthesis] scheme approaches the scrambled egg analogy. [It] starts with a particular hypothetical egg sequence, scrambles and cooks them in a virtual kitchen, and then compares the resulting virtual omelet with the actual input. When the virtual omelet matches the actual omelet, the input and cooking sequence producing the virtual omelet is conﬁrmed as the correct analysis. (p. 164) The analysis by synthesis procedure involves the following four steps. Step 1 The model stores the input string of words in short-term memory. Step 2 Using probabilistic procedures (to be described below), the model assigns a syntax-like structure (called pseudosyntax) and a likely meaning to the input string. Step 3 Using the output of step 2 as input, the model attempts to generate a grammatical derivation using the machinery of the Minimalist Program (Chomsky, 1995). This generative procedure is referred to as real syntax. Step 4 If the real syntax generates a grammatical derivation, it means that the pseudosyntax analysis is correct, and the meaning generated by the real syntax is assigned to the input string. If the real syntax cannot generate a grammatical derivation, it means that the pseudosyntax analysis is not grammatical, and the model tries again having eliminated one possible pseudosyntax analysis. As an example of this procedure, we will consider how the comprehension model works on the input sentence “Athens attacked Sparta.” Step 1 The model stores “Athens attacked Sparta” in short-term memory. Step 2 The model looks up the meanings and grammatical categories of the lexical items in the input sentence. This results in a mental representation like this: [Athens]N [attacked]V [Sparta]N

88

• Variation in Theoretical Perspective This mental representation is then segmented into phrases. Segmentation is accomplished using the “projection principle,” mentioned earlier, which states that nouns must be part of a noun phrase, verbs must be part of a verb phrase, etc., and that together these constituents make up a clause (that is, S). The resulting string looks like this: ((((Athens)N)NP) (((attacked) V)VP) (((Sparta)N)NP))S Recall that in the production model in ﬁgure 6.2, existing semantic case roles were matched with NPs before a syntactic frame was constructed. The comprehension model reverses that process, assigning semantic case roles to the existing NPs in the syntactic structure above. This assignment is based on the fact that in English the prototypical (i.e. most likely) relationship between the syntactic order of major constituents and the case assignment of NPs is: NPagent

V

NPpatient

After semantic case roles are assigned, the pseudosyntax analysis looks like this: ((((Athens) N)NP) AGENT

(((attacked) V)VP)

(((Sparta) N)NP))S PATIENT

Step 3 The pseudosyntax representation above is used as input for generating a complete derivation using the procedures of the Minimalist Program. These procedures will not be described here, but basically they accept as input candidate strings of words marked for grammatical category, such as the pseudosyntax string above, and then generate a full derivation that maps a surface structure onto a meaning. Step 4 If the real syntax derivation does not crash, it means that the pseudosyntax analysis of the sentence (the virtual omelet) matches the input string (the input omelet), so the meaning internally generated by the Minimalist procedure of “Athens attacked Sparta” is correct, and that meaning is assigned to the input string. Now let us consider how a connectionist network is involved in the pseudosyntax analysis. First, it is important to point out that in the process of analyzing an input string the model does not proceed entirely serially, as is implied by the four-step description above. Rather, parallel processing occurs while the input is being received. This allows for future input to be anticipated on the basis of the input that has been received so far. For example, the sentence “Athens attacked Sparta” might be analyzed as NP V NP after the model has received only the morphemes “Athens attack . . .” The NP V NP analysis is then assigned because in English the sequence NP V is very frequently followed by another NP. The pseudosyntax component of the comprehension model possesses a number of syntactic templates, such as NP V NP, which correspond to grammati-

Psychological Theories of Linguistic Variation

• 89

cal constructions in the language, and an entire syntactic template can be called up after only the initial elements have been received. However, it is possible for the pseudosyntax to miss its guess and to call up the wrong template. For example, if the input string continues as “Athens’ attack on Sparta failed,” the NP V NP template must be discarded and the following template substituted: (Ngenitive

NP)NP

PP

V

The on-the-ﬂy selection of templates is governed by probabilistic associations of word strings with possible syntactic structures. One cue that is used to select a particular syntactic template is the kind of complement structure that is likely to follow a particular verb. To illustrate this process, we will review a study of reading comprehension by Spivey and Tanenhaus (1998), which looked at sentences like (5). (5) The actress selected by the director believed that her performance was perfect. When a reader of this sentence reaches the word selected, two interpretations of the sentence are possible. Selected might be the past participle of a passive construction in a reduced relative clause (as, in fact, is the case in (5)), so that the complement of selected is “actress.” In other words, the sentence would be interpreted as synonymous with “The actress that was selected by the director believed that her performance was perfect.” The second possible interpretation is that selected is the past tense verb of a main clause, so that its complement will turn out to be some new NP, as in the sentence “The actress selected a new hair dryer.” When the reader has ﬁnished reading the word selected, the pseudosyntax will provisionally assign a template matching one of these interpretations, but which one? In Spivey and Tanenhaus’s (1998) model, the decision of whether to choose a restrictive relative clause (RR) template or a main clause (MC) template on the basis of the input string “The actress selected . . .” is made by a two-layer connectionist network which, unlike the Aphasia Model, incorporates probabilistic mechanisms. We will now take a look at a simpliﬁed version of that network. Let us ﬁrst ask what factors in the linguistic environment might favor the selection of the RR or the MC template when the reader reaches the word selected. One factor could be whether the reader has already picked up the word by in peripheral vision. By would favor the RR interpretation, though the MC interpretation would still be possible, as in the sentence “The actress selected by lot a new hair dryer.” Suppose that an examination of an English corpus revealed that by following a verb ending in -ed results in a RR in 85 percent of the cases and in an MC in 15 percent of the cases. This situation could be modeled in the connectionist network in ﬁgure 6.4. Figure 6.4 says that when the by node is activated, it sends diﬀerent amounts

90

• Variation in Theoretical Perspective

Figure 6.4 Simpliﬁed version of Spivey and Tanenhaus’s (1998) connectionist model for selecting an RR or MC template (see text).

of activation to the RR node and the MC node: the strength of the activation to the RR node is .85 and the strength of the activation to the MC node is .15. This activation results in diﬀerent weightings of the RR node and the MC node. Either node could be activated, resulting in either template being called up, but the RR node’s chance of being activated is 85 percent, and the MC node’s chance of being activated is 15 percent. As Preston (2002) points out, this is like ﬂipping a loaded coin. But what if the human being whose neural connections we are modeling changes majors from psychology to English? The academic prose style in the new discipline contains far fewer passive constructions than the prose style in psychology. Consequently, the weightings in the connections in the network in ﬁgure 6.4 are no longer optimal and need to be adjusted. This is possible by means of a process called “error back propagation,” which is a lot like the old behaviorist notion of operant conditioning. When the comprehension model reaches the word selected and activates either the RR or the MC node, the model will quickly learn whether its choice was correct because it will encounter more words in the input string which may or may not ﬁt the template it has called up. For example, suppose that the network chooses the RR node, but the word after by in the input string turns out to be lot, meaning that the network has guessed wrong. In this case, the weights connecting by to the output nodes can be slightly adjusted, with the connection to the RR node being slightly decreased and the connection to the MS node being slightly increased. Using the method of error back propagation over many thousands of trials, a connectionist network can adjust its weights to reﬂect the probabilities with which features of the input correlate with the various output possibilities, so that the network can most eﬃciently analyze incoming data. It is thought that something like error back propagation is the mechanism by which Hudson Kam and Newport’s (2005) informants learned the frequencies

Psychological Theories of Linguistic Variation

• 91

of articles in the input provided to them. However, the neural mechanisms in human beings must be much more eﬃcient than error back propagation run on a computer, because Hudson Kam and Newport’s (2005) informants required only dozens of trials, not thousands. Of course, the connectionist network in ﬁgure 6.4 is far too simple to eﬀectively analyze input strings that might contain an RR or a MC. Factors besides the presence or absence of by must be included in the network, and Spivey and Tanenhaus’s (1998) model includes three other factors, which are shown in ﬁgure 6.5. The black (activated) node to the right of the by node in ﬁgure 6.5 represents information in the discourse previous to the input sentence. In the case of the example sentence, if two actresses were mentioned in the preceding discourse, the RR interpretation would be expected in order to identify which of the two actresses is being referred to. Suppose an examination of a corpus showed that when two agents were mentioned in the previous discourse, an RR occurred in 67 percent of the cases and in an MC occurred in 33 percent of the cases. This means that the weight of the connection from the discourse node to the RR node should be .67 and the weight of the connection to the MC node should be .33. The third factor that the network takes into account is the probability that an RR or an MC will follow the particular verb in the sentence. This factor is represented by the node labeled “verb.” Suppose that an examination of discourse showed that the verb selected is followed by an RR in 60 percent of cases and by an MC in 40 percent of cases. This means that the weight of

Figure 6.5 Spivey and Tanenhaus’s (1998) connectionist model for selecting an RR or MC template (adapted).

92

• Variation in Theoretical Perspective

the connection to the RR node should be .60 and the weight of the connection to the MC node should be .40. The fourth factor that the network takes into account, represented by the rightmost activated node in the constraint layer, represents the bias toward interpreting an initial noun phrase + verb sequence as a main clause versus a reduced relative. Again, suppose a corpus shows that 85 percent of sentenceinitial sequences of noun phrase + verb are main clauses and 15 percent are reduced relatives. The appropriate weights will be .85 for the RR node and .15 for the MC node. The four activated nodes on the top layer of ﬁgure 6.5 represent four independent events (the presence of by, the presence of two possible referents in the previous discourse, etc.), and it is assumed that each of these events contributes its probability equally to the overall probability of activating either the RR node or the MC node. Therefore, the overall probability of activating one of the bottom layer nodes can be calculated by multiplying the probability of each top node by one quarter and adding the four products. The results of these calculations for the test sentence are .57 for activating the RR node and .43 for activating the MC node, as shown in ﬁgure 6.5. Now let us compare the connectionist network in the Aphasia Model to the connectionist network for pseudosyntax just described. Both of these networks, like all connectionist networks, are non-deterministic; that is, they incorporate an element of chance. In this respect they are similar to the Varbrul program and diﬀerent from categorical linguistic rules or Barsalou’s (1992) unmodiﬁed production model in ﬁgure 6.2, both of which are deterministic systems that march along a predetermined path from input to output. Thus, chance plays a role in both the Aphasia Model and the pseudosyntax model, but it plays a greater role in the pseudosyntax model. This is because the Aphasia Model is designed to show how a connectionist network can model slips of the tongue and other infrequent errors in the speech of native speakers. As discussed earlier, these errors result only from “noise in the system.” In Townsend and Bever’s (2001) template selection process, on the other hand, there is no “correct” output. The task is to calculate the probability of selecting the RR node or the MC node when the four events represented by the black top layer nodes of ﬁgure 6.4 occur. The network can make an informed guess, but it might turn out to be wrong. To suggest a gambling metaphor similar to Preston’s (2002) loaded coin, it is like when a blackjack player has been dealt two face cards and must decide whether to hit or stand. The odds certainly favor standing, but that might not turn out to be the right thing to do. This situation is exactly the same as that modeled by the Varbrul program or a variable rule, where independent constraints associated with diﬀering probabilities combine to determine the likelihood of alternating forms, such as N or G. An intriguing feature of Townsend and Bever’s (2001) comprehension

Psychological Theories of Linguistic Variation

• 93

model is that in step 3 of the process the generative machinery of the Minimalist Program is used to mentally generate sentences in real time. As in the Derivational Complexity Theory, discussed in chapter 1, a generative theory that was supposed to account for competence has been pressed into service for use in a performance model. Thus, the degree of abstraction between the competence theory and the performance theory has been narrowed to almost zero. Townsend and Bever (2001) comment: [Our model] appears to be a quite promising reuniﬁcation of psychological modeling and linguistic theory . . . Such uniﬁcation has been lacking generally, since the formulation of the Aspects model in the mid1960s. Later syntactic architectures, especially government and binding, provided a hodgepodge of theoretical systems of constraints, each of which might correspond to psychological operations . . . [But], hope springs eternal: perhaps a new “derivational theory” of the psychological operations involved in assigning syntactic derivation is at hand. (p. 179) Elliott’s Study of Spanish Acquisition We now turn to a study of second language acquisition that found evidence of probabilistic learning, which we will attempt to model using a connectionist network. Elliott (1995) studied the acquisition of the Spanish clitic se, when it is used reﬂexively. His database consisted of computer conferencing messages written by American college students, which he analyzed using the Varbrul program. Eighty-six students were represented in his study, and they produced a total of 2,247 tokens. The students wrote messages to their classmates at least weekly and often daily. They wrote mostly about class assignments, but they also discussed personal topics and campus events. The students received credit from their instructors for their postings, but were not graded on grammatical accuracy. In Spanish, the clitic se has a number of related uses, one of which is as a direct object pronoun in a transitive sentence, as in (6). (6)

Nuria me cortó. Nuria me cut “Nuria cut me.”

When the agent and theme are the same in such a sentence, the object pronoun functions as a reﬂexive, as in (7). (7)

a. Yo me corté. I myself cut “I cut myself.” b. Tu te cortaste. you yourself cut “You cut yourself.”

94

• Variation in Theoretical Perspective c. El/Ella se cortó. he/she him-/herself cut “He/She cut him-/herself.” d. Nosotros nos cortamos. we ourselves cut “We cut ourselves.” e. Ellos/Ellas se cortaron. they (masc./fem.) themselves cut “They cut themselves.”

Notice that in the sentences above, the reﬂexive pronoun has four diﬀerent forms: me, tu, nos, and se, depending on the referent. For convenience I will use se as a generic to refer to all of these forms. Se also has a number of other uses that are conceptually related to the transitive, reﬂexive use illustrated above, but which are not equivalent to English reﬂexives. One of these is the so-called “middle experiencer” use, which is illustrated in (8). (8)

a. Ella se molesta con tus preguntas. she herself annoys with your questions “She’s getting annoyed with your questions.” b. Yo me alegro de verte. I myself make happy of to see you “I’m happy to see you.” c. Yo me puse triste. I myself put sad “I get sad.”

According to Bull (1965) the transitive use of se in (6) and (7) is “logical” because the agent can act upon the patient, who, in the reﬂexive case, happens to be the same as the agent. Therefore, according to Bull, the reﬂexives in (7) are “true” reﬂexives. But, he says, the middle experiencer use of se, as in (8), is not “logical” because the verbs do not refer to a notion that an agent can perform on a patient. Thus, the verb molestar, “annoy,” is used in its “logical,” transitive use in (9) because one person can annoy another person with questions. (9)

Juan la molesta a Nuria con sus preguntas. Juan her annoys to Nuria with his questions “Juan annoys Nuria with his questions.”

But, in (8)a molestar is used in its middle experiencer, “non-logical” use because one person cannot annoy herself with another person’s questions. Another diﬀerence between the true reﬂexive and the middle experiencer use of se has to do with the semantic notions of the verbs in question. Kemmer (1993) provides a cognitive linguistics account (see chapter 8) of se that

Psychological Theories of Linguistic Variation

• 95

categorizes verbs into semantic domains. Semantic domains include the physical domain as in (7), the emotional domain as in (8), the social domain as in (10), and the ideational domain as in (11). (10) Juan se casó. Juan himself married “Juan got married.” (11) Juan se preguntó donde estaba el dinero. Juan himself asked where was the money “Juan wondered where the money was.” According to Kemmer (1993), the middle experiencer use of reﬂexive se occurs only in the emotional domain. Another fact about the middle experiencer use of se often noted by grammarians is that the referent of se is somehow changed by the experience named by the verb. This is clear in (8)b and (8)c, but less so in (8)a. Elliott performed a Varbrul analysis using a number of factors as independent variables. The demographic variables included information about the informants, such as age, sex, and the number of years of Spanish instruction. The linguistic variables included the forms of the pronoun se (that is me, tu, nos, or se), the tense of the verb, and the semantic domain of the verb. In order to keep things simple, I will not present Elliott’s results using Varbrul p values because there were a number of interacting factors that made the analysis complicated. Instead, I will just use percentage ﬁgures, but the Varbrul results are consistent with what I will claim. Also, I am only going to discuss the eﬀect of the semantic domain of the verb. Elliott (1995) found that his subjects used se least accurately with emotional domain verbs. The main reason for the inaccuracy with these verbs was overgeneralization. Thus, students tended to turn emotional domain verbs that are not middle experiencer reﬂexives into middle experiencer reﬂexives, as when (12)a is erroneously turned into (12)b and (13)a is erroneously turned into (13)b. (12) a. Odio esta comida. I hate this food “I hate this food.” b. *Me odio esta comida. myself I hate this food “I hate this food.” (13) a. Amo a Julia. I love to Julia “I love Julia.” b. *Me amo a Julia. myself I love to Julia “I love Julia.”

96

• Variation in Theoretical Perspective

The percentage of correct usage of se within each semantic domain is shown in table 6.1. As the table shows, the students were least accurate in the emotional domain. As already noted, the main reason for the inaccuracy in the emotional domain was overgeneralization. The percentage of overgeneralization of se within each semantic domain is shown in table 6.2. As table 6.2 shows, students had a strong tendency to incorrectly use se with transitive emotional domain verbs, which should not take se. Elliott’s Results Interpreted as a Connectionist Network Let me now oﬀer an informal explanation of why Elliott’s (1995) informants overgeneralized se in the emotional domain. Notice that some emotional domain verbs work as true, “logical” reﬂexives. Odiar in (14) and amar in (15) are used in this way: (14) Me odio. (15) Me amo.

“I hate myself.” “I love myself.”

But, as we have seen, many emotional domain verbs are middle experiencer reﬂexive verbs. Although Bull (1965) calls these reﬂexives “non-logical,” there is some logic in their use because, as mentioned, they imply a change in the emotional state of the experiencer, as illustrated in (8). Thus, a learner’s unconscious hypothesis about emotional domain verbs might be: If a verb causes a change of state in the subject of the sentence, show this by using se. But it may be very diﬃcult to decide, consciously or unconsciously, whether a particular verb is considered to cause such a change of state. For example, Table 6.1 Percentage of correct se usage by semantic domain in Elliott’s (1995) study. Semantic domains Physical

Social

Ideational

Emotional

54

46

38

24

Table 6.2 Percentage of overgeneralization of se by semantic domain in Elliott’s (1995) study. Physical

Social

Ideational

Emotional

52

20

69

86

Psychological Theories of Linguistic Variation

• 97

laugh is considered to cause a change in the one who laughs, as in (16), but love is not considered to cause a change in the one who loves, as in (17). (16) Me reí de mi hermano. (17) Amo a Julia.

“I laughed at my brother.” “I love Julia.”

It is easy to see how the fuzzy boundary between transitive verbs in the emotional domain (which are not reﬂexive) and middle experiencer verbs in the emotional domain (which are reﬂexive) can be crossed, and transitive verbs within the domain erroneously produced with se. We can now attempt to provide a connectionist explanation for this behavior. Figure 6.6 shows a connectionist network where the top layer represents the semantic features in the preverbal message that activate the lemmas for particular verbs. Among these features is information regarding the semantic domain and whether the situation causes a change of state in the experiencer, as represented by the two black nodes. Notice that for the learner represented in this model, amar is erroneously

Figure 6.6 A connectionist interpretation of Elliott’s (1975) data.

98

• Variation in Theoretical Perspective

connected to the experiencer changes node. The bottom layer of the network contains a node representing the activation of se and a zero node representing no clitic particle. Overgeneralization of se for use with amar can occur in two ways. One way is the so called cascading phenomenon, or random activation of nodes throughout the system. During the acquisition period, se can be randomly activated with any verb until connections between the verb nodes and the se node or zero node have been ﬁrmly established. This could account for Elliott’s subjects’ overgeneralizing se to verbs outside the emotional domain. But, as we have seen, overgeneralization within the emotional domain was far more frequent. This could be explained by the second way in which se could be activated. In order to activate amar, all of the top layer semantic nodes that connect to amar are normally activated ﬁrst, including the emotional domain node. Because this node is also connected to reir, molestar, and other middle experiencer verbs, some activation will continue through these middle layer nodes (though they will not be activated) on to the se node, which may then receive enough activation to be selected instead of the zero node. This is the connectionist account of overgeneralization. Furthermore, because activation is a two-way street, spreading upward as well as downward, every time the speaker erroneously produces amar with se, the dotted line connecting the two nodes will be strengthened and the line between amar and zero will be weakened. Earlier, I claimed that a connectionist production story was also an acquisition story. Let me now return to that topic. As we have seen, a connectionist network can learn the appropriate connections and weightings using the mechanism of error back propagation. When a learner hears or reads amar used correctly (without se), the connection between the amar node and the zero node is strengthened, and the connection between the amar node and the se node is weakened. This is the connectionist instantiation of Krashen’s input hypothesis in SLA theory. Furthermore, when a learner utters or writes amar correctly and notices that it seems right, or uses amar incorrectly and notices that it seems wrong, the same connections are strengthened and weakened. This is the connectionist instantiation of Swain and Lapkin’s (1989) output hypothesis. Thus, the reweighting of connections in a network is compatible with two inﬂuential learning theories in SLA. Furthermore, the connectionist account grounds these abstract theories in actual mental processes and perhaps (if like Feldman [2006] we go beyond the metaphorical interpretation of connectionist networks) in actual brain functioning. Notice that in the account above, I have claimed that the connectionist component of a production model is also a component of a comprehension model because the same connections are strengthened or weakened when learners both comprehend and produce sentences. This claim is generally believed, but the details are controversial. The connectionist interpretation of Elliott’s (1995) data shows how a connectionist model can learn categorical rules, such as which emotional domain

Psychological Theories of Linguistic Variation

• 99

verbs take se. Now let us consider how it might learn a variable rule. Over time the connection between amar and se will be weakened by correct input and output until it disappears. But what if we are dealing with a variable linguistic feature, say pronouncing -ing words as either G or N, as discussed in chapter 2? Just like the human participants in Hudson Kam and Newport’s (2005) experiment, the connectionist network could be trained to reproduce the frequencies of each of these forms as encountered in the input. But, of course, learning the basic frequencies (or in Varbrul terms input frequencies) of G and N is only the beginning of what a speaker needs to know. Real speakers must accurately learn the percentages of linguistic features that are associated with many factors in the speaking context, including the formality of the speaking situation, the speaker’s age, social class, gender, etc. In principle, a connectionist network could handle this ﬁnely tuned frequency learning by dedicating nodes in the top layer to each of the relevant features in the speaking context. That is, the network could learn not only the overall probability of connecting a verb stem with G or N, it could also learn to modify this baseline probability depending on the identity of the speaker, the listener, the formality of the speaking situation, and other contextual factors. In terms of Preston’s (2002) metaphor, this is how the coins are loaded. Thus, a connectionist network shared by the production and comprehension systems may be the mysterious score keeper that Bickerton (1971) called for, which connects Variation Theory to psycholinguistics. Conclusion In chapter 1, we discussed the relationship between a competence grammar and a performance model of speech production or comprehension, noting that although Chomskian generative grammars can be consistent with performance models, they are conceived at the level of competence, a level of abstraction that is not compatible with modeling probabilistic patterns. In this chapter we have seen how the facts of language variation in the speech of both native speakers and language learners can be accounted for in performance models of production, comprehension, and learning that use the probabilistic mechanism of a connectionist network. However, it would be nice to have a grammatical theory (not just a performance model) that allows for probabilistic behavior and is compatible with connectionist modeling. In the next chapter, we will take a look at such a theory, which is called cognitive linguistics. We have already had a glimpse of this theory because it was the framework in which Elliott (1995) conducted his study of middle experiencer verbs.

7

Cognitive Linguistics

Introduction In chapter 6, we saw that psycholinguists have incorporated probabilistic mechanisms in models of language production and comprehension. The mechanism used in both of the models we reviewed was the connectionist network. In this chapter, we will review a school of linguistics that is compatible with probabilistic mechanisms called cognitive linguistics (CL). We will see that CL is compatible with connectionism and can provide an abstract characterization (a grammatical characterization if you like) of the workings of connectionist networks. I will further claim that both CL and connectionism are compatible with variation theory. CL can provide a theoretical rationale for some variable language phenomena, and connectionism can provide at least the beginning of an answer to Bickerton’s (1971) question, discussed in chapter 1, of how the mind can learn and keep track of the probabilistic patterns studied by variationists. CL was developed in the 1980s by a loosely associated group of linguists who sometimes disagreed on certain principles and who did not always use the same terms and symbols. Therefore, several branches of CL have emerged. However, all the branches agree on basic principles, which contrast with those of generative grammar. These principles include the following: (1) considerations of meaning are necessary in doing grammatical analysis; (2) categorization is basic to human understanding and human beings construct grammatical categories like “noun” and “ditransitive construction” in the same way that they construct natural semantic categories like “cup” and “bird;” and (3) the boundaries between semantic and linguistic categories are often fuzzy, and therefore language processing can involve making decisions about what category a particular linguistic form belongs to based on probabilities. CL is like generative grammar in that it aims to show the relationship between an utterance (or phonological representation) and a meaning (or semantic representation). However, CL is unlike generative grammar in that it attempts to show this relationship as directly as possible, without using highly abstract devices like empty categories and traces. A CL description involves only three kinds of structures: phonological, semantic, and symbolic. We will examine examples of all of these, but in order to provide the background for that discussion, we will ﬁrst consider the CL story of categorization. 101

102

• Variation in Theoretical Perspective

Prototype Categories The question of what to call the objects around us is central to language study. What makes a container for liquid a cup and not a mug or a bowl? What makes a ﬂying creature a bird and not a bat? Questions like these involve the mental process of categorization, and people engage in it all the time. At the subconscious level, we must decide whether to call the color of our lost suitcase “green” or “blue.” At the conscious, and even legal, level we must decide whether a college student is in-state (and entitled to reduced tuition) or out-of-state. The traditional theory of categorization in linguistics and philosophy goes back to Aristotle, who said that members of a category share certain deﬁning features. An even counting number is any number that can be divided by two. A human being is a featherless biped. But Wittgenstein (1953) pointed out that there are some categories that don’t seem to have deﬁning features, such as “game.” It might seem that a game is any activity involving competition with other people, like basketball, or competition against odds, like roulette. But bouncing a ball against the side of a building just for fun could be called a game, and in this case competition seems to be absent. Ball bouncing is a game because it has some of the features of more typical games, such as the manipulation of a ball, and certain rules (you have to hit the building). Wittgenstein (1953) theorized that members of the category “game” do not have deﬁning features, but rather share a “family resemblance.” In fact, two games may have nothing in common, but each may have diﬀerent features of more typical games, just as two sisters may not look alike but are obviously members of the same family because one has her mother’s hair and the other her father’s skin. Labov (1973) carried out a psycholinguistic experiment where he showed subjects pictures of cup-like objects and asked them to name the objects. He found that there were some pictures that everyone called a cup, but as the vessel became shorter and larger in diameter more and more subjects began calling it a bowl. Characteristics besides the vessel’s dimensions also inﬂuenced naming. If the vessel had a handle, it was more likely to be called a cup, but if it was ﬁlled with potatoes it was more likely to be called a bowl. Labov’s (1973) experiment showed that there are some categories, like cup and bowl, that have fuzzy boundaries and gradually blend into other categories. Nevertheless, people do not have trouble deciding what vessel to serve coﬀee in because these categories have prototypical members about which everyone agrees. Such categories are called prototype categories. The peripheral members of a prototype category are of particular interest to variation theory because they allow one to observe prototype eﬀects, or variable judgments of category membership. In order to discuss prototype eﬀects let us take a look at some of the pioneering experiments conducted by Rosch in the 1970s. Rosch (1973) asked subjects to identify pictures of diﬀerent kinds of

Cognitive Linguistics

• 103

birds. She found that they could more quickly identify robins and sparrows than chickens and hawks. This ﬁnding suggests that people think about birds in terms of best examples, or prototypical birds, rather than in terms of atypical birds. CL proposes that the mental category BIRD has a prototype structure, with typical members at the center and atypical members at the periphery, as shown in ﬁgure 7.1. This kind of prototype category is called a radial category, and we will encounter further examples later in the chapter. Figure 7.1, which is for purposes of illustration and only loosely based on research, is a schematic representation of the category BIRD, showing its central and peripheral members and their features. Robins and sparrows are the two central members of the category. They share the features + ﬂies, + small, and + widely distributed. Swallows and doves come close to the prototypes because they share the ﬁrst two of these features. Ostriches and penguins are the most peripheral members of the category because they lack a widely shared characteristic: the ability to ﬂy. CL claims that central members of the category, or exemplars, are used for informal reasoning about the category as a whole. For example, if someone in Tucson says, “I saw a bird on the patio,” the listener will picture a small ﬂying bird, not a quail or a roadrunner, even though these species are common in Tucson. Rips (1994) demonstrated a similar phenomenon experimentally. He told one group of subjects that the robins on an island were infected with a

Figure 7.1 The prototype category BIRD. Bold type indicates features shared by subcategories.

104

• Variation in Theoretical Perspective

particular disease and asked whether they thought the ducks on the island would catch it. Then he told another group of subjects that the ducks on an island were infected with a disease and asked whether they thought the robins would catch it. The subjects were more likely to say that the ducks would catch the disease from the robins than vice versa. This experiment suggests that people consider a characteristic of a central member of a prototype category common to the whole category, but that they consider a characteristic of a peripheral member particular to that member. As we will see, speakers also use prototype categories for learning and producing morphological and syntactic structures. It should also be noted that the category BIRD is not a fuzzy or graded category like CUP and BOWL because it does not have fuzzy edges. All of the creatures in ﬁgure 7.1 are birds. Nevertheless, BIRD has prototypical members and people use these exemplars when learning and thinking about categories. Perhaps for paleontologists who study birds at the time when they were evolving from dinosaurs, the category BIRD is a graded category, blending into the category PTERODACTYL. Symbolic Structures As mentioned earlier, symbolic structures are one of the three kinds of structures postulated by CL. An example of such a structure is the lexical item “bird.” Following a tradition going back to Saussure [1915] (1974), CL claims that a word is represented in a speaker’s mind as a symbolic structure that shows an association between a semantic representation (that is a concept, which is customarily written in capital letters) and a phonological sequence. The symbolic representation for the lexical item bird is shown in (1). (1) BIRD

/bird/

The semantic representation in (1) refers to the schema for BIRD represented in ﬁgure 7.1, and the phonological representation is straightforward. According to Langacker (1991, p. 14), speakers relate the symbolic representation for BIRD to symbolic representations for similar concepts, such as CAT, DOG, and PLANET, using an even more abstract symbolic structure. What the concepts CAT, DOG, and PLANET have in common is that they involve material objects that can be perceived as distinct from the background space in which they are located. In other words, they are things. The abstract category THING is represented in a speaker’s mind in a way that is compatible with the

Cognitive Linguistics

• 105

following symbolic structure (where /x/ stands for phonological content of some kind, which at this abstract level is unknown). (2) THING

/x/

Like the symbolic representation for BIRD in (1), the symbolic representation for THING in (2) associates a semantic structure (in this case a very abstract one) with some kind of phonological representation. When speakers learn a new word for something, say aardvark, they construct a sound/meaning symbolic structure like the bottom half of (3), and they associate this with symbolic structure (2), creating a more complex symbolic structure, which is shown in (3) as a whole. In other words, they note that an aardvark is a thing, like a cat, a dog, or a planet. Thus, part of learning the word aardvark is learning the association shown in (3). (3) THING

/x/

AARDVARK

/ardvark/

Translating (2) and (3) into more common terminology (and for the moment simplifying a great deal), (2) corresponds to the notion of noun, and (3) says that the word aardvark is a noun. Thus, (3) is similar to the lexical entry of a generative grammar shown in (4).

106

• Variation in Theoretical Perspective (4) aardvark [AARDVARK Noun /ardvark/] (where AARDVARK stands for a semantic representation).

However, an important diﬀerence between the CL approach represented in (3) and the generative approach represented in (4) is that in generative grammar nounhood is a purely syntactic property, based on considerations such as whether a word can occur after a determiner or whether it can be modiﬁed to agree with a verb. CL recognizes these syntactic properties as part of what makes a word a noun but claims that a word’s meaning is the main consideration in determining its grammatical class, as explained below. So far the discussion of the semantics of nouns has been greatly simpliﬁed. Of course, there are many nouns that are not things, like anger, moment, and yellow (as in “The painter used a bright yellow”). Langacker (1991, p. 16) points out, however, that at an abstract level these words share the essential property of nouniness mentioned above: they can be construed as entities that stand out from a background. A prototypical noun like bird is situated in the cognitive domain of physical space, where it can be distinguished from its background mainly by its shape. To understand the nouniness of yellow, we must shift to the cognitive domain of color, where the yellow region of the spectrum can be distinguished from the background of other colors (or proﬁled in CL terminology) by its hue. Similarly, the word moment is a noun, in part, because within the cognitive domain of time an individual moment can be proﬁled or singled out from the moments that precede and follow it. In a similar way, anger can be proﬁled as a distinct emotion within the emotional domain. Thus, nouns are any kind of entity that, like things, can be proﬁled in some cognitive domain. In (2), THING should be understood as representing any such entity: bird, anger, yellow, moment, etc. Prototype Schemas in Morphology English Irregular Past Tense For a look at how CL handles morphology, let us consider a form that has been much studied by variationists: the English past tense. Most English past tense forms are either regular or irregular although some verbs, like dive, are in the process of being regularized, so that some speakers say dove, other speakers say dived, and still others alternate between the two forms. Let us consider how the ﬁrst group of speakers retrieve dove according to Barsalou’s (1992) production model (see ﬁgure 6.2). At level 3 of the production model, the lemma for dive (which is marked to show that this is an irregular verb) is joined with the appropriate lexeme. The phonological representations of irregular verbs are stored in the lexeme, so at this point the sequence /dowv/ is activated. In other

Cognitive Linguistics

• 107

words, according to the model, dove, like all irregular verb forms, is memorized. Now let us consider a speaker who says dived. For this speaker, the lemma marks dive as regular verb. At level 3 of the production model, the lemma is joined to the appropriate lexeme, but in this case the lexeme does not contain the complete phonological representation, only the stem /dayv/. The fact that the lemma is marked for regular past causes an abstract PAST marker to be attached to the verb stem. At level 6 in the production model the familiar morphosyntactic rule for regular past tense chooses the appropriate ending from the set of regular past tense endings: /d/, /t/, or /id/. Now consider the case of the speaker who alternates between dived and dove. Pinker and Prince (1994) suggest that one possibility for this speaker is that the lemma for dive is marked both regular and irregular, so that either /dayvd/ or /dowv/ could eventually be produced. In this case, the two forms would alternate randomly, a case of free variation.1 However, Pinker and Prince (1994) also allow for another possibility, one that was proposed by Bybee and Moder (1983; see also Bybee and Slobin, 1982) working within the framework of CL. The formal mechanism that they propose is the same as the mechanism for representing lexical items: the prototype schema, which we will now look at in more detail. As mentioned earlier, a schema is any mental representation. Symbolic structures are schemas. Additional examples of schemas can be found in Barsalou’s (1992) production model in ﬁgure 6.2. For example, CAT, the abstract concept of the animal (which might call up the lemma for cat, tabby, or tom, depending on the rest of the preverbal message), is a schema, as is the phonological representation /kæt/. Similarly, the tree structure at level 3 is a schema, as is the equivalent phrase structure rule S → NP VP. As noted in chapter 6, none of the schemas in ﬁgure 6.2 can model probabilistic patterns. For example, if the schema for the phonological representation of cat is accessed at level 4, only the phones speciﬁed in that schema can be sent on to level 5. Thus, the version of the production model shown in ﬁgure 6.2, which has not been modiﬁed to include connectionist networks, cannot model the fact that New York City speakers sometimes say [kæt] and sometimes say [kiyət]. A prototype schema, on the other hand, can model probabilistic patterns. To see how this is possible, let us look at the details of Bybee and Moder’s (1983) research on the learning of English irregular past tense forms. Bybee and Moder (1983) presented adult subjects with nonce (made up) verbs like strig and asked them what the past tenses would be. If irregular past forms were simply memorized, we would expect that the nonce words would elicit regular past endings because they are not associated with a memorized irregular form. But Bybee and Moder (1983) found that this was not the case. Although their subjects sometimes produced the regular forms, they more often produced irregular forms, such as strug.

108

• Variation in Theoretical Perspective

Historically, grammarians have suggested that such innovative forms are produced by analogy with real pairs of the present and past tenses of regular verbs, like spring – sprung, and clearly this is the case. But the question remains: What is the mental mechanism by which this analogy works? How did Bybee and Moder’s (1983) subjects know that if strig were a real verb, its past tense would probably be strug? Bybee and Moder (1983) proposed that irregular verb classes are mentally represented as prototype schemas. The prototype schema for the past form of strung-type verbs (technically class II irregular verbs) is shown in (5). (5) /s/ C (C) // /ŋ/ Notice that (5) is not an exemplar of a class II irregular verb (as robin is an exemplar of the bird category) because it is abstract. Exemplars are concrete examples, like strung. It may be that learners can use both abstract prototype schemas like (5) and exemplars when learning prototype categories (Goldberg, 2006, p. 47).2 According to (5), the prototype form for the past tense of a class II irregular verb has an initial /s/ followed by a consonant, optionally followed by another consonant (the parentheses indicate that the presence or absence of this third consonant does not aﬀect the prototypicality of the form), followed by the vowel //, followed by a velar nasal. As suggested, the past form sprung ﬁts the prototype exactly. However, forms that have only some of the features of the prototype can also qualify as class II irregular past verbs. An example is ﬂung, which contains a ﬁnal /ŋ/, the vowel //, and a preceding consonant, but lacks the initial /s/. Thus, ﬂung shares a family resemblance with sprung. It is a member of the prototype category of class II past forms, but not a central member. It occupies a position similar to that of the roadrunner in the prototype category of birds shown in ﬁgure 7.1. In the case of class II past forms (as in all graded prototype categories) there are no necessary and suﬃcient features that distinguish its members from other past forms; rather, membership in the category is a matter of degree. Bybee and Moder (1983) suggest that in their experiment, when the subjects were read a cue word like strig, they mentally constructed both a regular past tense form /strigd/ and an irregular class II past form /strg/. Then, they compared the irregular form to the prototype in (5). If the irregular form was sufﬁciently similar to (5), they uttered it; otherwise, they uttered the regular past form. The fuzziness of the criteria for membership in the category of class II past form accounts for the fact that Bybee and Moder’s (1983) subjects showed prototype eﬀects in the form of disagreement in regard to forms like sigged versus sug because sug is not close to the prototype. In later sections, I will point out the similarities between prototype schemas and variable rules and how prototype schemas can be written in variable rule notation.

Cognitive Linguistics

• 109

The Compatibility of Prototype Schemas and Connectionist Networks – Elliott’s Study In chapter 6 we saw how Elliott’s (1995; Adamson and Elliott, 1997) study of the acquisition of middle experiencer verbs in Spanish could be explained at the connectionist level, and I have noted that connectionist models are considered to be compatible with the prototype schemas used in CL (Feldman, 2006). Let us therefore ask how Elliott’s results could be discussed at the CL level, using prototype schemas. The discussion will illustrate how CL provides an abstract characterization of the workings of connectionist networks. In the discussion of Bybee and Moder’s (1983) experiment, I suggested that variation in morphology could occur when prototype schemas that control incompletely learned forms are accessed during production. Variation in Elliott’s data could occur in a similar way. Native speakers and advanced learners have memorized which emotional domain verbs are reﬂexive middle experiencer verbs requiring se and which are not. But, less advanced learners must make an (unconscious) informed guess. To do so, they may access a prototype schema or an exemplar for emotional domain verbs, which can result in the variable production of the clitic particle. Recall that Elliott (1995) found that his subjects’ learning of reﬂexive verbs was less accurate in the emotional domain than in the physical domain. The main reason for the inaccuracy within the emotional domain was overgeneralization. Subjects tended to turn transitive emotional domain verbs that do not take se (like odiar “hate”) into middle experiencer verbs that do take se, as in (6)a and (7)a (the correct versions of these sentences are supplied in (6)b and (7)b. (6) a. *Me odio esta comida. myself I hate this food “I hate this food.” b. Odio esta comida. I hate this food “I hate this food.” (7) a. Me encanto la comida de Japon. myself I enchant the food of Japan “I love Japanese food.” b. Me encanta la comida de Japon. me enchants the food of Japan “I love Japanese food.” As mentioned in chapter 6, it can be diﬃcult for learners to distinguish emotional domain verbs that are just transitive and not reﬂexive, like odiar “hate” and encantar “enchant,” from emotional domain middle experiencer verbs that are reﬂexive, like reirse “laugh” and divertirse “have fun.” The diﬀerence is

110

• Variation in Theoretical Perspective

that middle experiencer verbs require the participant to undergo a change of state while other kinds of emotional domain verbs do not. Such a change is clearly necessary to the meaning of verbs like volverse loco “to go crazy” and ponerse triste “become sad.” But, for some of the middle experiencer verbs such a change is not so obvious. These include reirse “laugh,” divertirse “have fun,” and quejarse “complain.” In chapter 6 overgeneralization was modeled using a connectionist network, and it was suggested that because reﬂexive particles are so often activated with verbs in this domain, some activation can spread to these particles regardless of which verb is activated. This claim is compatible with saying that a prototypical emotional domain verb is a middle experiencer verb, which takes a reﬂexive particle and implies a change of emotional state in its subject. A prototype schema for such a verb can be written using the conventions of variable rules, as in (8). (8) Emotional domain: verb + <se> [] Schema (8) says that a prototypical emotional domain verb takes a reﬂexive particle and implies a change of state in the Experiencer NP. The angled brackets around two of the components indicate that an emotional domain verb can have both of these components, only one, or neither, but that the prototype has both. These possibilities reﬂect Elliott’s (1995) results. Recall that his subjects did not always incorrectly attach se to emotional domain verbs. Rather, their overgeneralization was variable. Such variable performance could be a prototype eﬀect resulting from the imperfect ﬁt of peripheral members of the category. Sometimes the subjects did not use se with middle experiencer verbs when they did not consider the experiencer to have changed, as in reirse “laugh,” and sometimes they used se with regular transitive verbs that seemed to imply a change in the experiencer, like amar “love.” In chapter 6 we saw how Elliott’s (1995) results could be represented in the connectionist network in ﬁgure 6.5. This chapter has shown that this information is also compatible with a prototype schema, which can be written using the conventions of variable rules. So far, I have argued that language learners can make use of prototype schemas for constructing phonological and morphological representations of forms they have not yet memorized and that such schemas can model variation in interlanguage. For this reason, CL is a promising grammatical theory for variationist research. We will now consider how CL handles syntax, and how the theory can model syntactic variation in learners’ speech.

Cognitive Linguistics

• 111

Prototype Schemas in Syntax/Semantics: The Acquisition of Argument Structure Introduction Recently, there has been much interest in how ﬁrst and second language learners acquire the correct argument structures for verbs. The relationship between a verb and its arguments can be complex. For example, the locative verbs ﬁll and pour take NP PP complements, as in (9) and (10). (9) Marsha poured water into the glass. (10) Marsha ﬁlled the glass with water. But notice that for pour the ﬁrst complement NP, water, is the theme of the event of pouring and the second complement NP, glass, is the goal of the event. However, for ﬁll the ﬁrst NP, glass, is the goal and the second NP, water, is the theme. Notice also that for both verbs the order of the arguments cannot be reversed, as in (11) and (12). (11) *Marsha poured the glass with water. (12) *Marsha ﬁlled water into the glass. To complicate matters further, some locative verbs do allow both thematic orders, as in (13) and (14). (13) Marsha loaded the truck with hay. (14) Marsha loaded hay onto the truck. Perhaps the most studied complement structure is associated with dative verbs, and it also presents diﬃculties. Some dative verbs, such as give, allow two complement patterns: a prepositional phrase or a ditransitive. Example (15) illustrates the prepositional phrase complement, and (16) illustrates the ditransitive complement. (15) The Council gave/sent/faxed $3,000 to Marsha. (16) The Council gave/sent/faxed Marsha $3,000. However, other dative verbs, like donate, allow only the prepositional phrase, as in (17) and (18). (17) The Council donated/presented/credited $3,000 to Marsha. (18) The Council *donated/*presented/?credited Marsha $3,000. How can language learners master the subtleties of verbs and their complements? In particular, how can they avoid overgeneralizing and producing structures like (18)? Pinker (1989) calls the problem of avoiding overgeneralization “Baker’s Paradox,” after C. L. Baker, whose 1979 article brought widespread attention to the problem. Baker’s Paradox has three aspects. The ﬁrst aspect involves

112

• Variation in Theoretical Perspective

productivity. If speakers never constructed productive rules that allowed them to generate forms they had not heard before, there would be no problem. But, as we have seen, Bybee and Moder’s (1983) subjects did generate such forms, and Pinker (1989) documents that children overgeneralize ditransitive constructions, producing sentences like “*You ﬁnished me lots of rings” instead of “You ﬁnished lots of rings for me” (p. 21). The second aspect of Baker’s Paradox is the lack of negative evidence. If speakers were corrected when they said things like (11) and (12), they could avoid overgeneralizing in the future, but apparently such correction does not occur for L1 learners or for many L2 learners. The third aspect of the Paradox is the question of arbitrariness. The fact that nearly synonymous verbs like donate and give have diﬀerent complement structures means that there is no simple semantic guideline for pairing verbs with complements. Diﬀerent scholars have suggested diﬀerent solutions to Baker’s Paradox, and we will take a look at three of them: Baker’s (1979) Strict Constructivism Hypothesis, Pinker’s (1989) Lexical Rule Hypothesis, and Goldberg’s (1995, 2006) Construction Grammar Hypothesis, which is done within the CL framework. Then, I will present a pilot study of ditransitive use by native speakers and ditransitive acquisition by adult Korean speakers. Baker’s Strict Constructivism Hypothesis Baker (1979) challenges part 1 of the Paradox. He claims that children do not overgeneralize but rather memorize verbs and their complements as they are encountered. However, as mentioned, longitudinal studies of child acquisition (Bowerman, 1988; Gropen, 1989; Pinker, 1989), as well as experimental studies using nonce verbs (Gropen, Pinker, Hollander, and Goldberg, 1991) show that children do, in fact, overgeneralize to some extent, so the strict constructivism hypothesis has been abandoned. Pinker’s Lexical Rule Hypothesis Pinker (1989) challenges part 3 of the Paradox, claiming that complement patterns are not arbitrary. Rather, complement structure is signaled by complex morphophonemic and semantic clues. Before looking at the clues to dative alternation, let us consider the theory that Pinker adopts, which is an adapted version of Lexical-Functional Grammar (Bresnan, 1982). Early generative theories accounted for dative sentences with a syntactic transformational rule like (19), which would change a sentence like (20) into a sentence like (21). (19) V NP1 to NP2 → V NP2 NP1 (20) Marsha threw the ball to John. (21) Marsha threw John the ball. One reason that syntactic transformations fell out of favor was that they were not supposed to change the meaning of the structure. But people perceived that

Cognitive Linguistics

• 113

rules like (19) did change meanings. For example, according to Pinker (1989) (20) can be used where John did not catch the ball or could even be asleep, but (21) entails that John was meant to receive the ball and invites the inference that he did. Similarly, “She taught Amharic to the students, but they didn’t learn anything” sounds all too natural, whereas “She taught the students Amharic, but they didn’t learn anything” sounds odd. Lexical-functional grammar represents the diﬀerent meanings in (20) and (21) with the semantic representations in (22) and (23) respectively. These semantic representations, which consist of universal semantic primitives, are stored with the lexical entries for verbs. Notice that this claim entails that throw has two diﬀerent meanings: (22) throw1 → x causes y to go to z (23) throw2 → x causes z to possess y Pinker (1989) says that children notice that many dative verbs besides throw (for example give, send, teach, tell, and get) have both of these meanings. They are then able to abstract a lexical rule that relates the two meanings, as shown in (24). (24) a. x causes y to go to z → b. x causes z to possess y (where the arrow means “entails that”) Then, when children hear a new verb, say fax, with a meaning similar to send, they are able to apply rule (24) to produce (25) without hearing fax used in the ditransitive form. (25) Marsha faxed John the letter. To continue brieﬂy with the derivation of (25), Pinker claims that the semantic structure represented by (24)b, which is stored in the new lexical entry for fax, is projected onto an x-bar template by so-called “linking rules,” which indicate that the ﬁrst argument in the logical structure becomes the sentence subject, the second argument becomes the indirect object, and the third argument becomes the direct object. Thus, for sentence (25), x = Marsha, y = John and z = the letter. We now return to the question that began this section: How do learners learn that rules like (24) can apply to verbs like fax and give but not to verbs like donate and credit? As mentioned, Pinker claims that there are morphophonological and semantic constraints, or clues, as to which verbs allow the ditransitive. We will consider the morphophonological constraint ﬁrst. Old English had only the ditransitive dative, and there was considerable freedom as to the order of the two objects. Misunderstanding was avoided because NPs were marked for case. The prepositional dative was introduced from French and became widespread in the thirteenth and fourteenth centuries. During that period, the case markers eroded, and word order became less

114

• Variation in Theoretical Perspective

ﬂexible. For a time there was almost complementary distribution of the two dative forms, with prepositional datives occurring only with latinate verbs borrowed from French and ditransitive datives occurring only with native English verbs. Later, both types of verbs extended their range to the other pattern, but, as we will see, this extension was not complete, and the ditransitive form still cannot be used with many latinate verbs. Because children are not aware of English etymology, they must have some synchronic clues as to which verbs are native and which verbs are latinate. According to Pinker (1989), native verbs are single syllables or, if polysyllabic, take stress on the ﬁrst syllable. Also, preﬁxes and suﬃxes signal latinate forms. Since this constraint involves both morphology and phonology, Pinker calls it the “morphophonological constraint.” This constraint does not apply to all ditransitive verbs, as discussed below. The second type of constraint on which verbs allow the ditransitive is semantic. We have seen that all ditransitive verbs require basic semantics that match (24), that is, the goal argument must be a potential possessor of the theme. Pinker (1989) calls this requirement the broad-range semantic constraint. This requirement explains the ungrammaticality of (26)b, where “Chicago” cannot possess the car. (26) a. I drove the car to Chicago. b. *I drove Chicago the car. Notice that possession need not be literal; for example, verbs of communication are treated as denoting the transfer of messages which the recipient metaphorically possesses, as in “He told her the story” and “She showed him the answer.” Pinker (1989) calls (24) a broad-range lexical rule, but there must be narrowrange lexical rules as well to disallow verbs like push, whisper, and say, which are semantically compatible with causing a goal to be viewed as a recipient, but which do not allow the ditransitive. Pinker identiﬁes nine semantic subclasses of verbs which allow the ditransitive. These are shown in (27). Notice that subclasses 5 and 7 are immune from the morphophonological constraint. (27) Verbs that take the ditransitive construction: 1. Verbs that inherently signify acts of giving: e.g., give, pass, hand, sell, trade, lend, serve, feed. 2. Verbs of instantaneous causation of ballistic motion: e.g., throw toss, ﬂip, slap, poke, ﬂing, shoot, blast. 3. Verbs of sending: e.g., send, mail, ship. 4. Verbs of continuous causation of accompanied motion in a deictically speciﬁc direction: e.g., bring, take.

Cognitive Linguistics

• 115

5. Verbs of future having (involving a commitment that a person will have something at a later point): e.g., oﬀer, promise, bequeath, leave, refer, forward, allocate, guarantee, allot, assign, advance, award, reserve, grant. 6. Verbs of communicated message: e.g., tell, show, ask, teach, pose, write, spin, quote, cite. 7. Verbs of instrument of communication: e.g., radio, e-mail, telegraph, wire, telephone, netmail, fax. 8. Verbs of creation: e.g., bake, make, build, cook, sew, knit, toss (when a salad results), ﬁx (when a meal results), pour (when a drink results). Notice that in prepositional phrase form these verbs take for not to: “They baked a cake for Marsha.” 9. Verbs of obtaining: e.g., get, buy, ﬁnd, steal, order, win, earn, grab. Each of the verbs in these nine classes must participate in a diﬀerent narrowrange lexical rule in order to license the ditransitive. That is, verbs of instantaneous causation of ballistic motion, such as throw as in “Marsha threw John the ball,” must undergo a lexical rule like (28). (28) a. x CAUSES y to GO to Z (by means of INSTANTANEOUS BALLISTIC MOTION) → b. x CAUSES z to HAVE y Rule (28) would also license toss, slap, kick, etc. It would not license push, pull, lower, haul, etc. because these are verbs of “continuous causation of accompanied motion in some manner,” as discussed below. Pinker also identiﬁes ﬁve subclasses of verbs that are compatible with the broad-range constraint but not with the narrow-range rules, and therefore cannot take the ditransitive. These are shown in (29). (29) Verbs that do not take the ditransitive construction 1. Verbs of fulﬁlling (X gives something to Y that Y deserves, needs, or is worthy of): e.g., *I presented him the award; *I credited him the discovery; *Bill entrusted/trusted him the sacred chalice; *I supplied them a bag of groceries. 2. Verbs of continuous causation accompanied motion in some manner: e.g., *I pulled/carried/pushed/schlepped/lifted/lowered/ screamed/hauled John the box. 3. Verbs of manners of speaking: e.g., *John shouted/screamed/ murmured/whispered/yodeled Bill the news.

116

• Variation in Theoretical Perspective 4. Verbs of proposition and propositional attitudes: e.g., *I said/ asserted/questioned/claimed/doubted her something. 5. Verbs of choosing: e.g.,*I chose/picked/selected/favored/indicated her a dress.

Pinker’s (1989) discussion shows that learners are faced with a formidable task in learning the English ditransitive. They must learn a morphophonological constraint, a set of narrow semantic constraints, and how the two interact. Construction Grammar Introduction Construction Grammar (C×G) is a branch of CL that has been developed by a number of researchers, many working at universities on the West Coast of the United States, including Fillmore (1988), Kay (1990), Lakoﬀ (1987), Goldberg (1995, 2006), Shibatani (1996), and Feldman (2006). Here I will follow the account presented in Goldberg (1995, 2006). As we have seen, the lexical rule account requires that if a verb participates in more than one type of argument structure, it must have more than one meaning. This requirement can lead to an unwieldy proliferation of meanings. For example, the verb kick participates in no fewer than seven argument structures, shown in (30). (30) a. b. c. d. e. f. g. h.

Pat kicked the wall. Pat kicked Bob black and blue. Pat kicked the football into the stadium. Pat kicked at the football. Pat kicked his foot against the chair. Pat kicked Bob the football. The horse kicks. Pat kicked his way out of the operating room.

But we have the sense that the basic meaning of kick does not change in the sentences in (30). C×G posits that, in general, verbs retain a basic meaning, but that a new element of meaning is added by the construction itself. Thus, we understand that the physical action in (30)a is the same as the physical action in (30)b. The sentences in (30), then, contain not eight diﬀerent lexical verbs, but the same lexical verb in eight diﬀerent syntactic constructions. For example, (30)a is the transitive construction, (30)b is the resultative construction, (30)c is the caused motion construction, etc. Notice that (30)h, the way construction, contains not only speciﬁc grammatical categories but also a speciﬁc word: way. The way construction looks like this: verb + possessive pronoun + way + PP. Other sentences that exemplify this construction include “George painted his

Cognitive Linguistics

• 117

way through the apartment” and “Rocky couldn’t punch his way out of a paper bag.” We will return to the way construction below. Constructions are often presented in a formal notation using boxes, as shown in (31), which represents the ditransitive construction. (31) Ditransitive Construction Sem

CAUSE-RECEIVE

Syn

PRED | M V

< agt

<

| M SUBJ

rec

pat >

| M OBJ

> | M OBJ2

This construction maps a particular semantics onto a particular syntax. The semantic component of the construction says that an agent causes a recipient to receive a patient. The mapping says that the agent is realized syntactically as the subject, the recipient as the ﬁrst object, and the patient as the second object. This mapping is the equivalent of linking rules in Lexical-Functional Grammar. A verb, such as to hand, can be inserted in (or fused with) the construction and mapped onto the ditransitive surface structure, as shown in (32). (32) Composite Fused Structure: Ditransitive + hand Sem

CAUSE-RECEIVE

| M HAND

< agt

rec

| | M M < hander handee

pat >

| M handed >

As in Lexical-Functional Grammar, for a verb to be used in a particular construction, its semantics must be compatible with the semantics of the construction. This requirement prevents drive from taking the ditransitive, as in (26)b. Recall that Pinker (1989) enforced this semantic compatibility requirement by means of his broad-range constraint on lexical rules. As we have seen, Pinker (1989) also postulated a number of narrow-range constraints on lexical rules, thus proliferating the polysemy of verbs. For example, in his theory, the meaning of throw in (20) diﬀers from the meaning of throw in (21). Because C×G allows verbs to retain a single basic sense, we may ask how it accounts for the narrow-range semantic constraints on the

118

• Variation in Theoretical Perspective

ditransitive. The answer is the converse of Pinker’s answer. The constraints are found not in the meanings of the verbs but in the meanings of the ditransitive construction, which is a radial category like BIRD with a number of related meanings, as shown in ﬁgure 7.2. The central meaning of the construction, labeled “A,” is expressed by verbs of giving, instantaneous causation of ballistic motion, and continuous causation in a deictically speciﬁed direction. The extended senses of the construction are shown in B to F. Goldberg (1995) says that the extended senses of the construction are not predictable from the central meaning, but rather are “motivated” by it. That is, they must be learned conventionally, but learning is facilitated because the connections are not entirely arbitrary: they make sense because they are in a family resemblance relationship. Pinker (1989) also uses the term “motivated” to express the relationship between the broad-range semantic constraint and the narrow-range rules. Goldberg’s Usage-based Account of the Ditransitive Goldberg’s solution to Baker’s Paradox is similar to Pinker’s. She believes that the fusing of verbs with constructions must be conventionally learned, but that verb semantics provide clues as to which constructions a verb can be fused with. However, there are important diﬀerences in the two accounts. A problem for the lexical rule account of ditransitive learning is that it assumes that narrow semantic classes are mutually exclusive. A verb can ﬁt into only one of the classes and therefore either dativizes (that is, undergoes lexical rule (23)) or does not. But, native speakers’ intuitions regarding verbs in the same narrow semantic class are variable. For example, according to Pinker (1989), push is nondative (it is on the list of nondative verbs above as a verb of continuous causation-accompanied motion in some manner). But, “John pushed me a beer” sounds acceptable to me and to some of the native English-speaking subjects of the experiment reported in the next section. Perhaps the reason for this disagreement about grammaticality is that push can be construed as a verb of instantaneous causation of ballistic motion, similar to shove. This results in prototype eﬀects in the form of uncertain judgments. Goldberg (1995) remarks, “the determination of which narrowly-deﬁned class a given verb belongs in is not always entirely clear-cut. [. . .] In general, in the case of verbs that may fall into one of two classes, one which can appear ditransitively and one which cannot, we would expect to ﬁnd some dialectal variation in whether the verbs can be used ditransitively” (p. 42). Notice that in Goldberg’s (1995, 2006) account, verbs still fall into semantic classes that can be fused with the ditransitive construction. She suggests that these classes are learned through usage. For example, give is one of the most frequent verbs in mothers’ speech to children (Goldberg, 2006, p. 76) and so may serve as an examplar around which a category of verbs which take the

Cognitive Linguistics

• 119

Figure 7.2 The radical category “ditransitive verb” (from Goldberg, 1995). Reprinted with permissions.

ditransitive is constructed. As children encounter other examples of verbs with similar meaning that are used ditransitively, they will be added to this radial category. Such a category is shown in ﬁgure 7.3. The clusters of verbs in this ﬁgure are the counterparts of the structural descriptions of narrow lexical rules. If, for example, a verb’s semantics are compatible with an agent enabling a patient to go to a goal by means of instantaneous ballistic motion, then that verb is eligible to be fused with the ditransitive construction. This fusing adds an element of meaning to the structure, namely that the goal successfully receives the patient and becomes a recipient. The end result is the same as the result of a lexical rule. What is diﬀerent in the CL account is the possibility for

120

• Variation in Theoretical Perspective

Figure 7.3 Verbs that fuse with the ditransitive construction (from Goldberg, 1995). Reprinted with permissions.

verbs that are loosely attached to their narrow semantic categories to be construed as belonging to another category. A Pilot Study of Ditransitive Acquisition Among Korean Speakers Psycholinguistic experiments in speech production (Hare and Goldberg, 1999) and speech comprehension (Ahrens, 1995; Kaschhuk and Glenberg, 2000; Bencini and Goldberg, 2000) provide evidence for the psycholinguistic reality of syntactic constructions. For example, Ahrens (1995) asked 100 native English speakers to decide what the nonce verb moop meant in the sentence “She mooped me something.” Sixty percent of the subjects said that “moop” meant “give,” even though several ditransitive verbs have a higher overall frequency in Engish usage, including take and tell. These results suggest that the subjects equated “moop” with the central sense of ditransitive verbs. In order to explore whether learners of English as a second language have similar intuitions about (real) ditransitive verbs, I conducted a pilot study involving native speakers of Korean and English, which is described in the next sections. Research Design A questionnaire eliciting intuitions regarding 29 potential ditransitive sentences (as well as a number of masking sentences) was administered to 24 native English speakers (NSs) and 24 Korean speakers (NNSs). The potential ditransitive sentences were chosen with the help of a Korean linguist who is

Cognitive Linguistics

• 121

also an ESL teacher, in order to avoid patterns that could be literally translated from Korean or that are focused on in the Korean schools. The written instructions told the subjects to read each sentence carefully and to rate its grammaticality along a ﬁve-point scale, where 1 was completely ungrammatical and 5 was completely grammatical. Examples of how to complete the rating task were provided using a grammatical, an ungrammatical, and a questionable sentence, none of which involved dative sentences. Written instructions for the Korean subjects were provided in English and Korean. In addition, each subject was given a cloze test in English. The NSs’ scores on this test ranged from 18 to 24; the NNSs’ scores ranged from 3 to 25. In order to control for English proﬁciency, all subjects scoring below 17 were excluded. This procedure left 14 NNSs who were highly proﬁcient in English. All of the subjects were living in Tucson, and almost all of them had been in the U.S. for more than ﬁve years. Most were students at the University of Arizona. NSs’ Results It should ﬁrst be said that the results of the pilot study are only suggestive and are intended to indicate areas of interest for a larger and more rigorous study. The verbs included in the study are shown in table 7.1, where they are divided into verbs that dativize and verbs that do not dativize, according to Goldberg (1995) and Gropen et al. (1991). Verbs that fail to dativize can be subdivided into three groups: (1) verbs that fail to meet the broad-based semantic constraint; (2) verbs that meet the broad-based semantic constraint but do not fall into one of the narrow semantic classes that allow the ditransitive; and (3) verbs that violate the morphophonological constraint. First, consider the grammaticality judgments of the NS subjects, which appear in the left column of in table 7.2.3 The ordering of the verbs in this column appears to match fairly well the grammaticality claims of Goldberg (1995) and Gropen et al. (1991) shown in table 7.1. All of the verbs in the top third of the column are listed by these authors as ditransitive verbs, and all of the verbs in the bottom third are listed as nonditransitive verbs. The study also included ﬁve verbs that appeared to violate the broad-range semantic constraint (that the indirect object must be construable as the possessor of the direct object), namely: walk (Nick walked the dog for her → *Nick walked her the dog), drive (Elaine drove the car for them → *Elaine drove them the car), operate (George operated the projector for him → *George operated him the projector), and intercept (Jim intercepted the message to her → *Jim intercepted her the message). The NSs gave the lowest possible rating to all of these verbs except drive, which was rated in the bottom third of the scale, but well above the other three verbs. In hindsight, it seems that “Elaine drove them the car” could be construed as meaning that Elaine drove the car to them and they took over possession. “Nick walked her the dog,” on the other hand, does not seem to imply transferred possession of the dog.

122

• Variation in Theoretical Perspective

Table 7.1 Classiﬁcation of verbs on the grammaticality judgment task according to Goldberg’s (1995) and Gropen et al.’s (1991) criteria. Verbs that violate the morphonological constraint are marked with *. Verbs that dativize (according to Goldberg’s [1995] categories, see key below) B. owe, promise C. deny, refuse D. save, award, reserve, book, forward E. *approve F. (verbs of obtaining) get, win, steal, grab, (verbs of creating) create, *discover, *improve, *erect Metaphors: fax, quote, *communicate Verbs that do not dativize (1) Fail to meet broad-based semantic constraint: play (a trick on Jill), walk (the dog for Tom) improve (the recipe for Marsha) *intercept (the message to Nora) operate (the machine for George) (2) Meet the broad-based semantic constraint but do not fall into one of the narrow semantic classes: Verbs of fulﬁlling:*present, credit Verbs of continuous causation of accompanied motion in some manner: push Verbs of manner of speaking: say Key to semantic categories: B. Conditions of satisfaction imply that agent cause recipient to receive patient C. Agent causes recipient not to receive patient D. Agent acts to cause recipient to receive patient at some future point in time E. Agent enables recipient to receive patient F. Agent intends to cause recipient to receive patient

From a variationist point of view, the verbs that fall in the middle area of table 7.2 with scores ranging between .5 and −.5 are the most interesting because they appear to exhibit prototype eﬀects. Recall that Pinker (1989) and Goldberg (1995, 2006) said that ditransitive verbs are learned conventionally, though they are motivated by semantic similarity with one of the narrow semantic classes. Why, then, don’t the NSs rate all of the verbs near the top or bottom of the scale, as either grammatical or ungrammatical? Instead, six verbs, create, steal, push, credit, erect, and grab, are rated between +.5 and −.5. It should be noted that because the judgments of all the subjects were lumped together, these indeterminate ratings could result from both uncertain judgments on the part of individual subjects and disagreement among the subjects. But, the reason that some verbs could cause both uncertain judgments and dialect diﬀerences is because they do not clearly match the criteria of the ditransitive construction and so prototype eﬀects result. Of course, no

Cognitive Linguistics

• 123

Table 7.2 Native English and native Korean speakers’ ratings for the grammaticality of the ditransitive construction with various verbs. Index

English n = 24

Korean n = 14

1.0

fax, get, save, ship, award, deny, owe, reserve, win

0.9

book

fax

0.8

forward, quote

owe, award

refuse

ship, get

0.7 0.6 0.5 0.4 0.3 0.2

quote create

forward, credit

0.1 0.0

steal, push

−0.1

credit

save, book

−0.2

erect

operate, approve

−0.3

create

−0.4

grab

play, erect

−0.5

drive

present, steal, discover, reserve

−0.6

present, communicate

−0.7

communicate, win, intercept deny, grab, improve, refuse

−0.8

approve

push

−0.9

discover

drive

−1.0

play, improve, say, walk, intercept, operate

walk

Mean = 0.1

Mean = 0.29

deﬁnitive claims can be made on the basis of this small study, but I would like to suggest some possibilities. The ﬁrst thing to note about the six questionable verbs is that, by design, none of them ﬁts Goldberg’s central sense of the ditransitive construction. They are all, to some extent outliers. In fact, credit and present appear in class 1 (verbs of fulﬁlling) of Gropen et al.’s (1991) list of verbs that fail to dativize. Yet, credit was rated in the middle of the scale with a score of −.1 and present just escaped the middle ground with a score of −.6. One reason that credit sounds better than present is that present violates the morphophonological

124

• Variation in Theoretical Perspective

constraint. But why aren’t both of these verbs judged fully ungrammatical? Pinker (1989, p. 156) notes that both credit and present are fully grammatical if with follows the object, as in “They will present you with an award” and “I will credit you with the full amount.” Thus, these verbs pattern like locative verbs that take a direct object followed by a prepositional phrase (“Marsha poured water into the glass”). Pinker (1989) suggests that the requirement that these verbs take with may be eroding. Construction grammar could perhaps better explain this shift by noting that the verbs are semantically very similar to verbs that express the central sense of the ditransitive construction (they signify acts of giving), so they are becoming attached to the construction as an additional sense. Until they become fully attached, they cannot be easily classiﬁed and prototype eﬀects will result, causing variation in dialect usage and grammaticality judgments. Create and erect ﬁt semantically in Goldberg’s sense B of the ditransitive construction (verbs involved in scenes of creation) but they both violate the morphophonological constraint, which may account for the variation in acceptability judgments for these verbs. Push should be unacceptable because it falls into Gropen et al.’s (1991) forbidden class 2, verbs of continuous causation of accompanied motion. But, as discussed earlier, in our test sentence, “Joe pushed me a beer,” push could be construed as a verb of instantaneous causation of ballistic motion, analogous to slap in “Wayne slapped me the puck.” Because the scene depicted by the test sentence is ambiguous, prototype aﬀects result. It is less clear why our subjects did not like steal and grab in the ditransitive construction. Both are verbs of obtaining, which in the prepositional construction take for. However, Pinker’s (1989, p. 116) observations may suggest a reason. He points out that verbs in this class carry “an overlay of benefaction,” in other words, in both the ditransitive and prepositional phrase constructions verbs of obtaining imply that the transfer of possession will beneﬁt the recipient. But our test sentence, “George stole her $5,” may not result in beneﬁt to the recipient. And perhaps the same consideration applied to “She grabbed me a sandwich.” Who wants a grabbed sandwich? In sum, the uncertainty in the NSs’ judgments regarding six of the verbs on our test suggests that their assignment to semantic categories is not as straightforward as Pinker (1989) and Gropen et al. (1991) suggest. However, Goldberg’s (1995) grammatical construction account is compatible with fuzzy grammaticality intuitions and dialect diﬀerences because these verbs do not clearly meet the criteria for fusing with the ditransitive construction. NNSs’ Results The NNSs judged far fewer of the sentences to be grammatical than the NSs. NNSs’ conservative judgments of grammaticality have also been observed by many researchers, including Tarone (1985) and Shi (2003). The NNSs judged only ﬁve verbs to be acceptable in the ditransitive (receiving an index score of .5

Cognitive Linguistics

• 125

or higher): fax, owe, award, ship, and get. Table 7.2 shows that the NNSs’ judgments were also more variable than the NSs’ judgments. The NSs rated 15 of the 29 verbs in either the highest (+1.0) or lowest (−1.0) category, indicating unanimous agreement for these verbs. But the NNSs rated no verbs in the highest category and only one verb, walk, in the lowest category. The ﬁve verbs that the NNSs judged to be compatible with the ditransitive construction are in diﬀerent semantic classes. According to Goldberg’s categories shown in table 7.1, these are: fax (metaphor), owe (B), award (D), ship (A), and get (F). The fact that only ﬁve verbs, from ﬁve diﬀerent semantic categories, were judged to be clearly ditransitive suggests that these verbs were individually memorized, not learned as part of a semantic class. As mentioned earlier, ﬁve verbs that violate the broad-range semantic constraint were included in the study: walk, drive, operate, play (a trick on), and intercept. The NSs gave the lowest possible rating to all of these verbs except drive, which was discussed above. The NNSs gave very low ratings to walk and drive and rated intercept quite low. But play and operate are rated in the middle of the scale. These facts also suggest that the NNSs have not mastered the broad-range semantic constraint but have learned which verbs ﬁt the ditransitive on a verb-by-verb basis. We now consider the morphophonemic constraint. As table 7.2 shows, the NSs appear to abide by this constraint. Three verbs which violate the constraint, present, approve, and discover are rated at −.6 or lower (compare “John gave/*presented, found/*discovered her the perfect dress”; “John promised/ *approved her the raise”). The NNSs did not give high ratings to any of these verbs, though approve (which could be construed either as a verb of permission to which the morphophonemic constraint applies or a verb of future having, to which the constraint does not apply) was rated 12th out of the 29. Note also that the NSs rated reserve and award, verbs of future having that violate the morphophonemic constraint, fully grammatical. However, the NNSs gave a much higher rating to award than to reserve, again suggesting that they are learning ditransitives on a word-by-word basis. The case of deny and refuse, however, provides evidence that semantics plays some role in the NNSs’ ditransitive learning. These two verbs of refusal form an especially interesting class because (according to Pinker [1989]) they appear only in the ditransitive form, not the prepositional form (compare “Annette denied/refused him a promotion”. “*Annette denied/refused a promotion to him”). If only memorization was involved in learning which verbs are ditransitive, these verbs should be prime candidates to be memorized, but as table 7.2 shows, the NNSs judged both deny and refuse ungrammatical, despite the fact that they can have heard them used only in the ditransitive construction. Pinker (1989) considers deny, refuse completely unlike other ditransitive verbs because they do not undergo a lexical rule. Goldberg (1995, 2006), however, considers them no diﬀerent from other ditransitive verbs; their meaning is a metaphorical extension from the central meaning of the construction.

126

• Variation in Theoretical Perspective

As table 7.1 shows, she considers them members of semantic class C, “Agent causes recipient not to receive patient”, which is an outlying member of the radial category of ditransitive verbs. But, deny, refuse are special members of this category because their meaning is the opposite of the meaning of the other ditransitives: they mean that the recipient does not receive or possess the patient. In other words deny, refuse violate the broad-range semantic constraint. It appears that the NNSs have not learned these verbs individually but have excluded them based on their meaning. Thus, just as learning a new ditransitive can be “motivated” by a similarity in meaning to a prototypical member of a category, perhaps learning can be inhibited by a dissimilarity of meaning. Discussion: Prototype Schemas, Connectionist Networks, and Variable Rules In this chapter, I have argued that prototype schemas are compatible with variation in speech and grammaticality judgments. This claim is reasonable at the conceptual level because, unlike generative formalisms, prototype schemas are non-deterministic. I have also pointed out that in some cases even the formalisms used to represent prototype schemas and variable rules are similar. In the discussion of Elliott’s (1995) study in chapter 6, I showed that prototype schemas like (8)a can be written using variable rule notation, as in (8)b. Let us now see if this is possible for another prototype schema we have discussed. Bybee and Moder’s (1983) prototype schema for class II irregular verbs was shown in (5), which is repeated here. (5)

/s/ C (C) // /ŋ/

Recall that (5) says the prototype form for the past tense of a class II irregular verb has an initial /s/ followed by a consonant, optionally followed by another consonant, followed by the vowel //, followed by a velar nasal. It would be interesting to rank the features of (5) to show which of them most favored a candidate form like strug being analyzed as the past tense of a class II irregular verb by Bybee and Moder’s (1983) subjects. In fact, Bybee and Moder (1983) were able to discover these features by varying the initial and ﬁnal consonants of the cue words in their experiment. In variationist terms, they were able to discover the ordering of the constraints. They presented this information in tables, but it could be included in the prototype schema by using the conventions for variable rules, as shown in (33). (33) Γ

C

(C)

//

C A [<+velar>] B [<+nasal>]

Prototype schema (33) speciﬁes the necessary (though not the suﬃcient) features and the optional features of the prototype past tense form for class II

Cognitive Linguistics

• 127

irregular verbs. The necessary features are a mid-central vowel, a preceding consonant, and a following consonant. A form with only some of these features, such as cut, which has the essential features but none of the optional features, would be far from the prototype. A more prototypical form would have at least some of the optional features, which in (33) are ranked in order of their importance using the Greek letter notion discussed in chapter 1. A past tense form in which the alpha and gamma features are present (such as strug) is closer to the prototype than a form in which the beta and gamma features are present (such as strum). Thus, the prototype schema for the past form of class II verbs can be written using variable rule notation and used to model knowledge of linguistic forms. Let us now consider the implications of Bybee and Moder’s (1983) experiment for language acquisition. As discussed previously, these authors suggest that in language learning, variation in the production of irregular verbs could occur when learners are confronted with the present tense of a new verb that somewhat resembles the prototype, for example slink. When called upon to produce the past tense of this verb, learners would go through the same mental process as Bybee and Moder’s (1983) subjects. First, they would consult the lexical representation of slink to see if an irregular past form had been stored. If no form was stored, they would compute the possible form slunk and compare it to prototype schema (32). But, because slunk only loosely matches the prototype, sometimes slunk would be accepted and sometimes the regular past tense alternative slinked would be accepted. As acquisition proceeds, the learner will encounter the correct past tense of slink, and store it with the lexical entry. According to Bybee and Moder (1983), even after language acquisition is complete, the prototype schema will remain as a backup device, which can still be used if novel verbs are encountered. The psycholinguistic process just described of choosing between two internally generated forms, slunk and slinked, is similar to the choice described in Spivey and Tanenhaus’s (1998) study discussed in chapter 6. Recall that the hypothetical learner in that study had to choose whether to tentatively analyze the input string “the actress selected” as part of a reduced relative (RR) or a main clause (MC). Recall also that in Spivey and Tanenhaus’s (1998) model of sentence comprehension this choice was made by a connectionist network, where diﬀerent features of the input were represented by weighted connections between nodes. The choice of RR or MC was never certain, but diﬀerent combinations of input features favored one choice or the other. This situation is exactly the same as when a learner must choose between slunk and slinked, and, as the reader may have guessed, the prototype schema in (33) could be converted into a connectionist network, as shown in ﬁgure 7.4. Figure 7.4 represents a network that makes a guess as to whether a candidate verb form should be produced as a legitimate class II past tense or whether it should be rejected and the regular past tense rule employed instead. The

128

• Variation in Theoretical Perspective

Figure 7.4 A possible representation of prototype schema (6) as a connectionist network.

network is intended to be parallel to the network constructed by Spivey and Tanenhaus (1998) shown in ﬁgure 6.5. Now, let us look at the CL/connectionist story of regular past tense learning by children. The traditional account of this process is as follows. At ﬁrst, children learn the -ed forms of verbs as memorized chunks, just as they learn the irregular forms. At some point, they notice that many verbs come in two diﬀerent versions: walk–walked, push–pushed, etc., and that the -ed version is used to describe past events. Many children then go through a stage of adding -ed to all verbs, producing forms like goed instead of the previously learned went. Eventually, the two types of verbs are sorted out, and regular verbs are produced by the rule and irregular verbs are produced from memory. However, Pinker (1989) notes that this story skips over two important questions. The ﬁrst question is how the child knows “to look out for ‘present–past’ instead of ‘hot–cold,’ ‘indoor–outdoor,’ ‘good mood–bad mood,’ and hundreds of other interesting distinctions?” (p. 193). The second question is “how a child deduces that the rule is obligatory” (p. 193). Pinker’s answer to the ﬁrst question is that children are hard-wired to look for certain linguistic distinctions, such as the diﬀerence between the past and nonpast (but not “indoor–outdoor,” etc.), and are also innately programmed to correlate these distinctions with “minor diﬀerences in words, such as walk and walked” (1999, p. 210). The CL/connectionist account is similar. Although CL denies the existence of UG, it does acknowledge innate mechanisms of perceiving and processing information. For example, as discussed in chapter 1 and the appendix, the color category systems in languages expand over time in predictable ways because of the hard-wiring of the human perceptual system. Furthermore, the connectionist programs that have been written to mimic children’s learning of the past tense build in information about how to detect the diﬀerences in past and present forms and how to correlate these diﬀerences with past and present time. This is tantamount to assuming that children

Cognitive Linguistics

• 129

innately understand the past–nonpast distinction and to correlate this distinction with minor diﬀerences in words. Thus, the built-in programming in connectionist networks is very much like Pinker’s (1999) claim about hardwiring in children’s brains. We have already seen the CL/connectionist answer to Pinker’s second question of how a child can deduce that a rule is obligatory. In chapter 6 we saw that the Aphasia Model learned a categorical rule using the method of back-propagation. When the desired connections between the input layer and output layer were invariably reinforced, the weightings of those connections became so strong that probabilistic output ceased, except that which was comparable to “noise in the system”, or in generative grammar terms “performance error.” The human counterpart of this kind of learning was observed in Hudson Kam and Newport’s (2005) research, which was reviewed at the beginning of chapter 6. That study showed that even adults can learn the frequencies at which rules occur in input and that if a rule occurs in input 100 percent of the time, it will be produced categorically. In fact, that study found that categorical learning results when input overwhelmingly favors a rule but does not quite reach 100 percent. To summarize, in this chapter we have seen that conceptually CL is compatible with connectionist theory, and that both CL and connectionism are compatible with Variation Theory. I have therefore suggested that CL is a promising grammatical framework in which to discuss variable linguistic phenomena.

IV Variation in Pedagogical Perspective

8

Speaking Style and Monitoring

Monitoring—Attention Paid to Speech In chapter 4, we noted that early studies of variation in second language acquisition were psychologically oriented but that more recent studies have looked at the social dimensions of SLA. The same is true in regard to the study of style. In this chapter we will ﬁrst look at some studies that considered speaking style mainly in psychological terms in both native and nonnative speech. This strand of research has had a profound eﬀect on the ﬁeld of SLA because it includes Stephen Krashen’s inﬂuential monitor model. But before Krashen began talking about monitoring, the term was used by William Labov to explain diﬀerent speaking styles in native speaker speech. Labov’s notion of monitoring compares to Krashen’s notion in interesting ways. After the discussion of monitoring, we will review some more recent studies, which look at the social dimension of style. Finally, we will explore the relationship between the psychologically based notion of monitoring and the socially based notion of speech as an expression of identity. Labov’s Account of Style and Monitoring In his study of New York City speech, Labov (1972a) proposed that there are no single-style speakers: everyone alters the way they speak based on a number of diﬀerent factors in the speaking situation. One of these factors is the topic under discussion. As we saw in chapter 2, topics like “language” and “soapbox” elicit a more formal style of speech, with higher percentages of G, than topics like “kids” and “narrative.” Thus, in Labov’s (1972a) framework speaking styles vary along a continuum of prestige. Styles at the lower end of the continuum contain high percentages of informal or stigmatized features, and styles at the upper end of the continuum contain lower percentages of these features. Labov’s (1972a) deﬁnition of style was directly connected to his method of eliciting the diﬀerent styles. The principal method was the sociolinguistic interview, which was described in chapter 2. Two styles were distinguished within the interview proper, formal and informal. Additional styles were elicited by asking informants to read diﬀerent kinds of texts, including reading style, word list style, and minimal pair style. Thus, for Labov style is deﬁned by the context in which it is elicited. Labov (1972a) also showed that style is related to a speaker’s social class. 133

134

• Variation in Pedagogical Perspective

Working-class speakers use higher percentages of informal or stigmatized features in all speaking styles than middle-class speakers, who use higher percentages of these features than upper-class speakers. One of the variable features that Labov (1972a) looked at was postvocalic r, which New Yorkers often delete, so that guard can be pronounced [goəd] and fourth ﬂoor can be pronounced [foəð ﬂoə]. The relationship between /r/ deletion, speaking style and social class can be seen in ﬁgure 8.1. Consider the style of the highest social class (the top solid line). These speakers produced the lowest percentage of /r/ in the casual style of the sociolinguistic interview, a higher percentage of /r/ on the reading task, and the highest percentage of /r/ on the minimal pairs task. This pattern is shown in the speech of all of the social classes, a fact which implies that notions of correctness and prestigious speech are embedded within the entire speech community and aﬀect all social classes in the same way. Labov (1972a) explained the patterning shown in ﬁgure 8.1 by invoking a construct from psychology: monitoring, or attention paid to speech. He claimed that in informal speech speakers pay attention to the substance of what they are saying and not to the way they sound. Under these circumstances their basic, or vernacular, style emerges. On the other hand, when reading, especially when reading a list of words, speakers

Figure 8.1 Class stratiﬁcation of /r/ in guard, car, beer, beard, etc. for native New York City adults (based on Labov 1972a, p. 114).

Speaking Style and Monitoring

• 135

are able to pay more attention to how they sound, and they adjust their pronunciation in the direction of the prestige norm. Notice that this explanation of the relationship between prestige forms, social class, and monitoring assumes that the informants were trying to use the prestige variant. Presumably, their motive for doing so was to sound like educated members of a higher social class. This explanation seems particularly appropriate in regard to the hypercorrect pattern of the second higher class. Notice in ﬁgure 8.1 that in word list style, which allows a high degree of monitoring, these speakers produced higher frequencies of /r/ than the upper-class speakers. To summarize, Labov (1972a) claimed that speaking styles can be arranged along a continuum of prestige that embodies the judgments of correctness shared by members of a speech community. In the speaking situation used in Labov’s (1966) research, namely the sociolinguistic interview and associated reading tasks, all members of a speech community changed the way they spoke to avoid the less prestigious features that are associated with vernacular speech. These facts are summarized by Rickford and Eckert (2001) as follows: “The speaker’s stylistic activity . . . [is] directly connected to the speaker’s place in, and strategies with respect to, the socio-economic hierarchy” (p. 2). Krashen’s Monitor Model Labov’s (1972a) claim that attention to speech results in a shift toward more prestigious variants is related to the inﬂuential theory of second language acquisition proposed by Stephen Krashen (1978, 1982, 1985, 1987) called monitor theory. Krashen also found that some features of an informant’s speech (in this case the informant was a second language learner) varied with the elicitation task. Unlike Labov, Krashen did not investigate variation among semantically equivalent forms like G and N. Rather, he looked at whether learners produced particular morphemes, like plural -s, regular past tense -ed, and progressive -ing (in either form) in required contexts. In other words, he investigated the vertical continuum of language acquisition (see chapter 4). Several studies (e.g., Dulay and Burt, 1974; Bailey, Madden, and Krashen, 1974) had found that these morphemes could be ordered according to how accurately they are used by English learners, but that the accuracy order was diﬀerent on diﬀerent elicitation tasks. For example, on a speaking task progressive -ing was ranked ﬁrst in accuracy (that is, it was supplied most frequently in required contexts), but on a grammar test where the learners had to ﬁll in the blanks in a story, progressive -ing was ranked signiﬁcantly lower (Larsen-Freeman, 1975). Thus, both Labov and Krashen found that speakers vary their speech according to the elicitation task and that they can produce more “correct” speech on tasks that allow them to pay more attention to the form of their language. Both scholars called this attention to form “monitoring.” In these respects their ideas are similar, but upon closer inspection Krashen’s notion of monitoring turns out to be somewhat diﬀerent from Labov’s.

136

• Variation in Pedagogical Perspective

Krashen’s monitor model attempts to explain why diﬀerent elicitation tasks produce diﬀerent morpheme accuracy orders in the output of second language learners. Its basic claim is that second language production involves using one or both of two separate psycholinguistic systems, as shown in ﬁgure 8.2. Notice that the monitor model is a lot sketchier than the production model shown in ﬁgure 6.2. The ﬁrst system is the Language Acquisition Device (LAD), which is similar to the LAD proposed by Chomsky (1965) to explain language acquisition by children. The LAD constructs a mental grammar of the target language by analyzing input, a process that is automatic and unconscious. (This mental grammar is invoked at level 3 in the language production model in ﬁgure 6.2.) The second psycholinguistic system is the monitor. The monitor is a general learning device, not dedicated to language. Using the monitor, adults can learn about language in the same way that they learn about physics and history: by consciously learning facts and the connections between them. An example is memorizing the rule that regular English plurals end in -s. (Conscious knowledge is not represented in the production model in ﬁgure 6.2.) The monitor model works in the following way. The model assumes that an intended message has been already formed, and its processing begins at the stage when the unconscious mental grammar in the LAD is accessed. If the speaker is not focusing on form, the interlanguage grammar contained in the LAD will govern production. If this grammar contains a rule for plural -s, that morpheme will be attached to a plural noun (this path is indicated in ﬁgure 8.2 by the dotted line). However, if the speaker is focusing on form, the LAD output can be modiﬁed by the conscious rule in the monitor, as indicated by the solid line in ﬁgure 8.2. If the output from the LAD does not include a required plural morpheme, -s can be added by the monitor. However, it is important to note that the monitor can also delete -s from a plural noun if the speaker has not consciously learned the plural rule correctly (see below). Monitor theory also claims that grammatical knowledge in the LAD and grammatical knowledge in the monitor are internalized in entirely diﬀerent ways. The unconscious process involving the LAD is called acquisition, and the conscious process involving the monitor is called learning. Monitor theory claims that consciously known rules (the rules in the monitor) are of very limited use. Successful monitoring can occur only if three conditions are

Figure 8.2 The monitor model.

Speaking Style and Monitoring

• 137

met. First, the speaker must be attending to form; second, the speaker must have suﬃcient processing time; and third, the speaker must accurately know the rule for producing the correct form. In later modiﬁcations of the model, Krashen (1982, 1987) added a proviso to this last condition: the consciously known rule must be simple and easy to apply—a rule of thumb. If the speaker does not accurately know an easy to apply rule, monitoring can actually reduce accuracy. The monitor model has been criticized by many. Early criticisms (Spolsky, 1985, p. 274; Adamson, 1988, p. 82) focused on the fact that Krashen conceived of monitoring as an entirely conscious process. But the notion of attention to speech seems to be broader than that. As Smith (1982) noted, “Attention sometimes implies conscious knowledge and sometimes not . . . Attention simply means a kind of orientation, concentration, or focus” (p. 43). In response to such criticisms, Krashen (1982) modiﬁed the monitor model to include the possibility of unconscious monitoring, based on a “feel for correctness.” In the revised model Monitoring, with a capital M, represented conscious rule application, as described above, and monitoring, with a small m, represented the unconscious “feel for correctness process. To illustrate this new theory, a small m monitoring box was attached to the right of the LAD box in ﬁgure 8.2. (The distinction between Monitoring and monitoring has not caught on, and the typography is confusing. Hereafter, I will use the word “monitoring” to refer only to conscious monitoring unless otherwise stated.) This change brought monitor theory more into line with Labov’s (1966) notion of monitoring, which implies that there is a continuum between conscious and unconscious monitoring. In reading a list of words, a speaker can consciously apply a learned rule of pronunciation, but in speaking there is not suﬃcient time, so the speaker must rely on an unconscious “feel for correctness”. A second criticism of monitor theory was directed against Krashen’s (1978) claim that conscious knowledge can never become unconscious knowledge. This claim seemed even more unlikely after the monitor theory was modiﬁed to include a capacity for unconscious monitoring. In the language processing theory proposed by Anderson (1980, 1983) and adapted to second language acquisition by McLaughlin (1980, 1987) and others, it is axiomatic that conscious knowledge becomes automatized and turned into unconscious knowledge. Krashen oﬀered no convincing evidence why this process could not happen in second language acquisition. Monitoring in interlanguage was explored by a number of researchers in the 1970s and 1980s, who looked only at vertical variation (the alternation between a correct and an incorrect form), and who assumed that monitoring in a second language was similar to monitoring in a ﬁrst language because in both cases it produced more prestigious forms. We will now examine some of these studies.

138

• Variation in Pedagogical Perspective

Studies of Monitoring in Interlanguage Dickerson (1974) was the ﬁrst SLA researcher to suggest that speakers might monitor in their second language in the same way that they monitor in their ﬁrst language. She studied the pronunciation of /r/ by ten Japanese students studying ESL at an American university. She found that her subjects produced /r/ with 100 percent accuracy when reading word lists but with only 50 percent accuracy in free conversation. So, Dickerson (1974) claimed that attention to form resulted in the more target-like production. Tarone (1979, 1982, 1985, 1988) looked at monitoring in morphology and syntax. Recall that Labov claimed that native speakers have a vernacular style that underlies speech production, and that they modify this style appropriately to ﬁt the diﬀerent contexts of speaking, as shown in ﬁgure 8.1. Labov also claimed that in language change it is the vernacular style that is most susceptible to new forms. This notion was mentioned in chapter 3 as Labov’s (2001a, p. 437) fourth principle of transition: “Linguistic changes from below develop ﬁrst in spontaneous speech at the most informal level.” Similarly, Tarone (1982) claimed that interlanguage speakers have a vernacular style (we might think of it as the most automatic style at a particular stage of acquisition), which is not native-like, but which can be modiﬁed to include more native-like variants in circumstances that allow attention to speech. This position was similar to Krashen’s notion of small m monitoring. On the basis of further research, however, Tarone (1985, 1988) modiﬁed her position. To see why, consider the ﬁndings of her 1985 study. She collected data from ten speakers of Arabic and ten speakers of Japanese, in three elicitation contexts that diﬀered in the degree of monitoring they allowed. These were: (1) a multiple choice grammar test; (2) an interview focusing on the informants’ ﬁeld of study and academic plans; and (3) a spoken narrative in which the subjects recounted a series of events shown on a video screen. It seems reasonable that the grammar test would be the best context for accurate monitoring. Next best would be the interview, where the informants were describing familiar notions and could use vocabulary and sentence patterns of their own choosing. Retelling the narrative would seem to be the worst context for accurate monitoring because the informants may have had to use vocabulary that they were not familiar with and to frame events in unexpected ways. Tarone (1985) looked at the subjects’ production of four forms: third person -s, plural -s, articles, and presence or absence of the third person pronoun it (as in sentences like “I won’t know what is in the package until I receive it”). The results of the study were, in Tarone’s (1985) words, “Surprisingly complex” (p. 98). Because the Japanese speakers showed less variation than the Arabic speakers, we will consider only the Arabic speakers’ results, though the results for the Japanese speakers were generally compatible. The results for the Arabic speakers appear in table 8.1, which shows that only in the case of

Speaking Style and Monitoring

• 139

Table 8.1 Style shifting on four target language forms in narrative style, interview style, and grammar test style by Arabic speakers. Numbers represent percent of correct usage. Morpheme

3rd person -s Noun plural Article Object pronoun it

Elicitation task

Result of Monitoring

Test

Interview

Narr.

67 70 58 77

51 83 85 92

39 71 91 100

Improves accuracy No change Decreases accuracy Decreases accuracy

third person -s, did monitoring improve accuracy. Increased monitoring did not aﬀect the accuracy of plural marking, and it decreased the accuracy of article and third person use. Tarone (1988) explained this pattern by suggesting that the grammatical morphemes she looked at fall into two classes according to the role they play in discourse, and that the two classes are aﬀected diﬀerently by monitoring. Third person singular -s is diﬀerent from the other morphemes because it is redundant. Therefore, in casual speaking or recounting a narrative, speakers can ignore this form and still communicate eﬀectively. Tarone suggests that this is what her subjects did on the speaking tasks. On the grammar task, however, discourse coherence was not at stake, so the informants could focus their cognitive resources on even the redundant third person -s. But, articles, object pronoun it, and plurals are not redundant in discourse; rather, they are important for conveying an intended meaning and preserving cohesion. Therefore, speakers must focus their cognitive resources on getting these morphemes right on the speaking tasks, even at the expense of accuracy with redundant forms. Both Adamson (1988, p. 83) and Preston (1989, p. 259) have a somewhat diﬀerent interpretation of Tarone’s (1985) data. Recall that one of Krashen’s (1987) conditions for successful monitoring was that a rule must be easily stated and remembered—a rule of thumb. The rule for using third person -s ﬁts this description, and, in addition, subject–verb agreement is a major focus of grammar instruction in ESL programs. Therefore, Krashen might predict that monitor use would be helpful here. The rules for article use, on the other hand, are far from simple, and so monitoring these forms should result in less accurate production. Plural -s is a monitorable form, and Tarone’s (1985) subjects used it more accurately in the interview but less accurately on the test. Therefore, the data regarding this form are inconclusive. The most interesting case is object pronoun it. According to Preston (1989, p. 259), the rules for agreement between this form and its antecedent are subtle and cannot be

140

• Variation in Pedagogical Perspective

stated easily. Therefore, monitoring for object pronoun it, like monitoring for articles, might well result in a decrease in accuracy according to monitor theory. Accommodation, Audience Design, and Self-identiﬁcation Accommodation Theory Since Labov’s classic studies of style shifting, some researchers have taken a very diﬀerent approach to the study of speaking style. The social psychologist Howard Giles and his colleagues (Giles and Powesland, 1975; Giles, 1984; Giles, Coupland, and Coupland, 1991) proposed that one source of variation is that people adjust their speech in relation to the speech of their interlocutors. He called this adjustment accommodation, of which there are two kinds: convergence and divergence. Convergence occurs when speakers alter their speech to be more like that of their interlocutors because they want to establish solidarity with or elicit approval from their audience. An example of convergence is found in Labov’s (1972a) famous department store study, in which workingclass department store employees altered their production of post vocalic /r/ to sound more like their customers. Labov (1972a) sampled the speech of employees at three New York City department stores: Klein’s, Macy’s, and Saks, which cater to customers from the working class, middle class, and upper class, respectively. His research design was ingenious. He hid a tape recorder in a shoulder bag, approached an employee, and asked where to ﬁnd an item that he already knew was on the fourth ﬂoor. The verbal interaction might go like this: “Where are the men’s wallets?” “They’re on the fourth ﬂoor.” “Where?” “Fourth ﬂoor.” Thus, Labov usually collected four tokens of /r/ words from each interaction. Because the store employees were working-class people, we can assume that their vernacular style would contain very little post-vocalic /r/, as shown in ﬁgure 8.1, and therefore that the chances of recording an /r/ as one of the four tokens would be slim. This proved to be the case at Klein’s, where only 15 percent of the employees Labov approached produced an /r/. However, at Macy’s 55 percent of the employees produced an /r/, and at Saks the ﬁgure was 65 percent. These results suggest that the Macy’s and Saks employees were adjusting their /r/ production in order to converge with the speech of their middle- and upper-class customers. The department store study also illustrates the important point that when speakers do not have the opportunity to sample the speech of their interlocutors, they accommodate to their own mental image of that speech. Because Labov did not engage in conversation with the

Speaking Style and Monitoring

• 141

employees, they could not get a good idea of how he spoke, so they addressed him in the speaking style of a typical customer. Divergence occurs when speakers alter their speech to be diﬀerent from that of their interlocutor because they want to increase the social distance between them. An example of divergence is provided by Bourhis and Giles (1977), who asked a group of ethnically Welsh English speakers to respond to an English speaker with a Received Pronunciation accent. They found that when the content of the conversation focused on Welsh–English diﬀerences, thereby potentially threatening the Welsh identity of the informants, these speakers diverged from the speech of the Englishman, emphasizing the Welsh features of their own variety of English. Audience Design and Self-identification Accommodation theory comes from the ﬁeld of social psychology, and it includes an elaborate account of speakers’ motivations and personal relationships, which is not directly relevant to mainstream sociolinguistic concerns. Alan Bell (1984, 2001) built upon the insights from accommodation theory to develop a more sociolinguistically centered theory called audience design, which we will now consider. Audience Design The basic mechanism of audience design theory is the same as that of accommodation theory, namely that “Speakers design their style primarily for and in response to their audience” (Bell 1984, p. 143). Bell (1977) presented evidence for this claim by analyzing the speech of four radio announcers, all of whom worked simultaneously for two New Zealand stations broadcasting out of the same studio. One of the stations was the National Public Radio station, which catered to an upper-class audience, and the other was a local station, which catered to a working-class audience. Bell (1977) looked at how the announcers pronounced intervocalic /t/. In American English intervocalic /t/ is almost categorically pronounced as an alveolar voiced ﬂap (making writer sound like rider), but in New Zealand English intervocalic /t/ is a true sociolinguistic variable, which stratiﬁes according to social class. Bell noticed that the broadcasters were voicing considerably fewer intervocalic /t/s while broadcasting on the National Public Radio station than on the local station, shifting an average of 20 percent between the two contexts. Because this style shift occurred in the speech of the same individuals speaking in the same physical location, he concluded that the only reason for it was the change in audience; that is, the announcers were converging with the speech of their listeners, producing more voiceless /t/s when addressing the upper-class audience and more ﬂapped /t/s when addressing the working-class audience. As discussed, Labov (1966) had also found a close relationship between speaking style and social class, but Bell’s (1984) audience design

142

• Variation in Pedagogical Perspective

theory describes this relationship more comprehensively than Labov had with the style axiom, which states: Variation on the style dimension within the speech of a single speaker derives from and echoes the variation which exists between speakers on the “social” dimension. (p. 151) Evidence in support of the style axiom can be seen in the study of ing–in’ variation in chapter 2. The social factor measured in that study was gender, not social class (social class was roughly controlled for by the fact that all the informants lived in South Philadelphia, a working-class neighborhood), but gender is one component of the social dimension. Also, in the study in chapter 2, the range of style variation for individuals was not reported, only the range for all speakers lumped together. Nevertheless, it is possible to compare the range of variation according to style for all speakers to the range of variation according to gender. As table 2.2 shows, the Varbrul p values for N (in’) for the factor group style, range from .72 in casual style to .32 in careful style, a diﬀerence of .40. In contrast, the p values for the factor group gender range from .77 for men to .24 for women, a diﬀerence of .53. Thus, the range of variation on the social dimension is indeed larger than the range on the style dimension, as the style axiom predicts. Most studies of language variation, like the study of ing–in’ variation in chapter 2, consider not only the dimensions of style and gender, but also the dimension of linguistic environment. Preston (1991) proposes to account for the relationship between the linguistic dimension and the social dimension with the status axiom, which states that the range of variation on the linguistic dimension is greater than the range of variation on the social dimension. This claim is also supported by the data in table 2.2. As just mentioned, the range of p values for gender is .53, but the p values for linguistic environment range from 1.00 for future (but see note a to table 2.2) to .13 for prepositions, a range of .87, which, as the status axiom predicts, is greater than .53. The question of how the style axiom applies in nonnative speech will be discussed later in the chapter. Responsive Style Shifting and Initiative Style Shifting Audience design theory posits two main types of style shifting: responsive and initiative, both of which contain subtypes, as shown in ﬁgure 8.3. Responsive style shifting resembles convergence in accommodation theory. In its simplest form it occurs when speakers alter their speaking style to be more like that of their interlocutor. Tracing down the diﬀerent branches of responsive style shifting in the tree diagram in ﬁgure 8.3, we can distinguish the following subtypes. 1. Audience, second person addressee. This kind of style shifting involves accommodating to the speech of an addressee, as was the case of the

Speaking Style and Monitoring

• 143

Figure 8.3 Bell’s categorization of diﬀerent types of style shifting (from Bell, 2001). Reprinted with permissions.

clerks in Labov’s (1972a) department store study. Recall that, as in that study, speakers sometimes do not have the opportunity to sample their interlocutor’s speech, so they accommodate to their own mental model of how that person would speak. 2. Audience, third persons present. This kind of shifting typically occurs during a conversation within earshot of a third person, or auditor. For example, a teenager might address a friend in a more formal way if an adult were present. 3. Non-audience, setting/topic. Recall from chapter 2 that a number of diﬀerent topics can be distinguished within the sociolinguistic interview, including “language,” “soapbox,” and “kids.” Labov (1972a) claimed that speakers tend to monitor more for some of these topics (such as “language”) than for others (such as “kids”). Bell (1984) agrees that diﬀerent topics can elicit diﬀerent styles, but he claims that monitoring is not involved. Rather, he says (Bell, 2002, p. 146) that diﬀerent topics (as well as diﬀerent physical settings, like a school or a playground) are associated with particular styles of speaking. Therefore, switching topics or settings is a bit like switching audiences because speakers recall the audiences that are typically addressed in discussing these topics or that are typically present in these settings. As Bell (2002, p. 293) notes, this claim is reminiscent of Bakhtin’s (1981, p. 293) idea that “all words have the ‘taste’ of a profession, a genre, . . . a generation . . . Each word tastes of the context and contexts in which it has lived its socially charged life” (quoted in Bell, 2002, p. 143). This claim is also, of course, reminiscent of associationist psychology, which was popular at the time Bakhtin did his work. So, according to audience design theory, we can understand

144

• Variation in Pedagogical Perspective Labov’s (1972a) informants’ shift to informal speech when discussing kids as their recollection of the “taste” of the language associated with this topic and setting, rather than their becoming less conscious of their speech and therefore reverting to their natural, vernacular style.

The second main type of audience design shown in ﬁgure 8.3 is initiative design, of which the most important type (and the only type we will discuss) is referee design. In referee design, speakers do not shift to match the speech of their audience but rather shift in the direction of an absent group of speakers, or referees. Two kinds or referee design are distinguished. 1. Ingroup design occurs when speakers shift in the direction of their own social group while addressing an interlocutor who is not a member of that group in order to emphasize their own identity. This kind of shifting is similar to divergence in accommodation theory. 2. Outgroup design occurs when speakers adopt a style that emphasizes features of a group with whom they wish to identify. This kind of shift can be short term, or temporary, or can be long term, resulting in a change in basic speaking style, as speakers attempt to establish themselves as members of an outside social group. An early example of outgroup design is provided by Labov’s (1972a) study of vowels on Martha’s Vineyard. The island of Martha’s Vineyard, which is discussed in chapter 1, provides a fascinating laboratory for the study of language change because the population of the island changes every summer when swarms of tourists arrive from the mainland, bringing with them their mainland accents. The year-long residents of the Vineyard, especially those who live on the western part of the island, speak a Yankee dialect that contains relics of an earlier variety of English. Features of this variety include the pronunciation of post-vocalic /r/ (Boston English is r-less) and centralization of the vowels in the diphthongs /aw/ and /ay/, so that “about the house” is pronounced [əbəwt] the [həws]. Labov studied the pronunciation of these diphthongs in the speech of young islanders. He found that high school students who planned to leave the island for careers on the mainland showed virtually no centralization in reading style, but students who were planning to stay on the island showed a high degree of centralization. Labov (1972a) suggested that vowel centralization has become a marker of identity as a traditional islander, and that those who wish to adopt this identity increase this feature in their speech. He notes, “When a man says [rəyt ] or [həws], he is unconsciously establishing the fact that he belongs to the island: that he is one of the natives to whom the island really belongs” (p. 36).

Speaking Style and Monitoring

• 145

Since Bell’s early work, sociolinguists have placed more emphasis on referee design as a motivation for style shifting because it is a way for speakers to aﬃrm their own social identity or to construct a new one. One might ask how it is possible to determine whether an instance of style shifting is responsive or initiative. Did the New Zealand radio announcers devoice their /t/s on Public Radio because they had gauged the speech of their audience and wished to accommodate to it (responsive, audience design), or was it because they wished to establish a personal identity as members of the upper class (initiative, referee, outgroup design)? Bell (2001) admits that this question can be diﬃcult, and suggests that sociolinguists must engage in a qualitative study of speakers’ social circumstances and identities in order to answer it. Studies of Style Shifting in Interlanguage Perhaps the most carefully designed study of audience design in interlanguage was conducted by Young (1991), who looked at how accurately adult Chinese speakers supplied the plural morpheme -s in obligatory contexts. Young found that a number of factors, including the linguistic environment in which -s appeared, inﬂuenced the variation, but here we will look only at the eﬀect of audience. Young (1991) arranged to interview each informant twice, once by a native English speaker and once by a native Chinese speaker. Young’s original hypothesis was that his informants would accommodate to the speech of their interlocutors according to shared ethnicity. Thus, he predicted that in a Varbrul analysis the factor native English speaker would favor plural marking and the factor nonnative English speaker would disfavor plural marking. Young found, however, that the identity of the interviewer made no signiﬁcant diﬀerence in plural marking. In a second analysis of the data, Young (1989) hypothesized that factors other than ethnicity might aﬀect the informants’ degree of accommodation. Therefore, he constructed a “convergence index,” which took into account not only the ethnicity of the informant and interviewer but also their age and sex. Thus, when an informant and an interviewer were of the same sex, ethnicity, and approximate age, the convergence index was high, but when these factors did not match, the convergence index was low. Young (1989) also divided his informants into two English proﬁciency groups, high and low. This time the Varbrul analysis did produce some signiﬁcant results, as shown in table 8.2. As table 8.2 shows, only the high proﬁciency speakers contribute signiﬁcantly to the variation. This fact suggests that speakers must gain a minimal proﬁciency in the use of a target language structure before socially conditioned variation appears in their speech. The high proﬁciency speakers appear to accommodate, but in unexpected ways. When there is high convergence between the informants and their interlocutors, plural nouns are marked more frequently, not less frequently as Young had expected. A possible explanation for this result is that the Chinese-speaking interviewers marked plural nouns

146

• Variation in Pedagogical Perspective

Table 8.2 Varbrul p values for plural -s marking by Chinese speakers according to convergence with interlocutors.

High convergence with an NS High convergence with an NNS Low convergence with an NNS Low convergence with an NS

Low-proﬁciency Informants

High-proﬁciency Informants

n/s n/s n/s n/s

n/s 0.59 n/a 0.26

n/s = not signiﬁcant; n/a = insuﬃcient data

more accurately than the informants (we are not given information about the relative proﬁciency of the interviewers), and so the informants were, in fact, accommodating to the speech of their interlocutors. Within the audience design framework in ﬁgure 8.3, this type of style shifting appears to be: “responsive, audience, second person addressee” (but recall Bell’s [2001] warning that ethnographic research is required to ﬁrmly identify the motive for a style shift). The second result apparent in table 8.2 is that the high proﬁciency informants diverged from the speech of the NS interviewers with whom they had low convergence. Audience design theory can explain this result by the observation that the informants may have felt somewhat threatened by the test-like interview, conducted by a native English speaker who diﬀered from them in age and sex, and therefore they wished to distance themselves from their interviewers. This type of style shifting would ﬁt in ﬁgure 8.3 as “initiative, outgroup, temporary.” Young’s (1989) reanalysis of the data shows that for his informants, in the context of this study, ethnic identity is not the only factor in whether high proﬁciency speakers converge or diverge from the speech of their interlocutors, but that age and gender are important as well. Adamson and Regan (1991) looked at the pronunciation of -ing in the speech of Vietnamese and Cambodian adults living in Philadelphia and Washington, D.C., and compared it to the pronunciation of native English speakers (the native speaker part of the study is presented in chapter 2). We found that linguistic environment, speaking style, and speaker’s sex constrained -ing pronunciation for both groups. Among the native speakers, both males and females produced the informal variant -in’ more frequently in unmonitored styles, though on the whole males produced much higher percentages of -in’ than females in both monitored and unmonitored styles. Among the English learners, the females showed the same basic patterning as the native English-speaking females, where unmonitored style favored -in’ and monitored style favored -ing. Unexpectedly, however, we found that the males used more -in’ in their more monitored style. In audience design theory this kind of style shifting appears to be “initiative, referee, outgroup, long-term”

Speaking Style and Monitoring

• 147

design. We speculated that this style shift represented our informants’ eﬀort to construct an appropriate male identity in English. Adamson and Regan (1991) also looked at aspects of variation along the vertical continuum. Because of transfer from the native languages of our informants, their initial pronunciation of the -ing morpheme was /iyŋ/. Therefore, movement along the vertical continuum involved acquiring the variant /in/. As noted in chapter 2, Houston (1985) and Labov (2001a) had found that for native speakers the relative frequency of -ing and -in’ was conditioned by grammatical category. For the historical reasons discussed in chapter 2, the nominal categories of noun, pronoun, and gerund favored -ing, whereas the verbal categories of participle, progressive, and going to future favored -in’. Adamson and Regan (1991) found that, in general, nominals and verbals constrained ing–in’ variation in the learners’ speech in the same way. However, there were two exceptions: the pronouns something and nothing and the preposition during, which showed higher frequencies of -in’ than the more verbal categories of participle and adjective. We speculated that the -in’ pronunciation of pronouns and these words involved lexical learning (memorizing individual words), and that this kind of learning is easier than the learning of a variable rule, which is required for learning appropriate variation in open classes of words, such as progressives and gerunds. In terms of the discussion in chapter 3, the change from -ing to -in’ was proceeding in a way analogous to lexical diﬀusion in regard to the words something, nothing, and during, and was proceeding in a way analogous to regular sound change in regard to progressives and gerunds. Major (2004) studied English learners’ production of four variants that allow alternation between a formal and an informal form. These variants were: (1) -ing versus -in’; (2) /n/ assimilation in can (e.g., can go versus [kæŋ] go); (3) palatalization (e.g., got you versus [gacˇ uw]; and (4) deletion of /v/ in of (e.g., can of beans versus can o’ beans). The subjects of the study were 16 native speakers of Japanese, 16 native speakers of Spanish, and 16 native speakers of English, all of whom were enrolled in ﬁrst-year university composition classes. Each language group was equally divided between males and females. The Spanish speakers had been in the United States an average of 5.3 years, and the Japanese speakers for an average of 1.3 years. The informants were asked to perform two verbal tasks, which, according to Major, elicited two diﬀerent styles. The ﬁrst task, which was expected to elicit the more informal style, was to read a complete sentence containing one of the target structures. For example, when reading the sentence “Can Betty go or not?” the informants might say “Ca[m] Betty go or not.” The second task, which was expected to elicit the more formal style, was to read only a phrase containing the target structure, such as “can Betty.” Major believed that when reading only a phrase the informants would be more likely to monitor their speech because they would not have to process the meaning

148

• Variation in Pedagogical Perspective

Table 8.3 Percentage of informal variants for sentence reading style versus phrase reading style for three native language groups. N (the total number of sentences + phrases read by each language group) = 1,793. Language group

Sentence reading style

Phrase reading style

English Japanese Spanish

37 27 25

7 26 13

of a complete proposition. Therefore, they would be more likely to say “ca[n] Betty.” Major (2004) ﬁrst compared the sentence-reading style to the phrasereading style for each group of speakers by calculating the overall percentage of informal variants used by each group when the results for all four phonological variables were combined. He found that the native English speakers showed a signiﬁcant diﬀerence between the two styles, as shown in table 8.3. The native Spanish speakers showed a smaller but still statistically signiﬁcant diﬀerence between the two styles, and the Japanese group showed virtually no diﬀerence between the two styles. Major concluded that this result was due to the fact that the Spanish speakers had lived in the U.S. long enough to acquire style stratiﬁcation, but the Japanese speakers had not. Major (2004) also looked for gender stratiﬁcation within the three language groups and found the results shown in table 8.4. As expected, the degree of gender diﬀerence for the native English speakers was large, with the males producing signiﬁcantly more informal forms than the females. There were also parallel gender diﬀerences for the Japanese speakers and the Spanish speakers, with the males producing signiﬁcantly more informal forms than females for both of these groups. From these results, Major (2004) concluded that his Spanish-speaking and Japanese-speaking informants had partially internalized the community-based gender norms for the three variables, although their degree of gender stratiﬁcation did not exactly match that of the native speakers. Major (2004) also noted that although both groups of English learners had to some extent acquired gender stratiﬁcation, only the Table 8.4 Percentage of informal variants for males and females for three native language groups. N (the total number of sentences + phrases read by each language group) = 1,793. Language group

Males

Females

English Japanese Spanish

28 30 24

15 22 13

Speaking Style and Monitoring

• 149

Spanish speakers, who had been immersed in English longer, had acquired style stratiﬁcation. From this fact, he concluded that gender norms appear to be acquired before style norms. This ﬁnding points to the possibility of a “gender axiom” for adult language learners, along the lines of the style and status axioms for native speakers. The gender axiom would state that style stratiﬁcation is acquired after, and is possibly derived from, gender stratiﬁcation. Major’s (2004) study aligns with the studies of native speaker speech in that it showed that in circumstances that allow more attention to form, second language learners produced more correct and more formal speech. However, the studies by Tarone (1988) and Adamson and Regan (1991) show that this is not always the case. A study in which monitoring produced less correct speech was conducted by Thompson and Brown (2003). These researchers looked at the interlanguage of a single informant, a native Spanish speaker who spoke ﬂuent English. She had studied English in Spain in elementary and secondary school but had no functional ability in English when she arrived in the United States at the age of 19. At that time, she married an American (who spoke Spanish but was English dominant) and studied English at an English language institute. After one year, she received a high enough score on the TOEFL (Test of English as a Foreign Language) test to gain admission to an American university. Thompson and Brown’s (2003) study was conducted three years later, when the informant was described as “an advanced speaker of English but with a notable foreign accent” (p. 41). The researchers looked at the informant’s acquisition of the English phoneme /i/, which does not exist in Spanish, so that many Spanish speakers pronounce ship as sheep and hit as heat. As in the Labovian studies, the informant was asked to produce speech in ﬁve styles: narrative, conversation, reading, word list, and minimal pairs. When constructing the three reading tasks, the researchers made sure to include only words that the informant was familiar with. A Varbrul analysis was made to determine the eﬀect of the following factors: style, syllable stress (primary or secondary), preceding sound, following sound, part of speech, number of syllables in the word, and whether the word was part of a minimal pair (e.g., ship forms a minimal pair with sheep, but kit is not part of a minimal pair because there is no word keet). Of these possible independent variables, only style and minimal pair proved to be signiﬁcant. It was expected that minimal pair words would be more accurate because more confusion could arise if they were pronounced like their mates. However, the informant pronounced the minimal pair words less accurately than the nonminimal pair words. This result could be explained by a connectionist model of lexical access in the following way. When a minimal pair word is accessed, its mate is partially activated and can aﬀect the choice of the vowel in the target word. When a non-minimal pair word is accessed, words that sound somewhat like it are also partially activated, but the activation will be less strong; therefore,

150

• Variation in Pedagogical Perspective

phonological interference will be less for these words than for minimal pair words. The eﬀect of style was exactly the opposite of that in Major’s (2004) study and in the Labovian studies. The percentage of accurate pronunciation of /i/ was as follows (Thompson and Brown [2003] do not provide the Varbrul p values): Style Narrative Conversation Reading passage Word list Minimal pairs

% /i/ 96 97 87 76 42

No. of tokens 238 300 70 33 24

One explanation for the unexpected eﬀect of style on the informant’s speech is the same as the one given in regard to Tarone’s (1988) informants’ data. That explanation (endorsed by Adamson [1988] and Preston [1989] but not by Tarone [1988]), was that that these speakers were attempting to monitor forms that they did not consciously control, and in guessing which form to use they were less accurate than if they had allowed the unconscious speech production mechanism to work on its own. This account may explain Thompson and Brown’s (2003) results because their informant apparently was attempting to monitor her vowel production on the reading tasks. Thompson and Brown (2003) report: Many times the sound would start as a one vowel /i/ and ﬁnish as /iy/ or vice versa reﬂecting a possible attempt to try to approximate the nativespeaker norm. Additionally, when [the informant] attempted to produce the minimal pairs, she produced the same vowel but varied the vowel length or pitch attempting to diﬀerentiate the two sounds. This lengthening of vowels and change of pitch did not change the timbre of the sound but reﬂected the monitoring that was occurring as she tried to distinguish between the two words. She was aware of the diﬀerence because of the distinct orthography that was present, but she was unable to determine the correct vowels. (p. 17) As Thompson and Brown (2003) note, their informant’s behavior was similar to that of Flege’s (1991) informants. In his study, Spanish speakers learning English sometimes produced /i/ for /iy/, for example pronouncing steal as still. Flege (1991) suggested that L2 learners have more diﬃculty establishing an independent phoneme category when a target sound is very similar to one in their own language than when a target sound has no close native language equivalent. A second and possibly compatible explanation for the pattern of style shifting found in Thompson and Brown’s (2003) study may be found in audience

Speaking Style and Monitoring

• 151

design theory. Recall from the discussion earlier in the chapter that according to Bell (2001) one kind of style shifting involves a change in topic and/or setting, and that, following Bahktin (1981), Bell says that certain language forms carry the “ﬂavor” of certain topics and settings. Diﬀerent kinds of elicitation tasks may evoke particular language forms in this way as well. The informant in this study had studied English in Spain in elementary and secondary school, where English was taught mainly in a reading context, and with almost no exposure to native English speakers. Therefore, in the classroom setting the informant presumably had little input containing /i/ and learned to pronounce written English with a Spanish accent. However, after arriving in the United States, the informant was exposed mainly to spoken English in conversations with native English speakers, and so in this setting she received a great deal of input with /i/. Perhaps, then, the informant’s use of /i/ and /iy/ to some extent reﬂected the circumstances in which she had been exposed to these sounds. Reconciling Monitoring and Audience Design An important diﬀerence between Labov’s attention to speech theory of style shifting and Bell’s audience design theory is the notion of cognitive diﬃculty, which the construct of monitoring implies. The monitoring explanation for style shifting assumes that for native speakers the formal styles are more difﬁcult to produce, requiring more attentional resources than the informal styles. Audience design theory does not make this assumption. It claims only that diﬀerent styles are appropriate to diﬀerent speakers, audiences, and topics. It may be that the diﬀerent research designs that Labov and Bell used are related to this diﬀerence in their theories. In Labov’s research design (the sociolinguistic interview) audience was held constant while speaking task was manipulated, thus isolating the diﬀerences associated with monitoring. In Bell’s research design, audience was manipulated while speaking task was held constant, thus isolating the diﬀerences associated with audience. Labov’s sociolinguistic interview and reading tasks are a bit like a psychological experiment. An informant is usually interviewed and recorded by professors or graduate students, often working in pairs. The interviewers try not to identify themselves as linguists, but in the interest of protecting the informants’ rights (and complying with Human Subjects Review Committee requirements), they do identify themselves as scholars. A team of interviewers might explain their purpose to an informant by saying something like the following: “We’re graduate students at the University of Pennsylvania, and we are studying this community. We’d like to ask you about some of your experiences in this neighborhood.” In these circumstances, participants are usually receptive, or at least neutral, to their interviewers and can be expected to converge with their speech. Of course, this speech will be (or will be assumed to be) prestigious; therefore, the informants can be expected to shift in the direction of upper-class, educated speech. In other words, the phenomenon of monitoring in the sociolinguistic interview,

152

• Variation in Pedagogical Perspective

and on the associated reading tasks, involves a particular case of audience design, namely “responsive, audience, second person design,” or convergence. In Labov’s theory this kind of style shift is assumed to be cognitively diﬃcult. The reason, as discussed in chapter 3 (and as pointed out by Tarone [1988] and Preston [2001, 2002]) is that as children native speakers learn a vernacular or informal style of their language, which forms their basic or underlying grammar. As they are exposed to more formal varieties in writing and in the speech of teachers and other adults in formal contexts, children learn new rules and diﬀerent frequencies of application for variable rules. An example of such a new rule is the formal way to express irrealis conditions in an if clause. The rule of the vernacular grammar may produce sentences like “If I would have known that, I would have told you.” The new, formal rule produces “If I had known that, I would have told you,” or even “Had I known that, I would have told you.” An example of adjusting a variable rule is that New York City children must learn to produce higher frequencies of postvocalic /r/ when discussing academic topics. Labov’s monitoring theory assumes that when speakers shift away from the early-learned, vernacular rules, they must employ additional mental resources, and that these resources are more available when a speaker is not focusing on the content of speech, as when telling a story or, to a lesser extent, when reading a story. Unlike Labov, Bell studied speech that was naturally produced in various settings, for example when travel agents consulted with their clients, or when radio announcers addressed their audiences. He did not administer an instrument containing diﬀerent elicitation tasks; therefore, the elicitation context was held constant while the audience changed. It may be, then, that monitoring theory and audience design theory are not incompatible—both aﬀect speaking style under certain circumstances, and therefore both theories are correct, although each theory on its own is incomplete. To better understand the relationship between monitoring and audience design, research is needed that systematically manipulates both the audience and the elicitation context.

9

Teaching Implications

Social Dimensions Introduction So far, we have investigated both the social and the psycholinguistic dimensions of linguistic variation. Both of these areas have implications for language teaching, which we will discuss in this chapter. We begin with the implications of socially conditioned variation. During the days of the grammar translation method, grammar was the focus of foreign and second language teaching, although at the advanced stages learners were assigned readings, ranging from letters to academic articles, that provided them with exposure to authentic discourse. The audio-lingual method, which succeeded the grammar translation method in the 1960s, also focused on grammatical competence. Grammar was not taught explicitly but was learned inductively by drilling strategic grammatical structures. Audio-lingual texts typically did not contain authentic target language readings, but readings constructed to exemplify a particular grammatical structure. The audio-lingual method was overthrown by the communicative language teaching revolution of the 1970s and 1980s. This revolution had roots in both psycholinguistic and sociolinguistic theory. Krashen (1978, 1982, 1985, 1987), the most inﬂuential scholar of communicative language teaching, took a psycholinguistic approach. He advocated making language lessons meaningful, stress-free, and interesting because he believed that such lessons provided the most eﬀective input to the Language Acquistion Device. A diﬀerent but compatible school of communicative language teaching (e.g., Savignon, 1983; Wilkins, 1976) took a sociolinguistic approach, stressing communicative competence, as advocated by the anthropologist Hymes (1972). Canale and Swain’s (1980) classic model of communicative competence in second language acquisition was reworked by Bachman (1990), who proposed the model of communicative language ability shown in ﬁgure 9.1. Figure 9.1 divides communicative language ability into four basic competencies, which are grouped into two larger competencies. Organizational competence consists of grammatical competence (knowledge of the patterns of sentence organization), and discourse competence (knowledge of how diﬀerent discourses are organized). Pragmatic competence consists of illocutionary 153

154

• Variation in Pedagogical Perspective

Figure 9.1 Bachman’s model of communicative language ability.

competence (how to perform speech acts appropriately) and sociolinguistic competence (how to use styles and registers appropriately).1 A lot has been written about how to teach grammatical competence and illocutionary competence. The teaching of discourse competence has also been extensively addressed in the literature on basic writing and college composition under the rubric of “rhetoric.” However, to date few authors have addressed whether sociolinguistic competence should be taught and, if so, how. In particular, the question of whether to teach nonstandard dialects of the target language has rarely been discussed. Most language teaching programs are similar to the French immersion program in Ontario (reviewed in chapter 4), where, as Mougeon, Rehner, and Nadasdi (2004) point out, sociolinguistic variation is absent from the textbooks and even from the classroom speech of teachers. In Ontario, students mostly study the French of educated Parisians, just as throughout the world students of English, Spanish, Amharic, and other languages study only the standard variety, even if the students live in a place where a vernacular variety is spoken. This situation is similar in some ways to the teaching of English as a second dialect in African American communities in the United States, where textbooks and teachers largely ignore African American English (AAE), the language of the community. In both cases a local variety is suppressed in favor of a prescriptive norm. It is therefore instructive to take a look at proposals to teach AAE in American schools. We will brieﬂy review this question before turning to the equally controversial question of whether to include regional and social varieties in foreign and second language classrooms. African American English Readers In the 1960s researchers and educators (e.g., Labov, 1967; Gladney, 1973) noted that a disproportionate number of African American children were not reading at grade level. Undoubtedly, many factors contributed to this problem, but some scholars (e.g., Fasold and Shuy, 1970; Laﬀey and Shuy, 1973) wondered whether one factor was the diﬀerence between the children’s language and the language of the reading textbooks. An example of how such diﬀerences could cause problems can be seen in the teaching of phonics. Many phonics lessons teach the English vowel sounds by drilling the diﬀerence in “word families,” groups of words that diﬀer by a single sound. Thus, the diﬀerence between the vowels in bet and bit might be taught by having the child read the following groups of words:

Teaching Implications pet letter set tent pen

• 155

pit litter sit tint pin

In this lesson confusion could arise because AAE (like Southern English) does not distinguish /e/ and /i/ when these vowels occur before nasals, as in the last two examples. Problems related to syntax and morphology could also arise. For example, AAE, like Spanish, has a rule of negative agreement, allowing sentences like “We ain’t never had no trouble.” Other examples of AAE patterns that do not appear in the Standard English textbooks include completive done (“I done gone to bed”—meaning that the speaker is in bed for the night), and invariant be (“My father, he be tired”—meaning that he is usually tired). On the basis of such considerations, a number of linguists in the 1960s and 1970s proposed that reading materials written in AAE should be used in the schools. Several dialect readers were written, the most ambitious of which was the Bridge reading program, published in 1977 (Rickford and Rickford, 1995). These materials included passages in three forms: nonstandard, standard, and an intermediate variety, with standard spelling used throughout. Teachers were encouraged to let students pronounce the Standard English passages in their own way, without correcting nonstandard pronunciation. This technique is common practice in language minority areas around the world; in Stuttgart, for example, school textbooks are written in High German, but students read them with a Swabish accent. The Bridge materials also included exercises and an audiotape featuring spoken AAE. An excerpt from a passage written in the nonstandard variety appears below. DREAMY MAE This here little Sister name Mae was most deﬁnitely untogether. I mean, like she didn’t act together. She didn’t look together. She was just an untogether Sister. Her teacher was always sounding on her ’bout day dreaming in class. I mean, like, just ’bout every day the teacher would be getting on her case. But it didn’t seem to bother her none. She just kept on keeping on. Like, I guess daydreaming was her groove. And you know what they say: “Don’t knock your Sister’s groove.” But a whole lotta people did knock it. But like I say, she just kept on keeping on. (from Rickford and Rickford, 1995, p. 127) Despite evidence of their eﬀectiveness, the AAE readers met with resistance from many parents, teachers, and community leaders. As a result, the dialect readers were quietly dropped. More recently, opinions about the value and appropriate place of AAE in the schools have changed, and dialect readers, including the Bridge series, have

156

• Variation in Pedagogical Perspective

come back—this time to a more positive, though still mixed, reception (Perry and Delpit, 1998). The Stanford linguists Angela and John Rickford (1995) have conducted research on the use of the Bridge readers in East Palo Alto, California. One goal of their small-scale study was to gauge the attitudes of students and teachers, both White and Black, toward dialect readers versus Standard English readers. The attitudes were mixed. Teachers of both races rated the Standard English readers as better written and more helpful to students. One African American teacher remarked, “Every Black kid knows that there is language for the playground, and then there is language for the classroom, and if you want anyone to take you seriously, you’d better not mix the two. . . I just don’t think it’s the right approach for teaching Black kids English” (Rickford and Rickford 1995, p. 118). The students, on the other hand, preferred the vernacular materials, but there was a sharp distinction between boys and girls, with boys much preferring the AAE materials. In one of the studies, the African American students were asked whether they preferred the vernacular or Standard English version of a story, and which version was most like the way they talked. To both questions all of the boys answered “vernacular” and all of the girls answered “standard.” This split reﬂects a phenomenon that Labov (1966) noticed in his studies of AAE in Harlem. Adolescent males appear to be the most consistent speakers of the vernacular. It is only when they get older and move in wider social networks that these speakers show more variation between vernacular and standard forms. A problem with the Bridge series, which may be apparent from the excerpt above, is that its specially written passages are not real literature. To address this problem, A. Rickford (1999) developed a program written in the vernacular that includes literature, and that could be used alongside introductory materials like the Bridge program. Rickford was educated in the Caribbean, where she read local authors such as V. S. Naipaul. Remembering the excitement and sense of participation she felt reading about her own culture, she decided to create a reading program that would feature traditional African American folk tales, such as the Brer Rabbit stories, and contemporary African American short stories. The language used in these stories varied from formal, Standard English to AAE. Rickford notes, “The vernacular is maintained as an important cultural marker, but idiosyncratic [forms] are avoided” (Rickford, 1999, p. 241), as in this example: Yes, Brer Rabbit had fallen in love, and it was with one of Miz Meadows’ girls. Don’t nobody know why, ’cause he’d been knowing the girl longer than us folks have known hard times, but that’s the way love is. One day you ﬁne and the next day you in love. (from Rickford, 1999, p. 242) As discussed in Adamson (2005), the problem with using vernacular language in the public arena is not a problem with the language; it is a problem

Teaching Implications

• 157

with the public. But teachers cannot (quickly, at least) change social prejudice, whereas they can change their students’ chances of public success by teaching them to read and write Standard English. A popular compromise position is that students should be bi-dialectal. In regard to reading, both standard and vernacular materials should be used, but students should be allowed to pronounce the standard materials according to their own phonological systems. In regard to writing, students should be encouraged to use the vernacular in appropriate contexts, such as writing a journal, a letter to a friend, or a dialogue for a short story. Of course, there should be assignments that elicit Standard English, as well. An especially useful exercise is asking the students to recast a piece of writing for a diﬀerent audience, requiring that Standard English be rewritten in vernacular, and vice versa. As we will see, these suggestions could apply to the teaching of nonstandard varieties in second and foreign language contexts, as well. Teaching Sociolinguistic Competence in Foreign Language Contexts Definitions The distinction between a foreign language and a second language has been controversial. Oxford (1996) proposes a functional deﬁnition: a foreign language is typically learned in a setting where the language is seldom used or experienced, while a second language is learned in a setting where the language is typically used by the majority of individuals for everyday communication. In regard to English, Kachru (1996) proposes a historical criterion. He divides the countries of the world into three groups according to the role English plays in the society. Inner circle countries include Britain, the U.S., and Canada, where English is the native language of the majority. In these countries English is taught to nonnative speakers as a second language. Outer circle countries include India, Kenya, and Malaysia. Typically, these countries have been colonized by English speakers, and English has become indigenized and is used as an oﬃcial language or a lingua franca. In these countries also, English is taught as a second language. In expanding circle countries, including Germany, France, and Mexico, English is taught as a foreign language, used for international business, scientiﬁc purposes, and cultural enrichment. Goals of Foreign Language Teaching Kramsch (2002a) observes that second language instruction and foreign language instruction come from diﬀerent academic traditions. Second language scholars are mostly trained in linguistics and the social sciences, and the main goal of second language programs is to enable students to speak the target language appropriately and with near-native proﬁciency. Therefore, second language programs aim to teach survival skills to new immigrants, technical English to new factory employees, English for academic purposes to students,

158

• Variation in Pedagogical Perspective

and so on. Foreign language scholars, on the other hand, are mostly trained in literature and the humanities, and the main goal of foreign language programs is to enable students to read and appreciate the literature of the target culture. Kramsch (2002b) goes on to compare the goals of foreign language study in diﬀerent countries. For the United States these goals read like the self-help literature that has been popular since Benjamin Franklin. They embody an educational philosophy that tries to improve the adolescent as a whole. Speciﬁc goals include enabling the student to “Look beyond . . . customary borders . . . Act with greater awareness of self [and] of other countries . . . [and] gain direct access to additional bodies of knowledge” (1996 National Standards Statement of Philosophy, quoted in Kramsch, 2002a, p. 6). In Germany foreign language education goals also include personal improvement but with a civic emphasis: “Foreign language education contributes eminently to . . . character development of our students . . . [It] should give them insights into Christian and humanistic traditions, to behave according to moral principles, and to respect religious and cultural values” (Rahmenplan Gymnasiale Oberstufe, 1998, quoted in Kramsch, 2002a, p. 8). Kramsch explains: “The German guidelines . . . express the moral obligation of administrators to help school pupils become responsible citizens in a democratic society – the historical legacy of WWII” (Kramsch, 2002a, p. 8). In France foreign language education goals are couched in the language of the Enlightenment, with a focus on rationality and intellectual reﬂection. They aim at “an increasingly nuanced study of texts of increasing complexity, deepening of a reasoned understanding of culture . . . the deepening of a metalinguistic reﬂection on the target language and on language in general (Ministère de l’Education Nationale 2000, p.19; quoted in Kramsch 2002a, p. 19). As Kramsch (2002a, p. 9) notes, nowhere in any of these goals for foreign language instruction is achieving near-native proﬁciency mentioned. Valdman and the Pedagogical Norm The scholar who is most identiﬁed with the question of what forms to teach in the foreign language classroom is Albert Valdman, who addressed the question under the rubric of the pedagogical norm. This construct is deﬁned by Bardovi-Harlig and Gass (2002) as “The immediate language target, or targets, that learners seek to acquire during their language study” (p. 3). Fox’s (2002) deﬁnition is similar: “The term ‘pedagogical norm’ is an abstraction that has been used to deﬁne a language variety that is simpler and more uniform than that of the native speaker . . . It serves as an immediate target for the language learner and represents a step, or series of steps, that can lead to the eventual acquisition of the full range of native speaker variation” (p. 209). Valdman (1989, p. 21) laid out four principles for choosing the structures that comprised pedagogical norms. These norms should:

Teaching Implications

• 159

1. Reﬂect the actual speech of target language speakers in authentic communicative situations. 2. Conform to native speakers’ idealized view of their speech use. 3. Conform to expectations of both native speakers and foreign learners concerning the type of linguistic behavior appropriate for foreign learners. 4. Take into account processing and learning factors. There is considerable leeway and even some contradiction in these principles. For example, principle 1 says that the pedagogical norm should reﬂect what native speakers actually say, but principle 2 says that the norm should conform to an idealized view of what they say. Both principles 1 and 2 potentially conﬂict with principle 4 because ease of processing might suggest teaching nonstandard or even ungrammatical forms. So, the devil is in the details, and over the years Valdman’s instructional materials changed as his emphasis shifted among the principles. Throughout his career, however, Valdman seems to have placed the greatest emphasis on principle 4 in order to facilitate learning, a priority that was, no doubt, appreciated by generations of students. An example of Valdman’s emphasis on learnability can be seen in his recommendations for introducing yes/no questions in French. These can be formed in two ways. The ﬁrst way is to invert the subject and the tensecarrying verb, as in English. Thus, the statement la plume est sur la table (the pen is on the table) can be changed into the question Est la plume sur la table? The second way to form a question is to put the phrase est-ce que in front of the corresponding statement as follows: Est-ce que la plume est sur la table? The latter structure seems easier to learn than the former because a single memorized phrase can change any statement into a question, and Valdman advocated teaching this form before the subject–verb inversion form. In his earlier writings, Valdman (1961, 1966) emphasized the philosophy expressed in principles 2 and 3, as did virtually every French course before the 1990s. As Kramsch (2002b, p. 59) notes, “The norm in F[oreign] L[anguage] teaching has historically been represented by the standard forms of written language as encountered in canonical works of literature” (emphasis in original). Reﬂecting this philosophy, Valdman stated that the pedagogical norm should be based on “the speech behavior of educated Paris speakers” (1961, p. 1). At the same time, however, he noted, “all varieties of French are equally ‘grammatical’ and acceptable . . . [to] . . . the linguistic analyst” (1961, p. 1). As the 1980s gave way to the 1990s, Krashen’s psycholinguistic approach to communicative language teaching, with its emphasis on comprehensible input, gave way to the sociolinguistic approach to communicative language teaching, with its emphasis on communicative competence. Valdman’s philosophy shifted in this direction as well, with attention at the advanced levels to regional and sociolinguistic variation. Valdman (1989) noted, “As

160

• Variation in Pedagogical Perspective

instruction progresses, and as learners become more capable of processing the more complex syntactic features characteristic of planned formal discourse, the pedagogical norm must increasingly take into account sociolinguistic considerations” (quoted in Bardovi-Harlig and Gass, 2002, p. 28). But how important and, indeed, how practical is it to teach variable forms such as the diﬀerence between “They were playing basketball” and “They were playin’ basketball” when students may have diﬃculty including any form of -ing in progressive tenses? Acknowledging this problem, Valdman advocated teaching only a receptive competence of variable features along with metalinguistic knowledge of what alternating forms signify. Students should learn the signiﬁcance of G versus N without being asked to produce the forms in appropriate circumstances. This move toward emphasizing receptive variable competence was reﬂected in Valdman’s (1992) condensing the four principles for constructing pedagogical norms into only three. The new principles are: 1. Linguistic: the actual variable production of targeted native speakers in authentic communicative situations. 2. Sociopsychological: native speakers’ idealized views of their speech and the perceptions both native speakers and foreign learners have regarding expected behavior of foreign users. 3. Psycholinguistic: relative ease of acquisition and use. In the revised version, teaching the idealized language variety no longer rates a separate principle but is conﬂated with the less prescriptive and more sociolinguistic notion that the norms should reﬂect expected language behavior, which would include informal and perhaps even stigmatized forms. Other Advocates of a Variable Pedagogical Norm In recent years, other writers have stressed the sociolinguistic dimension of pedagogical norms, advocating that foreign language learners be exposed to varieties other than the standard. However, as Fox (2002) acknowledges, this proposal, like the proposal to teach AAE, can run into opposition. Fox (2002) notes, for example, that the French of educated Parisians has long been considered to be not only more socially acceptable, but more beautiful and logical than other varieties, a judgment that is held not only by Parisians but by other French speakers. “Belgian francophones manifest an acceptance of a linguistic subjugation with respect to France, the disparagement of ways of speaking they believe illegitimate, and a pessimistic view of the future of French” (p. 205). Nonetheless, Fox says that attitudes are changing. In her estimation the most important development in French studies in the past two decades is the multicultural turn in language studies, which has resulted in the inclusion of the francophone literatures of Africa, Canada, and the Caribbean as legitimate objects of study. In line with the cross-cultural turn in language studies, Fox

Teaching Implications

• 161

(2002) advocates exposing American students of French to “standard Quebec French,” the variety used in Quebec newspapers and radio and television news reports. Although this variety appears to diﬀer from Parisian French mainly in terms of pronunciation and does not include the vernacular forms, or even many mildly marked forms, Fox (2002) believes such exposure will better prepare American students to interact with North American French speakers. Her proposal is thus in line with the idea of teaching French as an international language. However, Fox (2002) acknowledges the practical diﬃculties of this proposal. Most U.S. French teachers are speakers of Parisian French and would need to rely on recorded materials, which are not presently available. Kramsch (2002a, 2002b) also advocates exposing foreign language students to variation, in this case not to diﬀerent regional varieties but to diﬀerent styles and registers. She suggests that pedagogical norms should be based on “one’s unique experience as a nonnative speaker” (p. 61). In cases where foreign language students are likely to encounter native speakers of the target language (as are students of Spanish in the American Southwest, for example), they should be exposed to informal styles and local vernacular forms. Like Valdman, Kramsch recommends that students acquire only a receptive competence in these forms. In terms of register variation, she advocates exposing students to narrative and poetic forms. But, of course, this is already done in American foreign language programs, where upper division courses consist almost entirely of the study of literature. Kramsch’s more innovative suggestion is to also expose students to diﬀerent modes of input, including e-mail messages, handwritten documents, telephone conversations, and even code-switching discourse. Fox (2002) and Kramsch (2002a, 2002b) see the foreign language learner as an active user of the target language in contexts beyond the classroom. This may be possible for some foreign language learners in the United States but not for most of them, who do not have much access to native speakers. Furthermore, even students who live in bilingual communities are usually not able to acquire productive competence in any form of the target language because the typical foreign language program consists of studying the language one hour a day for three to ﬁve days a week for three or four semesters. For these learners, Valdman’s pedagogical norm, which includes some exposure to register variation but focuses almost completely on standard forms, and places a higher priority on learnability than on variability, is more realistic. Nevertheless, suggestions such as those made by Fox and Kramsch do address the needs of those foreign language learners who are able to go beyond basic requirements, especially those who participate in increasingly popular study abroad programs. For these learners, the line between foreign language and second language is blurring.

162

• Variation in Pedagogical Perspective

Teaching Sociolinguistic Competence in Second Language Contexts As mentioned earlier, foreign language teaching and second language teaching come from diﬀerent academic traditions and embrace diﬀerent models of the target language. Second language teaching, the more inﬂuential of the two traditions, is connected to the social sciences, especially psychology and ethnography. Foreign language teaching comes from the humanities and is connected to literary and cultural studies. Thus, Valdman (1989) properly associates his pedagogical norm for teaching French as a foreign language with a literary standard because his American students will probably encounter French only in the classroom, where they will mostly read and discuss French literature. Second language students usually have very diﬀerent goals. In the case of English language learners in the United States, students desire to interact and even integrate with native speakers. For this reason, they need to study authentic speech, including informal and perhaps even stigmatized variants. Auger (2002) addresses the question of what variety of French to teach to heritage English learners in Montreal, a matter that has been debated for 40 years. There are three candidates: (1) the continental standard (educated Parisian French); (2) Standard Quebec French; and (3) the vernacular variety that is spoken by the Montreal working class, sometimes referred to as joual (horse), and that is viewed by many of its speakers as an emblem of their identity. According to Auger (2002), this latter variety, which contains many of the stigmatized features referred to by Mougeon, Rehner, and Nadasdi (2004), has never been seriously considered for use in schools for the reasons already mentioned. Auger (2002) argues that this position is understandable because parents who enroll their children in French immersion programs are as much instrumentally as integratively motivated, and proﬁciency in Standard French is the key to professional advancement in Quebec. Valdman (1989) acknowledged this situation when he stated that the learning of a foreign language “may be viewed as an economic investment whose value increases in direct proportion to the status conferred by variant forms: the higher the social status associated with a variant, the more remunerative the investment” (p. 21). Thus, the goal of the Ministry of Education of Quebec, according to Auger (2002), was twofold: the students should speak Standard Quebec French and should be able to understand vernacular French. The problem is that the latter goal has not been realized. Graduates of the French immersion programs report that they cannot communicate with their French-speaking neighbors. Auger (2002) oﬀers two recommendations for solving this problem. The ﬁrst recommendation parallels that of Rickford and Rickford (1995) in regard to AAE. She suggests using literature that features characters speaking in the vernacular. Auger’s second recommendation is more daring. She proposes teaching some productive control of the vernacular using standard communicative methods, such as roleplays. For example, a student might act as a DJ

Teaching Implications

• 163

for a radio show or might play the role of a college student welcoming a francophone roommate. The exercises that Auger (2002) provides show that she proposes to teach not only the vocabulary and pronunciation of the vernacular, but also some of the syntactic structures, such as ne deletion and the postverbal (not preverbal) placement of object clitics. There is a second parallel between Auger’s (2002) proposal and Rickford and Rickford’s (1995) proposal for using Black English literature. These authors note that during the adolescent years language is closely connected with identity, and if a goal of language education is to foster a bilingual and bicultural identity, this is a critical time to do so. The Postmodern Turn—Identity Introduction In the discussion of style shifting in native speaker speech in chapter 8, we noted the close connection between language use and a speaker’s social identity. This connection has been discussed in recent language acquisition literature, as well. An early deﬁnition of identity is provided by Tajfel (1978), who said that social identity is “that part of an individual’s self-concept which derives from his knowledge of his membership of a social group (or groups) together with the value and emotional signiﬁcance attached to that membership (quoted in Joseph, 2004, p. 76). More recent scholars, such as Hecht (2002), have emphasized the ﬂuidity of identity, and how it is constructed not entirely by the self but also by others’ perceptions. Joseph (2004) points out that the tension between self-deﬁned identity and other-deﬁned identity is parallel to the debate in literary criticism between whether the “real meaning” of a work is to be found in the author’s intent or in the reader’s interpretation. The problem with looking to authorial intent is that we can never be sure exactly what the author had in mind—if, indeed, the author had a clear intention. The problem with looking for meaning in the “reader’s response” is that readers might be ill-informed or might bring their own idiosyncrasies or agendas to the interpretation of a work. Joseph (2004) advocates bringing both authorial intent and reader response into the interpretation of a work. It is the same, he says, for social identity construction: “Both self-identity and the identities others construct for us go into making up our ‘real’ identity” (p. 83). We have reviewed a number of studies that deal with native speakers’ motivation for changing the way they speak. Recall from chapter 8 that variationists have found a connection between style shifting and a speaker’s identity. This is most apparent in the kind of style shifting Bell (2001) calls “referee design,” of which there are several types (see ﬁgure 8.3). “Ingroup design” occurs when speakers wish to emphasize their identity by increasing the features of their speech that diﬀerentiate them from other groups, a phenomenon similar to Giles’ (1984) “divergent accommodation.” “Outgroup design” occurs when

164

• Variation in Pedagogical Perspective

speakers change aspects of their basic speaking style in an attempt to establish themselves as members of an outside social group, and this type is most similar to the situation in SLA. As discussed in chapters 1 and 8, Labov (1972a) found that high school students living on Martha’s Vineyard had to choose whether to remain on the island or to move to the mainland in order to pursue greater educational and career possibilities. Those who chose to remain and retain their identity as islanders changed their speech by emphasizing a pronunciation that was associated with island speech. Those who chose to leave did not change their speech. While Labov’s (1972a) subjects revived a linguistic feature used by older islanders, Eckert’s (1999) high school students, as described in chapter 3, advanced an ongoing change, the backing of the vowel // so that busses sounds like bosses. It is no surprise that the speakers who showed the most backing of // were women because, as Labov (2001a, p. 280) observes, “Women have been found to be in advance of men in most of the linguistic changes studied . . . in the past several decades.” But the identity of the linguistic innovators in Eckert’s (1999) study was more speciﬁc than gender. Mid-central vowel backing was most advanced among the social group that Eckert (1999) called “burned-out female burnouts.” These were girls who totally rejected the high school culture of academic achievement, along with sports, clubs, and other approved activities. Identity and Second Language Acquisition Second language researchers have long been interested in how a speaker’s identity relates to learning a language, but they have investigated the notion under other rubrics. For example, Gardner and Lambert (1972) addressed the question in terms of motivation. They distinguished two kinds of motivation for English speakers learning French in Montreal: instrumental and integrative. Instrumentally motivated learners wish to use French only for practical purposes, such as business. Integratively motivated learners wish to make Frenchspeaking friends and move socially in French-speaking circles. Gardner (2002) notes, “though self-identity is not explicitly identiﬁed in it, the concept of integrativeness involves the willingness to identify with the other language community” (p. 164). Gardner and Lambert’s (1972; Gardner, 2002) studies assumed that people had a more or less ﬁxed identity in regard to language learning. They were either willing or reluctant to move beyond their own culture and take on some of the characteristics of another culture. But more recent studies of identity have stressed its continuously changing nature. As Norton and Toohey (2002) state, “Applied linguistics researchers have been drawn to literature that conceives of identity not as static and one-dimensional but as multiple, changing, and a site of struggle” (p. 116). These ideas are compatible with the work of

Teaching Implications

• 165

Vygotsky (1978, 1986) and Bakhtin (1981), who believed that learning a new language, or even a new discourse style or register, involved taking on aspects of a new identity. As an example of the latter, consider the following conversation between a psychologist and a literate Kazakh peasant, reported by Luria (1976), a disciple of Vygotsky. Experimenter: It is twenty versts from here to Uch-Kurgan, while Shakhimardan is four times closer. [Actually, the reverse is true.] Peasant: What! Shakhimardian four times closer?! But it’s farther away! Experimenter: Yes, we know. But I gave out this problem as an exercise. Peasant: I’ve never studied, so I can’t solve a problem like that! I don’t understand it! Divide by four? No . . . I can’t . . . [The experimenter repeats the problem.] Peasant: If you divide by four, it’ll be . . . ﬁve versts . . . if you divide twenty by four, you have ﬁve! (quoted in Frawley, 1997, p. 13) In this example, the peasant shifts from the discourse of everyday common sense reasoning to the discourse of hypothetical academic reasoning. Vygotskians argue that such a shift involves a shift in identity as well: the peasant adopts an aspect of a recently constructed identity, that of the pupil. We have seen many other examples of identity construction throughout the book, some of which are brieﬂy reviewed below. The notion of identity is central to Wolfram, Carter, and Moriello’s (2004) study of Hispanic immigrants to North Carolina, discussed in chapter 4. A particularly interesting ﬁnding was the diﬀerence in the speech of an 11-yearold girl and her 13-year-old brother, both of whom had lived in Piedmont North Carolina all of their lives. The girl had not accommodated to local Southern speech norms, producing only 5.9 percent unglided /ay/ in words like pie and sky. The boy, however, produced 62 percent unglided /ay/, indicating that he had acculturated to local norms. Wolfram, Carter, and Moriello (2004) attribute this diﬀerence to the fact that the boy identiﬁed with the local “jock culture,” whereas his sister identiﬁed more with mainstream American culture. Identity is also central to Auger’s (2002) discussion of teaching French as a second language. One might wonder why English heritage speakers living in Montreal have to rely on immersion programs to learn French when they live in a francophone city and province. The reason is that Quebec is a divided society (LaPonce, 1992). Auger (2002, p. 91) notes that “most immersion students rarely use French outside of school, as most of their friends are Anglophones . . . Anglophones appear not to seek opportunities to speak French.” Not only that, but as immersion school graduates get older they use less and less French. Auger (2002) suggests that the matter of identity is part of this

166

• Variation in Pedagogical Perspective

social segregation. Identity is largely formed during the preteen and adolescent years, but although immersion students spend a good part of each day at school hearing and speaking French, they lack a way to communicate with their French-speaking peers. Therefore, they develop only an English identity. As an example, Auger (2002) quotes Louisa, who despite speaking ﬂuent Parisian French did not integrate with her francophone classmates. “I did not ﬁt in . . . It’s just obvious from my speech [that I’m not French-Canadian]” (p. 97). Auger expresses the hope that “If French-immersion programs can help instill in the many English-speaking children and adolescents enrolled in them a stronger sense of comfort with the French they are learning . . . this will increase their desire to communicate with French-speaking children and teenagers, thus beginning to erode the linguistic boundaries that continue to divide the two linguistic communities” (p. 98). To summarize this section, recently both foreign and second language specialists have recommended broadening the traditional curriculum to include informal styles and vernacular forms of the target language. In foreign language teaching, the most common recommendation is to aim for only a receptive competence, but in second language teaching, where students have access to native speakers, the recommendation is to aim for some productive competence, as well. Psychological Dimensions Constructivist Language Teaching As the discussion throughout the book has shown, variation theory has emphasized the social dimension of language rather than the psychological dimension. To look for the psychological motivations of linguistic variation, we have turned to the ﬁelds of psycholinguistics and Cognitive Linguistics. It is therefore natural that this section will draw more from these ﬁelds than from Variation Theory proper. However, Variation Theory does suggest at least one important guideline for teaching, which we will now explore in some detail. The intuitive (but often ignored) idea that teaching should build on what a student already knows, which can be called constructivism, has been emphasized by teachers, linguists, and psychologists. Perhaps the most inﬂuential of the psychologists is Vygotsky (1978, 1986), who proposed the Zone of Proximal Development (ZPD). He deﬁned the ZPD as “the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance or in collaboration with more capable peers” (1978, p. 86). Scholars have interpreted this deﬁnition in diﬀerent ways, depending on how they deﬁne “development,” but one common interpretation (Tharp and Gallimore, 1988, 1990) equates development with learning (rather than with cognitive maturation, as proposed by Piaget [1972]). In other words, learning

Teaching Implications

• 167

builds on previous learning. For example, students cannot understand the Periodic Table of the Elements unless they ﬁrst understand the structure of the atom. Thus, a lesson on the Periodic Table would be within the ZPD of a student who understands the basic relationship of protons, neutrons, and electrons, but beyond the ZPD of a student who does not understand this relationship. Another way of looking at this notion is provided by Wood, Bruner, and Ross (1976), who coined the term scaﬀolding. Scaﬀolding is the process by which a teacher helps a student understand new concepts by ﬁlling in necessary background information. Thus, for a student who does not clearly understand the relationship of protons, neutrons, and electrons, a lesson on the Periodic Table would include scaﬀolding in the form of a review of atomic structure. Krashen (1982, 1985, 1987) was perhaps the ﬁrst linguist to propose the related idea that there are natural stages in language acquisition, and that structures should be taught in the sequence in which they are naturally acquired. Krashen couched his discussion in general terms, saying that if the natural sequence for acquiring a series of structures is i, i + 1, i + 2, etc., and that if a student is at stage i, then the best structure to teach is i + 1. As we saw earlier in the chapter, Valdman (1989) endorsed this idea (without speciﬁcally mentioning Krashen) in his principle 4 for constructing a pedagogical grammar, which is: “Take into account processing and learning factors.” Krashen did not provide examples of sequential structures, but other researchers have supplied them. One example of an acquisitional order was discovered by the Zweitspracherwerb Italienischer und Spanischer Arbeiter (ZISA) group in the 1970s (reported, for example, in Meisel, Clahsen, and Pienemann, 1981 and Pienemann, Johnston, and Brindley, 1988). The ZISA group described ﬁve stages in the acquisition of German word order by speakers of Italian and Spanish. The ﬁrst three stages are as follows: 1. Learners begin with SVO word order, as in the native languages, even in incorrect contexts (compare example (1) to example (3)). Example: *die kinder muss machen die pause the children must make the pause 2. Learners can move a ﬁnal element to initial position Example: kinder spielen da children play there AND da kinder spielen there children play 3. Learners can move nonﬁnite verbal elements to the end of the clause, as required in German. Example: die kinder muss die pause machen the children must the pause make

168

• Variation in Pedagogical Perspective

The ZISA group explained this acquisitional order in terms of diﬃculty of cognitive processing. Stage 1 requires the least amount of processing because its structures transfer from the native languages. Stage 2 requires more processing because an element must be moved. In this case it is an adjunct element that is not embedded in a lower structure. Stage 3 requires even more cognitive processing because an embedded element must be taken out of a lower structure and moved to a higher structure (in the case of the example, a verb must be moved out of a V’ and embedded under VP). The ZISA group recommended that these structures be taught in the order mentioned above, and, in fact, claimed that they cannot be learned in any other order. A second example of a natural acquisition order was studied by Doughty (1991), who looked at the acquisition of relative clauses. The theoretical basis of Doughty’s research was the Noun Phrase Acquisition Hierarchy proposed by Keenan and Comrie (1977). These scholars studied the distribution of diﬀerent types of relative clauses in the languages of the world and noticed two important facts. The ﬁrst is that the diﬀerent types of relative clauses are not equally distributed among the world’s languages. Subject focus relative clauses are more frequent than direct object focus clauses, which are more frequent than indirect object focus clauses, which are more frequent than object of preposition focus clauses, etc.2 The second fact is that there is an implicational hierarchy among the types of relative clauses in the languages of the world. A simpliﬁed version of this hierarchy looks like this: subject focus > direct object focus > indirect object focus > object of preposition focus That is, if a particular language has object of preposition focus relative clauses, it will also have all the types to the left of it in the hierarchy. Likewise, if a language has indirect object focus relative clauses, it will also have all the types to the left of it in the hierarchy. In other words, subject focus relative clauses are less marked, in the distributional sense, than relative clauses of the other types. This fact suggests that subject focus relative clauses may also be less marked in the psychological sense; that is, they may be easier to learn. Several studies suggest that this is the case (e.g., Eckman, Bell, and Nelson, 1988; Gass, 1980; for a dissenting view, see Tarrallo and Myhill, 1983). Doughty (1991) taught object of preposition focus relative clauses to college students who were using only subject focus relative clauses in their speech. Her goal was to determine whether the natural order of relative clause acquisition could be “beaten” by oﬀering explicit instruction in a more marked form. She found, ﬁrst of all, that the students were able to learn the object of preposition relative clauses even though they had not progressed through the natural order represented in the implicational hierarchy. She also found, remarkably, that the students had learned direct object and indirect object focus clauses, even though these types were not taught in the experiment. In other words,

Teaching Implications

• 169

teaching the more marked form somehow imparted knowledge of the less marked forms. This result, at ﬁrst glance, seems to suggest that processing difﬁculty is not involved in learning relative clauses, but that the acquisitional order is caused by other factors, such as saliency of forms or frequency of input. However, Doughty (1991) suggests that a processing constraint is in operation, namely the basic ability to relativize. Because all of her subjects were producing subject focus relative clauses, she believes that they had developed this processing capacity and therefore were ready to learn the other types of relativization. Regardless of whether processing factors are involved in a natural order of acquisition, it would seem reasonable to teach structures in the order in which they are naturally acquired. For whatever reason, that order seems to be the easiest for learners to follow. Doughty’s (1991) experiment does not contradict this recommendation. Although her subjects were able to learn less marked relative clauses by being taught only the most marked type, they had advantages that are not shared by most students. For one thing, they were highly motivated students at an Ivy League university; for another, they had the beneﬁt of tailor-made computerized instruction. But how can a teacher know what structures are involved in natural orders (only a few have been studied) and, if cognitive constraints are involved, how can a teacher know whether the students have developed the cognitive capacity in the new language to acquire the target structure? An obvious answer is that if students are variably producing a structure, they have developed the requisite cognitive capacity and are ready to master the structure. In terms of Variation Theory, if students have a variable rule for a structure, it is an appropriate target for instruction, which may push to completion the change in the internal grammar that is already in progress. Formulaic Sequences and Constructions A second area of teaching implications involves cognitive linguistics (CL). Recently, linguists and language teachers have paid a great deal of attention to formulaic sequences (sometimes called collocations). These include idioms like “kick the bucket” and “buy the farm,” as well as less idiomatic but commonly used phrases, like “as well as” and “ﬁrst of all.” Formulaic sequences also include phrases with slots that can be ﬁlled with a variable, like “As far as John/ Marsha/it is concerned,” and even more abstract patterns like the way construction discussed in chapter 7 (“Rocky couldn’t punch his way out of a paper bag”). The fact that this last example of a formulaic sequence is also a construction in CL raises the question of whether all formulaic sequences are constructions. They are not: only some formulaic sequences qualify as constructions. We will discuss the diﬀerence between these two notions later in the chapter. One reason for the recent interest in formulaic sequences is that they can be identiﬁed by computer concordancing programs that look for a particular word in a corpus and retrieve the words around it. So, given the word “majority,” the program would come back with sequences like “in the majority

170

• Variation in Pedagogical Perspective

of cases” and “majority report.” These programs can also identify which formulaic sequences are the most common in a corpus of speech or writing, thus identifying which sequences are the most useful for learners. A second reason for interest in formulaic sequences is the growing belief that in language production and comprehension we do not always encode and decode word by word, as implied in the discussion of the sentence production and comprehension models in chapter 6. Sinclair (1996) claims, to the contrary, that “units of meaning are expected to be largely phrases” (p. 82). Such a possibility is, in fact, allowed by Barsalou’s model. As discussed in chapter 6, at level 2 the formulator ﬁnds lemmas that match the preverbal message. But, it would also be possible for the formulator to match a preverbal message, or part of one, with a formulaic sequence, retrieving both lemmas and some syntactic structure. At level 3 a phrase marker would be constructed that included the syntactic structure handed down by the formulator. It is less clear how Townsend and Bever’s (2001) sentence comprehension model could handle formulaic sequences, but an intriguing possibility is that the templates that constitute pseudosyntax can include formulaic sequences and constructions. Regardless of whether and how formulaic sequences are involved in speech production and comprehension, it is clear they should be taught to ESL students. As Jones and Haywood (2004) observe, “Formulaic sequences enable students to express technical ideas economically and also provide the tone appropriate to a particular academic register” (p. 273). We now turn to the question of how to teach formulaic sequences. To ﬁnd out how these forms are presently taught, Jones and Haywood (2004) surveyed four college-level English for Academic Purposes writing texts, which were traditionally structured, with a reading passage followed by a list of vocabulary, some of which were formulaic sequences. Jones and Haywood (2004) found that the texts were “not very useful if the aim is for the students to acquire formulaic sequences” (p. 270). For one thing, the vocabulary lists included a large number of phrases, which would likely overwhelm the students. For another, the examples of phrases were often decontextualized. Furthermore, the books did not include exercises in which the students had an opportunity to use the phrases. To ﬁnd an alternative to this method, Jones and Haywood (2004) conducted a study in which they taught formulaic sequences to English learners from a variety of academic disciplines who were attending a university course in English for Academic Purposes. Their exploratory study suﬀered from problems common to studies conducted in such programs, namely a small number of subjects (only 21), no random assignment to groups, incomplete data due to students being absent on critical days, and a short time in which to teach (only ten weeks). Nevertheless, the research is interesting for the teaching methods it employed as well as for the tentative ﬁndings. The formulaic sequences taught in the experiment were chosen from a list compiled by Biber, Conrad, and

Teaching Implications

• 171

Reppen (1998), which was based on frequently occurring phrases in a corpus of ﬁve million words covering a wide range of disciplines. Examples included “the nature of,” “as a result of,” “on the one hand,” and “studies have shown that.” Ten of the students in the study were assigned to a treatment group, which received instruction in the formulaic sequences, and eleven of the students were assigned to a control group, which received the regular lessons. Instruction was presented to the treatment group in a reading class and a writing class. In the reading class, instruction involved three steps. First, a text was studied in the normal way with discussion and scanning exercises. Next, the text was read again, but this time in a format with the target sequences printed in bold. Finally, the students completed exercises involving the sequences. In one exercise, for example, they recast the meanings of the sequences in more colloquial language. In the writing class, the students in the treatment group were asked to write four essays using appropriate sequences and to complete exercises involving the sequences. Finally, the students were asked to use a concordancing program to discover formulaic sequences in selected texts. Pre- and post-tests were administered to see whether the treatment group had become more aware of formulaic sequences than the control group. One test was a modiﬁed cloze test, where students had to provide the appropriate formulaic sequence based on context and surface cues. For example, the following sentence (which was part of a larger passage) should elicit the phrase this kind of: “Too much of th k o chemical might encourage the immune system to stop the development of the embryo.” The results showed that the treatment group gained 1.5 points from pre-test to posttest, whereas the control group gained no points. Jones and Haywood (2004) concluded, “These results suggest that the modest improvements in the treatment group are due to increased knowledge due to teaching” (p. 285). We now turn to the diﬀerence between formulaic sequences and constructions, beginning with a phrase that qualiﬁes as both: “Fred had guilt written all over his face.” This expression, like many formulaic sequences and constructions, has a place for variables. An informal representation of the sequence could be written as in (1): (1)

{John} had {guilt} written all over {his} face

where the angled brackets contain a variable. Thus, (2) is another example of the expression. (2)

Marsha had disappointment written all over her face.

This sequence would not be appropriate for Jones and Haywood’s (2004) experiment because it is too colloquial, but it would be appropriate for a conversation, and it could be taught using Jones and Haywood’s methods by substituting dialogues and stories for academic readings. However, there could

172

• Variation in Pedagogical Perspective

be a problem with such a lesson. Notice that not just any emotion can ﬁt comfortably in the {guilt} slot, as shown in (3) and (4). (3) (4)

??Marsha had pleasure written all over her face. ??John had rage written all over his face.

The problem with (3) and (4) can be understood by treating (1) as a construction rather than as just a formulaic sequence. Unlike formulaic sequences, constructions are often related to prototype schemas of lexical items with which they can be fused, and recall that in chapter 7 we saw such a relationship between the ditransitive construction and the prototype schema for “ditransitive verb” (see ﬁgures 7.3 and 7.4). The basic meaning of the ditransitive construction is that someone transfers something to someone else, who then possesses it. This event was represented using the formal notation shown in (32) in chapter 7. A simpler formal notation is shown in (5). (5)

SOMEONE TRANSFERS SOMETHING TO SOMEONE ELSE, WHO THEN POSSESSES IT

[NP1 V NP2 NP3]

In this notation the semantic/pragmatic information is written in capital letters on the left, and the associated syntax is written in square brackets on the right. The ditransitive construction contains only abstract syntactic and semantic units, but as we have seen constructions can also contain speciﬁc lexical items, and when this is the case, these items are included in the square brackets. Thus, Queller (2001) represents the “written all over” construction as in (6). (6)

SOMEONE’S ATTEMPT TO [have {guilt} written all over PRESENT A FRONT OF {one’s} face] COMPOSURE IS BEING “MESSED UP” BY AN UNCONSCIOUS DISPERSAL OF EMOTION OVER HIS/HER FACE

Treating (1) as a construction with this semantic interpretation instead of a formulaic sequence allows us to see why (3) and (4) are odd. Someone who is experiencing pleasure or rage does not usually try to present a composed front. Guilt and disappointment, on the other hand, are emotions that people often try to conceal, sometimes without success. Therefore, these lexical items better match the basic meaning of the construction. The implication for teaching is that a careful examination of a construction’s meaning can make clear why possible surface realizations take the form they do. Therefore, constructions should be taught in more depth than other formulaic sequences. The methods used by Jones and Haywood (2004) would be appropriate for teaching constructions, but the lesson should also provide

Teaching Implications

• 173

an analysis of the construction’s meaning and a discussion of why only certain lexical items ﬁt the construction. The goal is to help the students understand why certain lexical items, though not predictable from a construction’s meaning, are “motivated” by it. This advice applies to teaching the ditransitive construction, as well. Recall that in the pilot study of the ditransitive construction, the native English speakers’ intuitions formed a continuum of grammaticality, which is shown in table 7.2. The two ends and the middle point on this continuum are represented in (7) to (9). (7) (8) (9)

Sally got me a ticket. ?Fred pushed him a beer. *George operated them the projector.

Recall also that in chapter 7 it was claimed that this continuum of grammaticality resulted from the prototype eﬀects produced by trying to fuse verbs that do not belong to one of the permitted categories (as in (9)) or are outlying members of a permitted category (as in (8); see ﬁgure 7.4) with the ditransitive construction.3 This account suggests that when teaching the ditransitive, lessons should include not only fully grammatical examples, like give, but also questionable examples, like push, and ungrammatical examples, like operate, along with an explanation of why these latter words don’t ﬁt. In other words, the prototype lexical schemas associated with a construction should be explored. Teachers are often asked, “Why can’t you say x?” and students are often told, “You just can’t,” but in the case of constructions it is possible to provide a more satisfactory answer. A Philosophy of Language Teaching Theories of Understanding This volume began with a discussion of the philosophical foundations of knowledge, and it is ﬁtting to end with a discussion of the teaching implications of diﬀerent schools of philosophy. The oldest of the three schools we will discuss is usually called positivism though cognitive linguists (e.g., Lakoﬀ, 1987; Johnson, 1987) use the term objectivism. Objectivism has been the dominant philosophy of science in the West since Aristotle. This philosophy provides a common-sense answer to two fundamental questions regarding human understanding: the ontological question, What is the nature of reality? and the epistemological question, How do we know what we know? The objectivist’s answer to the ontological question is that the natural world consists of objects that have certain properties, such as weight and density, and that exist in certain relationships to each other; such as “the rock is in the river”; “the bird is ﬂying over the tree.”4 The objectivist’s answer to the epistemological question is that the mind constructs models of reality (that is, schemas), which reﬂect the objects, properties, and relationships that exist independently

174

• Variation in Pedagogical Perspective

in the world. The central assumption of objectivist psychology is that these schemas accurately represent the world—that the mind is a mirror of nature. However, according to objectivism, it is important not to confuse external reality with mental models of reality. Therefore, objectivism endorses the “independence assumption,” which Lakoﬀ (1987, p. 164) states as follows: “no true fact can depend on people’s believing it, on their knowledge of it, on their conceptualization of it, or on any other aspect of human cognition.” Thus, objectivism posits a “God’s eye” view of the universe, independent of human perception, in which all objects, properties, and relationships are correctly characterized. In other words, to the question “If a tree falls in the forest and no one hears it, does it make a sound?” the objectivist answers, “Yes.” Objectivism has been challenged by the philosophy of social constructionism (Geertz, 1983; Kuhn, 1970, 1977; Rorty, 1979, 1989). Social constructionists disagree with a central tenet of objectivism: the claim that there are two diﬀerent kinds of “facts,” which Lakoﬀ (1987, p. 170) calls “brute facts” and “institutional facts.” Brute facts involve the objects, properties, and relationships in the physical world, such as the speed of light and the relative density of gold and water. According to the independence assumption, such facts are true regardless of any human institution. Objectivists believe that scientiﬁc theories are grounded in brute facts, and therefore that these theories can be objectively evaluated. Theories that correspond to and predict the actual brute facts of nature are true; other theories are false. Institutional facts, on the other hand, are set up by human beings. They include customs, like leaving a tip in a restaurant, and agreed-upon states of aﬀairs, like the fact that George W. Bush was elected President of the United States (or was he? You can see that there is room for disagreement here). Institutional facts include all of a society’s laws, beliefs, customs, and myths. Social constructionists contend that there is no absolute dichotomy between brute facts and institutional facts. Obviously, institutional facts violate the independence principle: they depend entirely on human understanding, and they can be changed. For example, in Eastern Europe institutional facts involving national boundaries and political alliances have changed fairly recently. But, as Kuhn (1970) has pointed out, this is the case with brute facts as well. Many scientiﬁc “facts” have been discredited. For example, scientists no longer believe that matter consists of four elements, that the sun revolves around the earth, or that space is ﬁlled with an undetectable substance called “ether.” Social constructionists say that brute facts, like social facts, are constructed by societies, in this case a society of scientists. Therefore, they deny that there is an essential diﬀerence between brute facts and institutional facts and that any form of knowledge is ﬁrmly grounded in reality. There is no “God’s eye” view of nature. Social constructionism proposes a relativistic theory of knowledge. Rorty (1989), for example, claims that scientiﬁc theories can only be judged as “true” in relation to a particular group of people at a particular time.

Teaching Implications

• 175

Actually, not all social constructionists hold this strongly relativistic position. Kuhn, perhaps the best-known social constructionist, seems to place a higher value on sensory experience in evaluating scientiﬁc theories than does Rorty. Kuhn (1970, p. 126) asks, “[Is] sensory experience ﬁxed and neutral? The [objectivist] viewpoint . . . dictates an immediate and unequivocal Yes! But in the absence of a developed alternative, I ﬁnd it impossible to relinquish entirely that viewpoint.” Kuhn’s (1970) reservations about the absolute relativity of knowledge foreshadowed the third school of philosophy we will look at, which is called experiential realism and is endorsed by many cognitive linguists. But ﬁrst, let us consider some of the teaching implications of the two philosophies of science we have discussed so far. Teaching Implications of Objectivism and Social Constructionism According to Bruﬀee (1986), both the objectivist and the social constructionist views of knowledge have analogs in teaching. The objectivist believes that knowledge about brute facts, and by extension “proven” scientiﬁc theories, is authoritative: certain claims are true and others are false. The ultimate authority is nature, but next in authority is the scientist who understands nature; thus, there is no point in discussing or debating scientiﬁc facts. The classroom analog of this view is the lecture course, where an authoritative teacher stands at the front of the room and supplies facts to the students. The ﬂow of information is from the authority to the neophyte. The social constructionist believes that all knowledge, including scientiﬁc knowledge, is collaboratively constructed. Authority resides in a society of experts who agree that certain assumptions and approaches are fruitful. Members of this society interact in conversations and by writing books, articles, letters, and e-mail messages. The process of expounding, criticizing, and revising ideas within a scholarly community is called the “hermeneutic circle.” In science, the hermeneutic circle includes reports of experiments, but experimental results are suggestive rather than conclusive. As Kuhn (1970) points out, no scientiﬁc theory is without exceptions and problems, and scholars must interpret and assess how new data aﬀect a dominant theory. Sometimes when experimental results call a theory into question, scholars consider the results to be a special case or a convenient ﬁction that does not change the basic theory. For example, for at least 50 years after Copernicus proposed the heliocentric universe, astronomers accepted the utility of his model for calculating the location of the planets, but they did not believe that the model was literally true (Kuhn, 1959). Furthermore, sometimes experiments cannot decide between competing theories, and the scientist must hold several theories in mind, using whichever theory is most helpful for dealing with a particular phenomenon. As Feynmann (1965) observes, Every theoretical physicist who is any good knows six or seven diﬀerent

176

• Variation in Pedagogical Perspective

theoretical representations for exactly the same physics. He knows that . . . nobody is ever going to be able to decide which one is right at that level, but he keeps them in his head, hoping that they will give him diﬀerent ideas for guessing. (p. 168) Bruﬀee (1984) says that the classroom analog of the social constructionists’ model of knowledge is a circle of scholars constructing the network of schemas for a particular area of knowledge. The job of the teacher is to engage students in the ongoing conversation of an academic discipline—to introduce them into the hermeneutic circle. He states, “Our task must involve engaging students in conversation among themselves at as many points in both the writing and reading process as possible, and that we should continue to ensure that students’ conversations about what they read and write are similar in as many ways as possible to the way we would like them eventually to read and write” (Bruﬀee 1984, p. 642). Experiential Realism Experiential realism is the philosophy of science associated with cognitive linguistics (Lakoﬀ, 1987; Johnson, 1987). It endorses the social constructionist claim that mental models of institutional facts are entirely socially constructed, but it rejects the claim of some social constructionists that mental models of physical reality can diﬀer radically in diﬀerent societies.5 Rather, it holds that such models are constructed, to some extent, by an interaction between the human perceptual and cognitive apparatus and the physical world. This construction is constrained by universals of perception and cognition and by the universality of certain basic aspects of human experience, and not just by social beliefs. Experiential realists say that human beings have a “concept-making capacity” that allows us to learn about (“construct”) reality directly as well as by means of language and teaching. Because this cognitive apparatus is universal in the species and because basic experience with physical objects is similar in all societies, “directly known” knowledge is similar as well. Such knowledge provides a grounding for schemas of brute facts. Evidence for these claims comes from studies of universals in language and language acquisition. Color categorization is an example of how the human concept-making capacity interacts with socially constructed mental models. The number of basic color terms in the languages of the world varies from two to eleven, but it does not vary without limit. There are many languages that have a term for a color that corresponds to both green and blue in the English system. But there are no languages that have a term that corresponds to both green and red. This is because the human optical apparatus perceives green and blue as similar and green and red as maximally diﬀerent. Furthermore, as explained in the appendix, when languages expand their basic color terms, they all take the same route, ﬁrst distinguishing colors that maximally contrast in our perception.

Teaching Implications

• 177

Cognitive Linguistics claims that our capacity to construct concepts in other domains is similarly constrained by species-speciﬁc perceptual and cognitive mechanisms. These allow us to construct two kinds of mental representations: image schemas and schemas for basic level concepts. Lakoﬀ (1987) notes, “Since image schemas are common to all human beings, as are the principles that determine basic-level concepts, total relativism is ruled out, though limited relativism is permitted” (p. 268). It is beyond the scope of this volume to present the theory of image schemas and basic level objects in full. Instead I will attempt brieﬂy to outline the theory in connection with some of the supporting evidence and refer the reader to more comprehensive discussions. Image schemas are highly abstract mental representations of basic physical relationships, such as an agent exerting force on an object. Johnson (1987) points out that this concept is ubiquitous in human experience. We push a door and it opens; we throw a baseball and it ﬂies. According to Johnson, such universal experiences give rise to the “force” image schema. This schema is an emergent Gestalt concept that “exists for us prelinguistically, though [it] can be considerably reﬁned and elaborated as a result of the acquisition of language” (1987, p. 48). A drawing of the force image schema appears in ﬁgure 9.2, which indicates that when an object, or trajector, is impacted by a force (represented by the solid line), it moves along a path (represented by the dotted line). How the force image schema emerges in children and how language is mapped onto it are suggested in studies by Slobin (1973, 1985), who found that children from diﬀerent language backgrounds employ certain kinds of linguistic structures before others. He claimed that these structures reﬂect “prototypical scenes” in a child’s experience. One such scene is the manipulative activity scene, which Slobin (1985) describes as follows: Manipulative activities involve a cluster of interrelated notions, including: the concepts representing the physical objects themselves, along with sensorimotor concepts of physical agency involving the hands and perceptual-cognitive changes of state and change of location, along with some . . . notions of . . . causality, embedded in interactional formats of requesting, giving and taking. (p. 1175) Certain parts of this scene can be marked grammatically. For example, in many languages direct objects are marked for the accusative case. But Slobin claims that children learn to mark “prototypical” objects, that is, objects that are physically aﬀected (such as the object of the action of giving) before they

Figure 9.2 The force image schema.

178

• Variation in Pedagogical Perspective

learn to mark objects that are not physically aﬀected (such as the object of the act of seeing). A similar situation occurs in the acquisition of Kaluli (Schieﬀelin, 1985), an ergative language which marks the subject of verbs that take a direct object. Here children ﬁrst learn to use the ergative marker with the subjects of verbs that involve direct physical manipulation. According to Slobin, these facts suggest that at ﬁrst children do not mark grammatical classes, like direct object, or subject of ergative verb, but rather semantic classes, like the object or agent in the manipulative activity scene. Slobin suggests that children understand the manipulative activity scene directly (perhaps in the form of a force image schema) based on their interaction with objects in the world as mediated by human perception and basic cognition, and that they map language onto this emergent Gestalt concept. This kind of understanding, he says, is universal and accounts for “basic child grammar,” a grammatical system shared by all children in the beginning stages of language learning. Further evidence that knowledge of basic physical relationships can be understood directly is provided by Talmy (1985a, 1985b), who claims that these relationships form a privileged class of knowledge that all languages tend to represent by grammatical rather than by lexical devices. One example is English prepositions. Prepositions are grammatical rather than lexical because they belong to a closed class of words that is relatively small and cannot be added to easily. Talmy (1985a) points out that prepositions assume image schemas that contain two basic elements, which he called “ﬁgure” and “ground” (more recently the terms trajector and landmark have been used; see the discussion of ﬁgure 9.3 below). A landmark is an entity that can be construed as a reference point and a trajector is an entity that is located with respect to a landmark. For example, the preposition across in “The man swam across the pool” assumes an image schema where the pool is the landmark and the man is the trajector. Talmy claims that there are universal constraints on how image schemas can characterize the possible relationships between trajector and landmark. The trajector is typically smaller than the landmark, so “The towel lay across the body” sounds more natural than “The body lay across the towel” (unless it’s a very big towel). Furthermore, the absolute size of the trajector and landmark do not seem to matter. Thus, “The ant walked across the paper” and “The bus drove across the country” sound equally natural. One aspect of the landmark that languages typically encode grammatically is its state or constituent structure. Through refers to an image schema in which the ground is some medium, such as water or trees, rather than a ﬂat plane. “The man walked through the ﬁeld” cannot refer to a plowed ﬁeld but only to a ﬁeld covered with some substance such as wheat. Cognitive linguists claim that image schemas also underlie grammatical constructions. For example, the “transfer of possession” image schema, shown in ﬁgure 9.3, underlies the ditransitive construction. Recall that the ditransitive

Teaching Implications

• 179

Figure 9.3 The transfer of possession schema.

construction has the prototypical meaning “someone gives something to someone else, who then possesses it.” Figure 9.3 says that a trajector (object) moves from a landmark that is the source of the transfer (the circle on the left) to a landmark that is the goal of the transfer (the circle on the right) and remains there. Talmy (1985a) characterizes the cognitive process of creating image schemas as a “boiling down of objects, in all their bulk and physicality” (p. 232) to idealized and abstract images. Like Lakoﬀ (1987) and Johnson (1987), he attributes the fact that such schematization appears to work in similar ways in all languages to the universal nature of the human perceptual and cognitive apparatus interacting with physical reality. “The explanation [for similarities] can be found in our very mode—in large part presumably innate—of conceiving, perceiving, and interacting with the contents of space” (Talmy 1985a, p. 233). According to Johnson (1987), image schemas underlie and permeate our language-based network of concepts for objects and events, and thus make possible our understanding of the world. They “provide a basis for and can connect up with our . . . networks or webs of meaning. Without them, we cannot explain the connections and relationships that obtain in our semantic networks” (p.189). Johnson does not claim that all languages are built on exactly the same image schemas; only that image schemas are substantially similar in all languages. A second foundational notion of experiential realism is the notion of basic level objects, the evidence for which comes mostly from studies of categorization. All languages categorize objects at various levels of abstraction. The least abstract category is one that contains a particular object. For example, the category “Montserrat” (my cat) contains one member. In English there are a number of more abstract categories to which Montserrat belongs: “Siamese cat,” “cat,” “animal,” and “living thing.” According to Lakoﬀ (1987), languages may diﬀer considerably in their systems of categorization, but most languages will have a word denoting objects at the basic level, which is at an intermediate level of abstractness. Thus, most languages will have a word that corresponds to “cat,” but not necessarily to “Siamese cat” or “animal” or “living thing.” Languages tend to have words for basic level categories because it is at this level of abstraction that the properties shared by members of categories are most salient to human beings. The basic level of categorization for objects is

180

• Variation in Pedagogical Perspective

distinguished from more and less abstract levels by three characteristics: (1) Gestalt perception of the object’s overall shape; (2) the human capacity for physical interaction with the object; and (3) the ability to form a rich mental image of the object (it is possible to form a mental image of “Montserrat,” “Siamese cat,” or “cat,” but not of “animal”). Brown (1965), who ﬁrst proposed the basic level theory, noticed that basic level categories are among the ﬁrst to be named by children. This is so because the overall shape and manner of bodily interaction with objects at this level are salient to children. Brown observed that this latter characteristic is particularly important. He noted: When something is categorized, it is regarded as equivalent to certain other things. For what purposes equivalent? . . . Flowers are equivalent in that they are agreeable to smell and are pickable. Cats are equivalent in that they are to be petted, but gently, so that they do not claw. (Brown, 1965, pp. 318–319) Notice that the prelinguistic ability to conceptualize basic level objects is also assumed in Slobin’s (1973, 1985) account of the manipulative activity scene. In order for children to understand that a force is acting on an object, they must ﬁrst understand what an object is. The claim that knowledge in the form of image schemas and schemas for basic level objects is “preconceptual” (known directly, and not by means of language) explains how human beings who do not have language, such as infants and feral children, can understand the world. This claim is also consistent with the experience of Helen Keller (1988), who reported that before she learned her ﬁrst word, “water,” she knew perfectly well what the cool liquid was, and did not confuse it with milk or bread. However, Johnson (1987) acknowledges that directly known knowledge can be extended and built upon considerably by diﬀerent linguistic and conceptual systems, and thus most knowledge is relative. Experiential realism stakes out a position between radical relativism and objectivism. It claims that all societies are “plugged in” to the physical world by means of image schemas and schemas for basic level objects. To summarize, according to experiential realism schemas for social facts are entirely socially constructed and therefore can diﬀer radically from culture to culture, but schemas for brute facts involving basic level objects and physical relationships are grounded in human perception and cognition as well as universal experience. Therefore, schemas for these facts, like schemas for color systems, cannot diﬀer without limit. Notice that experiential realism does not endorse the objectivist claim that schemas are “true” reﬂections of reality. Rather, it claims that these representations are “true” in relation to human beings. Teaching Implications of Experiential Realism Experiential realism endorses the social constructionist model of knowledge in most respects, and therefore has similar implications for teaching. The

Teaching Implications

• 181

metaphor of education as an ongoing conversation in which students take a greater and greater part seems particularly appropriate for language teaching. This metaphor, after all, has guided the communicative competence approach to language teaching (Savignon, 1983). The metaphor is also appropriate for content-based instruction (Adamson, 1993, 2005; Brinton and Master, 1997; Snow and Brinton, 1997), where the goal is to introduce students to an academic culture. We now consider the ways in which experiential realism diﬀers from social constructionism, and therefore has additional implications for language teaching. The theory that we can directly know certain concepts helps to explain why some methods of beginning language instruction are successful and provides an answer to the common question: How can you teach {Spanish} if you don’t speak the students’ language? Most communicative language teaching methods introduce the target language in connection with basic level objects and simple topological relationships. For example, in the Total Physical Response method (Asher, 1969), the teacher gives commands: “Walk to the door,” “Turn around,” “Put the pen on the book,” while demonstrating what the commands mean. The students then perform the actions themselves. No translation is necessary here. These actions and objects are understood directly. A second method that involves the manipulation of basic level objects and topographical relationships is The Silent Way (Gattegno, 1972). In this method the objects and relationships are modeled by using colored rods while they are described in the target language: “The red rod is above the green rod,” etc. The Silent Way provides a good example of what is meant by direct or prelinguistic understanding because it is not necessary for the students’ native language to mark the distinctions that can be taught with the rods. For example, as we have seen, many languages do not have separate words for blue and green, but as Rosch (1973) found, human beings can readily learn to mark these colors lexically, even though their language does not, because the two hues correspond to natural divisions imposed on the color spectrum by human optical physiology (Kay and McDaniel, 1978). Thus, students have no diﬃculty learning to name the two colors. Similarly, Spanish-speaking students quickly learn the diﬀerence between “the rod is on the glass” and “the rod is in the glass” when it is demonstrated, even though the Spanish preposition en would be used in both cases. In regard to higher level teaching, and in particular to content-based instruction, the theory of image schemas and basic level objects helps to explain why teaching based on demonstration is more eﬀective than expository teaching, which is based entirely on language. Adamson (1993) describes the case of Ceyong, a seventh-grade student from Korea, who was unable to understand scientiﬁc concepts such as speciﬁc gravity on the basis of her teacher’s lectures. However, when Ceyong’s tutor took her to a laboratory where she was able to measure the speciﬁc gravity of various minerals using

182

• Variation in Pedagogical Perspective

scales and beakers of water, the concept became clear to her. Experiential teaching was endorsed long ago by John Dewey, who said: When education . . . fails to recognize that the primary or initial subject matter always exists as matter of an active doing, involving the use of the body and the handling of material, the subject matter of instruction is isolated from the needs and purposes of the learner, and so becomes just something to be memorized and reproduced upon demand. (1916, p. 184) It is important to emphasize that content-based teaching must prepare students to beneﬁt from expository teaching as well as experiential teaching. Much of the academic conversation in which we desire students to participate is expository in nature. The point is that experiential teaching can provide an entry into this conversation. The goal of the content-based course should be to provide the most eﬀective mixture of expository and experiential instruction. In general, second language students will require considerably more experiential instruction than will native speakers, as recommended by many experts, including Valdez (2001) and Rigg and Enright (1986). To summarize, in this section I have attempted to show that the philosophy of experiential realism can provide an epistemology that is compatible with eﬀective approaches to language teaching, both at the introductory and advanced levels. According to this philosophy, most kinds of knowledge are socially constructed, and authority resides not in an objective standard of truth, but in a community of experts who share a paradigm that provides a particular vocabulary and a set of assumptions about what counts as a good argument. Scholarship is viewed as a continuing conversation among members of this community, which continually results in a reﬁnement of vocabulary and occasionally in a shift of paradigms. Education is viewed as increasing participation in this conversation. This model of knowledge implies that language instruction should be interactive and collaborative. Experiential realism also claims that some knowledge can be constructed by direct sensory experience, and therefore is independent of language. Thus, experiential teaching that involves the observation and manipulation of objects and relationships can provide an entry into the academic conversation of a second language.

Appendix Variation and Change in Color Semantics

Introduction Scholars who study color category systems have recognized that their work is similar to the study of variation and change in other linguistic systems. Berlin and Kay (1969) pointed out the similarities between their theory of the development of color systems and Jacobson’s (1962) theory of children’s phonemic development. Kay (1975) pointed out similarities with Weinreich, Labov, and Herzog’s (1968) theory of sound change, noting that in both cases diachronic change is accompanied by synchronic variability and heterogeneity. Investigations of sound systems described in Labov (1982, 1994, 2001a), and of color systems described in MacLaury (1986, 1991), point up both similarities and diﬀerences in how change takes place in these two cognitive systems. In this appendix, the theory of color category change is discussed within the framework of sound change outlined in chapter 3. Background: The Evolution of Color Categories Berlin and Kay (1969) studied systems of color categorization in 98 languages. They noted that all languages have an inﬁnite number of terms for describing colors, ranging from “red,” to “reddish-brown,” to “the color of the rust on my aunt’s Chevrolet.” However, among this inﬁnite number of terms there are a maximum of 11 basic color terms. A color term is basic if, among other things, it is a single morpheme, is not derived from another term (like reddishbrown), and uniquely names a region of the color spectrum. The set of basic color terms exhaustively covers the color spectrum. For each individual, the basic color terms form a cognitive system that is familiar and easily accessible. The individual will know and use many secondary color terms that are not part of the basic system, but the basic terms will be used most frequently. The 11 basic color terms in English are: white, black, red, green, yellow, blue, brown, purple, pink, orange, and gray. Many languages have fewer than 11 basic terms, but these languages can be arranged in a fairly strict implicational relationship. This relationship, as modiﬁed by Kay (1975), is shown in ﬁgure A.1, which says that if a language has only two color terms, they will be light-warm and darkcool. If a language has only three color terms, they will be white, warm, and dark-cool, and so on. Berlin and Kay (1969) claimed that their synchronic study of color categorization systems has diachronic implications. That is, the implicational 183

184

• Appendix

Figure A.1 Stages of evolution in color categories.

relationships in ﬁgure A.1 apply not only to the description of present-day systems, but also to the historical development of color category systems. Thus, early systems contained the terms in the early stages of ﬁgure A.1, and as they developed they added the terms in the later stages. Such evolution can be observed today in the changing color systems of developing societies, as described below. What is the motivating force that causes a society to expand its stock of basic color terms? Berlin and Kay (1969, p. 16) observe: There appears to be a positive correlation between general cultural complexity (and/or level of technological development) and complexity of color vocabulary . . . Increase in the number of basic color terms may be seen as part of a general increase in vocabulary, a response to an informationally richer cultural environment about which speakers must communicate eﬀectively.

Appendix

• 185

Recall from chapter 3 that Labov’s theory of sound change includes social, functional, and physiological factors. Social factors provide the strongest motivation for sound change. The functional principle of maximum contrast is a motivating force behind chain shifts, where, for example, the overcrowding of phonemes in one area of phonetic space results in a phoneme being pushed into a diﬀerent area. Physiological factors include the fact that vowels tend to move toward the front of the mouth because it has more articulatory space. Labov (2001a) believes that although physiological factors have a strong inﬂuence on sound change, they interact with and can be overridden by social factors, “just as a boat may tack into the wind” (p. 499). In contrast, Berlin and Kay’s (1969) theory of change in color systems is a functional theory narrowly constrained by physiological universals that limit the direction that change can take. Their theory is functional because it claims that if the informational load on a color term becomes too great, a new color term is added. The new term will name the hue that maximally contrasts with the hues already named. Maximum contrast is determined by the physiological nature of the human optical apparatus. Kay (1975) and Dougherty (1977) add a social component to the theory of color category development, noting that how far an individual has progressed along the hierarchy shown in ﬁgure A.1 is related to social factors. Age is the factor most strongly correlated with color category development. Kay (1975) reports that in studies of color categorization in Aguaruna (Berlin and Berlin, 1975), Futunese (Dougherty, 1975), and Binumarien (Hage and Hawkes, 1975), younger speakers progressed signiﬁcantly further in color category development than older speakers. He concludes, “All the signiﬁcant diﬀerences point in the same direction—that younger speakers have more advanced basic color systems than older speakers” (Kay, 1975, p. 269). Another social constraint on the development of color category systems is how much exposure an individual has to more advanced color systems. Berlin and Berlin (1975) report that, for Aguaruna speakers, contact with Hispanic culture and proﬁciency in Spanish correlate with advanced color term development. Similarly, Hage and Hawkes (1975) report that for Binumarien, speakers with more exposure to the New-Melanesian language are more advanced. On the other hand, Dougherty (1975, 1977) reports that for Futunese, the development of color terms appears to be negatively correlated with exposure to New-Melanesian. Change in Color Category Systems The theory of change in color category systems outlined above can be compared to the theory of change in linguistic systems outlined by Weinreich, Labov, and Herzog (1968) and Labov (1982, 1994), which was reviewed in chapter 3. These scholars provide a useful framework for the study of linguistic change by specifying ﬁve problems with which such a theory must deal. These

186

• Appendix

are the problems of constraints, transition, embedding, evaluation, and actuation. All of these except the constraints problem (which concerns only ease of articulation) have both a linguistic aspect and a social aspect. The Constraints Problem The explanation of the physiological constraints on color development is a robust area of cognitive theory. The following discussion is based on the theory presented in MacLaury (1986, 1991), which contains three physiological principles: (1) due to the structure of the optical system people perceive six purest colors—red, yellow, green, blue, white, and black—which can be arranged into exactly ﬁfteen diﬀerent pairs: red-yellow, red-green, red-blue, yellow-green, yellow-blue, yellow-white, etc.; (2) the two colors in each pair are perceived to be similar to some extent and diﬀerent to some extent; (3) the strength of this similarity and diﬀerence (or contrast) diﬀers for each of the ﬁfteen pairs of colors. For example, the pair blue/green has a high degree of similarity and a low degree of contrast, whereas the pair red/yellow has a low degree of similarity and a high degree of contrast. As mentioned earlier, a color category system adds a basic color by adding a hue that maximally contrasts with the existing basic colors. MacLaury’s ﬁnal principle is a cognitive principle that explains gradual change in systems as a whole: individuals show variation in their attention to the similarity and diﬀerence in color pairs. This principle is discussed below as part of the actuation problem. In sum, physiological constraints do not totally determine sound change or color category change; however, they are much stronger in regard to color change. The Transition Problem The transition problem in color change concerns the route by which a new basic color term is added. The most common route involves ﬁve phases: (1) nearsynonymy with an older term; (2) coextensivity; (3) inclusion; (4) overlapping (a phase that does not always occur); and (5) complementation. In phase 1, when a new color term appears, it has nearly the same range (refers to the same hues) and the same focal point (best example) as an existing color term. Thus, the two terms are virtually synonymous. In phase 2, the focal point of the new term shifts, although the range remains the same. In this phase, the old term and the new term are no longer synonymous. However, they are coextensive because all the hues referred to by the new term could, in a stretch, also be referred to by the old term. An English example of coextensive terms might be rose and pink (although these are not basic color terms). An individual might name exactly the same hues both rose and pink; however, pink would be more appropriate for certain hues and rose more appropriate for others. In phase 3, inclusion, the range of the new term retreats so that it refers exclusively to the area around its focal point. An English example might be the term scarlet, which is completely included within red. The older term can still refer variably

Appendix

• 187

to the focal point of the new term, though this would be less likely. In phase 4, overlapping, the older term can no longer refer to the focal area around the new term; however, the fuzzy area in between the two focal points can still be named by each term. In phase 5, complementation, the two terms are in complementary distribution. As mentioned, phase 4, overlapping, is not attested in all color studies, so apparently the system can go directly from inclusion to complementary distribution. Thus, in color change, the range of the older term retreats so that overlapping is ﬁrst reduced and then eliminated as the old and new terms become mutually exclusive. At that point the information load is equally distributed between the two terms. Phases 4 and 5 are crucial, for in these phases a unique form–meaning relationship is created, and a new basic color term enters the language. The route of color change is similar in some ways to the route of regular sound change described in chapter 3 because in the early stages of both cases a new term can refer to all members of an existing class. In phase 1 of color change, the new term applies variably to all of the hues named by the old term. In sound change, the old phone can be used variably in all the words in the word class, although, typically, there is environmental conditioning. For example, the centralization of /aw/ on Martha’s Vineyard was more advanced before voiceless consonants. Thus, incipient sound change normally corresponds to phase 2 of color category change. However, regular sound change is diﬀerent from color category change in that it does not produce a new basic category, but merely alters the pronunciation of an existing category. The route of color change is similar to the route of morphological change in child language acquisition and in decreolization because in these cases a developing system expands to create new basic units. In child language acquisition, Slobin (1973, p. 184) observes, “New forms ﬁrst express old functions” (emphasis in original), a description that exactly ﬁts phase 1 of color change, near-synonymy. Similarly, in decreolization, according to Bickerton (1975), a characteristic strategy is for new morphemes to be slotted into place in creole structures—semantic as well as syntactic (p. 70). An example from Guyanese Creole is the replacement of bin by di/did. Bin is an anterior time marker that is used most frequently with stative events as in (1), where it indicates a simple past. (1)

dem bin gat wan lil haus They had a little house. (Bickerton, 1975, p. 35).

For nonstative events, simple past is most frequently indicated by the verb stem alone. The change from basilectal bin to mesolectal di/did appears to occur in a way that is similar to the color category change. In phase 1, near-synonymy, di/did is in free variation with bin, so that there is no distributional diﬀerence between (1) and (2).

188

• Appendix

(2)

dem di/did gat wan lil haus They had a little house.

In phases 2 and 3, coextensivity and inclusion, di/did still alternates in all semantic environments with bin, but it appears to develop a focal point in a particular semantic ﬁeld, namely use with nonstative verbs. Bickerton (1975) notes that at the low mesolectal level, bin occurs with 75 percent of past statives and 25 percent of past nonstatives. The ﬁgures for di/did exactly reverse this distribution. An equivalent to phases 4 and 5 of color change is not attested in the replacement of bin by di/did. However, it seems reasonable that these phases could occur, if only for brief periods of time. In phase 4, overlapping, bin would occur with prototypical cases of statives, and di/did would occur with prototypical cases of nonstatives. In a system that has reached phase 5, complementarity, bin would occur only with prototypical statives and did would occur in all other cases. In decreolization, of course, there is a further phase where did completely replaces bin, and the stative/nonstative marking of past time disappears. The social aspect of the transition problem in both sound change and color category change involves the question of the locus of change. Does signiﬁcant change occur in individuals over the course of their lifetimes, or does change occur mainly from generation to generation? As we saw in chapter 3, Labov (2001a) views the locus of major systemic sound change as generational, not individual. He believes that once an individual’s sound system has been set, it can be modiﬁed only in respect to low level rules, such as raising or lowering rules, which produce a diﬀerent allophone of a phoneme. However, tensing rules, which are likely to produce a new phoneme, can be learned only by children. The situation with color change may be similar. Perhaps individuals do not change the basic color systems that they ﬁrst learn, but rather add new secondary colors to the system, which are used with increasing frequency. When the children of these speakers construct their own color systems, they incorporate these secondary colors as basic colors. This hypothesis is supported by Kay (1975), who ﬁnds that in general younger speakers have more basic color categories than older speakers. In addition, MacLaury (1986, 1991) shows that older speakers use secondary color terms that younger speakers adopt as basic. The absence of longitudinal data preclude the conclusion that older speakers innovated the terms when they were young, but this seems likely. In research conducted in southern Mexico (MacLaury, 1986, pp. 320–324; Burgess, Kempton, and MacLaury, 1983, ﬁg. 7), one Tarahumara speaker was interviewed twice, once in his village and again two years later in Oaxaca, where he was far from home and had just ﬁnished two months of intensive linguistic training. The later data showed shrinkage and polarization of categories and the use of more qualiﬁers, signs of stronger emphasis on distinctiveness. But

Appendix

• 189

the speaker had not added new basic color categories. Thus, some evidence suggests that individual adults make secondary changes but not basic changes in both color and sound and, therefore, that in both systems the locus of basic change is normally between generations (however, see the discussion in regard to the embedding problem). The Evaluation Problem A surprising feature of color category research in Mesoamerica is the extreme variation that is seen between members of the same speech community. As a rule, people who interact daily diﬀer vastly in the organization and complexity of their color category systems. For example, speaker A might have three basic color terms, and speaker B might have ten. Neither is aware of their diﬀerence, and both are surprised when the diﬀerence becomes apparent during the course of elicitation. Berlin and Berlin (1975, p. 86, note 9) report young Aguaruna speakers who register surprise and even laugh at their senior relatives as the elders label color chips for an investigator. MacLaury (1991) reports that, in spite of this extreme diﬀerence in color category systems, apparently communication does not break down. However, closer investigation may reveal that misunderstandings are more frequent than presently supposed, as was the case with the speakers with diﬀerent phonological systems described in chapter 3. The Actuation Problem and the Embedding Problem Recall from chapter 3 that social factors are a major motivation for sound change. MacLaury (1991) claims that social factors can also aﬀect color change. An example is the case of Tzeltal and Tzotzil, two Mayan languages of southern Mexico. Tzeltal is proceeding along the expected path of color development with some speakers at stage III of ﬁgure A.1 and other speakers at stages V and even VI. Tzotzil speakers exhibit a similar range from stage III through stage VI, but they also show a remarkable phenomenon: the various stages of color terms can coexist within the same speaker. For example, one subject had a term for green and a term for blue, but also a term for green-blue. In fact, this subject had preserved all the terms of the older stage III system with their original meanings while adding new terms to create a stage V system. In the normal course of development, when younger speakers add a new basic color term, the meaning of an older term is modiﬁed to denote a smaller range. That is, the new system is built out of the old system, not created alongside it. The explanation for the unusual development of color terms in Tzotzil can be found in the social circumstances of the community. Tzotzil is spoken in the village of Navenchauc, and Tzeltal is spoken in the village of Tenejapa. Tenejapa is an isolated community located at the end of a dirt road, which is exposed to little external inﬂuence. Navenchuac, on the other hand, has been exposed to massive external inﬂuence ever since the Pan-American highway was built

190

• Appendix

through the town. In reaction to this threat to tradition, the village has become extremely conservative, as seen in the inhabitants’ traditional patterns of dress and in their guarded relationships with outsiders. MacLaury (1991) hypothesizes that the older system of color terms in Tzotzil has become emblematic of the culture and thus has resisted extinction despite the emergence of the new color terms. Cognitive Aspects of Sound Change and Color Change It is apparent from the discussion so far that sociolinguists ﬁnd the motivation for sound change mainly within the social realm whereas cognitive anthropologists ﬁnd the motivation for color change mainly within the cognitive realm. This diﬀerence undoubtedly stems, in part, from the academic orientation of the two disciplines. Nevertheless, Labov (1979, 1994) has acknowledged that a full explanation of linguistic change must include cognitive as well as social considerations. For example, cognitive factors seem to be necessary to explain the very beginnings of sound change. In the search for innovators, individual diﬀerences within the same social network must be explained. In chapter 3 we saw that Labov (2001a) identiﬁed innovators as individuals who are leaders within their social network and have contacts outside the network. But some earlier research by Labov (1979) suggests that cognitive diﬀerences between individuals may play a role, as well. Labov (1979) reported that repetition tests administered to adolescent African American English (AAE) speakers produced highly variable results. In their spontaneous speech, all of the subjects showed equivalent proﬁciency in the AAE vernacular, yet some subjects were unable to repeat Standard English constructions, whereas others, whom Labov calls “verbal leaders,” had little diﬃculty. Labov (1979) characterizes the ﬁrst group as “dialect bound.” The distinction between dialect bound and non-dialect bound individuals corresponds to Day’s (1979) distinction between language bound and language optional individuals. Language bound individuals tend to perceive stimuli in terms of existing mental schemas, ignoring non-categorical diﬀerences between stimuli. The language bound/language optional distinction between individuals recalls MacLaury’s (1991) distinction discussed above between individuals who focus on similarity (corresponding to language bound speakers) and individuals who focus on distinctiveness (corresponding to language optional speakers). Thus, the same cognitive principle may help to explain individual innovation in color category development and in phonological change. Within a society exposed to novelty, individuals who are language optional will be innovators of color category development. Within a middle level social group, individuals who are language optional and who have many contacts outside the group (and are thus exposed to linguistic novelty) may be innovators of sound change. If these innovators have high local status, the change may spread and become emblematic of the group.

Appendix

• 191

Conclusions Color category change has been studied largely within a cognitive paradigm, and sound change has been studied largely within a sociological paradigm, but both cognitive and sociological factors are necessary fully to explain both types of change.

Notes

Chapter 2 1. This chapter is based on Adamson and Regan (1991). 2. This section is based closely on Houston’s (1985) discussion.

Chapter 3 1. The parentheses indicate that a form is variably realized. This notation is used when the discussion speciﬁcally focuses on language change. Otherwise, the traditional slashes and brackets are used.

Chapter 4 1. This percentage has dramatically increased in recent years (Regan, V. personal communication, December 9, 2007). 2. Montreal has a similar French immersion program (but not identical: the Montreal program involves 100 percent French immersion in grades 1 to 3). For a review, see Adamson (2005). Montreal’s program is often cited as the source of the instructional method called structured immersion, which is designated by constitutional amendment as the only legal method allowed for instructing English language learners in the states of Arizona and California.

Chapter 5 1. Though younger children might still be in the process of acquiring Chinese, their continued acquisition would be maintained by their primary caregivers (i.e., their parents) in the home, thus limiting the confounding possibility that English was being acquired as a “second” ﬁrst language. In any case, the children had already begun to develop a ﬁrst language matrix upon which to build. 2. The original data were collected by Dr Muriel Saville-Troike and her research assistant Junlin Pan. Their focus was on the children’s Chinese language use, but they also transcribed the English narratives, which they graciously shared with us. 3. Replacive verbs ultimately had to be omitted from the analysis because there were too few tokens in the data.

Chapter 6 1. In the interest of continuity I will use variation theory terminology when discussing this article; however, the authors used diﬀerent terms. 2. Feldman (2006) presents a compelling case that the relationship between connectionist computer models and neural computation in the brain is closer than metaphorical. However, he notes that others disagree, quoting Chomsky’s recent statement, “The belief that neurophysiology is even relevant to the functioning of the mind is just a hypothesis. Who knows if we’re looking at the right aspects of the brain at all?” (quoted in Feldman, 2006, p. 80).

192

Notes

• 193

Chapter 7 1. Variationists have learned to question purported cases of free variation and to look for factors that favor one form or the other. It is likely that formal discourse would favor dove, and informal discourse would favor dived. In regard to this point Levelt (1989) remarks, “Certain so-called registers . . . seem to select for lexical items with particular connotational properties. [This area] is a matter of much dispute” (p. 183). 2. Notice that the formalism used in (5) does not distinguish between the necessary and the optional features of the prototype because it does not use a convention like angled brackets. 3. The ﬁgures in table 7.2 were arrived at in the following way. The subjects rated the grammaticality of each test sentence on a scale of 1 to 5, where 1 meant completely ungrammatical and 5 meant completely grammatical. Then the mean score for each sentence was calculated. Some sentences received a mean score of 5; some sentences received a mean score of 1; and some sentences received a mean score in between these numbers. For ease of display, the mean scores were converted to a grammaticality index score ranging from 1.0 (completely grammatical) to −1.0 (completely ungrammatical), using the formula: mean score − 3 2 with the result rounded to the nearest tenths place.

Chapter 9 1. Actually, there is more in Bachman’s (1990) deﬁnition of sociolinguistic competence, including knowledge of literary registers, but this is the most important and frequently discussed part of the construct. 2. The focus of a relative clause is determined by what function the relative pronoun (including a deleted relative pronoun) serves in the clause. Here are some examples: a. The forest [which burned down] was beautiful. This relative clause is subject focus because which serves as the subject of the clause. b. The forest [which the careless camper burned down] was beautiful. This relative clause is direct object focus because which serves as the direct object of the clause. c. The forest [the careless camper burned down] was beautiful. This relative clause is also direct object focus because which could be inserted in the direct object position, as in b. 3. Notice, by the way, that the same thing is happening with (3) and (4). These sentences are not totally unacceptable. Rather, guilt and embarrassment are central members of the prototype category “emotions that can be concealed,” and pleasure and rage are outlying members. The latter verbs can be fused with the construction, but they don’t sound quite right. 4. As Dowty, Wall, and Peters (1981, p. 7) put it: “As a ﬁrst approximation let us simply assume that the world contains various sorts of objects—call them ‘entities’—and that in a particular state-of-aﬀairs these entities have certain properties and stand in certain relations to each other.”

194

• Notes

5. It is not easy to identify “radical relativists.” According to Smith (1989, p. 218), even Rorty is “more positive than he acknowledges.” Smith names herself, Feyerabend (1978), Goodman (1968), and Barnes and Bloor (1982) as selfidentiﬁed “radical relativists.”

References

Adamson, H. D. (1980). A study of variable syntactic rules in the interlanguage of Spanish-speaking adults. Unpublished doctoral dissertation. Georgetown University. Adamson, H. D. (1988). Variation theory and second language acquisition. Washington, DC: Georgetown University Press. Adamson, H. D. (1993). Academic competence: Theory and classroom practice. New York: Longman. Adamson, H.D. (2005). Language minority students in American schools: An education in English. Mahwah, NJ: Lawrence Erlbaum Associates. Adamson, H. D., & Elliott, O. P. (1997). Sources of variation in interlanguage. International Review of Applied Linguistics, 35, 87–98. Adamson, H. D., Fonseca-Greber, B., Kataoka, K., Scardino, V., & Takano, S. (1996). Tense marking in the English of Spanish-speaking adolescents. In R. Bayley & D. R. Preston (Eds.), Second language acquisition and linguistic variation (pp. 121–134). Amsterdam: John Benjamins. Adamson, H. D., & Kovac, C. (1981). Variation theory and second language acquisition: An analysis of Schumann’s data. In D. Sankoﬀ & H. Cedergren (Eds.), Variation Omnibus (pp. 285–292). Carbondale, IL and Edmonton, AL: Linguistic Research, Inc. Adamson, H. D., & Regan, V. (1991). The acquisition of community speech norms by Asian immigrants learning ESL: A preliminary study. Studies in Second Language Acquisition, 13, 1–22. Ahrens, K. (1995). The mental representation of verbs. Unpublished doctoral dissertation. University of California, San Diego. Aksu-Koç, A. A., & von Stutterheim, C. (1994). Temporal relations in narrative: Simultaneity. In R. A. Berman & D. I. Slobin (Eds.), Relating events in narrative: A crosslinguistic developmental study (pp. 393–455). Hillsdale, NJ: Lawrence Erlbaum. Andersen, R. (1993). Four operating principles and input distribution as explanations for underdeveloped and mature morphological systems. In K. Hyltenstam & A. Viborg (Eds.), Progression and regression in language (pp. 309–339). Cambridge, UK: Cambridge University Press. Anderson, J. R. (1980). Cognitive psychology and its implications. San Francisco: Freeman. Anderson, J.R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press. Anshen, F (1969). Speech variation among Negroes in a small southern community. Unpublished doctoral dissertation, New York University. Anshen, F. (1975). Varied objections to various variable rules. In R. W. Fasold & R. W. Shuy (Eds.), Analyzing variation in language (pp. 1–10). Washington, DC: Georgetown University Press. Ashby, W. J. (1981). The loss of the negative particle “ne” in French: A syntactic change in progress. Lingua, 39, 119–137. Ashby, W. J. (1996). A syntactic change in progress. Language, 57, 674–678. Asher, J. J. (1969). The total physical response approach to second language learning. The Modern Language Journal, 53, 1–17. Auger, J. (2002). French immersion in Montreal: Pedagogical norm and functional competence. In S. Gass, K. Bardovi-Harlig, S. S. Magnan, & J. Walz (Eds.), Pedagogical norms for second and foreign language learning and teaching: Studies in honor of Albert Valdman (pp. 81–101). Amsterdam: John Benjamins. Bachman, L. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Baker, C. L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry, 10, 533–581. Bailey, N., Madden, C., & Krashen, S. (1974). Is there a “natural sequence’ in adult second language learning?” Language Learning, 21, 235–243. Baillargeon, R. (1995). Physical reasoning in infancy. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 181–204). Cambridge, MA: MIT Press. Baker, C. L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry, 10, 533–581.

195

196

• References

Bakhtin, M. M. (1981). The dialogic imagination (C. Emerson & M. Holquist, Trans.; M. Holquist. Ed.). Austin: University of Texas Press. Bardovi-Harlig, K. (1995). A narrative perspective on the development of the tense/aspect system in second language acquisition. Studies in Second Language Acquisition, 17, 263–291. Bardovi-Harlig, K., & Gass, S. (2002). Introduction. In: S. Gass, K. Bardovi-Harlig, S. S. Magnan, & J. Walz (Eds.), Pedagogical norms for second and foreign language learning and teaching: Studies in honor of Albert Valdman (pp. 1–12). Amsterdam: John Benjamins. Barlow, M. (1994). Corpora for theory and practice. Houston, Texas: Rice University, ms. Barnes, B., & Bloor, D. (1982). Relativism, rationalism and the sociology of knowledge. In M. Hollis & S. Lukes (Eds.), Rationality and relativism (pp. 57–65). Cambridge, MA: Harvard University Press. Barsalou, L. W. (1992). Cognitive psychology: An overview for cognitive scientists. Hillsdale, NJ: Lawrence Erlbaum. Bayley, R. (1994). Interlanguage variation and the quantitative paradigm: Past tense marking in Chinese-English. In E. Tarone, S. M. Gass, & A. Cohen (Eds.), Research methodology in second-language acquisition (pp. 157–181). Hillsdale, NJ: Lawrence Erlbaum. Bayley, R. (1996). Competing constraints on variation in the speech of adult Chinese learners of English. In R. Bayley & D. Preston (Eds.), Second language acquisition and linguistic variation (pp. 97–130). Philadelphia: John Benjamins. Bayley, R., & Regan, V. (2004). The acquisition of sociolinguistic competence. Journal of Sociolinguistics, 8, 323–338. Bell, A. (1977). The language of radio news in Auckland: A sociolinguistic study of style, audience and subediting variation. Unpublished doctoral dissertation. University of Auckland. Bell, A. (1984). Language style as audience design. Language in Society, 13 (2), 145–204. Bell, A. (1991). Audience accommodation in the mass media. In H. Giles, J. Coupland, & N. Coupland (Eds.), Contexts of accommodation: Developments in applied sociolinguistics (pp. 69–102). Cambridge, UK: Cambridge University Press. Bell, A. (2001). Back in style: Reworking audience design. In P. Eckert & J. R. Rickford (Eds.), Style and sociolinguistic variation (pp. 139–169). New York: Cambridge University Press. Bencini, G., & Goldberg, A. (2000). The contribution of argument structure constructions to sentence meaning. Journal of Memory and Language, 43, 640–651. Berdan, R. (1975). The necessity of variable rules. In R. W. Fasold & R. Shuy (Eds.), Analyzing variation in language (pp. 11–26). Washington, DC: Georgetown University Press. Berdan, R. (1996). Disentangling language acquisition from language variation. In R. Bayley & D. Preston (Eds.), Second language acquisition and linguistic variation (pp. 203–244). Philadelphia: John Benjamins. Berlin, B., & Berlin, E. A. (1975). Aguaruna color categories. American Ethnologist, 2, 61–87. Berlin, B., & Kay, P. (1969). Basic color terms: Their universality and evolution. Berkeley and Los Angeles: University of California Press. Berman, R. A., & Slobin, D. I. (Eds.) (1994). Relating events in narrative: A crosslinguistic developmental study. Hillsdale, NJ: Lawrence Erlbaum. Bialystok, E. (1999). Cognitive complexity and attentional control in the bilingual mind. Child Development, 7 (3), 636–644. Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge, UK: Cambridge University Press. Bickerton, D. (1971). Inherent variability and variable rules. Foundations of Language, 7, 457–492. Bickerton, D. (1975). Dynamics of a Creole System. New York: Cambridge University Press. Bickerton, D. (1981). The Roots of Language. Ann Arbor: Karoma. Bley-Vroman, R. (1990). The logical problem of foreign language learning. Lingustic Analysis, 20, 3–49. Bloom, P. (1993). Grammatical continuity in language development: The case of subjectless sentences. Linguistic Inquiry, 17, 721–734. Bourhis, R. Y., & Giles, H. (1977). The language of intergroup distinctiveness. In H. Giles (Ed.), Language, ethnicity and intergroup relations (pp. 19–135). London: Academic Press. Bowerman, M. (1988). The “no negative evidence” problem: How do children avoid constructing an overly general grammar? In J. A. Hawkins (Ed.), Explaining language universals. Malden, MA: Blackwell. Bowerman, M., & Bresnan, J. (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press.

References

• 197

Bresnan, J. (Ed.) (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press. Brinton, D. M., & Master, P. (Eds.) (1997). New ways in content-based instruction. Alexandria, VA: TESOL. Brown, R. (1965). Social psychology. New York: Free Press. Bruﬀee, K. A. (1984). Collaborative learning and the “conversation of mankind.” College English, 46, 635–652. Bruﬀee, K. A. (1986). Social construction, language, and the authority of knowledge: A bibliographical essay. College English, 48, 773–790. Bull, W. (1965). Spanish for teachers: Applied linguistics. New York: Ronald Press. Burgess, D., Kempton, W., & MacLaury, R. (1983). Tarahumara color modiﬁers: Category structure presaging evolutionary change. American Ethnologist, 10, 133–149. Bybee, J., & Moder, L. (1983). Morphological classes as natural categories. Language, 59, 252–269. Bybee, J., & Slobin, D. (1982). Rules and schemas in the development and use of the English past tense. Language, 58, 265–289. Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1–47. Cedergren, H. (1973). The interplay of social and linguistic factors in Panama. Unpublished doctoral dissertation. Cornell University. Cedergren, H., & Sankoﬀ, D. (1974). Variable rules: Performance as a statistical reﬂection of competence. Language, 50, 333–355. Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky, N. (1986). Knowledge of language: Its nature, origin and use. New York: Praeger. Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press. Chomsky, N. (1999). On the nature, use and acquisition of language. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of child language (pp. 33–54). San Diego: Academic Press. Clark, H. H., & Clark, E. V. (1977). Psychology and language: An introduction to psycholinguistics. New York: Harcourt. Clifton, C., Kurcz, I., & Jenkins, J. J. (1965). Grammatical relations as determinants of sentence similarity. Journal of Verbal Learning and Verbal Behavior, 4, 112–117. Clifton, C., & Odom, P. (1966). Similarity relations among certain English sentence constructions. Psychological Monographs, 80 (5), 1–35. Cofer, T. (1972). Linguistic variability in a Philadelphia speech community. Unpublished doctoral dissertation. University of Pennsylvania. Corder, S. P. (1981). Formal simplicity and functional simpliﬁcation. In R. Anderson (Ed.), New Dimensions in second language acquisition research (pp. 156–152). Rowley, MA: Newbury House. Crain, S., & McKee, C. (1986). Acquisition of structural restrictions on anaphora. In S. Berman, J.-W. Choe, & J. M. McDonough (Eds.), Proceedings of the North Eastern Linguistics Society 16 (pp. 94–110). Amherst, MA: GLSA. Darwin, C. (1859/1998). The origin of species. New York: Modern Library. Day, R. (1979). Verbal ﬂuency and the language bound eﬀect. In C. J. Fillmore, D. Kempler, & W. S.-Y. Wang (Eds.), Individual diﬀerences in language ability and language behavior (pp. 57–84). New York: Academic Press. Dell, G., Chang, F., & Griﬃn, Z. (2001). Connectionist models of language production: Lexical access and grammatical encoding. In M. Christiansen & N. Chater (Eds.), Connectionist Psycholinguistics (pp. 112–143). Westport, CT: Ablex. Dewey, J. (1916). Democracy and education. New York: Macmillan. Dickerson, L. B. (1974). Internal and external patterning of phonological variability in the speech of Japanese learners of English: Toward a theory of second language acquisition. Unpublished doctoral dissertation. University of Illinois. Dickerson, L. B. (1975). The learner’s interlanguage as a system of variable rules. TESOL Quarterly, 9, 401–407. Dougherty, J. W. D. (1975). A universalist analysis of variation and change in color semantics. Unpublished doctoral dissertation. University of California, Berkeley. Dougherty, J. W. D. (1977). Color categorization in West Futunese: Variability and change. In B. G. Blount & M. Sanchez (Eds.), Sociocultural dimensions of language change (pp. 103–118). New York: Academic Press.

198

• References

Doughty, K. (1991). Second language instruction does make a diﬀerence: Evidence from an empirical study of SL relativization. Studies in Second Language Acquisition, 13 (4), 431–469. Dowty, D., Wall, R., & Peters, S. (1981). Introduction to Montague semantics. Dordrecht: Reidel. Dulay, H., & Burt, M. K. (1974). Natural sequences in child second language acquisition. Language Learning, 24, 37–53. Eckert, P. (1988). Adolescent social structure and the spread of linguistic change. Language in Society, 17, 183–208. Eckert, P. (1999). Linguistic variation as social practice. Oxford: Blackwell. Eckert, P., & McConnell-Ginet, S. (1992). Think practically and look locally: Language and gender as community-based practice. Annual Review of Anthropology, 21, 461–490. Eckert, P. & Rickford, J. R. (Eds.) (2001). Style and sociolinguistic variation. New York: Cambridge University Press. Eckman, F., Bell, L., & Nelson, D. (1988). On the generalization of relative clause instruction in the acquistion of English as a second language. Applied Linguistics, 9, 1–20. Elliott, O. P. (1995). A glance at the syntactic and semantic principles underlying the Spanish clitic se: A study in second language acquisition. Unpublished doctoral dissertation. University of Arizona. Ellis, R. (1990). Reply to Gregg. Applied Linguistics, 11, 384–391. Ellis, R. (1994). The study of second language acquisition. Oxford: Oxford University Press. Fasold, R. (1972). Tense marking in Black English: A linguistic and social analysis. Washington, DC: Center For Applied Linguistics. Fasold, R. (1985). Perspectives on sociolinguistic variation. Language in Society, 14, 515–526. Fasold, R. W., & Shuy, R. W. (Eds.) (1970). Teaching Standard English in the inner city. Washington, DC: Center for Applied Linguistics. Fasold, R. W., & Shuy, R. W. (Eds.) (1975). Analyzing variation in language. Washington, DC: Georgetown University Press. Feldman, J. (2006). From molecule to metaphor. Cambridge, MA: MIT Press. Feyerabend, P. (1978). Against method: Outline of an anarchistic theory of knowledge. London: Verso. Feynmann, R. (1965). The character of physical law. Cambridge, MA: MIT Press. Fillmore, C. J. (1988). The mechanisms of a construction grammar. BLS, 14, 35–55. Fischer, J. (1958). Social inﬂuence on the choice of linguistic variant. Word, 14, 47–56. Flege, J. (1991). Perception and production: The relevance of phonetic input to L2 phonological learning. In T. Huebner and C. Ferguson (Eds.), Crosscurrents in second language acquisition and linguistic theory. Philadelphia: John Benjamins. Fodor, J. A. (1975). The language of thought. New York: Thomas Y. Crowell. Fodor, J. A., & Garrett, M. F. (1967). Some syntactic determinents of sentential complexity. Perception and Psychophysics, 2, 289–296. Fox, C.A. (2002). Incorporating variation in the French classroom. In S. Gass, K. Bardovi-Harlig, S. S. Magnan, & J. Walz (Eds.), Pedagogical norms for second and foreign language learning and teaching: Studies in honor of Albert Valdman (pp. 201–211). Amsterdam: John Benjamins. Frawley, W. (1997). Vygotsky and cognitive science. Cambridge, MA: MIT Press. Gardner, R. C. (2002). Social psychological perspective on second language acquisition. In R. Kaplan (Ed.), The Oxford Handbook of Applied Linguistics (pp. 160–169). Oxford: Oxford University Press. Gardner, R. C., & Lambert, W. E. (1972). Attitudes and motivations in second language learning. Rowley, MA: Newbury House. Garrett, M. (1975). The analysis of sentence production. In G. H. Bower (Ed.), The psychology of learning and motivation (pp. 133–175). San Diego: Academic Press. Gass, S. (1980). An investigation of syntactic transfer in adult second language learners. In R. Scarcella & S. Krashen (Eds.), Research in second language acquisition. Rowley, MA: Newbury House. Gass, S., Bardovi-Harlig, K., Magnan, S. S., & Walz, J. (Eds.) (2002). Pedagogical norms for second and foreign language learning and teaching: Studies in honor of Albert Valdman. Amsterdam: John Benjamins. Gattegno, C. (1972). Teaching foreign languages in schools: The silent way. New York: Educational Solutions. Geertz, C. (1983). Local knowledge: Further essays in interpretive anthropology. New York: Basic Books.

References

• 199

Giles, H. (Ed.) (1984). The dynamics of speech accommodation. Amsterdam: Mouton. Giles, H., Coupland, J., & Coupland, N. (Eds.) (1991). Contexts of accommodation: Developments in applied sociolinguistics. Cambridge, UK: Cambridge University Press. Giles, H., & Powesland, P. F. (1975). Speech style and social evaluation. London: Academic Press. Gladney, M.R. (1973). Problems in teaching children with nonstandard dialects. In J. L. Laﬀey & R. W. Shuy (Eds.), Language diﬀerences: Do they interfere? (pp. 40–46). Newark, DE: International Reading Asociation. Goldberg, A. (1995). Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press. Goldberg, A. (2006). Constructions at work: The nature of generalization in language. Oxford: Oxford University Press. Goodman, K. (1968). The psycholinguistic nature of the reading process. Detroit, MI: Wayne State University Press. Gregg, K. R. (1990). The variable competence model and why it isn’t. Applied Linguistics, 11, 364–383. Gregg, K. R. (1996). The logical and developmental problems of second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 49–81). San Diego: Academic Press. Gropen, J. (1989). Learning locative verbs: How universal linking rules constrain productivity. Unpublished doctoral dissertation. MIT. Gropen, J., Pinker, S., Hollander, M., & Goldberg, R. (1991). Aﬀectedness and direct objects: The role of lexical semantics in the acquisition of verb argument structure. Cognition, 41, 153–195. Hage, B., & Hawkes, R. (1975). Binumarien color categories. Ethnology, 24, 287–300. Hansen, J. (2001). Linguistic constraints on the acquisition of English syllable codas by native speakers of Mandarin Chinese. Applied Linguistics, 22 (3), 338–365. Hare, M., & Goldberg, A. (1999). Structural priming: Purely syntactic? Paper presented at the Proceedings of the Cognitive Science Society. Proceedings of the 22nd annual cognitive science society (pp. 208–211). Mahwah, NJ: Lawrence Erlbaum. Hecht, M. L. (2002). A research odyssey toward the development of a communication theory of identity. Communication Monographs, 60, 76–82. Houston, S. (1985). Continuity and change in English morphology: The variable ING. Unpublished doctoral dissertation. University of Pennsylvania. Howard, M., Lemee, I., & Regan, V. (2006). The L2 acquisition of a phonological variable: The case of /l/ deletion in French. Journal of French Language Studies, 16, 1–24. Hudson Kam, C. L., & Newport, E. L. (2005). Regularizing unpredictable variation: The roles of adult and child learners in language formation and change. Language Learning and Development, 1 (2), 151–195. Huebner, T. (1985). System and variability in interlanguage syntax. Language Learning, 35, 141–163. Hymes, D. (1972). On communicative competence. Philadelphia: University of Pennsylvania Press. Jacobson, R. (1962). Selected Writings I. The Hague: Mouton. Johnson, M. (1987). The body in the mind. Chicago: University of Chicago Press. Jones, M., & Haywood, S. (2004). Facilitating the acquisition of formulaic sequences: An exploratory study in an EAP context. In N. Schmitt (Ed.), Formulaic sequences: Acquisition processing and use (pp. 269–300). Philadelphia: John Benjamins. Joseph, J. E. (2004). Language and identity: National, ethnic, religious. New York: Palgrave. Kachru, B. (1996). The paradigm of marginality. World Englishes, 15 (3), 241–255. Kasckhuk, M. P., & Glenberg, A. M. (2000). Constructing meaning: The role of aﬀordedness and grammatical constructions in sentence comprehension. Journal of memory and language, 43, 508–529. Kay, P. (1975). Synchronic variability and diachronic change in basic color terms. Journal of Language in Society, 4, 257–270. Kay, P. (1978). Variable rules, community grammar, and linguistic change. In D. Sankoﬀ (Ed.), Linguistic variation: Models and methods (pp. 71–83). New York: Academic Press. Kay, P. (1990). Even. Linguistics and philosophy, 13 (1), 59–112. Kay, P., & McDaniel, C. (1978). The linguistic signiﬁcance of the meaning of basic color terms. Language, 54, 610–646. Kay, P., & McDaniel, C. (1979). On the logic of variable rules. Language in Society, 8 (3), 151–187. Keller, H. (1988). The story of my life. Norwalk, CT: Easton Press.

200

• References

Kemmer, S. (1993). The middle voice. Philadelphia: John Benjamins. Keenan, E. L., & Comrie, B. (1977). Noun phrase accessibility and universal grammar. Linguistic Inquiry 8, 63–99. Kovac, C., & Adamson, H.D. (1981). Variation theory and ﬁrst language acquisition. In D. Sankoﬀ & H. Cedergren (Eds.), Variation omnibus (pp. 403–410). Carbondale, IL and Edmonton, AL: Linguistic Research, Inc. Kramsch, C. (2002a). Beyond the second vs. foreign language dichotomy. In K. S. Miller & P. Thompson (Eds.), Unity and diversity in language use (pp. 1–21). London: Continuum. Kramsch, C. (2002b). Standard, norm, and variability in language learning: A view from foreign language research. In S. Gass, K. Bardovi-Harlig, S. S. Magnan, & J. Walz (Eds.), Pedagogical norms for second and foreign language learning and teaching: Studies in honor of Albert Valdman (pp. 60–79). Amsterdam: John Benjamins. Krashen, S. (1978). The monitor model of second language acquisition. In R. Gingras (Ed.), Second language acquisition and foreign language teaching. Arlington, VA: Center for Applied Linguistics. Krashen, S. (1982). Principles and practices of second language acquisition. Oxford: Pergamon. Krashen, S. (1985). The input hypothesis: Issues and implementations. London: Longman. Krashen, S. (1987). Principles and practice in second language acquisition. Englewood Cliﬀs, NJ: Prentice-Hall. Kuhn, T. S. (1959). The Copernican revolution: Planetary astronomy in the development of western thought. New York: Vintage. Kuhn, T. S. (1970). The structure of scientiﬁc revolutions. Chicago: University of Chicago Press. Kuhn, T. S. (1977). The essential tension. Chicago: University of Chicago Press. Labov, W. (1966). The social stratiﬁcation of English in New York City. Washington, DC: Center for Applied Linguistics. Labov, W. (1967). Some sources of reading problems for Negro speakers of nonstandard English. In A. Frazier (Ed.), New directions in elementary English (pp. 140–167). Champaign, IL: NCTE. Labov, W. (1969). Contraction, deletion and inherent variability of the English copula. Language, 45, 715–762. Labov, W. (1972a). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press. Labov, W. (1972b). Language in the inner city. Philadelphia: University of Pennsylvania Press. Labov, W. (1973). The boundaries of words and their meanings. In C.-J. Bailey & R. Shuy (Eds.), New ways of analyzing variation in English (pp. 340–373). Washington, DC: Georgetown University Press. Labov, W. (1979). Locating the frontier between social and psychological factors in linguistic variation. In C. J. Fillmore, D. Kempler, & W. S.-Y. Wang (Eds.), Individual diﬀerences in language ability and language behavior (pp. 324–340). New York: Academic Press. Labov, W. (1982). Building on empirical foundations. In W. P. Lehmann, & Y. Malliel (Eds.), Perspectives on historical linguistics (pp. 17–92). Amsterdam: John Benjamins. Labov, W. (1984). Field methods of the project on linguistic change and variation. In J. Baugh & J. Sherzer (Eds.), Language in use: Readings in socio-linguistics (pp. 28–53). Englewood Cliﬀs, NJ: Prentice-Hall. Labov, W. (1994). Principles of language change Vol. 1: Internal factors. Oxford: Blackwell. Labov, W. (2001a). Principles of language change Vol. 2: Social factors. Oxford: Blackwell. Labov, W. (2001b). The anatomy of style-shifting. In P. Eckert & J. R. Rickford, (Eds.), Style and sociolinguistic variation (pp. 85–108). New York: Cambridge University Press. Labov, W., Cohen, P., Robins, C., & Lewis, J. (1968). A study of the non-standard English of Negro and Puerto Rican speakers in New York City. USOE Final Report, Research Project No. 3288. Labov, W., & Labov, T. (1976). Learning the syntax of questions. Paper delivered at the Conference on the Psychology of Language, Stirling, Scotland. Ladefoged, P. (1975). A course in phonetics. New York: Harcourt. Laﬀey, J. L., & Shuy, R. (1973). Language diﬀerences: Do they interfere? Newark, DE: International Reading Association. Lakoﬀ, G. (1987). Women, ﬁre, and dangerous things. Chicago: University of Chicago Press. Langacker, R. (1991). Foundations of cognitive grammar, Vol. 2. Chicago: University of Chicago Press. LaPonce, J. A. (1992). Reducing the tensions resulting from language contacts: Personal or territorial

References

• 201

solutions? In D. Bonin (Ed.), Reconciliation: The language issue in Canada in the 1990s (pp. 125–132). Kingston, Ontario: Queens University Press. Larsen-Freeman, D. (1975). The acquisition of grammatical morphemes by adult ESL students. TESOL Quarterly, 9, 409–420. Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22, 1–75. Luria, A. R. (1976). Cognitive development. Cambridge, MA: Harvard University Press. MacLaury, R. (1986). Color in mesoamerica, vol. 1: A theory of composite categorization. Unpublished doctoral dissertation. University of California, Berkeley. MacLaury, R. (1987). Color category evolution and Shuswap yellow-green. American Anthropologist, 89, 107–124. MacLaury, R. (1991). Social and cognitive motivations of change: Measuring variability in color semantics. Language, 67 (1), 34–62. Major, R. (2004). Gender and stylistic variation in second language phonology. Language Variation and Change, 16, 169–188. McLaughlin, B. (1980). Theory and research in second language learning: An emerging paradigm. Language Learning, 30, 331–350. McLaughlin, B. (1987). Theories of second language learning. London: Edward Arnold. Meisel, J., Clahsen, H, & Pienemann, M. (1981). On determining developmental stages in natural second language acquistion. Studies in second language acquisition, 3, 109–35. Miller, G. A., & McKean, K. O. (1964). A chronometric study of some relations between sentences. Quarterly Journal of Experimental Psychology, 16, 297–308. Milroy, J. (1982). Probing under the tip of the iceberg: Phonological “normalization” and the shape of speech communities. In S. Romaine (Ed.), Sociolinguistic variation in speech communities (pp. 35–48). London: Edward Arnold. Mitchell, R., & Myles, F. (1998). Second language learning theories. New York: Arnold. Mitchell, R., & Myles, F. (2004). Second language learning theories (2nd ed.). New York: Arnold. Mougeon, R., Nadasdi, T., & Rehner, K., (2002). Etat de la recherche sur l’appropriation de la variation par les apprenants avances du FL2 ou FLE. AILE (Acquisition et Interaction en Langue Etrangère). Special issue: L’Acquisition de la variation par les apprenants du français langue seconde, 17, 17–50. Mougeon, R., Rehner, K., & Nadasdi, T. (2004). The learning of spoken French variation by immersion students from Toronto, Canada. Journal of Sociolinguistics, 8, 408–432. Muller, M. (1861). Lectures on the science of language. Delivered at the Royal Institution of Great Britain in April, May, and June, 1886. First Series, London. Norton, B., & Toohey, K. (2002). Identity and language learning. In R. Kaplan (Ed.), The Oxford handbook of applied linguistics (pp. 115–123). New York: Oxford University Press. Oxford, R. L. (1996). Language learning motivation: Pathways to the new century. Honolulu: University Press of Hawaii. Payne, A. (1980). Factors controlling the acquisition of the Philadelphia dialect by out-of-state children. In W. Labov (Ed.), Locating language in time and space (pp. 329–345). New York: Academic Press. Perry, T., & Delpit, L. (Eds.) (1998). The real Ebonics debate. Boston: Beacon. Piaget, J. (1972). The child and reality. New York: Wiley. Pienemann, M., Johnston, M., & Brindley, G. (1988). Constructing an acquisition-based procedure for assessing second language acquisition. Studies in second language acquisition, 10, 217–243. Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press. Pinker, S. (1999). Words and rules. New York: HarperCollins. Pinker, S. (2002). The blank slate: The modern denial of human nature. New York: Viking. Pinker, S., & Prince, A. (1994). Regular and irregular morphology and the psychological status of rules of grammar. In S. D. Lima, R. L. Corrigan, & G. K. Iverson (Eds.), The reality of linguistic rules (pp. 321–352). Amsterdam: John Benjamins. Poplack, S., & Walker, D. (1986). Going through /l/ in Canadian French. In D. Sankoﬀ (Ed.), Diversity and diachrony (pp. 173–198). Amsterdam: John Benjamins. Preston, D. (1989). Sociolinguistics and second language acquisition. Oxford: Blackwell.

202

• References

Preston, D. (1991). Style, status, and change: Three sociolinguistic axioms. In F. Byrne and T. Hubner (Eds.), Development and structures of creole languages. Essays in honor of Derek Bickerton. Creole Language Library, Vol. 9. Amsterdam: John Benjamins. 43–59. Preston, D. (1996). Variationist perspectives on second language acquisition. In R. Bailey & D. Preston (Eds.), Second language acquisition and linguistic variation (pp. 1–45). Philadelphia: John Benjamins. Preston, D. (2001). Style and the psycholinguistics of sociolinguistics: The logical problem of language variation. In P. Eckert & J. R. Rickford (Eds.), Style and sociolinguistic variation (pp. 279–304). New York: Cambridge University Press. Preston, D. (2002). A variationist perspective on second language acquisition: Psycholinguistic concerns. In R. Kaplan (Ed.), The Oxford handbook of applied linguistics (pp. 141–159). New York: Oxford University Press. Queller, K. (2001). A usage based approach to teaching the phrasal lexicon. In M. Putz, S. Niemeyer, & R. Driven (Eds.), Applied Cognitive Linguistics II: Language Pedagogy (pp. 55–83). Berlin: Mouton. Rand, D., & Sankoﬀ, D. (1990). GoldVarb Version 2: A variable rule application for the MacIntosh. Montreal: Université de Montréal, Centre Recherches Mathématiques. Regan, V. (1996). Variation in French interlanguage: A longitudinal study of sociolinguistic competence. In R. Bayley & D. Preston (Eds.), Second language acquisition and linguistic variation (pp. 177–201) Philadelphia: John Benjamins. Rehner, K., Mougeon, R., & Nadasdi, T. (2003). The learning of sociolinguistic variation by advanced FSL learners: The case of nous versus on in immersion French. Studies in Second Language Acquisition, 25, 127–156. Rickford, A. E. (1999). I can ﬂy. Lanham, MD: University Press of America. Rickford, A. E., & Rickford, J. (1995). Dialect readers revisited. Linguistics and Education, 7, 107–128. Rickford, J. R., & Eckert, P. (2001). Introduction. In P. Eckert & J. R. Rickford (Eds.), Style and sociolinguistic variation (pp. 1–18). New York: Cambridge. Rigg, P., & Enright, S. (1986). Children and ESL: Integrating perspectives. Washington, DC: TESOL. Rips, L. J. (1994). The psychology of proof: Deductive reasoning in human thinking. Cambridge, MA: MIT Press. Roberts, J. (1993). The acquisition of variable rules: t/d deletion and -ing production in preschool children. Unpublished doctoral dissertation. University of Pennsylvania. Robinson, G. H. (1964). Continuous estimation of a time-varying probability. Ergonomics 7, 7–21. Robinson, J. S., Lawrence, H. R., & Tagliamonte, S. A. (2001). GOLDVARB 2001: A multivariate analysis application for windows. Department of Language and Linguistic Science, York, Canada: University of York. Romaine, S. (1982). Sociolinguistic variation in speech communities. London: Arnold. Rorty, R. (1979). Philosophy and the mirror of nature. Princeton, NJ: Princeton University Press. Rorty, R. (1989). Contingency, irony, and solidarity. Cambridge, UK: Cambridge University Press. Rosch, E. (1973). Natural categories. Cognitive Psychology, 4, 328–350. Sankoﬀ, G. (1974). A quantitative paradigm for the study of communicative competence. In R. Bauman & J. Sherzer (Eds.), Explorations in the ethnography of speaking (pp. 18–49). Cambridge, UK: Cambridge University Press. Sankoﬀ, D., & Labov, W. (1979). On the uses of variable rules. Language in Society, 8 (3), 189–222. Sankoﬀ, G. & Vincent, D. (1977). L’emploi productif de “ne” dans le français parle de Montréal. Le Français Moderne, 45 (3), 243–256. Saussure, F. [1915] (1974). Course in general linguistics (W. Baskin, Trans.). London: Fontana/ Collins. Savignon, S. J. (1983). Communitive competence: Theory and classroom practice: Texts and contexts in second language learning. Reading, MA: Addison-Wesley. Schieﬀelin, B. B. (1985). The acquisition of Kaluli. In D. I. Slobin (Ed.), The crosslinguistic study of language acquisition, Vol. 1: The data (pp. 525–594). Hillsdale, NJ: Lawrence Erlbaum. Schiﬀrin, D. (1981). Tense variation in narrative. Language, 57 (1), 45–62. Schumann, J. (1978). The pidginization hypothesis. Rowley, MA: Newbury House. Schwartz, B. D., & Sprouse, R. A. (1996). L2 cognitive states and the full transfer/access model. Second Language research, 12, 40–72. Shi, E. (2003). Second language grammar and secondary predication. Unpublished doctoral dissertation. University of Arizona.

References

• 203

Shibatani, M. (1996). Applicatives and benefactives: A cognitive account. In S. Thompson & M. Shibatani (Eds.), Grammatical constructions: Their form and meaning (pp. 245–263). Oxford: Clarendon Press. Shirai, Y., & Andersen, R. W. (1995). The acquisition of tense-aspect morphology: A prototype account. Language, 71 (4), 743–762. Shuy, R., Wolfram, W, & Riley, W. (1968). Social stratiﬁcation in Detroit speech. Washington, DC: Center for Applied Linguistics. Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press. Sinclair, J. (1996). The search for units of meaning. Textus, IX, 75–106. Slobin, D. I. (1973). Cognitive prerequisites for the development of grammar. In C. Ferguson & D. Slobin (Eds.), Studies of child language development (pp. 175–208). New York: Holt. Slobin, D. I. (1985). Crosslinguistic evidence for the language-making capacity. In D. I. Slobin (Ed.), The crosslinguistic study of language acquisition, Vol. 2: Theoretical issues (pp. 1157–1256). Hillsdale, NJ: Lawrence Erlbaum. Smith, B. H. (1989). Contingencies of value. Cambridge, MA: Harvard University Press. Smith, F. (1982). Writing and the writer. New York: Holt. Smolensky, M. (2001). Grammar-based connectionist approaches to language. In M. H. Christiansen & N. Chater (Eds.), Connectionism and psycholinguistics (pp. 319–347). Westport, CT: Ablex. Snow, C., & Brinton, D.M. (1997). The content-based classroom: Perspectives on integrating language and content. White Plains: Longman. Spelke, E., Vishton, P., & von Hofsten, C. (1995). Object perception, object-directed action, and physical knowledge in infancy. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 165–179). Cambridge, MA: MIT Press. Spivey, M. J. & Tanenhaus, M. K. (1998). Syntactic ambiguity resolution in discourse: Modeling the eﬀects of referential context and lexical frequency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1521–1543. Spolsky, B. (1985). Formulating a theory of second language learning. Studies in Second Language Acquisition, 7 (3), 269–288. Stabler, E. (1984). Berwick and Weinberg on linguistics and computational psychology. Cognition, 17, 155–179. Swain, S., & Lapkin, D. (1989). Canadian immersion and adult second language teaching: What’s the connection? Modern Language Journal, 73 (2), 150–159. Tajfel, H. (1978). Social categorization, social identity, and social comparison. In H. Tajfel (Ed.), Diﬀerentiation between social groups: Studies in the social psychology of intergroup relations (pp. 61–76). London: Academic Press. Talmy, L. (1985a). Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Language typology and syntactic description, Vol. 3. Cambridge, UK: Cambridge University Press. Talmy, L. (1985b). Force dynamics in language and thought. In W. H. Eilfort, P. D. Kroeber, & K. L. Peterson (Eds.), CLS 21 Part 2: Papers from the parasession on causatives and agentivity (pp. 293–337). Chicago: Chicago Linguistics Society. Tarone, E. (1979). Interlanguage as chameleon. Language Learning, 29, 181–191. Tarone, E. (1982). Systematicity and attention in interlanguage. Language Leaning, 32, 9–84. Tarone, E. (1985). Variability in interlanguage use: A study of style-shifting in morphology and syntax. Language Learning, 35, 373–403. Tarone, E. (1988). Variation in interlanguage. London: Edward Arnold. Tarone, E. (1990). On variation in interlanguage: A response to Gregg. Applied Linguistics, 11, 392–400. Tarrallo, F., & Myhill, J. (1983). Interference and natural language in second language acquisition. Language Learning, 33, 55–76. Tharp, R., & Gallimore, R. (1988). Rousing minds to life: Teaching, learning and schooling in social context. Cambridge, UK: Cambridge University Press. Tharp, R., & Gallimore, R. (1990). Teaching, schooling and literate discourse. In L. Moll (Ed.), Vygotsky and education (pp. 175–205). Cambridge, UK: Cambridge University Press. Thompson, G. L., & Brown, A. V. (2003). Interlanguage variation: The inﬂuence of contextualized language on L2 phonological production. Arizona Working Papers in Second Language Acquisition and Teaching (SLAT), 10, 35–50.

204

• References

Towel, R., & Hawkins, R. (1994). Approaches to second language acquisition. Clevedon, UK: Multilingual Matters. Townsend, D. J., & Bever, T. G. (2001). Sentence comprehension: The integration of habits and rules. Cambridge, MA: MIT Press. Trudgill, P. (1974). The social diﬀerentiation of English in Norwich. Cambridge, UK: Cambridge University Press. Valdez, G. (2001). Learning and not learning in English: Latino students in American schools. New York: Teachers College Press. Valdman, A. (1961). Applied French – A guide for teachers. Boston: Heath. Valdman, A. (1966). Programmed instruction and foreign language teaching. In A. Valdman (Ed.), Trends in language teaching (pp. 133–158). New York: McGraw-Hill. Valdman, A. (1989). The elaboration of pedagogical norms for second language learners in a conﬂictual diglossia situation. In S. Gass, C. Madden, D. Preston, & L. Selinker (Eds.), Variation in second language acquisition, Vol. I: Discourse and pragmatics (pp. 5–34). Clevedon, UK: Multilingual Matters. Valdman, A. (1992). Authenticity, variation and communication in the foreign language classroom. In C. Kramsch & S. McConnell-Ginet (Eds.), Text and context: Cross-disciplinary perspectives in language study (pp. 79–97). Lexington, MA: Heath. Valian, V. (1979). The wherefores and therefores of the competence–performance distinction. In W. E. Cooper & E. C. T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 1–26). Hillsdale, NJ: Lawrence Erlbaum. Vendler, Z. (1967). Linguistics in philosophy. Ithaca, NY: Cornell University Press. Vygotsky, L. S. (1978). Mind in society. Cambridge, MA: Harvard University Press. Vygotsky, L. S. (1986). Thought and language. Cambridge, MA: MIT Press. Weinreich, U., Labov, W., & Herzog, M. (1968). Empirical foundations for a theory of language change. In W. W. Lehman & Y. Malkiel (Eds.), Directions for historical linguistics (pp. 95–188). Austin: University of Texas Press. Wilkins, D. A. (1976). Notional syllabuses. Oxford: Oxford University Press. Wittgenstein, L. (1953). Philosophical investigations. New York: Macmillan. Wolfram, W. (1969). A sociolinguistic description of Detroit Negro speech. Washington, DC: Center for Applied Linguistics. Wolfram, W. (1974). Sociolinguistic aspects of assimilation: Puerto Rican English in New York City. Arlington, VA: Center for Applied Linguistics. Wolfram, W. (1975). Variable constraints and rule relations. In R. W. Fasold & R. W. Shuy, (Eds.), Analyzing variation in language (pp. 70–88). Washington, DC: Georgetown University Press. Wolfram, W. (1985). Variability in tense marking: A case for the obvious. Language Learning, 35, 229–253. Wolfram, W., Carter, P., & Moriello, B. (2004). Emerging Hispanic English: New dialect formation in the American South. In R. Bayley & D. Preston (Eds.), Second language acquisition and linguistic variation (pp. 339–358). Philadelphia: John Benjamins. Wolfram, W. & Fasold, R. (1974). The study of social dialects in American English. Englewood Cliﬀs, NJ: Prentice-Hall. Wolfson, N. (1982). On tense alternation and the need for analysis of native speaker usage in second language acquisition. Language Learning, 32, 53–68. Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17, 89–100. Young, R. (1989). Ends and means: Methods for the study of interlanguage variation. In S. Gass, C. Madden, D. Preston, & L. Selinker (Eds.), Variation in second language acquisition: Psycholinguistic issues (pp. 65–92). Clevedon, UK: Multilingual Matters. Young, R. (1991). Variation in interlanguage morphology. New York: Peter Lang.

Index

Note: page numbers in italics refer to ﬁgures, tables and diagrams

A Adamson, H.D., 49, 50, 56, 66, 109, 137, 139, 146–7, 150, 156, 181 accomodation theory, 54 actuation problem, 34, 140–1, 144–5 acquisitional orders, 167–9 African American English, 7, 12, 13, 16, 17, 20, 24, 43, 45, 52, 154, 156 analysis–by–synthesis, 86 Andersen, R., 63, 68, 72–3 Anderson, J.R., 137 artiﬁcial language: learning of, 77–9 Ashby, W.J., 55–6 audience design, 140–6, 151–2 Auger, J., 162–3, 165–6

B Bachman, L., 15, 49, 153–4, 193n Baker, C.L., 111 Bakhtin, M.M., 143, 165 Bardovi–Harlig, K., 64–5, 69, 71–2, 158 Bayley, R., 49, 51–3, 59, 61–2, 66–8, 71–3 Bell, A., 141–5, 151–2, 163 Berdan, R., 21, 51 Berlin, B., xvi, 4, 5, 183–5, 189 Bever, T.G., xv, 11, 12, 86, 92, 170 Biber, D., 170 Bickerton, D., 19, 21, 22, 77, 99, 101, 187–8 Binding Principle B, 5, 7, 8, 10 Belfast speech, 18, 21 Black English literature, 163 blank slate theory (of mind), 3 Bley–Vroman, R., 6 Bowerman, M., 112

Bresnan, J., 112 Brinton, D.M., 181 broad–range semantic constraint (on dativization), 114–8, 21, 125–6, 137 Brown, A. V., 149–50 Brown, R., 180 Bruﬀee, K. A., 175–6 Bruner, J., 167 Burgess, D., 199 Bybee, J., 107–9, 112, 126–7

C Canale, M., 153 Cedergren, H., 17, 20, 26, 66 Chomsky, N., xv, 3, 5, 7, 9, 10, 12, 20, 82, 87, 136, 192n Cofer, T., 12, 25–6, 31 Cognitive Linguistics: account of reﬂexive, 94; introduction to:, 101; prototype categories in, 102–3; teaching implications of, 169–73 colloquations (see formulaic sequences) color category systems, change in, 5, 34, 37, 129, 176, 180, 183–91 communicative competence, 15, 153, 159, 181 competence (versus performance), 9–11 comprehension model (of sentences), 11, 14, 15, 86–92 concordancing programs, 169, 171 connectionist networks, xv, 20, 96–9, 85–92, 96–9, 109–10, 126–9, 149, 192n

205

206

• Index

construction grammar, 116, 124 teaching implications of, 169–72 constructions: ditransitive, 112–26, 172; way, 116–7, 169; written all over, 172 constructivist language teaching, 166–9 Cookie Monster, 5 Corder, S.P., 49 Crain, S., 5

Fodor, J.A., 9, 12 formal learners, studies of, 54–6 formulaic sequences, 169–72 Fox, C. A., 159–61 Frawley, W., 165 French: articles in, 44; acquisition of 54–8; immersion programs, 57, 154, 162, 165–6, 192n; null subject parameter in, 6 fundamental diﬀerence hypothesis, 6, 7

D Dani tribe, 4 Darwin, C., 33 Day, R., 190 Dell, G., 84–6 derivational theory of complexity, 11, 12, 80, 92 Descartes, R., 3, 4 developmental problem (of language acquisition), 6, 7 Dewey, J., 182 dialect readers, 155–6 Dickerson, L. B. discourse constraints (in narrative), 64–5 Dougherty, J.W.D., 185 Doughty, K., 168–9

G

E

H

Eckert, P., 41–2, 135, 164 Eckman, F., 168 Elliott, O. P., 92–9, 109–10, 126 Ellis, R., 13, 19 epistemological question (of knowledge), 3, 173 error back propagation, 90, 98

Hansen, J., 61, 73 Hare, M., 120 Hecht, M.L., 163 Herzog, M., 34, 36, 45, 183, 185 horizontal variation (in interlanguage), 49–58 Houston, S., 12, 25–6, 31, 147, 192n Howard, M., 56, 59 Hudson Kam, C.L., 77, 79, 80, 90, 98, 124 Huebner, T., 49 Hume, D., 3 Hymes, D., 153

F Fasold, R., 12, 20, 32, 154 Feldman, J., xv, 98, 109, 116, 192n Feyerabend, P., 194n Feynmann, R., 175 Fillmore, C. J., 116 Fischer, J., 24 Flege, J., 150

Gallimore, R., 166 Gardner, R. C., 164 Garrett, M., xv, 12, 81 Gass, S., 159–60, 168 Gattegno, C., 181 Geertz, C., 174 gender axiom, 149 Giles, H., 140–1, 163 Gladney, M.R. Goldberg, A., 108, 112, 116, 118–25 government and binding theory, 7, 12, 92 Gregg, K. R., 6, 7, 19, 21 Gropen, J., 112, 121–4

I identity, social, xv, 42, 45, 133, 141, 144–7, 162–6

Index indicator (sociolinguistic), 44 innate ideas, 4

J Jacobson, R., 183 Jones, M., 170–2 Joseph, J.E., 163

K Kant, E., 4 Kay, P., xvi, 4, 5, 17, 20–1, 116, 181, 183–5, 189 Keenan, E. L., 168 knockout factor, 29, 30, 50 Krachu, B., 157 Kramsch, C., 157–9, 161 Krashen, S., xv, 98, 133, 135–9, 153, 159 Kuhn, T. S., 174–5

L Labov, W., xiii–xv, 7, 14, 15, 17, 19, 20–6, 29–39, 41–6, 49–51, 55, 64, 102, 133–5, 137–8, 140–4, 147, 149–52, 154, 156, 164, 183, 188, 190 Ladefoged, P., 34 Lakoﬀ, G., xvi, 116, 173–4, 176–7, 179 Lambert, W., 164 Langacker, R., 104, 106 language acquisition device, 136–7 language bound individuals, 190 language change: actuation problem of, 45; constraints problem of, 36–37; embedding problem of, 40–43; evaluation problem of, 43–45; transition problem of, 34, 37–40, 138 language optional individuals, 90 LaPonce, J. A., 165 Larsen–Freeman, D., 135 Levelt, W. J. M., xv, 13n, 81, 84, 190 Leibniz, G, W, 4 lexical diﬀusion, 38, 40, 54, 147

• 207

Lexical Rule Hypothesis (of learning argument structure), 112–118 linking rules, 82, 113, 117 Locke, J., 3 logical problem: of learning, 3–5, 6; of language acquisition, 5–6

M MacLaury, R., 183, 186, 188–90 Major, R., 56, 59, 97, 147–50 Martha’s Vineyard, 17, 43, 45, 144, 164, 187 marker (sociolinguistic), 44–5 McKee, C., 5 McLaughlin, B., 137 Meisel, J., 167 middle experiencer verbs (in Spanish), 93–7, 99, 109–10 Miller, G.A., 11 Milroy, J., 18, 21 Mitchell, R., 13, 15 monitor model, 133, 135–9 monitoring: in ﬁrst language, 4, 16, 18, 27–9, 44, 52, 55, 133–40, 143, 146, 149–52; in second language, 52, 55, 59, 133, 135, 139–40, 150–1 morphophonological constraint (on dativization), 113–14, 116, 121, 123–4 motivation: integrative, 62, 65; instrumental, 162, 164 Mougeon, R., 49, 56, 58, 154, 162 Myles, F., 13, 15

N Nadasdi, T., 49, 56, 58, 154, 162 narrative style, 14, 28–9, 133, 138–9, 149–50, 161 narratives (of Chinese–speaking children), 64–73 naturalistic learners, 51–4 ne deletion, 55–57, 163

208

• Index

Northern Cities Vowel Shift, 41, 43, 45 Noun Phrase Acquisition Hierarchy, 168

Q Quebec French, 56–8, 161–2, 165 Queller, K., 172

R O ontological question (of knowledge), 3, 173 output hypothesis, 98 overgeneralization, 55, 58–9, 95–7, 109–11 Oxford, R. L., 157

P Panama Spanish, 17, 20 past tense marking: by Chinese–speaking adults, 52–3; by Chinese–speaking children, 61–73; in AAE, 13, 32; in creole languages, 187–8; prototype schemas for, 106–8, 126–9 Payne, A., 38, 49 pedagogical norm, 158–161 Perry, T., 156 Philadelphia English, 23–32, 35–8, 40, 45, 142 Piaget, J., 166 Pienemann, M., 167 Pinker, S., 4, 107, 111–18, 122, 124–5, 128, 129 Poplack, S., 56 Preston, D., 16, 21, 28, 30, 70, 77, 90, 92, 99, 139, 142, 150, 152 probability matching, 39, 77 production model (of sentences), 12, 81–4, 87, 92, 98, 106–7, 136 property theory (of language acquisition), 6, 7 prototype categories, 102–4, 108, 193n prototype schemas:, 106–8; compatibility with connectionist networks, 109–10, 126–9; in syntax/ semantics, 111–26

r deletion, 14, 18, 134 Rand, D., 15 Regan, V., 49, 54–59, 146–7, 149 regular sound change, 37–8, 147, 187 Rehner, K., 49, 56, 58, 154, 162 relative clauses, 88–9, 168–9, 193n resultative construction, 116 Rickford, A. E., 155–6, 162–3 Rickford, J. R., 135, 155–6, 162–3 Rips, L.J., 103 Robinson, G. H., 77 Robinson, J. S., 66 Romaine, S., 18, 20–1 Rosch, E., 102, 181 Russell, B., 3

S saliency, principle of, 62, 67, 68, 71, 73 Sankoﬀ, D., 15, 17, 20, 22 Sankoﬀ, G, 17, 26, 55, 66 Savignon, S. J., 153, 181 Schieﬀelin, B.B., 158 Schiﬀrin, D., 64, 66 Schumann, J., xiii, 51 Schwartz, B. D., 6 semantic domains, 94–6 Shi, E., 124 Shibatani, M., 116 Shirai, Y., 63, 68, 72 Shuy, R., 25, 31, 54 Slobin, D. I., 64, 107, 177–8, 180, 187 Smith, B. H., 194n Smith F., 137 Smolensky, M., 9 Snow, C., 181 sociolinguistic competence, xv, 154, 157, 162, 193n sound change (see language change) Southern English, 24, 33, 44, 54, 155, 165

Index speech community, xiv, 12, 17, 18, 32–3, 40–3, 134–5 Spelke, E., 4 Spivey, M.J., 88–91, 127–8 Spolsky, B., 137 Stabler, E., 10 stereotype (linguistic), 44–5 Strict Constructivism Hypothesis, 112 style: of speaking, 13, 14, 16, 18, 24–32; 35, 39, 51–2, 55–6, 59, 90, 133–5, 138–52, 154, 161, 164, 166; vernacular, 14, 134–5, 138, 140, 144, 152, 157 style axiom, 142 style shifting, 25, 30, 39, 139, 140–51, 163 style tree, 27–8 Swain, M., 98, 153 symbolic structures, 104–6 syntactic templates, 88, 170

T Tajfel, H., 163 Talmy, L., 178–9 Tarrallo, F., 168 Tharp, R., 166 Thompson, G. L., 149–50 Townsend, D.J., xv, 11, 12, 86, 92, 170 transition problem of language acquisition (see language change) Trudgill, P., 24–5 Tzeltal, 189 Tzotzil, 189–90

U uniform constraints assumption, 17, 18, 20 Universal Grammar, xiii, 5, 39, 128

• 209

Valdman, A., 158–62, 167 Varbrul program, xiii, xiv, 15–17, 23, 26, 28–30, 41, 50, 51, 55, 61, 66–72, 78, 86, 91–5, 98, 142, 145, 146, 149–50 variable rules, 12–16, 50, 86, 92, 98, 147, 152, 168; as prototype schemas, 110, 126–9; in artiﬁcial language, 77–81; objections to, 17–20; logical status of, 21–2; scope of, 17 variation theory, xiv–xvi; history of, 12–21; related to cognitive linguistics, 102, 110, 122, 126–9; related to connectionism, 86, 99, 126–9 Vendler, Z., 62–3 verbs, class II irregular, 108, 126–7 vernacular language, 57, 58, 154, 156–7, 161–3, 166, 190 vertical variation (in interlanguage), 49–51, 53, 61, 135, 138, 147 vowel formants, 34, 36 Vygotsky, L. S., 165, 166

W Weinreich, U., 34, 36, 45, 183, 185 wh– questions, acquisition of, 50 Wilkins, D. A., 153 Wittgenstein, L., 102 Wolfram, W., 12, 17, 25, 31, 32, 49, 51–4, 56, 59, 61, 62, 66, 67, 68, 71, 73, 165 Wood, D., 167

Y Young, R., 51, 61, 72, 145–6

V Valdez, G., 182 Valian, V., 10

Z zone of proximal development, 166–7