The Growth and Maintenance of Linguistic Complexity (Studies in Language Companion)

The Growth and Maintenance of Linguistic Complexity Studies in Language Companion Series (SLCS) The SLCS series has b...

Author: Osten Dahl

26 downloads 710 Views 13MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

The Growth and Maintenance of Linguistic Complexity

Studies in Language Companion Series (SLCS) The SLCS series has been established as a companion series to Studies in Language, International Journal, sponsored by the Foundation “Foundations of Language”. Series Editors Werner Abraham

Michael Noonan

University of Vienna

University of Wisconsin, Milwaukee

Editorial Board Joan Bybee

Christian Lehmann

University of New Mexico

University of Erfurt

Ulrike Claudi

Robert Longacre

University of Cologne

University of Texas, Arlington

Bernard Comrie

Brian MacWhinney

Max Planck Institute For Evolutionary Anthropology, Leipzig

Carnegie-Mellon University

William Croft

University of California, Santa Barbara

University of Manchester

Edith Moravcsik

Östen Dahl

University of Wisconsin, Milwaukee

University of Stockholm

Masayoshi Shibatani

Gerrit Dimmendaal

Rice University and Kobe University

University of Cologne

Russell Tomlin

Ekkehard König

University of Oregon

Marianne Mithun

Free University of Berlin

Volume 71 The Growth and Maintenance of Linguistic Complexity by Östen Dahl

The Growth and Maintenance of Linguistic Complexity

Östen Dahl Stockholm University

John Benjamins Publishing Company Amsterdam/Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data Dahl, Östen The growth and maintenance of linguistic complexity / Östen Dahl. p. cm. (Studies in Language Companion Series, issn 0165–7763 ; v. 71) Includes bibliographical references and index. 1. Complexity (Linguistics) 2. Language and languages. I. Title. II. Series. P128.C664D34 2004 401’.43-dc22 isbn 90 272 3081 1 (Eur.) / 158811 554 2 (US) (Hb; alk. paper)

2004048607

© 2004 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microﬁlm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

Table of contents

Preface Chapter 1 Introduction Chapter 2 Information and redundancy 2.1 Why information is important5 2.2 Information theory for dummies5 2.3 Redundancy management9 2.4 Prominence management11 2.5 Inﬂationary phenomena15 Chapter 3 Complexity, order, and structure 3.1 Introduction19 3.2 Order19 3.3 Compression, complexity and patterns21 3.4 Attunement26 3.5 Emergence, reductionism and self-organization27 3.6 Emergence and emergentism in linguistics33 3.7 Complexity vs. cost and diﬃculty39 3.8 Complexity of languages40 3.9 Conceptual complexity45 3.10 Choice structure46 3.11 Linguistic patterns50 3.12 Linearity51 Chapter 4 Languages as non-genetically inherited systems 4.1 Introduction57 4.2 Memetics and linguistics57 4.3 Organisms, groups and ecosystems62

ix

1

5

19

57

vi

Table of contents

4.4 Genotypes, phenotypes and replication65 4.5 Life cycles69 Chapter 5 Aspects of linguistic knowledge 5.1 Introduction75 5.2 Functions and intentions75 5.2.1 Functions vs. conditions of use80 5.3 Ritualization and conventions86 5.4 Entrenchment89 5.5 Piecemeal learning96 Chapter 6 Maturation processes 6.1 The notion of maturity103 6.2 Identifying mature phenomena in languages106 6.3 Naturalness, markedness, and Universal Grammar115 Chapter 7 Grammatical maturation 7.1 The notions of grammaticalization and grammatical maturation119 7.2 Pattern spread121 7.3 Pattern competition and pattern regulation128 7.4 The cyclical theory of grammaticalization134 7.5 Unidirectionality, directionality and problems of identity140 7.6 The rise and fall of semantic redundancy147 7.6.1 Alienability and obligatory possessive marking148 7.6.2 Locational constructions153 Chapter 8 Pattern adaptation 8.1 Introduction157 8.2 Reductive change157 8.3 The concerted scales model of grammaticalization164 8.4 Preservation of structural complexity168 8.5 Reanalysis and structural change170 8.6 Tightness and condensation178

75

103

119

157

Table of contents

Chapter 9 Featurization 9.1 Introduction181 9.2 Abstract features in grammar181 9.2.1 General181 9.2.2 Models of morphological structure182 9.3 The inﬂectional model192 9.4 Agreement: Where syntax, morphology and lexicon meet196 9.5 Can we do without abstract features?203 9.6 Parallels in phonology205 Chapter 10 Incorporating patterns 10.1 Introduction209 10.2 “Classical noun incorporation”210 10.3 Quasi-incorporation216 10.4 Lexical aﬃxes219 10.5 Overview of NP-internal incorporation and quasi-incorporation221 10.5.1 Compound nouns222 10.5.2 Adjective + noun compounding225 10.5.3 Possessive NP constructions236 10.5.4 Co-compounds239 10.5.5 Titles and other proprial classiﬁers242 10.6 Incorporation of locational and directional adverbs/particles244 10.7 Referentiality constraints246 10.8 Incorporating patterns in the making?248 10.9 Explaining incorporation252

181

209

Chapter 11 Stability and change 261 11.1 Introduction261 11.2 Measuring stability261 11.3 Do languages become more complex over time?276 11.4 The dependence of language change on external factors280 11.5 Who is responsible for maturational changes — adults or children?285 Chapter 12 Final discussion

289

vii

viii Table of contents

Appendix A

297

References

303

List of abbrevations used in glosses

315

Language index

317

Author index

323

Subject index

327

Preface

Doctoral students are routinely told by their supervisors that they should delimit their topic, realize that they need not state their opinion on every important issue in linguistics and that they should not go on revising their manuscript indeﬁnitely. When writing this book, I gradually came to see that I was myself violating most of the principles I had been preaching to my students. Finally, after deciding that I had already spent too much time on the project, I managed to accept that I had to produce a ﬁnal version. Almost two years ago, a not quite so ﬁnal version was circulated. Several people were kind enough to read it, and I have beneﬁtted greatly from their comments. My thanks go to Alvar Ellegård, Hans-Olav Enger, Maria Koptjevskaja-Tamm, David Minugh, and Mikael Parkvall. Later versions were scrutinized by Anne Markowski and David Minugh, who pointed out infelicities in my English and many other errors of various kinds. Thanks are thus due to them and to Gunnar Eriksson and an anonymous referee for valuable last-minute comments. At this point, the phrase “all remaining errors are my own” is expected; however, in the context of this book, it should be noticed that it has been ritualized to the point of almost entirely losing its informational value (see further Sections 2.5 and 5.3). I also think that some responsibility for remaining errors should be taken by the well-known software company whose programs sometimes pretend they are more intelligent than they in fact are. An important source of inspiration is the daily contact with one’s colleagues in one’s own university and in other places. My home base, the Department of Linguistics at Stockholm University, provides a friendly and harmonious environment with a truly interdisciplinary spirit: for the work presented here, it has been particularly important to be able to communicate not only with people from my own group but also with colleagues in phonetics, sign language studies, and computational linguistics. At a relatively early stage of the work on this book, I spent two months as a visitor at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, which oﬀered ideal working conditions and many valuable contacts with colleagues. The research project “Meaning and Interpretation”, sponsored by the The Bank of Sweden Tercentenary Foundation, provided possibilities of creative interaction with scholars from several disciplines. A personal grant from the same foundation, for the study of grammaticalization processes in

x

The Growth and Maintenance of Linguistic Complexity

Scandinavian vernaculars, has given me more time for research than I would otherwise have had. For the ﬁnal stage of the preparation of the manuscript, I have received ﬁnancial support from the Swedish Research Council. The preface is usually the last thing you write, and the hardest decision is whom you should thank. I have chosen a restrictive option, mentioning by name only those who have been directly involved in the writing process, and acknowledging speciﬁc information in relevant places in the book. To all others, without whom this book would not be possible, a collective thanks according to the Scandinavian principle: “ingen nämnd och ingen glömd”, that is, “nobody mentioned and nobody forgotten”.

Chapter 1

Introduction

The following everyday English sentence is a suitable point of departure for illustrating the kind of linguistic complexity that I shall discuss in this book: (1) I called her yesterday The ﬁnite verb in (1), called, consists of two morphemes, the verb stem call and the inﬂectional suﬃx -ed, pronounced [d], which, as grammars tell us, is a marker of past tense. Grammars may also tell us that the function of past tense is to indicate that the event related in the sentence took place in the past. However, one may ask what the point is of including such a morpheme in (1), given that the past time reference follows from the time adverb yesterday. Indeed, in many languages, there would be no counterpart to the English past tense marker in a similar sentence. Consider e.g. the following Kammu sentence,1 where the verb is not inﬂected, and there is no other morpheme that could be identiﬁed as a tense marker: (2) Kammu Ò pìp k`66 knc6 neèy. I meet him yesterday ‘I met him yesterday’ In English, deleting the tense marker in (1) yields an ungrammatical sentence — English grammar thus forces the speaker to use a longer expression than would seem to be called for, from the point of view of languages such as Kammu, even if the addition of one segment to the verb stem does not look like a very costly embellishment of the sentence. No doubt, the existence of obligatory past tense marking also makes English more complex than it would otherwise be. However, it now turns out that English often manages to mark past tense without adding any extra segment at all. Consider the following sentence pair: (3) a. I see him now b. I saw him yesterday In (3), the present and past tenses are diﬀerentiated through an alternation of the stem vowel of the verb, which means that the two verb forms are the same length.

1.The sentence originates from the questionnaire material collected for Dahl (1985).

2

The Growth and Maintenance of Linguistic Complexity

Here, the obligatory past tense marking makes English more complex, but not more “verbose” — the term I shall employ to describe grammatical constructions whose expression is longer than appears necessary from a cross-linguistic point of view. Both the regular (historically: “weak”) and irregular (historically: “strong”) English verb conjugations exempliﬁed here are of a venerable age, going back at least two millennia, to the earliest attested stages of Germanic languages, and in the case of the strong verbs, probably at least twice as far back. According to one widely-accepted hypothesis, the weak conjugation originates in an auxiliary construction. The story of the vowel alternation found in the strong conjugation is more complicated and partly obscure. What is important here is the fact that grammatical patterns of this type are generally the result of long developmental chains that may stretch over millennia, and may thus be used as illustrations of the general thesis that linguistic phenomena have life cycles in the sense that they pass through a number of successive stages, during which they “mature”, that is, acquire properties that would not otherwise be possible. Such developmental chains have been studied under the rubric of grammaticalization, commonly deﬁned as a process by which lexical items become grammatical markers. In this book, I look at grammaticalization in the perspective of what I call maturity — mature linguistic phenomena being those that presuppose a non-trivial prehistory: that is, they can only exist in a language which has passed through speciﬁc earlier stages. Grammatical maturation — processes that give rise to phenomena that are mature in this sense — in general adds to the complexity of a language: hence this book is about the genesis and growth of linguistic complexity. Human languages are complex systems, not least because they need to express complex thoughts. The complexity I am interested in here, however, does not so much concern what one can say in a language — the expressive power of the language — as how it is said. Complexity is here seen, not as synonymous with “diﬃculty” but as an objective property of a system — a measure of the amount of information needed to describe or reconstruct it. But it is not enough for complexity to arise in a language — it must also be preserved over time in the transmission of the language from one generation to the next: hence this book is also about the maintenance of complexity, or the stability of mature phenomena. Structure of the book. A brief “road map” for this book looks as follows. In the ﬁrst ﬁve chapters, I attempt to set the stage for the treatment of linguistic complexity and maturational processes in grammar. Thus, in Chapters 2 and 3, I introduce and discuss some general mathematical and/or philosophical notions of relevance to the topic of the book. The most salient among those is, of course, complexity, but it turns out to be easier to start with another notion, namely that of information, to which Chapter 2 is devoted. After introducing the mathematical notion of information, and the closely related notion of redundancy, I turn to their application to

Introduction

linguistics, under the headings of “redundancy management” and “prominence management”. I also draw some parallels between linguistic and economic phenomena. In Chapter 3, I turn to notions such as complexity, order, and patterns, discussing them both from a more general point of view and from a linguistic perspective. I devote a signiﬁcant part of the chapter to the term “emergence”, in order to show that its use in linguistics diﬀers in crucial ways from that found in other disciplines. I also make a number of conceptual distinctions relating to complexity as applied to linguistics, trying both to keep it apart from other notions such as “diﬃculty” and “cost” and to sort out what dimensions of complexity apply to language. In the last part of the chapter, I look at linguistic structure and introduce the notions of linearity and verbosity as a way of speaking of complexity (or the lack of it) in grammar. In Chapter 4, I look at language from an evolutionary perspective — trying to see what unites and distinguishes a “non-genetically inherited system” such as a language from the genetically inherited systems that are the object of study of evolutionary biologists. Parallels between linguistics and biology are found not only in phylogeny but also in ontogeny — as already noted, the idea of “maturational processes” in grammar implies that grammatical entities pass through series of stages similar to those that an organism goes through in its life cycle. This notion is therefore discussed at some length in the ﬁnal section of the chapter. According to the view advocated in this book, languages are abstract “information-theoretic objects”, which are, however, acquired and used by concrete human beings. In Chapter 5, I discuss notions such as function and intention in their relation to language, and some problems concerning the acquisition and storage of linguistic knowledge. Chapter 6 introduces the notion of maturity. After considering what linguistic phenomena can be regarded as displaying maturity and how to identify less mature synchronic language states, the notion of maturity is compared to concepts such as naturalness and markedness, which played an important role in much of 20th century linguistics. Chapter 7 deals with the components of grammatical maturation processes (i.e., basically what is usually called “grammaticalization”) and looks in detail at pattern competition and pattern regulation. Some general properties commonly associated with grammaticalization are also discussed, such as cyclicity and unidirectionality. The other main component of grammatical maturation, pattern adaptation, is treated in Chapter 8, together with notions such as reanalysis and condensation. Chapter 9 brings up a particular aspect of grammatical maturation, the genesis of abstract grammatical features and inﬂectional systems. The non-linear character of those systems is compared to non-linear aspects of phonology. Chapter 10 discusses the phenomena of compounding and incorporation, in particular their diachronic aspects.

3

4

The Growth and Maintenance of Linguistic Complexity

In Chapter 11, I consider grammatical maturation in an “ecological” perspective, discussing the stability of grammatical systems and the factors underlying changes of diﬀerent kinds. Finally, in Chapter 12, I try to tie the loose ends together.

Chapter 2

Information and redundancy

2.1 Why information is important Although the mathematical notion of information, ﬁrst formulated by Claude Shannon in 1947 (Shannon (1949:Ch. 2)), may seem alien to linguists, it has obtained a central status in many branches of science, culminating in the view that information is the stuﬀ of which the universe is made (Fredkin (1992a, 1992b)). In this book, I shall try to show that notions from information theory are of direct relevance to linguistics, as well, particularly to the issues of language complexity and processes of linguistic maturation. This chapter provides the background notions necessary for the ensuing discussion, with a particular focus on the role of redundancy.

2.2 Information theory for dummies In this section, I shall explain as much of the mathematical notion of information as is possible with a minimum of mathematics, for the beneﬁt of those among my readers who are as formula-challenged as I am myself. Those who are already familiar with the notion may quite happily skip this section. Consider a simple combination lock like the ones that are sometimes found on suitcases:

3

7

4

Such a lock with three digits has 10 × 10 × 10 = 1000 possible combinations. This is not particularly safe. Someone who really wants to break into the suitcase can go through the diﬀerent combinations fairly quickly. A four-digit lock is obviously safer, since there are ten times as many combinations, i.e. 10 × 10×10 × 10 = 10,000:

3

4

7

5

6

The Growth and Maintenance of Linguistic Complexity

But you could also construct a safer combination lock by permitting not only digits but also letters in the combination, yielding 36 × 36 × 36 × 36= 1,679,616 combinations:

3

B

7

Z

We can see a combination lock as an example of a system that can be in a number of diﬀerent states — it has a state space. The properties of the state space are determined by the choices that are inherent in the system — in the case of the combination lock, each position deﬁnes one choice point, and at each choice point there are a certain number of choices. Systems of this kind — of which there are of course an innumerable variety — form the basis for information theory. Each combination in the lock example is essentially a string of symbols chosen from some alphabet. Sets of such symbol strings, or “string sets” for short, are convenient examples of the kind of systems just mentioned, not least in a linguistic setting. We may abstract from concrete manifestations in the physical world (such as combination locks) and speak of properties of string sets in general. But, as in the examples above, we may put diﬀerent restrictions on the members of a string set; for instance, we would normally choose a speciﬁc alphabet (the set of symbols allowed in the strings) and determine a maximal string length. The alphabet that the symbols are chosen from and the maximal length of strings together determine the number of possible strings that the set can contain. In the simplest case, represented by combination locks of the kind described above, this number corresponds to the size of the state space of the system and also determines one of its fundamental properties — what is often referred to as “uncertainty,” but could perhaps better be called “guessing diﬃculty” — that is, the probability that a random guess will match a given member of the set. This is also one of the several senses of the term “entropy”. The guessing diﬃculty is easily calculated as 1/n, where n is the number of strings in the set, provided that the strings are equally probable. (If they are not, we ﬁrst have to calculate their average probability.) Imagine now a simple “guessing game” of the following kind. There are two players, and one of them — Alice — chooses a 5-digit string of 0’s and 1’s that the other player — Bob — has to guess. Examples of such strings are 00100 and 10110. From the start, there are 2 × 2 × 2 × 2 × 2 = 32 possible strings and the probability of each of them is 1/32. Whenever Bob makes a false guess, Alice gives him a tip: she identiﬁes one of the digits in the string. At each step, then, Bob’s task becomes less diﬃcult: the number of digits that he has to guess decreases. Denoting the unknown letters in the string as *, we can illustrate what happens as follows: ***** Æ 1**** Æ 1*0** Æ 1*0*0 Æ 1*010 Æ 11010

Information and redundancy

What Alice gives Bob is information, which information theorists like to deﬁne as “reduction of uncertainty”. One advantage of this perspective on information is that it provides us with a possible way of measuring it. When Alice gives Bob the ﬁrst tip, she essentially reduces the ﬁve-digit string to be guessed to a four-digit one. The information contained in her tip then equals the diﬀerence in uncertainty between a ﬁve-letter and a four-letter string — the number of possibilities is reduced from 32 to 16 and the probability of each possibility increases from 1/32 to 1/16. But there is a more convenient way of counting here, using the same technique as the one used in the old game “Twenty Questions”, which is based on the fact that any set of choices can be “recoded” as a sequence of binary choices, expressible as yesno questions. Thus, to identify a number between 1 and 8, it is always enough to ask three questions along the following pattern: “Is it larger than 4?”, “Is it larger than 6?”, “Is it 7 or 8?”. Instead of stating the number of combinations, we can then simply give the number of binary choices (yes-no questions) necessary to identify an arbitrary member of the set.1 One binary choice — under the name of bit — is in fact the basic unit in which information is measured in information theory. Information theory does not care about the content of information that is conveyed, only the amount by which uncertainty is reduced. To avoid possible misunderstandings, I shall therefore use the phrase informational value (of a message or an expression) to denote this amount. In the example above, Alice’s messages to Bob conveyed information about an object (a string of symbols). For most of us, this would probably be the canonical use of the term “information” — information is always about something. However, in information theory as developed by Shannon, the primary object of study is really how signals (e.g. radio signals) are transmitted in a communication channel, and the focus is on how easy or diﬃcult it is to predict what is going to be transmitted. Shannon thus identiﬁes the information in a signal with its “uncertainty” or “guessing diﬃculty”, as deﬁned above. The less probable a signal is, the more information is transmitted by it. This is sometimes called “syntactic information”, as opposed to “semantic information”, which would be “information about” something. Again, the terminology is quite confusing — most people (including most linguists) probably wouldn’t see “syntactic information” as information at all. However, Shannon’s “syntactic information”, or rather its inverse, shows up quite frequently in linguistic literature under the guise of “predictability”: linguistic items are said to have “high predictability” when the amount of syntactic information is small. But the term “predictability” also allows for an equivocation between the two

1.Mathematically, this is of course the power to which you have to raise 2 to get the number of strings in the set, or else the 2-base logarithm of that number.

7

8

The Growth and Maintenance of Linguistic Complexity

kinds of information. Consider the well-known saying that man bites dog is a piece of news whereas dog bites man is not. The principle that the saying illustrates is that events that happen often are not newsworthy — that is, to be worth publishing, a piece of news should not have too high a probability, that is, too low an informational value. (I assume that this would fall under Grice’s Maxim of Relevance.) But this leads to an apparent paradox: the heading man bites dog may well have a higher frequency in newspapers — and hence be more predictable — than the heading dog bites man, although the latter represents a more frequent event. A message with a low semantic information value may thus be less probable than one with a high semantic information value. When a linguistic item increases its frequency, this involves both syntactic and semantic information, but in diﬀerent ways. A precondition for the rise in frequency is often that such relevance considerations as were mentioned in the preceding paragraph are ignored — one uses the item whether it conveys any new (and interesting) information or not. The predictability of the item increases, but that is rather an automatic consequence of the rise in frequency. These two phenomena are sometimes conﬂated in the discussion. As we all know, objective probabilities are one thing, subjective expectations another. I may know that it rains two days out of three in my home town and still be convinced that the weather will be nice tomorrow. The reasons for my conviction may be rational — I have read the weather forecast for tomorrow — or irrational — I’m organizing a garden party and don’t want to have it spoiled. Obviously, in linguistic communication, subjective expectations may be more important than objective probabilities, but the latter are easier to measure and often easier to speak about, so linguists tend to pretend there is really no diﬀerence. In Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, one major theme is the quest for the answer to “the great Question of Life, the Universe and Everything”. Finally, the answer is revealed to be “Forty-two”. At this point, however, it is realized that nobody knows what the question is! This already famous answer without a question can be used to illustrate one rather drastic limitation of the notion of information in information theory. If you visit a foreign country, it is rather essential for your survival to know if they drive on the left or on the right in that country. This is a binary choice, that is, one bit of information. But you will not be able to put that knowledge to any use if you cannot distinguish right and left or if you don’t understand the notion of a traﬃc rule. In other words, the single bit of information presupposes other knowledge, and exactly how much information that involves is a question that is not so easy to answer. But the problem turns out to be even wider. In fact, most stored information in the world is pretty useless as long as you do not have a “key” that tells you how to apply it to the world. And since the key itself may be seen as information, you would seem to need a key to apply the key — a potential inﬁnite regress. An

Information and redundancy

essential diﬀerence between the storage of information or knowledge in living beings like ourselves, on the one hand, and non-living “information containers” such as books, on the other, is that a living being normally carries both the information and the key to it — that is, can apply the knowledge directly.

2.3 Redundancy management To transfer a certain amount of information, one must spend a certain amount of communicative resources. This amount may be distributed over time in diﬀerent ways depending on the capacity of the communication channel — its bandwidth, as it is now fashionable to say. That is, in a broadband channel, a message may be more compressed in time, since a larger amount of information can be transferred in a given time interval. The term “noise” is used in information theory for anything that interferes with the transfer of information, including of course noise in its everyday sense. Most information channels are noisy in one way or another, resulting in loss of information on the way. A message may thus arrive in more or less garbled form, but may still be understood, however, if it contains enough redundancy for the message to be reconstructed by the receiver. A message is redundant if there is a less complex message that could — in principle — transfer the same amount of information, that is, if more communicative resources are spent on it than are theoretically necessary for its successful delivery. A sender can manipulate the redundancy level by spending more or less energy and/or time on the message. This is what we can call redundancy management (the term is used by Keller (1994) as a translation of the term Redundanzsteuerung, taken from Lüdtke (1980)). By increasing redundancy — for instance by repeating the message — the sender can reduce the risk of faulty delivery but will simultaneously raise her own costs for sending it. A cost-beneﬁt analysis of a prospective information transfer will therefore have to involve relating the expected noise level to the marginal costs for adding extra communicative resources, as well as extraneous factors such as the expected negative consequences of a delivery failure — that is, in more everyday terms, how important it is that the message is actually delivered. Developing techniques for safeguarding information transmission against errors with a minimal increase in transmission cost has been one of the major motivations for developing information theory. One of the everyday results of these eﬀorts can be seen in the “checksum digits” that are added to bank account and credit card numbers and other numerical codes. The checksum digit is calculated from the other digits by some more or less complicated algorithm, for instance by adding all the digits together and taking the last digit of the result. Given a number such as 1945 you obtain 1 + 9 + 4 + 5 = 19, the last digit of which is 9 — your

9

10

The Growth and Maintenance of Linguistic Complexity

checksum digit, so the new number will be 19459. Suppose now that this number is sent by cable and there is an error of transmission so that you receive 29459 instead. By performing the checksum calculation on the ﬁrst four digits in this string you will immediately see that there is an error, since 2 + 9 + 4 + 5 = 20. Obviously, some errors will still pass unnoticed. For instance, if there are errors in more than one digit, they may cancel each other out (say, if you receive the string 28459). But the addition of the checksum digit makes it possible to discover about nine out of ten transmission errors. More sophisticated algorithms make it possible not only to tell that an error has been made but also to reconstruct what the original message was. I will not try to go into them here, but rather point to a general principle which is of importance for linguistic redundancy management, as well. Suppose that you want to transmit a string of symbols. In the simplest possible code, each input symbol corresponds to one symbol in the output. But such a code is very sensitive to noise. If one of the symbols is somehow incorrectly transmitted, there is no way of reconstructing it. If, on the other hand, the information about an input symbol is spread over a longer stretch of the output, that is, one input symbol is allowed to inﬂuence several output symbols, a local disturbance may not be fatal. The principle, then, is that spreading out information is beneﬁcial to safe transmission. As an example of how this applies to spoken language, consider the phenomenon of co-articulation in phonetics. Co-articulation means that one phoneme inﬂuences the way another phoneme in its neighbourhood is articulated. For instance, the syllables [si˜] and [su˜] diﬀer (among other things) in that the second vowel is pronounced with lip-rounding. But if you try to pronounce them in sequence, you will notice that the [s] is also diﬀerent: the lip-rounding in [su˜] starts as soon as the initial segment begins, rather than after it. Co-articulation is probably most readily thought of as having to do with ease of pronunciation or inertia of the speech production apparatus. However, it may in fact also have beneﬁcial eﬀects on the reception of the message in that cues identifying a certain segment are spread out over a longer stretch of time and thus makes the identiﬁcation of the segment more immune to local disturbances. I shall argue later that a similar kind of information spread-out is important in linguistic maturation processes. An essential point to be signalled even at this preliminary stage is that the spread-out of information from a segment of the signal to its neighbours means that the mapping from input to output — and thus the system as such — becomes more complex. Ease of articulation and understanding is thus bought at the expense of the complexity of the system. Notice that it is not the case that just any way of increasing the redundancy of an expression will do. For instance, merely adding an arbitrary digit to a bank account number would not contribute to error detection in the way a checksum digit does. The reason, I think, lies in the fact that the checksum digit system creates a patterned and therefore “smart” kind of redundancy, whereas adding a random

Information and redundancy

digit does not introduce any patterning and is thus “dumb”. (The notion of pattern will be discussed in detail in Chapter 3.) Although the two examples just given diﬀer in “beneﬁt” but not in “cost”, error detection systems may also give the same “beneﬁt” at diﬀerent “costs”. The optimal system does not add to the length of the signal but rather exploits the redundancy that is already there. In co-articulation, for instance, spreading the realization of a feature over several segments does not make the utterance any longer, it merely enhances the diﬀerence between diﬀerent messages. There is a further distinction of some importance for what follows, namely between that between user-level redundancy management and system-level redundancy management. I noted above that the sender may reduce the risk of faulty delivery by adding redundancy. However, it is not the case that bank customers make a decision whether to add the checksum digit to their account number or not every time they ﬁll out a form at the bank or send money over the Internet; rather, the bank requires them to do so. Thus, the decision is taken at the level of the system rather than on-line by the user. Analogously, much of the redundancy in the linguistic utterances we produce is required by the grammar of the language. For instance, in the English sentence These ﬁve books are interesting the plurality of the books is signalled grammatically no less than three times in spite of being clear from the numeral ﬁve — by choosing plural forms of the demonstrative this, the noun book, and the copula. This may seem dysfunctional, but need not be. In the case of bank account numbers, it should be fairly clear that the decision whether to use checksum digits cannot be left to the bank clients, who are likely to systematically underestimate the risk of errors. In a communication situation, the introduction of redundancy by a system-level rule may be in the interest of the receiver, as a safe-guard against a bias towards excessive economy on the part of the sender. Furthermore, an on-line costbeneﬁt calculation is in itself costly. It may in fact be more eﬃcient to use an automatic redundancy-increasing principle that does not burden the processing system as much. All this restricts the applicability of “maxims of action” (Keller (1994: 90–95)) for explaining linguistic behaviour and language change.

2.4 Prominence management Redundancy management, as outlined in the preceding section, can be applied not only to messages as wholes but also to their parts. A sender who wants to maximize the chances of safe delivery and minimize costs will try to distribute the communicative resources spent on a message over its parts in a way that corresponds to their information content (in the above sense) and their extraneous importance. This means assigning a larger part of the resources at hand to components that either contain large amounts of information or whose delivery is critical for one reason or other.

11

12

The Growth and Maintenance of Linguistic Complexity

At this point a complicating factor comes into play. In human communication not only the sender but also the receiver is a rational agent. As a consequence, the sender may inﬂuence the receiver’s activities in various ways, in particular the receiver’s attentional and inferential processes. Of these, attention may be thought of in terms of resources: the time and energy a receiver is willing to spend on the message or a speciﬁc part of it. The receiver’s inferential processes, on the other hand, are crucial in that they help in recovering the information in the message, using not only the incoming signal but also the receiver’s knowledge about the world and assignment of subjective probabilities to diﬀerent possible messages — her expectations, in short. The sender, in turn, tries to form a hypothesis about the receiver’s expectations in order to optimize the message with respect to preservation of energy and ease of comprehension. But in doing so, the sender also guides the receiver by signalling what parts of the message to pay special attention to. Basically, what this means is that those parts of the message which are more diﬃcult or important should be enhanced and those parts that are easily recoverable or whose delivery is less critical can be reduced. To introduce a term that is parallel to “redundancy management”, I will call this prominence management. In actual practice, it is often diﬃcult to keep redundancy and prominence management apart. Highly unexpected news items also tend to be those that are worth paying attention to. Redundancy and prominence management thus have one major factor in common: they both operate on listeners’ expectations. Let us consider an everyday example of what I have just said, a simple shopping-list:

milk bread cheese orange juice Suppose Alice writes this list for Bob but knows that he is somewhat careless in reading instructions. She may for instance be afraid that he will just buy the things that he buys every day without really checking what is new. Or she may know that he tends to forget the cheese. Or, if they have Aunt Agatha in the house, and she will be angry if she does not get cheese for breakfast, it is particularly important to remember that item. There are a number of ways in which Alice may draw attention to an item on the list:

Information and redundancy

milk bread CHEESE orange juice

milk bread cheese orange juice

milk bread

cheese orange juice

milk bread cheese! orange juice

Capitalization, underlining, larger characters, and exclamation marks are all devices that enhance the probability that the receiver will notice the item in question. However, they diﬀer in the ways in which the goal is attained. The fact that the size of an item contributes to its chances of being noticed is grounded in basic principles of information theory and perceptual psychology. The exclamation mark, on the other hand, is a conventional device whose actual manifestation is at least partly arbitrary or language-speciﬁc — the corresponding Spanish list would for instance say ¡queso!, with an upside-down exclamation mark at the beginning. In other words, the users of a language may have learnt certain conventional means of signalling that a message or a certain part of it is worth extra attention. The person who primarily beneﬁts from this fact is the sender, who does not have to actually spend the resources that would in themselves guarantee that the message goes through; the only requirement is to give the appropriate signal — one which gets the receiver to do the work. In most cases, at least in spoken language, devices used for drawing attention to or highlighting a certain part of an utterance — what is traditionally called “emphasis” — are seldom entirely conventional. Most of them — notably prosodic prominence, repetition, and salient placement in the utterance2 — rely on “natural” redundancy-increasing mechanisms. As mentioned above, prominence management involves not only enhancing certain parts of the signal but also reducing those parts that are easily recoverable or whose delivery is less critical. Suppose that there are a number of items that Bob buys every day, such as milk and bread. Alice may then choose to abbreviate the words milk and bread to, say, m&b. She may also write “the usual stuﬀ” or leave these items out altogether, if she relies on Bob remembering them. Perhaps of greater importance to the topics to be discussed in this book is the possibility that not all the elements in a message are informationally autonomous. The simplest way of explaining what I mean by that is to say that informationally autonomous units express information that constitutes an independent and non-default choice between diﬀerent possibilities. They are the ones that are worth paying attention to

2.Languages may diﬀer in whether for instance the ﬁrst or the last position in a sentence is used for highlighting, but that it is these positions rather than any others that tend to be exploited for these purposes is certainly no accident — it is also in line with the results of research on perception in general.

13

14

The Growth and Maintenance of Linguistic Complexity

at all. An expression may fail to qualify as informationally autonomous for several reasons. One is that the information it expresses is derivable from some other expression or from the context. In the expression pregnant female, the word female does not contain any autonomous information that is not derivable from the word pregnant. Another is that the expression makes up a conceptual unit together with another expression. For instance, in the combination Tom and Jerry, neither of the constituents is understood as being informationally autonomous, since the characters in question always appear together. Finally, the information expressed may be the default value of some parameter. In the expression drink coﬀee, drink is not informationally autonomous since it is the activity that one would expect in connection with coﬀee — other alternatives are certainly possible but less likely. At least in languages like English and Swedish, informationally non-autonomous items are normally de-accented. Consider the following pair of sentences (where small capitals represent sentence stress) to see how this works: (4) Marie speaks ﬂuent Danish because she grew up in Denmark. (5) Marie speaks ﬂuent Danish because she grew up in Århus. Denmark is the expected place for learning, so Denmark is de-accented. Although Århus is situated in Denmark, we still couldn’t guess that this is where Marie lived, so it retains the accent. Thus, an accented item signals that here is something that you shouldn’t just allow to go in through one ear and out the other, because it contains non-retrievable information. Informational autonomy is important for the processes discussed in this book because it is a reason for an expression not to be integrated with its surrounding context. At this point, I want to introduce the notion of rhetorical value, a term which may have to remain slightly vague. If it made sense to quantify and measure the amount of attention a person gives to a certain item, rhetorical value could be deﬁned as the amount of attention that the listener pays to an utterance or one of its components. However, seen from the point of view of the speaker, we also have the intended rhetorical value, that is, the amount of attention that the speaker wants the listener to pay to an utterance or a component of it. What is particularly interesting for the themes discussed in this book is what we can call the conventional or system-level rhetorical value of a certain linguistic pattern. For instance, an exclamation mark is a signal of a higher-than-normal rhetorical value of an expression. By default, a complex utterance consists of a list of items of equal rhetorical value. In this section, we have seen that prominence management may be used to signal deviations from this normal level in either direction. We may thus divide utterance components into three broad categories with respect to their rhetorical value: high, normal and low. Later on, we shall see how the system-level rhetorical value of expressions decreases in the course of processes of language change.

Information and redundancy

2.5 Inﬂationary phenomena The similarities and historical ties between language, both spoken and written, and monetary economies would appear to be obvious but have been surprisingly little discussed. In Bloomﬁeld (1933), the function of linguistic communication is explicated through the (perhaps no longer wholly politically correct) story about Jack and Jill. Jill sees an apple on a tree and wants to eat it. Bloomﬁeld points out that the existence of language relieves Jill from the necessity of climbing the tree: instead she asks Jack to do it. Thus, language is seen as a powerful tool for creating a division of labour in society. However, Bloomﬁeld forgets to tell us what is in it for Jack in this deal. In fact, in another story, Jill might well have had to pay Jack for performing the service. In other words: language and money are both instrumental in creating a situation in which people do things for each other. In history, the appearance of monetary economies and the appearance of systems of writing are usually closely connected. Moreover, the original function of writing is considered to have been bookkeeping. Thus, like money, the written signs “represented” property in a fairly concrete sense. Money and language are both powerful tools for handling the transfer of information and manipulating rights and obligations. After all, the traditional text on banknotes is no less than an explicit performative: (6) I promise to pay the bearer on demand … If I give my neighbours ten potatoes in exchange for ﬁve herrings, what is exchanged has a direct and practical use for the persons involved. If, instead, I give them two dollars, they will accept the money only because they rely on being able to exchange it for something that they need later on. Money only has a value as long as everybody thinks it does. The fact that a piece of paper with some text on it is worth a certain amount of money is thus conventional and arbitrary in much the same sense as language is. Moreover, as we know, this fact is a very shaky one: the value may vanish at any time — money is a “symbolic commodity”. One notion from economics that has a wider signiﬁcance is inﬂation, a wellknown phenomenon to most of us. Together with unemployment, inﬂation is one of the typical diseases of modern economies. However, inﬂationary processes are not restricted to the economic sphere alone. Consider for instance the English words gentleman and lady, which in their original meaning denoted persons from the nobility, but today are often used as synonyms of man and woman. Similar stories can be told about titles in many languages. In Swedish, a number of diﬀerent words have been used for unmarried women, such as jungfru, fröken, mamsell; they all seem to have initially been used for women of high status, but have later become general titles for unmarried women and in some cases, they ﬁnally even acquired a

15

16

The Growth and Maintenance of Linguistic Complexity

derogatory character. Intuitively, we may say that titles tend to lose their “value” over time, but exactly what is the parallel with money here? Many titles such as lord or professor are connected with a certain status in society; they guarantee the bearer certain rights and privileges and the respect of others. If, for instance, a king confers a title on one of his subjects, the eﬀects are similar to those that would obtain if the king gave him or her a piece of land or a sum of money. But there is a crucial diﬀerence between the piece of land on the one hand and the title or the money on the other: the value connected with the title and the money is purely conventional. That is, there must be something in the world that corresponds to the title or to the sum of money, but what that is depends on a convention. In some cases, the lack of a real-world counterpart to an object with a conventional value will lead to an immediate crisis. If I try to sell two hundred tickets to a theatre with one hundred seats I will quite soon be in serious trouble. When the relationship between the object and what it “buys” in the world is less direct, however, there is always a temptation to multiply the conventionally-valued objects to obtain a short-term gain. A king may thus buy the loyalty of a number of people by making them into, say, “Grand Dukes”. But if the number of Grand Dukes in the country doubles, the value of that title is bound to decrease. Dilbert’s boss in the strip above got the point. Conferring a title, or doing other similar things, such as giving medals or bestowing orders, is usually “cheap” for the person who does it. Similarly, it is always tempting for someone who controls the issuing of banknotes in a country to get short-term advantages by printing more money. Such actions, however, are basically self-destructive in that the increase in the number of bearers of a title, or in the amount of money in circulation, inﬂuences the value of the “symbolic commodity”, resulting in inﬂation. Similar things occur in everyday communication. Thus, titles are not necessarily conferred by kings but are used by people all the time in talking to and about each other. Although the use of titles is normally governed to a large extent by conventions, there is often leeway for the choice between diﬀerent ways of addressing or referring to people. Also, there is usually a

Information and redundancy

“penalty” for using a title that is too low, but less frequently a “penalty” for using a title which is too high. On the contrary, you may sometimes “buy” a positive reaction from someone by over-titling him or her. In fact, such over-titling is sometimes conventionalized. When academic titles were more commonly used in Sweden than they are today, it was customary to “promote” academics when addressing them. Thus, a person with the lower “licentiate” degree would quite regularly be called “Doctor”. In the long run, however, such policies inevitably lead to the depreciation of titles and thus to the introduction of new ones. The use of evaluative expressions like excellent and good may work in a similar way. A teacher may want to give her students positive feedback and tells them their work is “excellent”. But if such an expression is used indiscriminately, that is, if everyone is told their work is excellent, it loses its informational value, and if the teacher really wants to single out somebody as outstanding she has to use another expression. But the loss in informational value that the over-use of an expression causes also has consequences for redundancy management: the speaker will be less motivated to spend as much energy on producing it as before, which may lead to phonetic reduction. This is something that we shall be returning to quite frequently in this book.

17

Chapter 3

Complexity, order, and structure

3.1

Introduction

In this chapter, I shall discuss complexity and other notions connected to it, both in general and in their application to languages and linguistic structures. I shall start out by considering another notion, namely that of order. After looking at the contentious issues of emergence and reductionism, I turn to distinctions between diﬀerent kinds of linguistic complexity, and then to questions about structure and patterns in linguistics. Finally, I introduce the notion of linearity.

3.2 Order Consider the diﬀerence between an ant trail between two anthills and a human highway between two cities. In both cases, an observer who sees the trail or road from above will be able to see a multitude of objects moving along it, but in the case of the ant trail, the ants will be spread all over it, whereas the location of each car on the highway depends on what direction it is moving in. An extraterrestrial being that observes the highway from its spaceship would most probably draw the conclusion that such a situation cannot arise by chance — there must be something that constrains the movements of the cars. This is an elementary illustration of the notion of order. If you have a set of toy cars and a race track, there are a large number of possible ways of placing those cars on the track. Only some of these ways conform to the principle, adhered to in all countries of the world, that vehicles should keep to one speciﬁc side of the road. These ways, then, constitute a set of “orderly” states. Notice the direct relationship here to information understood as “reduction of uncertainty”. Imposing an order on a system reduces its state space, and thus also the diﬃculty of guessing what state the system happens to be in at a given moment. In this sense, order equals information, and disorder equals entropy. If we think of a language as a set of pairs of meanings and forms, a maximally ordered language is one where every form has only one meaning and vice versa, and a maximally disordered language is one where any form can have any meaning. Obviously, real languages are somewhere in the middle, and to function properly as

20

The Growth and Maintenance of Linguistic Complexity

a means of communication a certain degree of order is necessary. Even if we cannot give a global measure of the degree to which a language is ordered or disordered, in studying language change it may still be useful to consider whether the diachronic processes we are looking at increase or decrease order. In general, order is less likely than disorder. The reason for this is simply that there are usually fewer ways of organizing things in an orderly fashion than in a disorderly one — there are fewer possible orderly states than disorderly ones. Lev Tolstoy expressed this principle in the famous opening sentence of Anna Karenina: “Happy families are all alike; every unhappy family is unhappy in its own way.” Consider, for instance, linear ordering. If you think of the possible ways of ordering books in a library, only a tiny fraction of them would satisfy a librarian’s idea of order. But the principle of the improbability of ordered states is very generally applicable and has, among other things, the important consequence that if things are, as it were, left to themselves, they are likely to change into a less ordered state. A librarian returning from a vacation may well ﬁnd the books in a mess. In physics, what is essentially the same principle goes under the name of the Second Law of Thermodynamics (“entropy must increase in any closed system”), which seemingly predicts that we — together with the rest of the universe — are heading for total chaos. Happily, the principle does not exclude local “islands” of order and stability. In fact, those apparent counterexamples are what most branches of science are interested in, and linguistics is no exception, languages being prime examples of such islands. The claim that orderly states are less likely than disorderly ones does not exhaust the notion of order, however. Remember that Tolstoy said that happy families are all alike. Likewise, all highways in a country are alike in the sense that cars are (almost always) driving on one and the same side of the road. Thus, we expect that the ordered states should be identiﬁable by conforming to some pattern. But the notion of pattern is tightly linked up with that of complexity, which will be treated in the next section. First, however, a ﬁnal observation on order: One important collateral principle of the improbability of orderly states is that order usually comes at a cost — as we all know from everyday experience. An orderly system has to be actively maintained; otherwise, it will tend to deteriorate, that is, return to disorder. This may look as if the system “prefers” to be in a disorderly state, suggesting an active striving away from order, but no such assumption is necessary, since it is an automatic consequence of the fact that a change is more likely to result in a less ordered state, these being more frequent to begin with. The principle applies also to complex systems like biological organisms, in which case the active maintenance is taken care of by natural selection, which thus serves not only to increase the ﬁtness of organisms but also to uphold it.

Complexity, order, and structure

3.3 Compression, complexity and patterns Most of us are familiar with zipping or compressing utilities, which make it possible to squeeze voluminous computer ﬁles into more compact ones that, if we are lucky, ﬁt onto a single diskette (a less acute problem nowadays than it used to be when diskettes were the only means of transporting ﬁles). How do such programs work? Without going into any technical details, it is fairly easy to see that most ﬁles, in particular text ﬁles, are more or less repetitious, and that in principle, if two parts of a ﬁle are identical, one could get rid of the second one by just making a crossreference to the ﬁrst. Other tricks that could help compressing a ﬁle are also conceivable. It is reasonable to assume, however, that there is a limit to how much a ﬁle could be compressed. It is a relatively safe prediction that there will never be a program that compresses Encyclopaedia Britannica onto a 1.44 megabyte diskette. Consider from this point of view the following two strings of digits, which are iterations of the shorter strings ‘185’ and ‘18578’, respectively: (7) 185185185185185185185185185185185185 (8) 18578185781857818578185781857818578 Being iterations of strings of diﬀerent length — three and ﬁve digits respectively, (7) and (8) clearly diﬀer in compressibility — any compressed version of (8) would have to be longer than a string obtained from (7) by the same method. We might now claim that this reﬂects the fact that (8) is more complex than (7). Likewise, Encyclopaedia Britannica might be said to be more complex than this book, since compressing the former would yield a longer zip ﬁle than compressing the latter. Notice now that we could see a zipped version of a ﬁle as a speciﬁcation of the uncompressed ﬁle. Generalizing, the complexity of an object would then be measured by the length of the shortest possible speciﬁcation or description of it. In mathematics, one tends to think in terms of algorithms rather than descriptions, and this measure is therefore often called algorithmic information content, i.e. the length of the shortest possible algorithm that can generate an object. An obvious problem is that there is no way of knowing that you have really found the shortest description of something, which means that discussions of complexity in this sense will have to remain rather abstract. A second objection will lead us to some interesting conclusions. In particular, it brings us to a consideration of the notion of randomness, and in turn, to that of pattern. If we say that we measure the complexity of a string S1 in terms of the shortest possible string S2 that speciﬁes it, we assume that there is such a shortest string, although we may not be able to identify it. However, what is the shortest possible string that speciﬁes S2? The problem here is that we have already compressed S2 as much as possible. If there were a shorter string that speciﬁed S2, it would also specify S1, and that is counter to the assumption. It follows that the shortest

21

22

The Growth and Maintenance of Linguistic Complexity

possible speciﬁcation of S2 is itself. In other words, there will be strings that cannot be further compressed and are their own shortest speciﬁcations. Such strings are random in the speciﬁc sense that they have no part that makes any other part more likely. In fact, mathematicians have proposed the property of non-compressibility as a deﬁnition of randomness. But the consequence is that random strings are always more complex than non-random strings of the same length, and it seems counter-intuitive to ascribe maximum complexity to a string like the following, which was generated by a random-number generator: (9) 388516427659891533144853529007192205 The idea of a non-compressible string can still be useful, though — it allows us to get a better understanding of what a pattern is. We saw that the reason (7) and (8) were compressible was because they contained iterations of shorter strings — in eﬀect, they are patterned. Conversely, (9) is a random — non-compressible — string because it does not contain any pattern. A pattern is thus something that makes it possible to obtain a speciﬁcation of a string that is shorter than the string itself. More generally: “some object O has a pattern P — O has a pattern “represented”, “described”, “captured”, and so on by P — if and only if we can use P to predict or compress O” (Shalizi (2001: 12)). The same idea is expressed by Goertzel (1994: Chapter 2) “I deﬁne a pattern, very simply, as a representation as something simpler.” The eﬃciency of a pattern is then deﬁnable as the gain in simplicity that it produces. Let us introduce a pattern in the random string (9) by replacing every ﬁfth digit by 0, starting with the fourth: (10) 38806427059890533104853029000192205 As it stands, the pattern in (10) is not very salient, so let us display it as a matrix instead: (11) 38806 42705 98905 33104 85302 90001 92205 Now, we can clearly see that the matrix consists of a patterned part — the fourth column — and an unpatterned (random) part — the rest. Now compare (11) to (12), where every second 0 in the fourth column has been replaced by 1:

Complexity, order, and structure

(12) 38806 42715 98905 33114 85302 90011 92205 Let us now make the patterns more conspicuous by replacing the numbers by diﬀerent shades of grey: (11)

(12)

Note that we are inclined to perceive the fourth columns in the matrices as objects — as vertical bands of white or white and grey respectively. Yet in (10), we can only register zeros reoccurring at regular intervals. We may also note that physically, the white band in (11) is “nothing” — that is, it is an uncoloured part of the book page/ computer screen that happens to be surrounded by coloured areas. (In fact, even this is not totally necessary: if one adjacent square happens to be white, we may extrapolate the band by ﬁlling out the sides of the rectangle.) I think that this is generalizable. What we see as objects in the world is in the end nothing but patterns — islands of stability in what would otherwise just be chaos. Spatial contiguity and temporal persistence are factors that make it easier to accept these patterns as objects, but are not always necessary factors. Given this view, even living organisms, of which human beings are a particular case, can be thought of just as extremely complex patterns. Patterns that are deﬁned negatively, as the absence of something, are very common, and we often tend to see them as objects, forgetting that they really are “nothing”. For instance, Grand Canyon is in a sense nothing but the

23

24

The Growth and Maintenance of Linguistic Complexity

absence of rock. This absence being patterned in an interesting way, however, we see Grand Canyon as “something” rather than “nothing”. We may thus formulate the slogan “to exist is to be a pattern”,1 which, if taken seriously, has rather interesting consequences. Rephrasing the deﬁnitions quoted above, a pattern is a way to simplify, or if we like, organize our picture of the world. A common argument in ontological discussions is to say that something doesn’t really exist because it can be regarded as an “abbreviatory device”. But in a sense patterns are just “abbreviatory devices”, and thus to exist is to be an abbreviatory device, or if we prefer, to be characterizable by an abbreviatory device. Now, with the help of the notion of pattern we can obtain an alternative measure of complexity — what Gell-Mann (1994) calls “eﬀective complexity”. That is, we measure the length of the speciﬁcation not of an object as a whole, but of the speciﬁcation of the totality of patterns it contains (Gell-Mann’s term is the “set of regularities”). This corresponds better to an intuitive understanding of the notion of complexity. We also come close to a notion which may feel more familiar to linguists: the set of patterns that an object contains can be said to equal its structure, so the complexity of an object is really a measure of the complexity of its structure. A pattern in a system may consist in there being a non-random relationship between two or more of the elements of a system. For instance, in the following matrix, the numbers in the second column consistently equal the numbers in the ﬁrst column plus one. (13) 34188 67642 56198 56433 45585 23790 12092 If you know this fact, you can predict the numbers in the second column from the numbers in the ﬁrst, and vice versa. In this sense, the ﬁrst and second columns carry information about each other. Information does not imply total predictability, though. In the following matrix, there is still a connection between the ﬁrst and the second columns: in each row, the second digit is greater than the ﬁrst one, but the exact diﬀerence between the digits varies. This means that knowledge of one number “reduces the uncertainty” about the other.

1.Obviously calqued on Quine’s “to be is to be the value of a variable”.

Complexity, order, and structure

(14) 35188 69642 57198 58433 45585 29790 13092 Clearly, when we say that one number carries information about another, this is a very weak sense of “carry information” — it does not mean anything more than “makes easier to guess”. It does not, in particular, mean that the information was intentionally put there. In real life, on the other hand, as soon as we observe a pattern, we tend to assume that it was either caused by some natural process or was the result of intentional action. I shall use the term “pattern” extensively throughout this book, in particular referring to behavioural patterns — patterns that are observable in the behaviour of agents in the broad sense, including sentient beings like you and me but also any entity that perceives and acts on its environment (Flake (1998: 444)). Such patterns are typically adaptations or, as I shall prefer to call them, attunements — notions that will be discussed in 3.4. Behavioural patterns tend to nest into each other, forming complex hierarchical systems — see further 3.10. The type of behavioural pattern that interests us most here is, of course, linguistic patterns — a notion to be considered in some more detail in 3.11 — which make up the complex hierarchical systems we call “languages”. A system, or an object, may be part of a larger system, although we may not always be able to observe the latter in its entirety. It is thus often the case that an observed object has a property that appears random if we look at that object alone, but one which is in fact part of a larger pattern. For instance, if we study a linguistic utterance, we may observe “internal” patterns that consist of relations between the elements of the utterance itself, and “external” patterns that go beyond the utterance. The latter may involve relations both to other elements of the language and to non-linguistic elements. Having meaning, in the broadest sense of the word, always involves being part of an external relationship or pattern. Absolute and relative complexity. The question of how long a description we need to characterize something obviously depends on whether we can rely on information that we already have. For instance, if I want to describe a person, I do not need to say that she has two legs, two arms and one head. This is because we have a general idea of what human beings are like. It is thus possible to speak of relative complexity. An entity E would have a certain complexity relative to a description or theory T measured by the length of the additional description necessary to characterize E provided that T is already given. Obviously, one and the

25

26

The Growth and Maintenance of Linguistic Complexity

same entity can vary in complexity depending on which theory is chosen. However, if we consider the total length of the theory and the descriptions of the individual entities, it will presumably still have a minimal value. A theory of a class of entities may specify (or predict) the properties that are common to all the members of the class. However, it may go beyond that and also specify properties that are typical of or “normal” for the members of the class — what holds in a default or prototypical case. The description of each member may then be considerably simpliﬁed, given that only deviations from the normal case have to be speciﬁed. An interesting consequence that now appears is that an entity which deviates from the default case in more respects will tend to be more complex, in this sense. In linguistics, if a theory of Universal Grammar speciﬁes a set of parameters with default or unmarked values, we would obtain a complexity ranking of languages in terms of how many marked parameter values they need — although this is not the way generativists usually describe it, though (see further discussion in 6.3).

3.4 Attunement As suggested in the preceding section, we may single out adaptation as a special case of pattern. Following Plotkin (1994: 246), adaptation can be simply deﬁned as “some feature or attribute of an organism that helps it survive and reproduce”. Clearly, this entails that the organism “carries information” about the environment in which it lives, whether this information is genetic or acquired. According to Plotkin, “all adaptations are knowledge”: “The ﬂeshy water-conserving cactus stem constitutes a form of knowledge of the scarcity of water in the world of the cactus, and the elongated slender beak of the humming-bird is a manifestation of the knowledge of the structure of the ﬂowers from which the bird draws nectar.”

This extended use of the word “knowledge” may not appeal to everyone, but it is useful in drawing attention to the similarities between adaptations that arise through processes of natural selection and knowledge acquired by learning. In terms of patterns, the organism and its environment are involved in non-random relationships with each other. As a simple example, consider the fur colours or patternings of carnivores such as lions, tigers, and polar bears — it is not so diﬃcult to guess which animal lives in which kind of habitat. In everyday language, “adapt” may also be a transitive, agentive verb, in which case reproduction becomes irrelevant. I may adapt a tool in such a way that it becomes more suitable for some task, but that does not necessarily help the tool survive or reproduce. There are also other, non-intentional cases of adaptation where survival or reproduction seems less important. If I buy a pair of shoes and

Complexity, order, and structure

wear them for a few days, the chances are that they will “adapt” to my feet in such a way that they become more comfortable. And a footpath across a lawn will “adapt” to the preferred walking-routes of the people using it. Plotkin’s deﬁnition of adaptation thus seems a bit narrow. On the other hand, it is also too wide, in that it covers phenomena where the term seems less natural. The most important of these is mutual coordination. Consider again the principles of left-hand and right-hand driving — a typical example of coordination in the sense that what is important is that everyone should behave in the same way, but which way is chosen is arbitrary. These principles very clearly increase the chances of survival on highways — one need only imagine what would happen if people suddenly stopped following them. Such coordination may well arise through natural selection. Suppose for instance that you have a car-track with a number of toy cars programmed randomly to keep to the left or the right, respectively. Each time two cars meet, they will either happily pass each other or crash, depending on whether they are programmed in the same way or not. For any given car, the chance of survival will be a function of the proportion of cars that are programmed to keep to the same side of the road. In the end, only cars of one kind will remain — which kind depends partly on chance, partly on the distribution of left-hand and righthand driving cars in the initial population. What I ﬁnd remarkable here is that there is a direct connection between coordination and a concept that is regarded as central in linguistics, viz. arbitrariness. This is in contrast to biological adaptation in general, which is non-arbitrary, insofar as it is asymmetrical, that is, an organism adapts to an environment that is already given. In mutual coordination, there is no pre-existing target of the adaptation process; rather, the important thing is that the choices made ﬁt each other. In linguistics, we are used to associating arbitrariness with conventions, which are acquired through learning, but as we see here, there is actually no reason why genetically-based coordination could not also involve arbitrary elements. I propose attunement as a more general term for all the situations discussed here. As a provisional deﬁnition, attunement is simply any pattern in which there is a systematic dependence (symmetric or asymmetric) between two or more elements in, or parts of, a system. Coordination, then, is mutual attunement, which may take place both within a population or across populations, and may be either genetic or acquired through learning.

3.5 Emergence, reductionism and self-organization The adjective “emergent” is given the following deﬁnition in Merriam-Webster’s dictionary (quoted from the web version):

27

28

The Growth and Maintenance of Linguistic Complexity

(15) 1. 2. 3. 4.

a: arising unexpectedly b: calling for prompt action: urgent rising out of or as if out of a ﬂuid arising as a natural or logical consequence newly formed or prominent

Of the ﬁve readings listed here, two are of particular interest, (1a) ‘arising unexpectedly’ and (3), ‘arising as a natural or logical consequence’. Taken together, they are strikingly incoherent, almost contradictory. A similar ambiguity is found in the verb emerge — cf. examples such as A number of problems suddenly emerged and The crisis emerged from the combination of several factors. Notice that while emergent and emerge in the ﬁrst reading can be predicated of an object in abstraction from its context, the second reading is necessarily relational — a natural or logical consequence has to be a consequence of something — it is x emerges from y rather than x emerges. Terms such as “emergent”, “emergence”, and “emergentism” have been used in science and philosophy for quite a long time and are more popular today than ever. There are serious discrepancies between the ways the terms are understood by diﬀerent scholars. In fact, it appears that some have in mind reading (1a) and others reading (3). The scientiﬁc use of the term “emergent” goes back to the 19th century philosopher and literary critic George Henry Lewes’ distinction between resultants and emergents: “phenomena that are predictable from their constituent parts and those that are not” (Lewes (1874)). An example of a resultant would be a physical mixture of sand and talcum powder, while a chemical compound such as salt, which looks nothing like sodium or chlorine, would exemplify an emergent. Lewes’ ideas, which were inspired by earlier work by John Stuart Mill, gave rise to the school of British Emergentism with names such as Samuel Alexander, Lloyd Morgan and C. D. Broad, and whose notion of an emergent property McLaughlin (1999: 267) formulates as “a property of the whole that is not an additive resultant of, or even linear function of, properties of the parts”. The choice of the term “emergent” was apparently motivated by the idea that emergent properties, as it were, “emerged” in an unexpected fashion (that is, reading (1a) above). The original idea that emergence is about “the whole being more than the sum of its parts” is somewhat narrowly conceived — the notion of emergence can be applied in general to any hierarchical structure, where it is possible to discern at least one higher and one lower level. A higher-level property would be said to be emergent when it cannot be predicted by, derived from, or reduced to properties of a lower level. Thus, emergentism can be seen as the denial of reductionism — the view that higher-level properties can always be reduced to lower-level ones”.2 But

2.In the linguistic discussion, “reductionism” is sometimes used in a diﬀerent sense. For instance, Langacker (2000: 2) says that “the issue of reductionism pertains to the relation

Complexity, order, and structure

This picture shows six successive states of Conway’s Game of Life. When played as an animation, the sequence above is perceived as the movement of an object across the screen, referred to as a “glider”. From the reductionist point of view, a “glider” is really nothing but various pixels that turn on and oﬀ — a true epiphenomenon, but our brains are programmed in such a way that we cannot help “constructing” an emergent object from the patterns we see. Whether gliders “exist” or are just ﬁgments of our imagination is a question about which opinions may diﬀer, not merely about the correct answer but also about the meaningfulness of asking it in the ﬁrst place.

“predicted by”, “derived from”, and “reduced to” can all be interpreted in many ways, which has given rise to rather varying views of what emergence really is. The use of expressions such as “novel” and “unexpected” in connection with emergent phenomena means that the notion of emergence runs the risk of being transposed to the eye of the beholder and thus losing its objectivity. The notions of complexity and pattern can help us ﬁnd a more satisfactory deﬁnition. Recall that we deﬁned a pattern as something that allows us to obtain a simpler description of a system, and the eﬃciency/strength of a pattern as the gain in simplicity it yields. Given a system with a higher and a lower level, and where the higher level depends on the lower one, a pattern at the higher level can be said to be emergent if it is more eﬃcient than any lower-level pattern that it depends on.3 Good illustrations of what such a notion of emergence means can be found in the area of “Artiﬁcial Life”, that is, computer simulations of life-like phenomena.

between general statements and more speciﬁc statements that amount to special cases of them”. The reductionist account here would be the one according to which the speciﬁc instantiations of the rule are excluded from the description on grounds of economy. Assuming a hierarchical relation between rules and instances, reductionism would thus mean that lower level is eliminated in favour of the higher one, that is, the inverse of the traditional deﬁnition of reductionism. But probably Langacker understands reductionism very broadly, as the thesis that things of one sort can be understood completely in terms of things of another sort. 3.This formulation is essentially an adaptation of the deﬁnition in Shalizi (2001): “One set of variables, A, emerges from another, B if (1) A is a function of B, i.e., at a higher level of abstraction, and (2) the higher-level variables can be predicted more eﬃciently than the lower-level ones, where “eﬃciency of prediction’’ is deﬁned using information theory.”

29

30

The Growth and Maintenance of Linguistic Complexity

Let us look at a classic example, Conway’s “Game of Life”, which is essentially a two-dimensional matrix of cells that each may be in either an “on” or an “oﬀ” state, and where they change states according to very simple rules: – –

if you are on but less than two or more than three of your immediate neighbours are on, switch oﬀ; if you are oﬀ and exactly three of your immediate neighbours are on, switch on.

When Conway’s “Game of Life” is implemented on a computer, the cells that switch on and oﬀ form patterns that we see as objects that move across the screen, sometimes in a seemingly very orderly fashion (see the ﬁgure on p. 29). Obviously, since this is a computer program, there is no “mystical” element in it: the “gliders” that we see on the screen are wholly determined by the rules that govern the behaviour of the individual cells, but we cannot speak of them if we are not allowed a “higher-level language” in which concepts such as “glider” are deﬁned. The following diagram shows the general picture. (16) Higher-level patterns: gliders

Lower-level patterns: local behaviour of pixels

We may thus say that in a way, an emergent phenomenon, in the sense the word has come to be used, combines the two seemingly contradictory readings quoted above: “arising unexpectedly” and “arising as a natural or logical consequence”. An important element in emergent phenomena like Conway’s “Game of Life” is the character of the rules. Each cell is “myopic” in that it can only “see” its immediate neighbours — that is, the rules refer only to cells that are next to each other. This can also be expressed by saying that the rules contain only “local information”, and is also true of the next example that we shall look at, Craig Reynolds’ “boids” model (Reynolds (1995)), which simulates the ﬂocking behaviour of birds. Reynolds shows that the following three rules are suﬃcient to generate the movements of ﬂocks, if they are assumed to govern the behaviour of each individual member of a ﬂock: 1. Collision Avoidance: avoid collisions with nearby ﬂockmates 2. Velocity Matching: attempt to match velocity with nearby ﬂockmates 3. Flock Centering: attempt to stay close to nearby ﬂockmates

Complexity, order, and structure

While Conway’s “Game of Life”, in spite of its name, has little resemblance to real biological phenomena, Reynolds’ “boids” are intended to show how actual ﬂocks of birds and other animals can be assumed to work. If it is suﬃcient for each member of the ﬂock to follow a small number of simple rules, there is no need for a “ﬂock captain” who would organize the movements of the ﬂock. The idea that group behaviour may be emergent in this way turns out to be extremely fruitful and is being applied to a growing number of biological phenomena, most often under the name of selforganization, which has been deﬁned as follows (Camazine et al. (2001: 8)) “Self-organization is a process in which pattern at the global level of a system emerges solely from numerous interactions among the lower-level components of the system. Moreover, the rules specifying interactions among the system’s components are executed using only local information, without reference to the global pattern.”

Self-organization is particularly salient in the behaviour of social insects such as ants, bees, and termites. A celebrated example is nest-building in termites, known to linguists from the work of Lindblom et al. (1984). Termite nests or mounds are architectural wonders, with air-conditioning systems to be envied by any human house-owner. Still, the worker termites who build them are guided by simple mechanisms such as pheromones,4 whose function is to lure termites to deposit their contributions in places where others have already done so. If lower-level behaviour is governed by simple rules, where do these rules come from? A computer simulation like the ones discussed above constitutes a complete closed world of its own, created and designed by “God”, that is, the programmer, who, in principle, has total control over the world, in the sense of choosing what the program should be like. On the other hand, these examples show that an almighty being is not necessarily omniscient: the programmer may not at all foresee what is going to happen — that is the “novelty” aspect of emergence. A biological system, on the other hand, might of course also be created by God, if the adherents of “intelligent design” are right, but the mainstream opinion is that it has developed through evolution, which implies that the lower-level rules have arisen through natural selection. What is crucial here, though, is that the lower-level rules by themselves do not increase the individual’s chances of survival and reproduction. Rather, it is their eﬀect on the higher, emergent level that constitutes the selectional advantage. The following diagram gives an idea.

4.“Pheromone” is not the name of a speciﬁc substance but rather a generic name for substances used in chemical communication between animals.

31

32

The Growth and Maintenance of Linguistic Complexity

(17) Higher-level patterns: termite mounds

Enhanced fitness of termite population

Lower-level patterns: behaviour of individual termites

Computer simulations of biological systems such as Reynolds’ “boids” abstract from the evolutionary dimension and see the system as if it were a closed one. Such an abstraction may be an advantage if one wants to show precisely the emergent or self-organized character of the system. In fact, for the notion of “self-organization” to make sense, it has to be interpreted as referring strictly to the internal workings of a system.5 That is, there may well be a “God” outside the system who sets the rules, but there is no “boss” inside who gives orders (Camazine et al. (2001: 7)). Obviously, the choice between the two alternatives is sometimes a matter of description. Consider for instance a company, where the director may for instance writes up some general instructions and then goes on holiday, leaving the employees to self-organize. Related to the notion of self-organization is the idea of “the invisible hand”, formulated by the 18th century economist Adam Smith, who used it to describe the workings of a market economy where (at least in theory) each individual behaves in his or her own interest but the result is still prosperity for everyone. “The invisible hand” is sometimes deﬁned as “the unintended result of intentional actions”. In linguistics, the notion of “the invisible hand”6 was applied to the general theory of language change by Keller (1994). Notice, however, that a market economy does not fulﬁl the criteria for self-organization formulated by Camazine et al., in that agents in a market are not necessarily “myopic”, that is, they are not restricted to local information. Even if the assumption in economic science that every such agent has access to perfect information is unrealistic, it is not the case, for instance, that customers never know any prices outside of their local supermarket. In fact, a stock-broker

5.The term “self-organization” is also used outside of biology, frequently in the wider sense of “spontaneously arising islands of order”. For instance, Ball (2001: 217) quotes the creation of sand dunes as an example of self-organization arising from “a subtle conspiracy between sand and wind, rather than being imposed by any external agency”. (This view of things is not entirely self-evident — the wind might of course be seen as external to the sand dunes.) 6.As noted by Keller, Smith’s choice of term was somewhat unfortunate in that it suggests a God-like agent behind everything, which is in fact exactly the opposite of what he really wanted to say.

Complexity, order, and structure

who checks the NASDAQ index before deciding whether to buy or sell can truly be said to be inﬂuenced by a global pattern. In other words, information in a market economy goes both upwards and downwards. This is actually characteristic of many social phenomena, including language. Commonly, it is not possible to determine precisely the “horizon” of the agents in a social system, that is, the information that their behaviour is based on may range from strictly local to global. For instance, consider the emergence of norms and conventions, such as those that govern the ways people dress. We tend to dress like those around us, but how large that group is varies with the circumstances. Likewise, in a small speech community, a speaker may be regularly exposed to all others, but in a large community, he or she will only hear a subset of them speak. We may therefore make a distinction between (prototypical) self-organization, in which only local information is used, and invisible hand phenomena, where the information may range from local to global. The notion of emergence is fairly often illustrated by the checkout lines in a supermarket, which tend to be of equal length, not as a result of conscious planning, but as an emergent result of individual customers’ attempts to minimize the time they have to spend waiting.7 It can be noted that this is more like an “invisible hand” phenomenon than a case of selforganization, in that customers use information about all the checkout lines at once when seeking the shortest one. It is also diﬀerent from the biological examples in that (presumably) there is no selectional pressure that favours the survival of those customers that behave in accordance with the principle “choose the shortest line”. I shall return to this example below, when discussing Brian MacWhinney’s notion of emergence.

3.6 Emergence and emergentism in linguistics “Emergence” has recently also become a popular term in the language sciences. However, although the examples given to illustrate the concept are sometimes the same or similar to those presented in the previous section, a closer look reveals that the term is in fact interpreted in a rather diﬀerent and not always uniform way. Explicit deﬁnitions are seldom given.

7.I had occasion to observe a slightly more complex variation on this theme in a cafeteria on a university campus, where there were two cashiers, one of which was known to be much more eﬃcient than the other. Many customers therefore preferred to line up for the former although her line was longer. If customer behaviour were perfectly rational, one would expect the ratio between the lengths of the lines to be equal to the ratio between the speed with which the two cashiers handled the customers. I have not been able to test this hypothesis empirically, however.

33

34

The Growth and Maintenance of Linguistic Complexity

Brian MacWhinney discusses what he calls “emergentism” in several papers. In MacWhinney (2002), he opposes it to “stipulationism”, in which complexities of human behaviour are assumed to derive from “stipulative rule systems”, promoting “the articulation of enormous cognitive architectures of seemingly impossible complexity”. By contrast, the emergentist approach relies not on stipulated rules but on “the interaction of general mechanisms”. MacWhinney gives two initial illustrations. The ﬁrst is the supermarket example already mentioned in 3.5, about which MacWhinney says: “There is no socially articulated rule governing this pattern. Instead, the uniformity of this simple social “structure” emerges from other basic facts about the goals and behavior of shoppers and supermarket managers.”

The other example concerns the hexagonal shape of the cells in bees’ honeycombs, which is “an emergent consequence of the application of packing rules to a collection of honey balls of roughly the same size”. Although these examples are indeed very similar to the ones we have already seen, MacWhinney’s perspective is subtly diﬀerent. It is not the derivability of higher-level patterns from lower-level ones that is important for his understanding of emergence but the absence of “stipulated rules” and the possibility of deriving the whole system from “general mechanisms”, preferably in another domain. In another paper (MacWhinney (2001)), he says: “… we need to be able to see how linguistic behavior in a target domain emerges from constraints derived from some related external domain. For example, an emergentist account may show how phonological structures emerge from physiological constraints on the vocal tract.”

Consider also MacWhinney’s discussion of a concrete neurolinguistic problem (2002), the much-disputed case of a genetically inherited language impairment in one British family and involving problems with inﬂectional morphology (Gopnik & Crago (1990)). The non-emergentist hypothesis, says MacWhinney, postulates a “speciﬁc mutation on a speciﬁc gene that somehow controls the process of regular suﬃxation and perhaps other aspects of linking”. The emergentist solution, on the other hand, attributes the deﬁcit to “a general motor impairment that impacts regular morphology”.8

8.Evidence has now accumulated that this particular disorder is indeed caused by a deﬁciency in one single gene referred to as FOXP2, which has been promoted to “the language gene” by the media, on quite shaky grounds. That the presence of a certain gene is necessary for the adequate functioning of a system does not of course imply that it is responsible for that system in general. (Cf. a situation when a bug in a single line of code crashes a computer program.) See Marcus & Fisher (2003) for a clarifying discussion.

Complexity, order, and structure

In other words, MacWhinney’s emergentism appears to boil down to a seemingly rather uncontroversial methodological principle (“Never invoke a speciﬁc explanation if there is a general one”), which is probably derivable from an even more general principle — Occam’s Razor, and thus is in itself emergent, in MacWhinney’s sense. The relation to the original concept of emergence is somewhat indirect, however. There are no myopic lower-level agents here that give rise to higher-order patterns by acting on local information, nor do we ﬁnd the “invisible hand” of economics; rather, MacWhinney has replaced the hierarchical higher-lower dimension with a speciﬁc-general one. It appears that MacWhinney’s use of “emergence” is based on the relational reading of “emergent”, ‘x arises as a logical or natural consequence of y’, rather than the reading ‘x arises unexpectedly’, as was the original concept in British emergentism. The following formulation supports this interpretation: “In order to begin to organize our thinking about emergent processes in language, the ﬁrst question that we need to ask is “Emergence from what?” In other words, we need to be able to see how linguistic behavior in a target domain emerges from constraints derived from some related external domain.” (2001: 449)

Consider again the supermarket example, represented in the following diagram: (18) Lines tend to be of equal length Customers want to minimize waiting time

Customers join the shortest line

As emergence is traditionally understood, it applies to the supermarket example in that the equal length of lines “arises unexpectedly” from the behaviour of individual customers. Why they behave in that way is of lesser importance. But that — the relationship shown by the shaded arrow in the diagram — is precisely what interests MacWhinney. The equal length of lines “arises as a logical or natural consequence” from the general and external principle that customers strive to miminize waiting time. From this perspective, the relationship between the higher and lower levels in the system loses its prominence, leading to the extension of the notion of emergence to cases where it is no longer possible to speak of higher-level patterns emerging from lower-level ones, as when phonological or morphological structures are derived from physiological constraints. In fact, since general statements can be seen as hierarchically higher than their speciﬁc instantiations, this shift of focus may in

35

36

The Growth and Maintenance of Linguistic Complexity

the worst case lead to an inversion of the original picture of emergence: it is the lower level that emerges from the higher, rather than the other way around. Let us now look at another way of understanding the term “emergence”, that promoted by Paul Hopper. The fact that MacWhinney’s paper “Emergentist approaches to language” was published in a volume with “emergence” in its title and co-edited by Hopper (Bybee & Hopper (2001)) would certainly suggest that MacWhinney and Hopper have a shared understanding of what the term means. Similarly, in the introduction to Barlow & Kemmer (2000), Hopper and MacWhinney are mentioned next to each other as scholars who have studied “emergence as a property of linguistic systems” and the implications of the notion of emergence for language and mind. In fact, however, Hopper’s use of the term “emergence” is quite diﬀerent from MacWhinney’s, let alone the other uses discussed above. In the original paper Hopper wrote on the subject (Hopper (1987)), “emergence” is characterized as follows: “The notion of emergence is a pregnant one. It is not intended to be a standard sense of origins or genealogy, not a historical question of “how” the grammar came to be the way it “is”, but instead it takes the adjective emergent seriously as a continual movement towards structure, a postponement or “deferral” of structure, a view of structure as always provisional, always negotiable, and in fact as epiphenomenal, that is, as least as much an eﬀect as a cause” (142).

The only reference to earlier mentions of emergence in the paper is a brief quotation from the cultural anthropologist James Cliﬀord, who has said of culture that it is “temporal, emergent, and disputed”. In Bybee & Hopper (2001), emergence is said to be understood “as an ongoing process of structuration”, the latter term being taken from the sociologist Anthony Giddens, whose deﬁnition is “the conditions which govern the continuity and dissolution of structures or types of structures”. In a review of grammaticalization research, Hopper refers to his own work in the following way: “… Hopper … suggested that the study of grammaticalization tended to undermine the assumption of a preexistent a priori grammatical component that stood as a prerequisite for discourse and a precondition for communication, and he proposed instead that grammar was an emergent property of texts. “Structure” would then be an epiphenomenal by-product of discourse.” (Hopper (1996: 231–232))

Whereas most of the quotations indicate that for Hopper, the essence of emergence lies in properties such as provisionality, negotiability and mobility, it is possible, in particular in the last quotation, to discern an implicit link to the traditional use of the term. Texts could perhaps be seen as constituting the lower level from which the

Complexity, order, and structure

higher-level notions of grammar and structure emerge.9 However, the last sentence suggests a reductionist attitude — an attempt to explain away the higher level as an “epiphenomenon”. It must be admitted that this is not the only place where such a reductionist bent is discernible in the discussion of emergent phenomena. This may seem rather curious, given that the traditional understanding of emergence (“the whole is more than its parts”) is anti-reductionist. In fact, the conﬂict is probably built into the original idea of emergence, given that an emergent pattern is on the one hand derivable from lower-level patterns, and on the other, exhibits new and interesting properties. Depending on whether one is more fascinated by the ﬁrst or the second element here, one may come to see diﬀerent and seemingly contradictory aspects of “emergence” as criterial. One important component of Hopper’s understanding of emergent phenomena is that they are unstable: “… emergent structures are unstable and are manifested stochastically.” (Bybee & Hopper (2001: 2))

However, there is no necessary connection between any of the more traditional concepts of emergence, lack of stability and stochasticity.10 Thus, the paradigm example of emergence, the “Game of Life”, is wholly deterministic, a perfect “Laplacean” world. Furthermore, stochasticity does not entail lack of stability: Middle-sized solid objects such as tables and chairs are stable enough for our practical purposes, even if they are ultimately made up of elements that behave stochastically, as modern physics reminds us. In fact, this is one of the points of the notion of emergence. A complex system is often more stable than its components, since it is less vulnerable to random ﬂuctuations. The individual customers in the supermarket may behave erratically — I may choose a longer line because I see a friend there, but such acts do not usually disturb the general pattern. Note also that stability is not irreconcilable with variation. In fact, social phenomena are often both more variable and more stable at the group level than at the individual level. For instance, it is likely that there are more alternative pronunciations of words in my speech community than are found in my own speech; but

9.However, I must admit that it would make more sense to me if the words “grammar” and “structure” swapped places in the quotation — structure would be an emergent property of texts and grammar a by-product. 10.Uriagereka (1998: 597) even includes stability in his (not particularly clear) deﬁnition of emergence: “Property of a system, whereby a process stabilizes into some global state that cannot be produced at a more local level.”

37

38

The Growth and Maintenance of Linguistic Complexity

Epiphenomena — the babies in the bath-water? In the quotation from Hopper (1996), the term “epiphenomenal” is used almost as a synonym of “emergent”. Dennett (1991: 401–402) notes that the term “epiphenomenon”, without anyone noticing it, has been used in two very diﬀerent ways: – –

a non-functional property or by-product (most dictionaries say “a secondary phenomenon”); an eﬀect of something that itself has no eﬀects whatsoever.

Of these the second is most common among philosophers, and Dennett points out that it is much stronger than the ﬁrst, so strong that it according to him “yields a concept of no utility whatsoever” (402), since most phenomena have at least some eﬀects on the world. It is quite common for linguists to use the “epiphenomenon” label, generally as a way of declaring a notion as not really necessary since it can be reduced to something else. In an interview a couple of years ago (Chomsky et al., 2002), Chomsky applied the label of epiphenomenon both to “the use of language for communication” and to “the utterances generated [by the generative system]”, and while functionalists like to suggest that grammar and structure are epiphenomenal, both generativists and adherents of more traditional historical linguistics happily apply the same label to grammaticalization. There seems to be a reductionist hiding in all of us, although many tend to claim otherwise. It may well be that the readiness of the “other side” to deﬁne away the notions that we ourselves ﬁnd useful should make us wary of reductionist tactics. A more positive attitude towards epiphenomena is taken by Keller (1994: 132) who ﬁnds that “in the domain of culture, epiphenomena are often most interesting”. It turns out, however, that what he means by “epiphenomena” is in fact exactly what others have called “emergent phenomena” (inﬂation, traﬃc jams, spontaneously arising footpaths etc.).

cultural inertia makes it less probable that the whole community will change its habits than that I will do so (see further 4.1, 5.1). Similar reasoning can be applied to diﬀerent levels within a language as a system. Although in a sense higher or more abstract patterns can exist only as long as they have concrete manifestations at lower levels, the higher level patterns may exhibit a greater stability. Consider the “strong verbs” in Germanic. In languages such as German, English, and Swedish, we ﬁnd groups of verbs which are conjugated according to ablaut patterns inherited from Proto-Germanic and originating even earlier. Seven such basic ablaut patterns are distinguished. One of the more salient of these, “Class I”, can be exempliﬁed by the English verb bite-bit-bitten and its cognates in German (beißen-biß-gebißen) and Swedish (bita-bet-bitit). In English, only a few such verbs remain; in German, on the other hand, there are 35 simplex

Complexity, order, and structure

Class I verbs and in Swedish there are 30 or 31. As it turns out, however, of the German and Swedish verbs, there are only 15 or 16 cognate pairs (and even among these, not all are inherited from Proto-Germanic, some being later loans). Likewise, of the six or seven classes of strong verbs postulated for Proto-Germanic, all are still operative with at most one exception in German and Swedish. I think this is evidence that the strong verb classes are in fact more stable than the individual verbs that comprise them (see also 11.2 and Appendix A).

3.7 Complexity vs. cost and diﬃculty Irrespective of how we want to measure the complexity of a system, I think it is essential, not least in linguistics, to keep complexity as an information-theoretic notion that is at least in principle “objective”, in the sense of being independent of the use to which we put the system. This means that we should keep complexity apart from other notions such as “cost” and “diﬃculty”, which must always be related to a user or an agent. On this point, I seem to be in agreement with McWhorter (2001a: 134), and in disagreement with his critics such as Kusters & Muysken (2001: 184–185), who criticize McWhorter’s notion of complexity because “it does not tell us whether a language will be diﬃcult for an L1 learner or an L2 learner”, and DeGraﬀ (2001: 268ﬀ), who labels it “strictly a-theoretical” (see further discussion in 3.8). Cost is essentially the amount of resources — in terms of energy, money or anything else — that an agent spends in order to achieve some goal. What we can call cost-beneﬁt considerations are certainly of central importance in explaining many aspects of communicative behaviour. The relationship of cost to complexity is often indirect, although it is sometimes tempting to conﬂate the two notions, as we shall see in the next section. Diﬃculty is a notion that primarily applies to tasks, and, as already noted, is always relative to an agent: it is easy or diﬃcult for someone. Diﬃculty can of course be understood and measured in many ways. One measure of the diﬃculty of a task is in terms of “risk of failure” — that is, if a large proportion of all agents fail or one agent fails more often than he or she succeeds, the task is diﬃcult for that agent or group of agents. There is an indirect relationship here to variation — if results vary, it means that the task is neither maximally easy nor maximally diﬃcult. There is of course also a relationship to cost — tasks that demand large expenditure of resources or in particular those that force the agent to or beyond the limits of his or her capacity are experienced as diﬃcult. Language obviously involves many diﬀerent tasks and many diﬀerent types of agents. This alone is a good reason for not identifying the complexity of a language with diﬃculty, since there is a priori no reason for giving priority to any particular

39

40

The Growth and Maintenance of Linguistic Complexity

kind of diﬃculty. Factors of particular importance are on the one hand, diﬃculty of processing, and learning or acquisition diﬃculty on the other. In the latter case, the distinction between ﬁrst and second language acquisition is evident, and will be of primary importance in the discussion of maturation processes later in the book. When complexity is identiﬁed as (or related to) acquisition diﬃculty, it is more often than not second-language learning that people are thinking of, which is probably natural in view of the fact that it is much harder to identify any variation in the success of ﬁrst-language acquisition. It is indeed important to know what is easy and what is diﬃcult for learners of second languages, but nevertheless second-language learning diﬃculty should be labelled as such and not confused with complexity in an information-theoretical sense. Then we can formulate and hopefully ﬁnd the answer to the empirical question of the relationship between second-language learning diﬃculty and various independently deﬁned notions of complexity. We might want to introduce a further notion that, for lack of a better term, could be called “demandingness”. By this I mean that a task puts certain requirements on its performers: for instance, if you want to study physics, you have to have a certain knowledge of mathematics. Demandingness is diﬀerent from diﬃculty in that the task is diﬃcult only if you do not fulﬁl the requirements — if you do, it may be very easy. For instance, acquiring a human language natively is certainly demanding (only human children seem to fulﬁl the requirements), but it does not necessarily follow that children ﬁnd it diﬃcult.

3.8 Complexity of languages Retaining the general idea of the complexity of an object being measured by the length of the shortest description of that object, with the reservations formulated above., we shall now discuss how the notion of complexity can be applied to languages and linguistic objects in general. There are of course manifold ways in which this may be done. I shall start by introducing a rather important distinction. Resources and regulations. Langacker (1999: 98) has characterized a language as a “structured inventory of conventional linguistic units” and linguistic knowledge as “an extensive collection of semantic, phonological, and symbolic resources that can be brought to bear in language processing”. The terms “inventory” and “resources” suggest that language is simply a set of tools that we can use for communication, but this is only half the story. Consider what happens when you register as a user of a library. A library is essentially an inventory of books, and registered users can choose between them, read them, possibly take them out etc. The books are thus the resources of the library. However, when registering as a user you also sign a document obliging you to follow the rules for using these resources.

Complexity, order, and structure

You are supposed to return the books on time, not to make notes in them, not to take them out of town etc. Since “rule” is a very loaded word in linguistics, I shall use regulations as a general term. Regulations may be general or apply only to a subset of the resources (e.g. a certain subset of the books of a library). Many societal institutions, including language, are analyzable in terms of resources and regulations. Intuitively, resources determine what is possible or permitted, regulations what is obligatory. Applied to language, the distinction is reminiscent of that between grammar and lexicon but does not coincide with it. It holds both of libraries and of languages that the regulations and the obligations that they entail become relevant only when I have decided to use a certain resource, which means that if I don’t want to use the resource, no obligations apply. This is diﬀerent from many other kinds of obligations in society, such as compulsory military service in the countries that still have this institution — any male citizen is supposed to serve for a certain time, whether he likes it or not. As for the library, on the other hand, if you don’t like the idea that you have to return the books on time, you may simply refrain from borrowing them. Similarly in language: an obligatory marker is obligatory only within a certain construction (or set of constructions); you don’t need to use the marker as long as you don’t use the construction(s) — which may of course be rather diﬃcult in some cases. Obligatoriness in language is thus generally a consequence of choices that we make — choices that in themselves are free but which force us to do certain other things. This holds not only for grammar — even if it is most obvious there — but also for the lexicon. Having chosen to use a certain lexical item, I have committed myself to following the regulations associated with that word, for instance to pronounce it in a certain way, to use the right endings, the right gender etc. In other words, lexical knowledge also involves regulations. As Putnam (1975: 144) noted, we often use words without knowing their full meaning. I may know that Jones got the Nobel Prize for discovering the zilthron without having the faintest idea what zilthron means. However, it does not bother me, since I trust that there are other people who do know. Putnam referred to this kind of phenomenon as the division of linguistic labour. More generally, it is of course the case that speakers’ knowledge of a language varies considerably, in particular with respect to the lexicon. Suppose that two speakers have exactly the same knowledge of their native language, except for one of them knowing a word which is unknown to the other. It seems that this would hardly prevent us from saying that the two speakers have the same language. Even if the diﬀerence between the two is much larger — say, one of them knows several thousand words that the other one does not know — we would probably still think of them as using the same language. Compare this to a situation where two speakers’ active competence diﬀers in that one of them consistently applies a rule of grammar that is absent from the other one’s speech. In this case, it is much harder to think of their languages as

41

42

The Growth and Maintenance of Linguistic Complexity

being identical, and it becomes increasingly diﬃcult as the number of diﬀering rules increases. Every speaker of a language has access to a subset of its total resources; speakers may diﬀer in the identity of this subset and still be said to speak the same language. Diﬀerences with respect to regulations on the other hand are much more likely to have a bearing on the question of identity of language. One reason for this is that a diﬀerence in regulations will normally give rise to acceptability conﬂicts; some utterances will be accepted by some and rejected by other speakers. When encountering a sentence that contains a word that is unknown to us, speakers will normally not judge it as unacceptable but rather assume that there is a lacuna in their lexicon. System complexity. Given that a language as a system can be seen as involving both resources and regulations, it follows that a language could be characterized as more or less complex with respect to both these notions. If we want to characterize a language with respect to its resources, the parameter that comes ﬁrst to mind is probably “richness”. For instance, a language with a large vocabulary would ceteris paribus be richer than a language with a small vocabulary and this could be interpreted to mean that it can be used to say more things — it is more expressive. In grammar, as well, it would seem that having recourse to a larger number of constructions makes the language “richer” and more expressive. A strict application of the idea that complexity is measured in terms of the length of the shortest description of an object would also imply that a language with a large vocabulary is more complex than one with a small vocabulary — it undeniably demands a longer description, as noted by DeGraﬀ (2001: 265–269). DeGraﬀ criticizes the interpretation of complexity in terms of length of description, referring to it as “bit complexity” and claiming that it “bears no relation to any theory where linguistic phenomena are independently identiﬁed and analyzed”. One bizarre logical consequence of bit-complexity, he says, is that “the languages with the biggest lexica would be the most complex”. This is correct, but what it shows is that it does not make much sense to apply the most simplistic notion of algorithmic complexity to a language as a whole.11 The distinction between resources and regulations is crucial here. The size of the lexicon is mainly a question of the richness of linguistic resources and thus concerns the expressive power of a language. It does appear desirable to keep the question of what can be expressed in a language separate from the complexity

11.DeGraﬀ’s criticism being valid only for attempts to measure the complexity of both grammar and lexicon at once, it does not really seem to apply to his main target, McWhorter (2001a), whose title indicates that it concerns grammatical complexity only. Admittedly, in his text, McWhorter sometimes speaks of one language A being more complex than another language B — apparently, this is just a somewhat sloppy shorthand for saying that A has a more complex grammar than B.

Complexity, order, and structure

of the system of regulations that determines how to express that which can be expressed. It is the latter that is of prime importance in this book and that we shall refer to as system complexity. More speciﬁcally, we regard the set of messages that can be expressed in the language under study as given and consider the complexity of the language seen as a system which maps these messages to expressions, or if we like, meanings to forms. It is of course not impossible (actually, it is rather probable) that languages diﬀer in terms of their expressive power,12 but this is something that will be disregarded here. We could also ask for the complexity of the expressions in the language, rather than that of the system. This necessitates a further distinction. Phonetic weight. One notion that has ﬁgured in the literature is that of “signal simplicity”, introduced by Langacker (1977: 102). In a discussion of what he calls “categories of linguistic optimality” he deﬁnes “signal simplicity” as “economy in regard to the production of the physical speech signal” and claims that there is a tendency toward signal simplicity behind many kinds of changes in language, e.g. sound changes such as assimilations, vowel mergers such as aw > o, reductions of unstressed vowels etc. A natural step would be to deﬁne “signal complexity” as the inverse of “signal simplicity”. However, Langacker’s use of the term “economy”, like his reference to the “principle of least eﬀort” suggests that we are dealing here with “cost of transmission” rather than “signal complexity”. Admittedly, in many cases, it will matter little which of these alternatives we choose. A two-segment sequence /aw/ will be both more complex and more costly to transmit than a one-segment /o/. (Remember old-fashioned telegrams paid by the number of words.) But if one speaks louder, one spends more energy without it necessarily being a more complex action. In principle, then, cost and complexity should be kept apart here, although it may not always be motivated to do so in practice. A convenient way of speaking of the parameter that Langacker primarily had in mind is in terms of the phonetic weight of an expression (and correspondingly, of expressions as being phonetically heavy or light). Structural complexity. Consider now the words maid and paid. With respect to phonetic weight, these words are equal, and the same holds for the complexity

12.As noted above, diﬀerences in expressive power may depend on the richness both of the lexicon and of the set of available grammatical distinctions. From the point of view of complexity, lexical richness is indeed somewhat trivial in a way rather similar to the diﬀerence in complexity between two unpatterned strings of diﬀerent lengths. The set of available grammatical distinctions is more interesting. An intriguing question is whether processes of grammatical change similar to those discussed in this book may also inﬂuence expressive power. It could be argued that the rise of tight structures, as discussed in 8.6, at least makes it easier to express more within a given time-frame.

43

44

The Growth and Maintenance of Linguistic Complexity

measured in numbers of phonemes. On the morphological level, on the other hand, paid is more complex, consisting of two morphemes. I shall use the term structural complexity as a general term for complexity measures that pertain to the structure of expressions, at some level of description.13 Structural complexity then is distinct both from system complexity on the one hand and cost of transmission on the other. A possible further notion is length of derivational history that is, the number of steps necessary to generate the expression in a formal system. Length of derivational history is related to structural complexity in a rather interesting way. Consider a “classical” transformational model of grammar, with a phrase structure and a transformational component. In the case of phrase structure, structural complexity, measured in the complexity of the tree, will always correspond to the number of steps in the derivation. This is not true of the transformational component, where the application of a rule does not necessarily make the structure more complex. Irrespective of whether one believes in this model or not, the distinction between structure-building and structure-changing rules may be essential in speaking of length of derivational history. It may be noted that precisely the notion of length of derivational history in terms of the number of transformations applied played a crucial role in discussions of the psychological reality of early transformational grammar. Length of derivational history could be said to be a type of structural complexity, in a wide sense of the latter term. However, one might also suggest measuring the complexity of a grammar not by the length of the grammar itself but rather by the length of a derivation necessary to generate a given expression, or perhaps by the average length of derivations. Such a measure would be intermediate between system and structural complexity. Outside of linguistics, the notion of length of derivational history is applicable for instance to deductive systems, in which the relevant measure would be the number of steps in the proof of a theorem. A potentially important extension of the notion is obtained by assuming that the derivation takes place in real time. We shall see later (6.1) how this relates to the notion of maturity of linguistic structures. The interrelationships between the diﬀerent types of complexity are in themselves complex. What is important to see is that there may well be trade-oﬀs between the types. For instance, a more sophisticated (that is, more complex) mapping between messages and expressions may allow us to reduce signal complexity. In Langacker (1977: 110) this is touched upon, in that he notes the conﬂict

13.It may be noted here that in his 1977 article Langacker actually subsumes phenomena such as the downgrading of words to aﬃxes and the loss of word and morpheme boundaries under the tendency towards signal simplicity. To me, this seems to confound economy and structure. In any case, it leads to the somewhat paradoxical result that the evolution of morphological structure is treated as a case of simpliﬁcation.

Complexity, order, and structure

between signal simplicity and what he calls “transparency”, or “the extent that [languages] show a one-to-one correspondence between units of expression and units of form”, that is, for most cases, the inverse of our concept of system complexity. It follows that a language change may increase one of the types of complexity mentioned above and decrease another.14 For each type of change, then, it is important to consider how it inﬂuences the diﬀerent kinds of complexity — it is not possible to classify changes simply as “complicating” or “simplifying”.

3.9 Conceptual complexity Grammatical regulations (what are usually called “rules”) often involve conceptual distinctions. For instance, there may be a rule to the eﬀect that a noun phrase obtains a certain marking provided that the referent has a certain property (for instance, case marking may apply only to NPs with animate reference). As a consequence, the complexity of a grammar (system complexity) sometimes depends on the complexity of concepts. This motivates a brief discussion of the notion of conceptual complexity. In accordance with what has been said earlier in this chapter, the complexity of a concept should be directly correlated to the length of its deﬁnition. Correspondingly, in feature-based theories of semantics, a complex meaning would be one that corresponds to a larger number of semantic features. There are some pitfalls here, though, that are of importance to the study of language change. It is sometimes said that there is an inverse relationship between the “intension” and the “extension” of a concept, in that a very general concept (say, ‘man’) will have a large extension but a very meagre content or intension, while a very speciﬁc concept (say, ‘bachelor’ or, even better, ‘pope’) has a small extension but a rich intension. Indeed, if you have a deﬁnition such as ‘adult male’ and make it longer by adding a further condition, such as ‘unmarried’, the extension is bound to shrink and conversely, grow if you subtract ‘adult’. But it is important to see that this presupposes that the terms of the deﬁnition are combined by the logical operation of conjunction, that is, we construe an ‘and’ between them. If they are not, the inverse correlation between extension and intension may not hold. For instance, lecture is deﬁned in one dictionary as (19) a formal talk on a serious or specialist subject given to a group of people

14.This claim is not equivalent to the idea that a change in complexity on one level of a linguistic system is always compensated by a change in the other direction on another.

45

46

The Growth and Maintenance of Linguistic Complexity

This, taken literally, means that the subject of a lecture may be either serious or specialist, or both. The set of serious topics overlaps with the set of specialist topics, but they do not coincide, nor is one a subset of the other. Consider now (20) a formal talk on a serious subject given to a group of people. (20) is shorter, and arguably less complex, than (19), but it is also narrower, since it demands that the topic should be serious, rather than serious or specialist. In this case, then, when the terms in a deﬁnition are joined by a disjunction (‘or’) rather than by a conjunction, the inverse correlation between intension and extension does not hold — a simpler deﬁnition yields a smaller extension. Likewise, if we add a further term to the disjunction, as in (21), the extension grows. (21) a formal talk on a serious, specialist or interesting subject given to a group of people The relevance to linguistic change lies in the fact that the expansion of the domain of use of a category does not always lead to a decrease in semantic complexity. For instance, a marker used for animate direct objects may be extended to deﬁnite direct objects; this, however, means that the characterization of the contexts in which the marker is used goes from ‘animate’ to ‘animate OR deﬁnite’, which is, in fact, a more complex description. Similarly, the progressive in English has been extended from agentive constructions to stative ones, but in the latter, they have obtained the nuance of “contingency”, as in She is (temporarily) living in London, necessitating a more complex description. Hopper & Traugott (1993) use the extended use of the progressive as an example of “generalization”, a notion which they take to include “increase in the polysemies of a form”. In my view, the term ‘generalization’ should not be used here, but rather be restricted to cases where the result can be deﬁned as a natural class which includes the original domain — say, if the restriction “animate” on the use of an object marker is simply dropped. This means that the target of the change must not be conceptually more complex than the source. Similarly, in changes that go in the other direction, we may distinguish the addition of a condition, which simultaneously decreases the domain and complicates the description, and the dropping of an alternative interpretation, which decreases the domain but simpliﬁes the description. Both these are labelled “narrowing” by Hopper & Traugott (1993).

3.10 Choice structure As I said in 3.3, behavioural patterns tend to nest into one other, forming complex hierarchical systems. A pattern may involve one or more choice points (a term introduced in 2.2), that is, possibilities for an agent to make a choice (“degrees of freedom”

Complexity, order, and structure

in statistical terms). We are then dealing with a schematic pattern or schema for short. Consider, as a simple example, a standard three-course dinner, consisting of appetizer, main course, and dessert (disregarding beverages for the time being) — each of the courses constitutes a choice point. The three-course dinner schema is a pattern which has an enormous number of possible and actual manifestations in restaurants all over the world (although it is of course far from being a cultural universal). Each restaurant will have its own set of possibilities for the customers to choose from. We thus have at least three levels of abstraction: – – –

the general pattern of a three-course dinner as already described; the set of choices oﬀered by a particular restaurant on a particular day; the particular dinner chosen by an individual customer.

When customers enter a restaurant, they are normally shown a menu, which is essentially a speciﬁcation of the second level of abstraction — the set of choices the restaurant has to oﬀer. They then indicate their preferences and (if everything works out) are served accordingly. The menu and the customer’s order then represent speciﬁcations of the two lowest levels of abstraction. Each of them may be said to put restrictions on the abstract schema represented by the general idea of a three-course dinner. But they also depend on this schema. What they must specify is essentially the choice or set of possible choices for each of the three dishes. The number of items in a customer’s order must thus correspond to the number of choice points (“degrees of freedom” in statistical terms) in the three-course dinner pattern. The situation is made more complicated by the fact that the menu may impose a further structure on the choices, introducing dependent choices. If you choose a certain dish as your main course, you may have a further choice between, say, rice and potatoes. If you choose a salad, you may have a choice among a number of diﬀerent dressings. (The proliferation of dependent choices in American restaurants is a cause of bewilderment for European customers.) Conversely, we may introduce a level above the individual dishes, regarding them all as dependent on the choice of the three-course dinner pattern, among all possible meal patterns. This gives the customer’s order a tree-like structure:

47

48

The Growth and Maintenance of Linguistic Complexity

(22)

three-course dinner salad Italian dressing steak well-done with French fries ice-cream

The most economical way of representing the choices made is in the order the items are served. It should be noted, however, that there is no real necessity for that: in fact, any way of unambiguous speciﬁcation will do. Like customers’ orders in restaurants, structural representations of utterances in linguistics can, by and large, be seen as speciﬁcations of structured sets of choices made by speakers relative to the language system. This idea is far from new (and has been taken to its logical extreme by linguists in the Hallidayan tradition), but confusion sometimes arises about what a linguistic representation is supposed to do, and what ontological status it should be taken to have. When speaking of the representation of a pattern, two aspects should be kept apart: –

–

the speciﬁcation of the choice points: how many they are (e.g. how many courses a dinner consists of), and from what domains the choices are made (e.g. what an appetizer may be like); the actual realization of the pattern, e.g. in what order the courses are served, how they are served etc.

This yields two kinds of structure: choice structure and output structure. Sometimes these coincide, sometimes they don’t. Consider again a three-course dinner. The instructions to the waiter might look like this: – – –

Serve the appetizer. Serve the main course. Serve the dessert.

This would correspond to the output structure: three actions in sequence. However, the customer’s order — representing the choice structure — may look quite diﬀerent. For instance, the restaurant may oﬀer two diﬀerent set menus, in which case the customer would simply order “Menu I” or “Menu II”. Or consider a

Complexity, order, and structure

somewhat more complex situation: the two set menus contain exactly one option, that between rice and potatoes with the main course. An example of an order would then be “Menu II, with rice”. The choice structure thus consists of two elements, which do not stand in any direct relationship to the three elements of the output structure. Similarly, a word like passed is usually described as consisting of two morphemes: the verb stem pass and the past tense suﬃx -ed. Past tense forms such as put, went, was, sang etc. are not analyzable in the same way. We can account for this by saying that the choice structure diﬀers from the output structure: only the choice structure contains two elements, the output structure does not. The distinction between choice structure and output structure is closely related to that introduced by the logician Haskell B. Curry (1961), between two “levels of grammar”, tectogrammatics and phenogrammatics: – –

tectogrammatics — “the study of grammatical structure in itself” i.e. as “something independent of the way it is represented in terms of expressions”; phenogrammatics — how grammatical structure is represented in terms of expressions.

For instance, the expressions two pound butter, two pounds butter and two pounds of butter might be claimed to be diﬀerent ways of realizing the same grammatical construction and thus diﬀer only with respect to their phenogrammatics. Likewise, the phrase structure rules S Æ NP VP and S Æ VP NP would not diﬀer tectogrammatically but only phenogrammatically. In his paper, Curry criticizes Chomsky’s early work for confounding these two aspects of grammar. Likewise, he is critical of those versions of categorial grammar which attempt to build word order into the system, rather than keeping strictly to tectogrammatics. He further predicts that tectogrammatical structure “will vary less from language to language than does the phenogrammatics” (Curry (1961: 66)). Tectogrammatics, then, would be the component of grammar that describes choice structure “in itself”, whereas phenogrammatics speciﬁes how choice structure is realized in terms of output structure. A choice point should correspond to a decision made by the speaker. Choice points may be free or bound. This is somewhat analogous to free and bound variables in logic in the following sense: A choice point may be free in a construction (or an instance of a construction) when we look at it in isolation but bound when the instance of the construction enters into a slot in a larger construction. Bound choice points are part of phenogrammatics rather than tectogrammatics. For instance, in a language with morphological case, a noun phrase may theoretically take any of the cases provided by the grammar of the language, but when it acts as the ﬁller of a slot in a construction, the case is usually ﬁxed by that construction. The choice point corresponding to the case is therefore bound in the larger context and does not correspond to a free decision by the speaker. In general,

49

50

The Growth and Maintenance of Linguistic Complexity

the choice of values of grammatical categories is bound either by the syntactic context or by factors in the speech situation.

3.11 Linguistic patterns In this book, I am deliberately non-committal with regard to formal grammatical models and the general architecture of grammar. In line with this, I use the maximally general notion of linguistic pattern for the elements that build up languages as systems of communication, whether lexical or grammatical. Linguistic patterns, obviously, are a special case of the even more general notions of pattern and behavioural pattern as introduced in 3.3. In usual linguistic terminology, patterns are types. I shall refer to the instances of patterns as linguistic objects. Patterns may be simple or complex. Simple patterns would be words or morphemes. Complex patterns may consist of ﬁxed parts only, e.g. set phrases such as I beg your pardon, or be schematic, that is, contain free slots, where other linguistic objects have to be inserted, for instance a pattern such as Down with NP!, where NP may be any noun phrase with deﬁnite reference, whether George W. Bush or Osama Bin Laden. A slot in a pattern is really a choice point — it means that the user is free to insert one of several diﬀerent objects. As we see in the example Down with NP!, it is normally the case that not just any linguistic object may be felicitously inserted in a slot, but only those belonging to a certain category (here, noun phrases). In fact, the diﬀerences between linguistic objects with regard to whether they can be inserted into one slot or another — their syntactic distribution — is one of the major motivations for postulating abstract entities such as parts of speech and phrasal categories. On the other hand, certain slots also allow for the insertion of non-conventional items. For instance, in an expression of the form X is called Y, basically any symbolic object is allowed in the Y slot. Many languages have special constructions for the inclusion of borrowed items. Thus, the use of non-conventional elements may also be governed by convention. The term construction may be used for schematic patterns, including both such constructions as are traditionally treated as lexical and listed in dictionaries, e.g. a verb frame such as give NP short shrift, or those more abstract patterns that are usually seen as belonging to grammar, e.g. the ditransitive construction V NP NP. In the grammatical framework called construction grammar developed by Fillmore, Kay, Goldberg and others (see e.g. Fillmore et al. (1988), Goldberg (1995)), the notion of ‘construction’ is basic, comprising not only grammatical constructions in the traditional sense but also what is usually thought of as belonging to phraseology. Following this terminological practice, I shall use the term “construction” in this book in a somewhat vague manner as an alternative to “pattern”, primarily having

Complexity, order, and structure

in mind grammatical constructions proper but also allowing for the inclusion of other patterns. The expression of a construction may involve smaller parts which are not freely chosen but rather come as obligatory elements. For instance, the English periphrastic comparative construction more Adj than NP contains the words more and than as obligatory parts. Being used as an automatic consequence of choosing the periphrastic comparative construction, these elements do not have independent communicative eﬀects. In fact, the word than hardly appears anywhere else than in comparative constructions (and after a few other words, as this very sentence illustrates). Still, we might want to attribute some kind of autonomous status to such elements. In particular, a word like than could be argued to form a constituent together with the following noun phrase. Insofar as it also inﬂuences the form of that noun phrase, for instance, with regard to choice of case (than I vs. than me) it looks very much like a grammatical operator. We might therefore be tempted to introduce the notion of an auxiliary pattern — that is, an element that helps build up the expression of another pattern but does not constitute an independent communicative choice. In the example I gave, the pattern than NP shows up as a proper subpart of the comparative construction more Adj than NP. But this is not the only possibility: in many cases, we have a choice between two patterns, where the choice is not free, but rather dependent on extraneous conditions of use, as will be discussed in 5.2.1. In such cases, it is natural to analyze the situation in terms of one primary pattern which is realized as one of two auxiliary ones. Such auxiliary patterns may develop an identity of their own, by appearing in many diﬀerent constructions. They would still belong to phenogrammatics, however, representing bound rather than free choices on the part of the speaker. The identiﬁcation of what is an auxiliary pattern and what is not is complicated by the fact that one and the same element may appear in an independent role in some contexts and as an auxiliary element in others.

3.12 Linearity Speaking of the complexity of languages, we could have in mind either tectogrammatical or phenogrammatical complexity — this is orthogonal to the distinctions introduced in 3.8. For instance, the existence of embedded sentences would make a language more complex from the tectogrammatical point of view. On the other hand, the existence of a word order diﬀerence between embedded and nonembedded sentences would rather count as adding to the complexity of phenogrammatics. Since the topic of this book has already been said to be complexity in the mapping between content and expression, it is primarily phenogrammatical complexity that interests us.

51

52

The Growth and Maintenance of Linguistic Complexity

Could we imagine a language with zero phenogrammatics, that is, with an empty phenogrammatical component in its grammar? It would be useful if we could, since complexity could then be measured in terms of the deviation from that zero point. To be interesting, such a language must contain at least some way of making complex expressions. In the terminology used in the preceding section, this will be done through schematic patterns, and the rule of the phenogrammatical component of a grammar is to specify how the expression of the pattern can be obtained if you know the input elements (the ﬁllers of the open slots in the pattern). The closest one can get to the zero phenogrammatics ideal is probably something along the following lines. Suppose you have two sets of “fridge magnets”: one with numbers and another with suitable count nouns, which can be combined into simple messages. For instance, if you put ‘5’ and ‘apple’ on the fridge door, that means ‘Buy 5 apples’. As long as we do not allow more than one message at a time, we need no restriction on where or in which order the magnets are placed on the fridge door — in this sense the language has no phenogrammatics. On the other hand, it has a non-trivial tectogrammatics: not any combination of magnets but only those which contain one magnet from each set yield well-formed messages. In contrast to this constructed code, most forms of spoken and written human language are traditionally seen as being built up of elements that are arranged in a sequence. We treat this as a design feature of the system and not as something that adds to tectogrammatical complexity. Minimal phenogrammatics would then amount to a system of “unrestricted concatenation”, that is, every schematic pattern is realized according to the principle “Concatenate the input expressions in any order.” Another way of expressing the same thing is to say that all constructions in the language are purely juxtapositional. For instance, if the language contains the expressions three and ﬁsh, both three ﬁsh and ﬁsh three would be well-formed complex expressions. As was the case in the fridge magnet language, this does not imply that the tectogrammatical component is empty. The phenogrammatical complexity of a language is then the extent to which it (or rather its grammar) deviates from a system of “unrestricted concatenation”. Such deviations may be of diﬀerent kinds, which may well have to be treated separately in a study of grammatical complexity. Restrictions on element order. This is the most “benign” type of deviation in that restrictions on element order do not deﬁne a new set of well-formed expressions relative to that of the unrestricted system, but merely single out a subset of it. Thus, in English a numeral has to precede its head noun, so three ﬁsh is well-formed but ﬁsh three is not. It is probably more common for grammatical patterns in languages to obey such restrictions than having totally free element order. Verbosity. Another relatively “benign” deviation from unrestricted concatenation is found in the cases discussed in the preceding section where the expression

Complexity, order, and structure

deviates from simple juxtaposition by the addition of some ﬁxed element. For instance, in English, quantiﬁer words and mass nouns can be joined by simple juxtaposition, as in much snow or little snow. In French, on the other hand, a word like beaucoup ‘much’ cannot simply be juxtaposed with the noun neige ‘snow’; rather, you must insert a preposition de in between: beaucoup de neige ‘much snow’. The deviation is benign in the sense that the output can still be described in terms of a concatenation operation including the two input elements and the preposition. From the point of view of English, however, the element de appears redundant. We may say that French quantiﬁer constructions are characterized by verbosity, deﬁned as having a larger phonetic weight (containing more material) than would be minimally necessary (thus a special case of redundancy). The patterns that contain no other deviations from unrestricted concatenation than restrictions on element order and verbosity more or less coincide with those that can be accounted for by context-free phrase structure rules without complex symbols (this means excluding not only transformations but also e.g. the morphophonematic rules postulated in Chomsky (1957)). To be able to refer to them in a simple way, I shall call them linear, in awareness that this term has various technical and non-technical uses. Insofar as the expression of a pattern goes beyond restrictions on element order and verbosity, it is thus non-linear. This can take place in a variety of ways. Notice to start with that a word order rule may violate linearity. This happens in particular when the input elements are themselves complex, that is, they contain schematic patterns on a lower level. Any linear word order rule has to respect the integrity of the input elements and cannot break it up. An “inﬁxing” rule which puts something the middle of an input element violates linearity. Linearity means, among other things, that the input elements are realized independently of each other in the output. Linearity thus excludes such phenomena as grammatical agreement, in which the form of one element depends on the identity of another. But also any form of integration between the components of the output expression can be seen as providing non-linearity. In practice, this means that almost any linguistic pattern in which expressions are combined deviates from linearity to some extent. Speaking of “juxtaposition”, as I did in the beginning of this section, is strictly speaking misleading, in that the parts of complex expressions in spoken language are not simply juxtaposed but integrated into a prosodic pattern. Taking “juxtaposition” literally, each component expression should retain its own intonational contour and stress pattern. But this, as we know, does not happen in practice. Even in written language, complex expressions are usually not strict concatenations of their components. If we join two words such as sheep and bleat to form an English sentence, it comes out not as (23a) but rather as the formatted version (23b), with initial capitalization and proper punctuation:

53

54

The Growth and Maintenance of Linguistic Complexity

(23) a. sheep bleat b. Sheep bleat. The degree of integration of components, or else the “tightness” of the construction, may thus be seen as one dimension of non-linearity. As integration gets tighter, components lose their prosodic independence and prominence, become more dependent on each other and on extraneous factors with respect to their form, and fuse with each other. In language, such phenomena are typically associated with complex word structures (see 8.6 for further discussion). Along another dimension, non-linearity increases with the unpredictability of the output, with suppletive processes as the extreme. A further type of non-linearity is when the output is not wholly predictable from the input but depends on extraneous factors. For instance, as will be discussed in more detail below (5.2.1), if the Russian ﬁrst person singular pronoun ja is combined with a verb in the past tense, the form of the verb is diﬀerent, depending on whether the speaker is male or female. Such dependence on extraneous factors is very common in language. We may also include here syntactic long-distance dependencies, in which the form of an element depends on a property of some expression which is not part of the constituent being built (for instance, anaphoric agreement). The problem we now face is formulating these principles in theory so as to apply them to concrete languages. Given that there are usually many ways of describing a language, very diﬀerent characterizations of what is phenogrammatical are possible. The problem is not so much one of non-linearity as of verbosity. How do we know what is minimally necessary for a construction? In particular, how do we know if an element should be seen as an input expression for a construction or as introduced only as part of its output?15 The solution I would propose is to turn the question into an empirical one: what is minimally necessary for a construction is the minimal expression of an equivalent construction found in a human language. Verbosity is thus operationally deﬁned as cross-linguistic dispensability.16 It may be objected that this opens another can of worms: when are two constructions equivalent? I submit, however, that there are enough cases where pairs of constructions in diﬀerent languages can be judged translationally equivalent for it to be possible to make interesting generalizations. For instance, the fact that a large part of the languages of the world would translate the sentence Peter is a teacher

15.In logical semantics, the terms “categorematic” and “syncategorematic” are sometimes used for this distinction, although this is subtly diﬀerent from the traditional understanding of the terms, where “syncategorematic” means “an expression that has a meaning only together with other expressions”. 16.For a comment on a similar notion used by John McWhorter, see fn. 26.

Complexity, order, and structure

simply as the name ‘Peter’ followed by the noun ‘teacher’ indicates that both the copula is and the indeﬁnite article a should be regarded as manifestations of verbosity in English, since they are cross-linguistically dispensable. At this point, it may be objected that these words are meaningful in that they express grammatical categories such as present tense and singular number. But notice that these categories are in themselves cross-linguistically dispensable: grammatical tense and grammatical number are simply absent from many languages. Any marker of these categories that adds to the phonetic weight of the expression is thus in itself an example of verbosity and non-trivial phenogrammatics. We should thus distinguish two kinds of cross-linguistic dispensability: that of morphemes or markers and that of categories. Obviously, a category may sometimes lack an overt manifestation: thus, the category of number is normally zeromarked in the singular in English, and sometimes zero-marked in the plural, as well. The discussion above concerned linearity in grammar. It is obvious that phonological phenomena may also be of a linear or a non-linear kind, with an analogous understanding of the notion. Without going too deeply into this issue, I want to point to one type of non-linearity that will turn out to be of direct importance in the diachronic processes to be discussed here, namely that of the phonological representation of words. A linear system would imply that the phonological make-up of every word or word form in a language can be exhaustively represented as a sequence of phonemes. Any prosodic or suprasegmental features are thus non-linear.

55

Chapter 4

Languages as non-genetically inherited systems

4.1 Introduction The languages we speak are ultimately conditioned by our genetic make-up. But biological notions may also be relevant to linguistics through the analogies between linguistic and biological entities. The 19th-century linguist August Schleicher’s thesis that languages are comparable to living organisms is nowadays looked upon with scepticism, but this is not the only way the connection can be made. A more fruitful analogy is that between a person’s genome and his or her native language as examples of bodies of inherited information. Furthermore, the notion of a life cycle is applicable not primarily to languages as systems, but rather, as I shall argue, to the patterns that are elements of those systems, more speciﬁcally, with respect to the diﬀerent stages that those patterns undergo in maturational processes.

4.2 Memetics and linguistics There has been considerable interest lately in the similarities between genetic and cultural evolution, and between genetic and cultural transmission of information. In particular, the concept of a meme, introduced by Dawkins (1976) and deﬁned as “a cognitive or behavioural pattern that can be transmitted from one individual to another one”, has become popular in wide circles. Memes have been assumed to partake in processes of natural selection, motivating a Darwinian perspective of cultural phenomena. It would seem that linguistics would be one of the most natural candidates for a memetic approach. The direction that “memetics” has taken, however, has so far been of rather limited interest to linguists. In order to fully appreciate the potential of a perspective inspired by biology on linguistics, we have to shift the emphasis in a number of ways. To start with, an important feature of culturally inherited systems is precisely that they are systems rather than sets of unconnected elements. In linguistics, it is more or less self-evident that a language is not simply an inventory of items, but also consists of

58

The Growth and Maintenance of Linguistic Complexity

grammatical constructions and other entities that are hierarchically superordinate to items in the lexicon. For the notion of a meme to be applicable, we would therefore have to assume that memes can be organized into hierarchical systems. The focus within memetics has been on horizontal transmission of information — that is, between individuals belonging to the same generation — rather than vertical transmission of information — that is, between diﬀerent generations. The biological analogue of a meme is therefore a virus rather than a gene, and the general phenomenon is contagion rather than inheritance. Languages, on the other hand, are essentially transmitted vertically rather than horizontally — they are nongenetically inherited systems. This is certainly true of many other components of culture as well, for instance religion: although a popular theme in memetics has been the virus-like spread of religions, most humans preserve the religious beliefs and practices (or lack of them) that they learnt from their parents or teachers in childhood. But language diﬀers from most other parts of culture in being a likely candidate for having a speciﬁc mechanism of transmission that may be at least partly biologically pre-programmed, or (to express basically the same idea in a less controversial form): language oﬀers us a unique opportunity to study the interaction between genetic and non-genetic inheritance. Another important point is that the study of culturally inherited systems and language in particular presupposes a partial shift of emphasis from the individual to the group level. Communication, of course, always involves at least two individuals, and a communication system is therefore social by deﬁnition. Much evolutionary thinking focuses on the transmission and selection of individual properties, often even on principle excluding phenomena such as group selection. In culture, it is fairly obvious both that information may not even physically reside in an individual (cf. e.g. written documents) and that many traits are not reducible to properties of individuals but have to be seen as emergent at the group level (e.g. the property of having a monarchical system of government). At least the latter claim is also arguably true of languages. Although the Chomskyan tradition treats language essentially as something that resides in an individual, arguments like the ones provided by Putnam (1975) in the discussion of the “linguistic division of labour” suggest that a language can only be adequately treated as a socially, rather than an individually deﬁned system (see also 3.8). Along another dimension, memetics, like much thinking in the Darwinian vein, focuses on change rather than on persistence: “evolution” is the catchword. But when we consider inheritance, whether genetic or cultural, ﬁdelity of transmission is essential, and we cannot really understand evolution and change otherwise than against this background. This is true of communication systems more than of anything else — they presuppose that there is some kind of “shared knowledge” between the communicators, and ceteris paribus, communication eﬃcacy will increase with uniformity.

Languages as non-genetically inherited systems

Arguments for the innateness of language have usually (at least within the Chomskyan tradition) taken the form that a large part of the knowledge that a speaker has of a language must be innate because there is too much to learn in such a short time and from such limited input. However, what I personally ﬁnd most impressive in ﬁrst language acquisition is rather the ﬁdelity with which the child copies the language of the speech community where she grows up, down to the minutest phonetic details that distinguish the language variety of that community from that of its closest neighbours, details that are often totally imperceptible to outsiders but are readily recognizable to the speakers themselves, making possible an immediate identiﬁcation of members and non-members of the community. What this suggests is that if anything is genetically pre-programmed, it is as much the ability to acquire the features that distinguish one language from another as the knowledge of the features that are common to all languages. Remember that the innate device that supposedly enables us to acquire our native language has ﬁgured under at least two diﬀerent names: UG (“Universal Grammar”) and LAD (“Language Acquisition Device”). These names can be understood as representing two rather diﬀerent ways of looking at the role of this component, where UG would stand for the static knowledge and LAD for the dynamics of acquisition. The obvious point of enhancing the organism’s ability to learn from the experience of others is that this is a much faster way of adapting to the environment than genetic evolution. However, there is a potential conﬂict between ﬁdelity of transmission and quick adaptation to environmental changes. Since the former acts as a conservative force, the system may be adapted not to the current situation but to one that held one or more generations ago. This cultural inertia is readily observable in all human societies (and is the source of many of humankind’s current problems), but there is a limit to what it can account for (see also 3.6, 5.1). Like any other piece of behaviour, transmitting a communicative signal has a cost for the sending agent in terms of energy and time expenditure. Some signals also, hopefully, have a beneﬁt for their senders: this obviously means, in the normal case, that they contribute to conveying some information, and that this information transfer is beneﬁcial to the sender. In the long run, agents will tend to employ only such signals as have a positive cost-beneﬁt balance. Signals, however, may be complex, and human linguistic utterances typically are. For each part of the signal, a separate cost-beneﬁt calculation can be made, at least in principle. In a certain communication situation, some signal element may well turn out to be dispensable without any detrimental eﬀect on the information conveyed — its cost-beneﬁt balance is negative. Agents will tend to spend less time and energy on such redundant parts of a signal, which puts the transmission of the element to new generations at risk, whether the transmission mechanism is cultural or genetic. In genetically transmitted systems, any feature that has a negative cost-beneﬁt balance will tend to be eliminated by natural selection. In culturally transmitted

59

60

The Growth and Maintenance of Linguistic Complexity

systems, the mechanism is slightly diﬀerent, but has the same eﬀect. When the average amount of energy spent on a certain element falls below a certain level, it will no longer be strong enough to be perceived and internalized by the individuals that acquire the system. It is perhaps no great discovery that redundant elements tend to be reduced and ﬁnally eliminated by language change. But this means that there is a limit to cultural inertia as an explanatory mechanism. The longer an element survives, and the more costly it is for the language users, the less probable is an explanation in terms of cultural inertia or “historical junk”. Generally, it is rather unlikely that pervasive phenomena, that is, phenomena that occur in many diﬀerent languages and seem to be stable for millennia once they have arisen, are entirely due to chance, and we are well advised to look for their raison d’être — their function(s). This function, or these functions, are not necessarily part of the knowledge of individual language users but are instead likely to be an emergent phenomenon. More precisely, the individual language user may not know, whether consciously or unconsciously, what the exact contribution of an element is to successful communication. It is sometimes suggested that language change is due at least partially to forces that are similar to mutations of genes. That is, in the intergenerational (vertical) transmission of languages, there would be occasional “errors” in individual children’s acquisition of their mother tongue, which might eventually trigger a change at the societal level. It may be noted that without any selectional advantages of the new language variety or other driving forces of change, it is statistically very unlikely that individual errors will ever spread to a whole community, in view of the pressure towards uniformity. When there is a conﬂict between the language of a child’s parents and the language spoken in the rest of the child’s environment, it seems that the normal choice is for the child to opt for the latter. More speciﬁcally, children speak like their peers rather than like their parents. Many societies have extensive exogamy, meaning that one of the parents may speak another language or another dialect than the other members of the group. In some places such as the Vaupés area in the Amazon (Jackson (1983)), marriage is explicitly only allowed with partners that speak another language — so-called “linguistic exogamy”. It may be noted that if children were more inﬂuenced by their parents than they are, this would be a threat to the stability of languages, in particular in places where exogamy is common. It would thus make sense for the preference for peers’ speech to be genetically determined. But in the absence of hard evidence it is probably better not to speculate about this too much. For the understanding of language change, on the other hand, the paths by which languages are transmitted in a society are of prime importance. Traditional historical linguistic research has largely been concentrated on describing individual historical processes and on ﬁnding causal explanations for them.

Languages as non-genetically inherited systems

There has been a clear tendency to look for speciﬁc, single causes of changes. That is, even if the possibility of multiple causation is not denied, explanations in terms of single causes are still seen as the default case, and when they are not possible, one still expects to ﬁnd a short and exhaustive list of causes. This characterization applies even to relatively recent works such as Thomason & Kaufman (1988). For instance, it is often discussed whether a speciﬁc change should be seen as being motivated by external or internal factors. However, scientiﬁc explanations are increasingly formulated not in terms of one or a few causes but rather in terms of a set of factors that enhance or diminish the probability that a certain kind of event will take place. Thus, when it is said that smoking causes cancer, what is meant is that smoking enhances the risk of getting cancer, but so does for instance a factor such as living in polluted areas. And even if we know these facts, we still cannot predict whether a speciﬁc person who smokes and lives in a polluted area will develop cancer or not. We can thus search for generalizations about linguistic change, but they will not necessarily enable us to predict what changes will take place in a given situation. Chaotic systems. Another aspect of the same story is that linguistic phenomena are “chaotic” in the sense that languages, and even more so language communities, are extremely complex systems and that in such systems, “small causes can have big eﬀects” — a phenomenon also referred to as the “butterﬂy eﬀect,” since the standard example is the butterﬂy in the Amazon which by ﬂapping its wings causes a hurricane in Kansas. The linguistic analogue would be a linguistic change triggered by a single speech event.1 Obviously, this would not mean that the change would be caused by this speech event alone: just that it starts a chain of events which is also determined by many other factors — compare the shot in Sarajevo in 1914 that started World War One. Notice, however, that it is usually easier to formulate general laws for the causal relationships that connect the triggering event with what follows it than to predict when the triggering event itself will take place. For instance, people’s reactions to assassinations of political leaders are much more predictable than those events

1.This probably happens relatively frequently in the case of lexical change. For instance, the use of the expression Iron Curtain for the line separating the West and the East during the Cold War is said to go back to a sentence in a speech by Winston Churchill in 1946. One may wonder if Churchill’s intention was really to coin a new expression — what he actually said was “an iron curtain has descended across the Continent”. A clear case of a lexical change arising from a single speech act, but through a misunderstanding, is the genesis of the expression ﬂying saucer in 1947. At what has been taken to be the ﬁrst sighting of a ﬂying saucer, the U. S. pilot Kenneth Arnold was reported to have said that he had seen some unknown, boomerang-shaped objects ﬂy “like a saucer if you skip it across water”. Only the saucer part was remembered and soon people were seeing ﬂying saucers all over the place.

61

62

The Growth and Maintenance of Linguistic Complexity

themselves. In linguistics, as well, it may be useful to distinguish the triggering event from what follows it, especially with regard to the maturation processes that are the theme of this book, which are often long chains of causally intertwined events. A somewhat similar distinction is that made by Croft (2000: 4) between – –

innovation — “the creation of novel forms in the language” and propagation — the “diﬀusion (or, conversely, loss) of those forms in the language”.

The distinction is essential for Croft’s view of the mechanisms of language change, according to which innovation is functional in nature but propagation is social. However, while the distinction may appear logically unassailable, it is questionable whether it can be upheld in practice. Consider some problematic, but typical cases: –

–

A change may consist in one form becoming more frequent or obligatory, or conversely, less frequent or obsolete. The form itself may have existed in the language for centuries before the change takes place. When a form is borrowed from one language (or language variety) into another, it looks like an innovation from the point of view of the borrowing language, but as propagation if we look at both of them. In fact, the original innovation may have taken place in a language very distant in space and time.

It is also questionable if the claim about the functional-social divide can be upheld. In cultural change in general, social and functional motivations tend to mix. My reasons for buying the latest high-tech gadget may include both my desire to be as well-equipped as my neighbours and my belief that it will make my life more comfortable. Similarly, I may choose a new linguistic pattern both because I have heard other people use it and because I ﬁnd it convenient. It may be noted that the innovation-propagation distinction would seem relatively clear in genetics, where innovation could be identiﬁed with mutations and propagation would be the spread of the mutated genes. However, mutations of genes diﬀer from cultural changes, including linguistic ones, by being random (according to current theory), which precludes “inspiration” from the outside.

4.3 Organisms, groups and ecosystems There are certain structural similarities between an organism such as a human being or a wolf and a social grouping such as a society of humans or a pack of wolves. Both organisms and societies consist of simpler entities — an organism is built up of cells and a society consists of individuals, i.e. organisms. There are also similarities in that there is a division of labour or functional diﬀerentiation between the member entities. A crucial diﬀerence, however, is that the members in a social

Languages as non-genetically inherited systems

grouping have their own individual oﬀspring and therefore compete with each other in the Darwinian selection process. The cells in an organism, on the other hand, do not propagate on their own. The distinction is less clear than might be thought. There are various social animals such as bees, ants and naked mole rats where only some individuals are involved in the process of reproduction. Even in human societies some sub-groups (e.g. monks) may be prevented from having their own oﬀspring. They can therefore only indirectly inﬂuence Darwinian selection. Groupings of organisms of the same species may of course diﬀer in the extent to which the functions of the members are diﬀerentiated. The more developed social groupings tend to resemble a third kind of complex biological entity, the ecosystem, which, however, normally involves several diﬀerent species that propagate independently of each other. Some human societies (e.g. South Africa during the apartheid period) consist of several diﬀerent groupings that do not intermarry to any greater extent. They are thus similar to ecosystems in this respect. We think of sexual reproduction as the normal way for a human society or a social grouping of animals to obtain new members. In principle, however, a human society could exist without sexual reproduction, provided that new members can be “imported” from the outside. The real-life example that comes closest is, I think, the orthodox monk republic of Athos, which apparently is a quite independent society inhabited only by men. It is easily seen, however, that the “reproductive process” needed here is in principle no diﬀerent from that used by most societies in the wider sense of that word, that is, associations, clubs etc. of various kinds. We could call such groupings “secondary societies”. Secondary societies are of some interest for the understanding of the relationship between diﬀerent kinds of inherited information. The mechanism of genetic inheritance ensures that the members of a species ﬁt the “blueprint” for that species. Such a “blueprint” can also be said to exist in the case of human secondary societies — that is, each society puts certain demands on its members. There are several diﬀerent ways of ensuring that members ﬁt the blueprint. The ﬁrst way is by ﬁltering — you see to it that the candidates already meet the demands when they enter. Consider the ways in which members are chosen for organizations such as scientiﬁc academies, elitist political parties etc. Such a process is of course in many ways diﬀerent from the process of natural selection. However, the eﬃciency of genetic information transmission is also sometimes enhanced by selective actions, e.g. when a female animal kills or refuses to feed oﬀspring that somehow deviate from the norm. The second way of making members of a grouping ﬁt the group’s “blueprint” is of course by training. Among humans, training is fundamental both in primary and secondary societies, but higher animals may also train their oﬀspring in various skills, as a complement to genetic inheritance. As a third way, one may mention all the diverse kinds of social control that keep members of groupings within bounds, with the maximal penalties being expulsion from the group or death.

63

64

The Growth and Maintenance of Linguistic Complexity

Two types of selection When Darwin introduced the term “natural selection” in The Origin of Species, his point of departure was “selection” in the sense of “artiﬁcial selection” or breeding, and the main claim of Darwinism is that the same results that are strived for by human breeders are also attained by purely natural processes. By combining the words “natural” (suggesting “blind” causation) and “selection” (suggesting intentionality), Darwin probably evoked a certain feeling of paradox in his contemporary readers, which served his rhetorical purpose. This paradox is lost on us, since, ironically, ‘natural selection’ now tends to be understood as the primary meaning of the word “selection”. What, then, is natural selection? In evolution, the distribution of properties among the individuals of a population changes over time. Saying that some properties are “selected” really simply means that there will be more of them at time t2 than at time t1. But there are at least two ways in which this may happen: – –

through selection by diﬀerential survival — individuals with a certain property have a greater chance of surviving from one point in time to another. through selection by diﬀerential reproduction — individuals with a certain property will on average have more oﬀspring.

There has been considerable confusion about these two alternatives. The phrase “survival of the ﬁttest”, used as a characterization of Darwin’s theory, gives the impression that evolution is all about selection by diﬀerential survival. The theory of “social Darwinism” seems to have largely built on this misunderstanding — in its worst variants it was even taken as an excuse for the allegedly ﬁttest to kill oﬀ their less ﬁt competitors. But in biological evolution the point is not to survive but to have oﬀspring. Of course, to do so you need to survive at least until you can reproduce, but that is only a necessary, not a suﬃcient, condition. However, selection by diﬀerential reproduction presupposes reproduction. There are of course many instances of systems where selection takes place without individuals being reproduced — that is, it depends solely on the fact that some are better at surviving than others. A typical case is competition between companies on a market, starting out with a large number of small entities and ending up with only a few large ones. Companies do not reproduce, they just live on or go bankrupt. Company mergers are sometimes described as “marriages” and one could possibly see them as producing new individuals. It is a bit diﬃcult to make this analogous with biological reproduction, however. If we see Microsoft clone itself into “baby Microsofts”, it is not seen as a case of reproductive success but rather as the ultimate failure. Selection of languages, if it takes place, would have to be a case of “selection by diﬀerential survival”. The same seems true of group selection, whether genetic or cultural.

Languages are a central part of the “blueprint” of any human society. The transmission of spoken language from one generation to another largely takes place without any conscious planning, although the role of parents in providing input to children’s language acquisition is of course crucial. Later on in life, however, the

Languages as non-genetically inherited systems

societal pressure on individuals to conform to the linguistic “blueprint” may be quite signiﬁcant.

4.4 Genotypes, phenotypes and replication Genotypes vs. phenotypes. Consider a cultural phenomenon — a meme, if we like — such as pumpkin pie. Pumpkin pies are produced by people; the phenomenon thus minimally involves an agent or producer and a product. Pumpkin pies are produced by the hundreds and the thousands, but what unites them is a basic idea — a design, recipe, plan or “blueprint”. The distinction between a recipe or blueprint and the visible output produced by following it is of course a fundamental one that shows up in many diﬀerent disguises, not least in linguistics. Cf. the wellknown conceptual pairs langue : parole; competence : performance and, in recent versions of Chomskyan theory I-language (internal language) :E-language (external language). (In spite of the similarity in terminology, the distinction genotype : phenotype is not really commensurable with Curry’s distinction tectogrammatics : phenogrammatics (see 3.10, however).) Biologists speak about genotypes — “the detailed description or classiﬁcation of the genetic information constituting a particular organism by heredity” and phenotypes — “the overt features of living organisms including the interactions between them and their environments as identiﬁable by an observer” (Heylighen (2000)). This is similar to the distinction between recipe and output but there are certain complications. The genotype-phenotype distinction, as used in biology, is in practice between what is genetically determined and what varies according to environmental factors. This makes the application to non-genetic systems problematic, especially if we want to apply the notions to language, since it excludes from the genotype everything that depends on learning. This means that the genotype cannot be identiﬁed with ‘I-language’, the latter presumably being a combination of innate and acquired elements. Furthermore, it is important to see that both genotypes and phenotypes are abstract types rather than concrete individual entities. The genotype is not the DNA in the organism’s cells but rather the content it encodes, and the phenotype is not the organism itself but rather its character. Here, obviously, there is a parallel between ‘genotype’ and more abstract interpretations of notions such as ‘competence’. These interpretations have been the object of criticism, as in the following quotation from Langacker (1999: 91): “If one aims for psychological reality, it cannot be maintained on purely methodological grounds that the most parsimonious grammar is the best one. Should it prove that the cognitive representation of language is in fact massive and highly redundant, the most accurate description of it (as a psychological entity) will reﬂect that size and redundancy.”

65

66

The Growth and Maintenance of Linguistic Complexity

However, it is not obvious that a grammar has to correspond directly to a speaker’s cognitive representations. It does seem to make sense to speak of a language as a “Platonic” or if we like, an information-theoretic object — a set of patterns, independent of its mode of representation. The thought that such “Platonic” objects exist may be repugnant to many people, and I should perhaps choose a less controversial example than language to argue for it. Songs often contain refrains, that is, lines that are repeated after every verse. When the lyrics of songs are represented in song books, the refrains of all verses except the ﬁrst one are often left out, to save space. Obviously, whether we repeat the refrain or not does not make any diﬀerence for the identity of the song, or for its complexity — both are the same in both cases. The information-theoretic object corresponding to a song is thus what is constant in all the representations, whether redundant or not. Similarly, consider the game of chess. The Russian chess champion Kasparov and the computer “Deep Blue” no doubt have very diﬀerent ways of representing the rules of chess mentally, or rather, internally, but we would still say that they know the same game — how could they otherwise play against each other?2 And there is a clear and obvious sense in which chess is a more complex game than e.g. checkers, regardless of the way these games are represented. In fact, the notion of redundancy is relational: an object cannot be redundant in itself but only insofar as it is regarded as a representation of something else. Thus, when Langacker suggests that cognitive representations of language are redundant, that presupposes that those representations are logically separable from language.3 ‘Representation’ is by itself a problematic notion. Usually linguists think of it as involving some kind of isomorphism, but it is questionable if ‘x represents y’ can be given a stronger interpretation than ‘x carries information about y’ in the sense discussed in 3.3. In recent connectionist models of language processing, there is a very indirect relation between the architecture of the system and traditional grammar. Whatever the viability of these models, I think it can safely be said that it is far from clear how to apply the criterion of psychological reality to knowledge that is largely unconscious and partly inaccessible to introspection. We thus really need three notions rather than the customary two. If we let ‘E-language’ continue to stand for observable language behaviour, and ‘I-language’ is made more precise by letting it denote the “cognitive representation of language”, we can use ‘P-language’ (P for ‘Plato’ or ‘Popper’) for language as an emergent,

2.Probably this is what Saussure would call “form”. 3.It is actually not clear what Langacker means by “reﬂect that size and redundancy”. Does he mean that an adequate description of the language also has to be massive and redundant? Or does he just mean that the size and redundancy should be pointed out somewhere? Obviously, there is no reason why a large object must have a large description.

Languages as non-genetically inherited systems

information-theoretic object (for earlier proposals to the same eﬀect, cf. Lass (1980: 130), Katz (1981)). Replication. A further notion that raises problems in the transfer from genetic to non-genetic systems is that of replication. In genetic systems, replication takes place by copying the genome from one organism to another. This is analogous to downloading a piece of software from one computer to another. But a non-genetic system such as language is transferred by a process which is analogous to what software engineers call “reverse programming”, that is, by reconstructing the genotype from the phenotype, or the software from the output. As has repeatedly been pointed out in the study of language acquisition, every child has to reconstruct the language anew. (Just imagine if you could simply download it from your parents…) Another important diﬀerence is that in a genetic system, one cannot really distinguish the replication of the genome from reproduction, i.e. the creation of new individuals, in general. There is no way of making a baby without transferring genetic material from the parents. By contrast, the transmission of a language from parents to children is, at least in principle, a separate process from producing these children, and it is much easier to think of the language as an object (albeit abstract) that is distinct from the individual. Whenever an organism or a machine is “programmed” for a certain behaviour, we may also think of each “run” of the program, or each execution of the behaviour, as a separate object. For instance, if I shave every morning, there will be as many runs of my “shaving” script as there are mornings. In a sense, this is also “replication”: each morning I replicate, or copy, what I did the day before. This is independent of whether the programming is genetic or not. But in speaking of nongenetic systems, it becomes tempting to see this as the proper analogue of replication in biology. It does make a certain sense to speak of “selection” here. Consider the classical notion of “trial and error” learning. Facing a problem that has to be solved, I try various solutions. Some of them work, some don’t. The next time I encounter a similar problem I “replicate” those solutions that worked last time. But this means that there are really two kinds of replication. Consider again the making of pumpkin pies. One kind of replication — phenotypal replication — takes place every time someone produces a pumpkin pie. A rather diﬀerent type of replication — genotypal replication — is obtained when one person communicates the recipe of pumpkin pie to another — orally or in writing, with the result that the knowledge of the recipe is “copied” from the sender’s mind to the receiver’s. It is genotypal replication that most people who have been speaking of the propagation of memes have had in mind, and the level where selection processes might apply. But sometimes thinking in terms of phenotypal replication makes sense, too. As an example, let us look at jokes — for instance the political anecdotes that used to ﬂower in Eastern Europe during the communist period. With jokes, like similar cultural phenomena such as symphonies, hymns, theatre plays, one may also

67

68

The Growth and Maintenance of Linguistic Complexity

distinguish two levels — the design and the performance. When I tell a joke, the performance may or may not be successful, that is, the listener may or may not ﬁnd the joke funny. In the ﬁrst case, he or she may memorize it and tell it to someone else. Here, selection takes place at the genotypal level; it is the good jokes that are remembered and carried on. But a positive reaction from the listener will also make it more probable that I myself will repeat the joke later on. Thus, successful jokes will be performed more often — which is phenotypal replication. (In passing, we may note that the concept of population also becomes applicable to at least two levels: the population of producers and the population of products (pumpkin pies, performances of jokes).) To complicate things, a performance by a stand-up comedian will contain a fairly large number of diﬀerent jokes, where the chances of survival from one performance to another will naturally depend on the success of each individual joke. According to the terminology introduced by Dawkins, genes and memes are called “replicators”4 and the organisms that carry them “vehicles”. Hull (1988) — and Croft (2000) following him — uses the term “interactor” for the latter, deﬁning it as “an entity that interacts as a cohesive whole with its environment in such a way that this interaction causes replication to be diﬀerential” (Hull (1988: 408–409)). Croft (1996, 2000) suggests alternative ways of applying these notions to language. The interactor would still be the speaker, but the replicator is suggested in Croft (1996) to be the utterance, with “language” understood as a population of utterances in a speech community — “the set of actual utterances produced and comprehended in a particular speech community”. In Croft (2000) the role of the replicator is delegated to the “lingueme”, i.e. “a unit of linguistic structure, as embodied in particular utterances, that can be inherited in replication” — basically, a linguistic pattern (see 3.11). The process of replication, then, occurs every time a speaker produces an utterance embodying a set of linguemes. An utterance is now said to be a “structured set of replicators”, corresponding to a string of DNA in biology (Croft (2000: 38)). It is evident that no perfect matches can be found between linguistics and biology. It appears to me, however, that Croft’s way of drawing the analogy obscures the fundamental distinction represented by pairs of concepts such as genotype : phenotype and competence : performance.

4.I had some initial problems understanding this term until I realized that it is formed from the intransitive rather than the transitive use of the verb replicate. That is, a replicator is something that undergoes replication rather than an agent that replicates something.

Languages as non-genetically inherited systems

4.5 Life cycles Cyclicity. Many phenomena in the world have life cycles in the sense that they – –

tend to exist for a restricted time; pass through a number of developmental stages with distinguishable properties.

Perhaps human beings are the example that ﬁrst comes to our minds, but life cycles in some form are found in virtually all biological life-forms. The most striking case is that of insects such as butterﬂies, which pass through several diﬀerent stages (egg, larva, cocoon, adult) between which the external appearance of the individual changes totally. Are life cycles necessarily cyclic? This may well seem like a foolish question. However, the answer turns out to be rather problematic. Being cyclic is usually understood to involve repetition, the natural metaphor (as also suggested by the etymology of the word) being a wheel. For instance, according to the Hindu idea of reincarnation, life is cyclic in a very concrete sense — an individual goes through life an indeﬁnite number of times. But if we do not believe in reincarnation, each of us passes through the diﬀerent stages of our lives only once. At the level of a population, on the other hand, we can speak of repetition in that many individuals go through analogous developmental sequences at diﬀerent points in time, and it thus seems that when one individual dies, others take over. Here, cyclicity is emergent at the group level. An intermediate case between reincarnation and group level cyclicity is when there is some kind of link between the disappearance of an individual and the appearance of a new one. For instance, when my car gets too old, I will scrap it and buy a replacement. At any point in time, I have exactly one car that is at a certain point in its life cycle. Thus, we may discern diﬀerent kinds of cyclicity, some of which are stronger than others. There have been linguistic theories that imply cyclicity in the strong sense of a repeated alternation between a set of diﬀerent states. In its most common form, such claims take the form that languages pass from isolating to agglutinating to inﬂecting and back to isolating, where a new cycle starts. Similar proposals have been made for the development of grammatical categories, and my use of the term “life cycle” may suggest an analogous picture. However, there are two assumptions here that must be questioned. One is that of “programmed death”: that the end of the life cycle is somehow predetermined to take place at a certain point, or at least that the demise of the entity becomes more probable the older it becomes. The other is that of “conditioned iteration”: that the end of one cycle somehow triggers the beginning of the next one. As for the ﬁrst assumption, it is often thought that there are natural processes of “erosion” or “attrition” that will cause any linguistic item to go to zero after a while. I shall argue here that in this respect, languages are more like ecosystems such

69

70

The Growth and Maintenance of Linguistic Complexity

Parkinson’s Cycle The British historian and satirist C. Northcote Parkinson is most well-known for “Parkinson’s Law”, which says that there is always enough work to occupy the time available (Parkinson (1957)). There are, however, many other striking insights in his writings. Thus, in another chapter in the same book, he describes the life cycles of boards and committees. As Parkinson notes, there are at least two important forces that inﬂuence the number of members of a committee: the need to keep the committee small and eﬃcient, and the need to have a suﬃcient number of specialists in diﬀerent ﬁelds represented on the committee. (A variant of the latter is frequently observed in the academic world: the need to have at least one representative for every category of people or every organization that has any relationship whatsoever to the committee’s domain of responsibility.) According to Parkinson, there is a tendency for committees to grow over time. As their eﬃciency decreases, more and more decisions will tend to be taken by a smaller core group of members, who in due time will take over the role of the old committee. Parkinson claims that this process has taken place in a cyclic fashion in the history of British government, repeating itself no less than ﬁve times, starting with the Council of the Crown, the present-day House of Lords. If we leave out this ﬁrst cycle, where the numbers and dates are a bit dubious, the remaining ones are as follows: –

–

– –

Lords of the King’s Council, beginning with less than ten members in the 13th century, rising to 20 in 1433, to 41 in 1504 and ﬁnally to 172, whereupon the Council ceased to meet, although long before that it had lost its power to… the Privy Council, whose original nine members grew to 20 in 1540, 29 in 1547 and 44 in 1558; in 1723 it had 67 members, in 1900 200 and in 1951 300, but already around 1700 the real power had passed to… the Cabinet Council, with 17 members, who soon became 25, and were very quickly replaced by… the Cabinet, which went from 5 members in 1740 to 20 in 1900.

as forests. The life cycle of a forest usually ends with a forest ﬁre. However, given that forest ﬁres usually have external causes (such as thunderbolts), they can, in principle, take place at any point in the life cycle of a forest. Statistically, of course, they are likely to happen within a certain period, but there is no “programming” involved in the timing. Likewise, the catastrophic type of change that typically causes global breakdowns in grammatical systems, I will argue in 11.4, is due to external factors that can arise at any point in time. The second assumption, that the end of a cycle triggers the beginning of a new one, cannot be upheld generally in linguistics. Most speciﬁc grammatical phenomena exist only in a subset of all languages. For instance, a substantial proportion of all languages manage without deﬁnite articles, and have done so for millennia. The loss of an article system in a language thus does not create a gap that must be ﬁlled and

Languages as non-genetically inherited systems

there is nothing that necessitates the development of a new article system.5 Factors determining life cycles. How do life cycles come into being? For cases such as our own life cycles the answer is quite complex, since there are obviously several diﬀerent underlying processes behind them. Let’s therefore start with some relatively simple examples.

The Greek temple in the picture illustrates the process that is commonly referred to as erosion or attrition and which basically any solid physical object is subject to. By direct action of such forces as wind and rain and chemical processes resulting from e.g. air pollution, the material gets worn away. In the end, this leads to the total disappearance of the object. Although the process of erosion will hit some parts of the object earlier than others, it is usually random at more ﬁne-grained levels — that is, there is no way of predicting exactly which particles will disappear. In this sense, it is analogous to sending a message via a noisy channel — compare what happens to the photo of Stonehenge below if we successively add to it more and more “white noise” (“white” in the sense of being undiﬀerentiated):

In the case of man-made artefacts such as tools and pieces of clothing, the process of erosion is partly determined by the extent to which the object is used. A shirt can only be washed a certain number of times before becoming unusable. We all know that erosion leads to a loss of order and structure. Analogously, the application of noise to the photos above clearly leads to the loss of information. However, in one sense, a (partially) eroded object may be more complex to

5.This is not to deny that areal pressure may sometimes contribute to the repeated introduction of some grammatical category or construction in a language, giving the impression of a communicative need for the phenomenon in question.

71

72

The Growth and Maintenance of Linguistic Complexity

describe, since, as I said, the results on a ﬁne-grained level are unpredictable, but also unpatterned — which in its turn means that they are basically uninteresting. Erosion should be distinguished from other processes that also lead to a loss of matter. A pocket calculator produced today weighs only a fraction of what one of its bulky ancestors did thirty years ago. Like many other electronic appliances, calculators have successively been reduced in size as more eﬃcient and light-weight components have been developed. We may call this process trimming. It is fairly obvious that it is diﬀerent from the erosion process that would have taken place e.g. if a calculator had been left outdoors for thirty years. In contrast to erosion, trimming takes place under constraints that preserve the functionality of the object. I shall return to the distinction between trimming and erosion later when discussing language change. Even ordinary physical objects thus have life cycles in the sense that they are sooner or later eroded and dissipated in the surrounding environment — all in accordance with the Second Law of Thermodynamics. But these life cycles of hammers and roofs are relatively uninteresting since they lack one of the most fundamental properties of biological life-forms, viz. the capacity for growth. Processes of growth and maturation account for most of the changes that occur during the early stages of the development of an organism, while the other major mechanism — aging — takes over gradually at later stages. Aging resembles the wearing-out or erosion process we have already discussed for inanimate objects, but it may also be conditioned by factors that are internal to the organism. Thus, cells in sexually reproductive organisms usually only divide a restricted number of times. It may be too strong to say that they are pre-programmed to divide only this many times; rather, if there is pre-programming, it works toward keeping the process going for a minimal number of times. It does seem that it is sometimes more rational to replace a worn-out item with a new one than trying to repair it. Thus, the restricted life cycle of individuals has the eﬀect that the society where they live can guarantee itself longevity by systematically replacing its members. It is striking that life cycles are much less pronounced on higher levels of organization than on lower ones. Thus, in spite of repeated claims to the contrary, human societies do not have anything corresponding to the well-deﬁned life cycles of their members — a human society may, in principle, go on existing for ever. In medicine, infections tend to go through deﬁnite stages of development, both as applying to individuals, and as a group level phenomenon, that is, as epidemics. An epidemic may spread quickly through a population and then fade out because everyone has already had it or (which is partly the same thing) the members of the population have developed a resistance towards it. However, the resistance disappears after some time, and the epidemic comes back, creating a true cycle. Many, or most, of the phenomena used to exemplify the concept of memes in popular discussion have life cycles in the sense that they are subject to the laws of fashion.

Languages as non-genetically inherited systems

“The tip cycle” 1. tips are given for special services rendered

5. net prices are applied

4. a certain percentage is added to the bill

2. tips are often given without any special reason 3. tips are always expected

Consider a ‘fad’ such as the wearing of baseball caps in reverse (already on the wane). It starts as the idea of one or a few individuals, then spreads quickly, perhaps all over the world, then enters a more stable phase for a couple of years, to end up being regarded as a sign of not really keeping up. The development of fashion phenomena is obviously related to the need for novelty but other factors are also important, such as the need to conform to the norms of the group you identify with and simultaneously distance yourself from other groups. Some fashions are cyclic in the proper sense in that they return at regular intervals, much like Halley’s Comet, or like outbreaks of the ﬂu. Thus, ﬁrst names have been observed to recur every three generations or so: in Sweden, the names that were popular at the beginning of the 20th century are coming back again. But cultural phenomena may show even more complex life cycles. Consider the development of elites in society, such as nobility. In medieval Sweden, anyone could become a nobleman (which essentially meant being relieved of taxes) simply by putting a horse and an armed horseman at the king’s disposal. As time went by, noblemen secured themselves privileges and a stable share of the power in society. Part of this was that nobility became hereditary and remained so even when the privileges and the power were lost. In the end, then, the nobility in Sweden became totally fossilized as an elite. It appears that such a development from ‘functional’ to ‘formal’ criteria is an important component of the life cycles of many cultural phenomena, including linguistic ones. The development of tipping is an interesting illustration (see ﬁgure). The processes that underlie life cycles are in general unidirectional in the sense that

73

74

The Growth and Maintenance of Linguistic Complexity

we do not become young again after becoming old. However, confusion may arise if we look at the development of individual parameters, such as size and strength. Thus, after becoming big you may become smaller again. But make no mistake: this does not mean that the process of aging is equivalent to maturation in reverse. Younger and older individuals often compete for the same ‘niche’. A leader wolf can only keep his place as long as he is stronger than his younger competitors. Thus, sooner or later, some younger individual will take over. It is important to see how this relates to Darwinian selection. When the old leader wolf is defeated by the young one, he has normally already passed on his genes to his oﬀspring. Also, the fact that he is defeated depends on his age rather than on his genes. Thus, it is misleading to think of the outcome of the ﬁght between the old and the young wolf as ‘the survival of the ﬁttest’.

Chapter 5

Aspects of linguistic knowledge

5.1

Introduction

The recognition that languages may be seen as information-theoretic objects or “P-languages” does not detract from the importance of also studying the ways in which humans acquire and use languages. In this chapter, I shall consider several issues that will prove to be relevant to the main topic of the book. 5.2 Functions and intentions Separating functions and intentions. Intentions tend to play a central role in theories of linguistic communication. Consider, for example, Grice’s paraphrase of “A meant something by x”1 as “A intended the utterance of x to produce some eﬀect in an audience by means of the recognition of this intention” (Grice (1957)). However, intention is not essential to communication: “Fireﬂies ﬂash, moths spray pheromones, bees dance, ﬁsh emit electric pulses, lizards drop dewlaps, frogs croak, birds sing, bats chirp, lions roar, monkeys grunt, apes grimace, and humans speak.”

These are examples of “systems of communication” enumerated in Hauser & Marler (1999: 22). It is obvious that not all these systems involve intention on the part of the communicators — in particular, only the last items in the list are likely to involve an intention like the one described in the quotation from Grice. Still, Hauser and Marler make the generalization that all the systems “are designed to mediate a ﬂow of information between sender and receiver”. “Are designed to” must here be understood in the evolutionary sense: these systems evolved because they contributed to the reproductive success of their bearers. Such a claim involves a functional explanation in the sense current in biology and deﬁned e.g. by Tinbergen (1953). Functional explanations, like the one quoted from Hauser and Marler, are often couched in quasi-teleological terms, which means that they certainly sound to the uninitiated as if they referred to intentions. However, such

1.In the original, there is a subscript “NN” on “meant”, indicating “non-natural meaning”.

76

The Growth and Maintenance of Linguistic Complexity

formulations should only be understood as a convenient way of referring to the reproductive advantages of certain behaviours. Notice also that it is only the behaviour itself that needs to be programmed into the individual, not its function. Rather, the function is an emergent quality of the behaviour — emergent in the sense that it makes sense only at a higher level of description, the one where we speak of the role it plays in evolution. A well-known feature of the behaviour of squirrels and other rodents is foodhoarding. This has the obvious function of ensuring that there is enough food available even in the oﬀ-season. However, biologists inform us that rodents do not hoard food because they foresee that it will be needed during the winter, they just do it because they are programmed to. In contrast, hoarding behaviour in humans, when it is not due to a pathological condition, is normally evoked by expectations about future periods of dearth. For instance, rumours of a truck strike might cause people to hoard food products because they are afraid that food will not be delivered to the stores. In such cases, we describe the behaviour as intentional and directed towards a goal. On the other hand, one would hardly say that the squirrel’s goal is to have a suﬃcient amount of nuts during the winter. This understanding of the word “goal”, in which it presupposes expectations about the future, ﬁnds support in the WordNet deﬁnition of “goal” as “the state of aﬀairs that a plan is intended to achieve and that (when achieved) terminates behavior intended to achieve it”. Notice that this will make the two notions “goal” and “function” clearly distinct from each other. The philosopher David Lewis, among others, has argued that our understanding of causality depends on the consideration of non-actualized possibilities — “alternative possible worlds” (Lewis (1973)). Saying that A causes B means that ceteris paribus, if A had not happened, B would not have happened either. For instance, Hitler’s invasion of Poland caused England and France to declare war; if Hitler had not invaded Poland, there would have been no declaration of war. Functions, as deﬁned above, are causal notions, and consequently involve counterfactual considerations. That a property “contributes to reproductive success”, in more explicit causal terms, means that if the organism did not have that property, it would be less successful in reproducing, that is, it would have less oﬀspring. This is empirically testable only to the extent that we can observe other organisms in analogous environments that do lack the property and are less successful in reproducing. Goals involve causation indirectly, in that the agent intends the action to cause the goal to come about. In the prototypical case, the agent believes that if the action is performed, the goal will come about, and that if it isn’t, the goal will not come about. We may still speak of a goal if the agent is not sure of the success of the action; however, it seems that the agent at least has to believe that the performing of an action increases the likelihood that the goal will be realized.

Aspects of linguistic knowledge

Suppose the British government had already decided to declare war on Hitler when he invaded Poland. In that case, we would not say that the invasion caused the declaration of war. Similarly, it seems strange to say that I do something with a goal in mind if I am convinced that the goal state will come about whether the action is performed or not. If I say on a Wednesday that I intend to bring it about that tomorrow is Thursday nobody will be particularly impressed. The notion of function in linguistics. In principle, the biological concept of a function should be transferrable to cultural phenomena without any major changes (as argued by Harder (1996)). As noted above, we may assume that cultural practices that tend to be stable and to survive for many generations do so for a reason. At some level, there must be a positive cost-beneﬁt calculation that motivates the continuation of the practice. Such a calculation need not reside at the level of the individual act, however — that is, it may be independent of the intentions and desires of the agent at the moment the decision to apply the practice is taken. In such cases, the cost-beneﬁt calculation is emergent — it can only be made on a higher level of analysis. Identifying the function of a cultural pattern is not always unproblematic. To start with, it is obvious that we cannot always speak of the function of a phenomenon — most cultural patterns, at least potentially, have several functions. Clothes are worn to keep wearers warm, dry, or cool, depending on the weather; they may identify the wearer as a member of a group or as having a certain rank; they are used with the intention of impressing other people; they hide parts of one’s body that one does not want to be seen etc. Functions also change over time, with or without the pattern itself changing (Keller (1994: 14)). Many behaviours and artefacts that once had a practical function may later on be used for decorative or ceremonial purposes. Moreover, patterns may survive for a long time not because they are useful in any way but because of cultural inertia, more commonly called tradition or routine (cf. 3.6, 4.1). Especially in such cases, it is easy to confuse the function of the pattern with the original motivation for introducing it, which may be entirely diﬀerent. The concept of a niche. The term “niche” is well-known from ecology, and the term is nowadays often used in a generalized sense. Most deﬁnitions refer to the physical position of an organism in an ecosystem, or to its function in the same, or both. A deviant formulation is found in Flake (1998), who says that a niche is “a way for an animal to make a living in an ecosystem”. It turns out that there is an interesting diﬀerence in perspective here. Saying that A’s niche in the ecosystem S equals A’s function in S amounts to identifying the niche with the contribution A makes to the continued existence of S. But Flake’s deﬁnition suggests that we are looking for the way A manages to survive in S, irrespective of whether this is good or bad for S as a whole. The diﬀerence stands out clearly in the case of parasites. The rats in your basement are certainly quite content with their place in the “ecosystem”

77

78

The Growth and Maintenance of Linguistic Complexity

to which you and they belong, but it is questionable whether they make any positive contribution to it. In my opinion, Flake’s deﬁnition makes more sense than the ones that refer to functions, especially when speaking of diﬀerent organisms competing for a niche — the competition concerns their own survival rather than that of the whole ecosystem. The notion of function, then, is strictly speaking irrelevant to that of a niche. Does the concept of a niche make sense in a linguistic context? There is a striking similarity between the ecological principle that no niche can be occupied by two diﬀerent species at the same time and the claim that absolute synonymy does not exist in language, that is, that two diﬀerent linguistic elements cannot have exactly the same meaning. (Croft (2000: 176) calls this the “First Law of Propagation”, without, however, drawing a parallel to ecology.) It is indeed tempting to think of the semantic and/or pragmatic aspects of linguistic elements as their “niches”. In particular, the niche metaphor is close at hand when speaking of linguistic elements expanding their domains of use, “conquering new niches” as it were. Given that “niche” does not imply “function”, as we have just seen, the term is useful when we want to speak of the place of an element in a linguistic system without making any claims as to its functionality. Indirect intentionality. Dennett (1987) makes a distinction between original and derived intentionality. My alarm clock wakes me up at seven in the morning. It does so because I have intentionally set it to do so. The alarm clock, obviously, has no intentions, but what it does derives from my intentions. Dennett argues that what seems to be original intentionality in human action is derived in that all functionality ultimately derives from the process of natural selection. The diﬀerence between the nut-hoarding squirrel and a human involved in a similar activity would thus not be as great as it might seem. Whether this view is reasonable or not is an issue that I won’t go into. Instead, I shall look at some possible applications of the distinction made by Dennett. For convenience, I shall use the term indirect intentionality for the whole chain that involves both original and derived intentionality. One pervasive trait of indirect intentionality is that there tends to be a diﬀerence in speciﬁcity between the levels of original and derived intentionality. When I receive an electronic message from the university library asking me to return my overdue books, no living being has had the speciﬁc intention of informing me about those particular books, rather, someone has — intentionally — installed an automatic system that informs any negligent library patrons of their books being overdue. The original intention is thus less speciﬁc and more abstract than the derived one.

Aspects of linguistic knowledge

(24)

intention

general routine

specific action

But a similar relationship may often be found in our everyday actions. When I leave my house in the morning, I lock the door. Why? If someone asks me, I will probably answer that I do not want other persons to be able to enter the house in my absence. However, it is unlikely that I am consciously aware of this every time I lock the door, rather, doing so belongs to the set of routines that I perform (quite possibly sometimes half asleep) every morning. So if I say that my intention is to keep strangers out of the house, the intentionality involved is indirect in that the causal link is primarily between this intention and the door-locking routine in general, rather than a speciﬁc action performed on a particular morning. A habit is something that is performed more or less routinely or automatically, given certain circumstances. This necessarily creates a separation, or if we like, emancipation of the action from the (original) intention, and decreases the relevance of intentionality (see ﬁgure above). Consider now a slightly diﬀerent case of indirect intentionality. Suppose I want to attain a goal of some kind. It is then quite normal that I have to perform certain actions that are necessary pre-conditions for getting what I want. We would have no problem in regarding these actions as intentional. But sometimes they have negative side-eﬀects, go against my moral principles, or are generally unpleasant to perform. Only if I want to attain my primary goal badly enough will I put up with those things. In a cost-beneﬁt analysis, they are part of the cost I have to pay in order to get what I want. In such cases, it feels less natural to use the word “intentional”. To take an everyday example: you are staying at a cheap hotel, and the only bathroom is at the other end of the corridor. Do you intentionally pass through the corridor when you want to go to the bathroom? Sometimes such questions do not seem to make very much sense, and very much the same applies to communicative behaviour, to which we now turn. As was noted in the beginning of the chapter, most cases of animal communication can hardly be explained in intentional terms. The same obviously goes for much of the information transfer that takes place between humans: I blush and immediately reveal that I am embarrassed. A certain piece of communicative behaviour may involve both intended and non-intended information. If I ask for a beer in a pub, my goal is certainly to make the bartender understand that I want a beer and moreover, to understand that I intend to make him understand that (in accordance with the Gricean concept of communication as formulated at the beginning of this chapter). But my unsteady voice may reveal that I am too

79

80

The Growth and Maintenance of Linguistic Complexity

intoxicated to be served more beer, directly counteracting my intentions. However, this seemingly straightforward dichotomy between intended and non-intended information meets with diﬃculties due to the existence of indirect intentionality. In particular, an agent may be forced to reveal certain information as a necessary condition to attain her goal. I may want to download a ﬁle from the Internet. Before I can do so, however, the website in question demands that I reveal my email address, my age, gender etc. This is not like blushing or speaking with a drunken voice, but full-blown intentionality is also lacking. (See Allwood (1976) for a taxonomy of communicative situations.) Let us now see how the distinction between direct and indirect intentionality links up with the notion of structure in behaviour in general and in language in particular. The notion that provides the bridge between them is choice. Full-blown intentionality presupposes freedom — the agent has more than one option to choose from: in a one-party state, the intentions of voters are irrelevant. In a democracy, you may even have the option not to make a choice, by abstaining from voting. But in many situations you have several alternatives and must choose one of them. Such forced choices are often cases of indirect intentionality: a freely made decision may entail further choices. Choices may thus interact with each other, forming complex patterns, becoming more or less institutionalized in a society. In the next section, we shall discuss the relationship between functions, intentions, and the meaning of grammatical items.

5.2.1

Functions vs. conditions of use

Seuren (1996: 334, 2001) suggests that “each language has associated with it a socalled semantic questionnaire that has to be “ﬁlled in” by any speaker of a language before any sentence can be formulated”: in a language like English, it must thus be speciﬁed whether the situation spoken of in a sentence is located in the past or not. Classical formulations of what appears to be the same idea2 are found in some papers by Roman Jakobson, in the context of discussions of the diﬀerences between grammatical and lexical meaning:

2.McWhorter’s notion of “overspeciﬁcation” is also relevant here. That a language is “overspeciﬁed” for him means that it “gives overt and grammaticalized expression to the more ﬁne-grained semantic and/or pragmatic distinctions than other [languages]” (136). This may be taken as a formulation of the idea of cross-linguistic dispensability (see 3.12), but what I ﬁnd slightly problematic is the assumption that seems to underlie his way of reasoning that these distinctions can be characterized as “unnecessary to basic communication”, and that there presumably are others which are necessary to it. Whether a certain distinction is necessary or not must depend on what the speaker wants to say; the point is that a grammar may force us to express it quite independently of what we want to convey.

Aspects of linguistic knowledge

“Languages diﬀer essentially in what they must convey and not in what they may convey.” (Jakobson (1959a: 236)) “… the true diﬀerence between languages is not in what may or may not be expressed but in what must or must not be conveyed by the speakers.” (Jakobson (1959b: 492))

Although these quotations undoubtedly express a fundamental insight, some caution is in place here. Seuren’s idea of a “semantic questionnaire” may be interpreted to the eﬀect that what is important is that a certain piece of information be expressed somewhere. However, what is typical of obligatory categories such as tense in English is that the grammatical markings in questions are used irrespective of whether the information they express is indicated elsewhere in the sentence or not. Thus, the presence of a temporal adverb does not relieve us from the necessity of using the past tense. On the contrary, it is precisely when a deictic adverb such as yesterday is used that the choice of the past rather than the present is wholly unavoidable. Conversely, some English verbs have identical forms for the present and the past; for instance, a sentence such as you put it there is temporally wholly unspeciﬁed, but it is still not felt to be in any way incomplete in actual use, as the semantic questionnaire idea would imply. It may thus be argued that the obligatoriness of the semantic information is secondary to the obligatoriness of the grammatical category, or as Jakobson also says, referring to Franz Boas: “the grammatical pattern of a language (as opposed to its lexical stock) determines those aspects of each experience that must be expressed in the given language” (Jakobson (1959a: 236)).3 Cf. the discussion of obligatoriness in 3.8. Let us have a look at an actual example to see how this works in practice. In Russian, a person who speaks about him/herself in the past tense has to reveal by the choice of verb ending whether he or she is a man or a woman. For instance, when someone says Ja zabolel-a ‘I have become ill’ the feminine ending -a conveys the information that the speaker is a woman. Sometimes, of course, this may be part of the speaker’s intentions, but the point is that this has no bearing on the choice of expression: an appeal to intentions does not help us to explain how the -a ending works. The irrelevance of the speaker’s intentions does not mean that we cannot

3.Interestingly enough, at least one of Jakobson’s two examples is a borderline case which is less well suited to illustrate the diﬀerence between lexical and grammatical material. He says “In order to translate accurately the English sentence “I hired a worker,” a Russian needs supplementary information, whether this action was completed or not and whether the worker was a man or a woman”. I assume that what Jakobson had in mind in the latter case was the forced choice between the nouns rabotnik ‘(male) worker’ and rabotnica ‘female worker’. But although these words are related through a derivational process, one could still argue that the diﬀerence between them is lexical rather than grammatical.

81

82

The Growth and Maintenance of Linguistic Complexity

speak of the functions of an element such as the Russian a-ending, but we have to describe them in non-intentional, “emergent” terms: they concern the reasons why linguistic elements of these kinds survive in languages. We may now describe the facts about Russian in a number of ways. Consider the following two statements: (25) a.

If the subject of the sentence refers to a female individual, the verb gets the ending -a in the past tense. b. The function of the ending -a is to convey the information that the referent of the subject is female.

The formulations in (a) and (b) diﬀer in that (a) states a suﬃcient condition for the use of the -a ending while (b) ascribes a function to it. In the light of the preceding discussion of functions, it should be clear that (b) must be seen as making a stronger claim than (a), and, in addition, a claim on a higher and more explanatory level. When describing grammatical systems, linguists use a multitude of formulations to state generalizations as those in (25), such as: (26) a. X expresses/marks/signals/codes/encodes Y. b. The speaker uses X to express/mark/signal/code/encode Y Are these locutions like (25a) or (25b)? Most linguists who use them have probably never worried about this question, and would not think of it as representing a problem at all. Backed into a corner, however, they would probably in most cases acknowledge that the formulations in (25) reﬂect a belief on their part that this is in some sense what X is good for, that is, it says something about its function. As I argued above, the identiﬁcation of the function(s) of a cultural pattern is often problematic because the functions are (i) non-unique, (ii) changing, (iii) easy to confuse with other things. These considerations certainly apply to the functions of grammatical elements. In particular, there is no reason to assume that a grammatical element has one function only and that, in addition, the function is constant over time. On the contrary, in order to understand what is going on in grammaticalization, we have to postulate multiple functions whose relative weights shift during the grammaticalization process. Furthermore, we may have to resist the temptation to identify the function of a grammatical item with the information that is computable from its conditions of use. Notice that as soon as a cultural pattern is connected with a set of conditions for its use, there are conclusions to be drawn from the fact that the pattern is observed. If I know that Friday is “casual dress day” at my oﬃce, the fact that none of my colleagues is wearing a tie tells me that today is Friday. But it does not follow that the function of casual dress is to signal that it is Friday. This line of reasoning is relatively easy to accept for non-linguistic cultural patterns; it is much harder to free oneself from the thought that elements of a language always wear their function on their sleeve: that, for instance, a deﬁnite

Aspects of linguistic knowledge

article has the function of signalling the “identiﬁability” or “uniqueness” of its referent, or that the function of a tense morpheme necessarily is to tell us when something happened. We must reject such an identiﬁcation of functions and conditions of use, however, in order to understand how an element may come to be used in contexts that are precisely the opposite of what its assumed function seems to imply. For instance, at advanced stages of the grammaticalization of deﬁnite articles, they show up in what would seem to be prototypically indeﬁnite uses of noun phrases. One such context corresponds to noun phrases with the so-called partitive article in French, where Swedish normally has a bare noun. Examples are from Moroccan Arabic and Elfdalian. (27) Moroccan Arabic Ka¯in 6l-hobz. ¨ there-is def-bread ‘There is bread.’ (Caubet (1983: 235)) (28) Elfdalian An drikk mjotsje˛. he drink.prs milk.def.acc ‘He’s drinking milk.’ It is diﬃcult to see how deﬁnite articles may expand from their “normal” uses to these ones if we assume that their main function is to signal the deﬁniteness of the noun phrase, however we understand the essence of “deﬁniteness”. Rather, the condition that the noun phrase be semantically deﬁnite is a backgrounded relict of its original content. Another example is the frequent development from progressives to imperfectives. The progressive in a language such as English creates an “opposition” between e.g. “speciﬁc” She is smoking and “generic” or “habitual” She smokes. Yet, in language after language we may see that progressives take on generic and habitual uses, thus apparently denying their own essence.4 Summing up: grammatical categories such as gender, deﬁniteness and tense have conditions on use that a speaker must know in order to speak the language idiomatically; the relationship between these conditions of use and the function of the elements in question may not be a direct one, however. In particular, one should not identify the function of the grammatical categories with the information

4.Examples are Scots Gaelic, Welsh, Yoruba (Comrie (1976: 39)); Hindi-Urdu and Punjabi (Lienhard (1961: 46–49), Dahl (1985: 93)); various Turkic languages, Armenian, Persian (Johanson (2000: 100)); Lezgian (Haspelmath (1993: 276)).

83

84

The Growth and Maintenance of Linguistic Complexity

about the world obtainable by drawing conclusions from the assumption that the speaker applies the rules correctly, such as information about the sex of some referent. Neither should one describe such information in terms of the intentions of the speaker: at most, we are dealing here with indirect intentionality in the sense deﬁned in the previous section. For a further discussion of possible functions of grammatical items, see 9.4. Further observations on the semantics of grammatical morphemes. I noted above that the speaker’s intentions are irrelevant for the use of grammatical marking like Russian gender. Discussing these matters in Dahl (1985), I claimed that the information conveyed by a morpheme such as the Russian feminine ending -a is not part of the speaker’s “intended message” and noted that the usual Gricean communicative principles, in particular the principle of relevance, are simply inoperative for the use of such morphemes. It is possible to strengthen this statement: the information conveyed by the gender morpheme cannot be manipulated by the speaker in such a way that it becomes part of the “intended message”. Consider the following Russian sentences: (29) Russian a. Moj professor ne mužcˇina, a ženšcˇina. my professor neg man but woman ‘My professor is not a man but a woman’ b. Moj professor ne šved, a švedka. my professor neg Swede but Swedish_woman ‘My professor is not a Swedish man but a Swedish woman’ c.

Moj professor ne russkij, a russkaja. my professor neg Russian.m.sg.nom but Russian.f.sg.nom ‘My professor is not a Russian man but a Russian woman’

d. Moj professor ne veselyj a veselaja. my professor neg happy.m.sg.nom but happy.f.sg.nom ‘My professor is not cheerful-male but cheerful-female’ e.

Moj professor ne p’jan, a p’jana. my professor neg drunk.m.sg but drunk.f.sg ‘My professor is not drunk-male but drunk-female’

(29a) contrasts two nouns meaning ‘man’ and ‘woman’. In (b), there is a masculine and a feminine noun for ‘Swede’ distinguished by the derivative ending -ka. In (c), there are masculine and feminine forms of the nominalized adjective russkij ‘Russian’. In (d), there are masculine and feminine “long” forms of the adjective veselyj ‘cheerful’, and in (e), there are the corresponding “short” forms of the adjective p’jan ‘drunk’. (See 7.3 for information about Russian “long” and “short” adjective forms.)

Aspects of linguistic knowledge

The sentences in (29) represent a decreasing scale of acceptability.5 (a) is a normal way of stating that a person is female rather than male. (b) and (c) suggest “meta-comments”, natural primarily as a correction to what someone has said. (d) and (e) do not make sense at all. On a simplistic componential analysis, all the feminine nouns and adjectives in (29) would contain a semantic component ‘female’, but as we see, this component has very diﬀerent roles to play depending on whether it occurs as part of the meaning of a simple lexical item, a derivational morpheme6 or an inﬂectional marker. As we move downwards in (29), the purported semantic component [female] becomes less and less accessible to manipulation. This is consonant with the fact that grammatical morphemes, in particular inﬂectional ones, are not usually possible targets of highlighting operations such as focusing or marking by constrastive stress. An observation made by James McCawley is of some interest here. Discussing Greenberg’s notion of markedness in morphology, McCawley (1968:568) noted that the plural is used in English when the cardinality of the set referred to is not known: “application forms give headings like schools attended and children”. This can be made a little stronger: if you have one child only, you may feel that you have to qualify a positive answer to the question Do you have any children?, e.g. Yes, but only one, but a negative answer (No, I don’t) is simply not possible. In other words, the plural morpheme in children has no eﬀect on the truth-conditional interpretation of the question — quite unlike what would happen if any were replaced by several. Newmeyer (1998: 254) questions the assumption that grammatical (“functional”) elements convey less information than lexical ones: “In some conversational contexts, for example, the lexical material coding tense and aspect [I assume this means the information conveyed by the tense and aspect markings Ö.D.] is wholly extractable from the context of the discourse; in others conveying clearly that the proposition expressed is in the past tense rather than in the present tense can make all the diﬀerence in the world.”

He provides no example of the latter situation, however. In fact, it is not so easy to use the past tense in English to highlight the past character of some event or state of aﬀairs. To quote James McCawley once more, the sentence The farmer killed the duckling “is odd unless the prior context provides a time for the past tense to refer to” (McCawley (1968: 110)) — that is, the fact that the time-point is in the past is

5.I am grateful to Maria Koptjevskaja-Tamm for the intuitions. 6.The ending -aja in russkaja ‘Russian woman’ is obviously formally an inﬂectional adjectival ending but since russkij ‘Russian (man)’ and russkaja ‘Russian woman’ are independently lexicalized as nouns, it could be argued that it has obtained the function of a derivational suﬃx. In other words, this is a possible route from inﬂection to derivation — the opposite direction of what is common in grammaticalization.

85

86

The Growth and Maintenance of Linguistic Complexity

backgrounded. At least in some varieties of English, you may say I lived there to mean ‘I once lived there’ but as soon as you try to contrast such a statement to one in the present tense, it starts to sound strange: (30) ?I don’t live there, but I lived there. (preferred: …but I used to live there) Future will is quite diﬀerent — (31) is wholly normal: (31) I don’t live there, but I will. It thus appears that the inaccessibility of the “meaning” is a characteristic of inﬂectional elements.

5.3 Ritualization and conventions In the study of animal communication, the term ritualization has been used of behaviours disconnected or emancipated from their original function, in particular, for the development of “display behaviour”, such as when an agent signals its readiness to perform an action (e.g. an attack) by making the initial movements of that action: e.g. shaking one’s ﬁst at enemies in order to make them go away rather than hitting them, to take a human example. Since the point is no longer to perform the action but just to display an intention as it were symbolically, the cost in terms of physical eﬀort and possible damage may be reduced to a minimum. Ritualization in the proper sense can, in principle, only be applied to communicative behaviour. You could not, for instance, ritualize your eating behaviour without rather serious consequences to your health. Recall at this point the Gricean view of meaning (see p. 75). In the ﬁrst stage of development, shaking one’s ﬁst will be understood “literally”, as a preparation for actually hitting the other. But after a while, the receiver’s understanding of the act will change, so that what is communicated is rather something weaker, with the function of a warning — “I intend to attack you if you don’t go away”. Ritualization is thus a case of what I above (3.4) called mutual attunement: it involves changes in the mental states of both the sender and the receiver. These changes may either be genetic or depend on learning — the cases described in animal communication are of course mainly of the former kind. Human languages, on the other hand, are supposed to be conventional systems. The concept of convention appears by deﬁnition to involve acquired behaviour or knowledge and it seems hard to apply it to genetically transmitted information. But the knowledge that is necessary for ritualization to develop beyond the initial stage shares two essential properties with convention and other cases of mutual attunement: (i) it is distributed over at least two individuals, (ii) it has a component of arbitrariness in that it is emancipated from its original or “natural” function.

Aspects of linguistic knowledge

Croft (2000: 7) says that “convention is a property of the mutual knowledge or common ground of the speech community”. However, convention does not really require that everyone have the same knowledge, only that each individual’s knowledge ﬁt his or her role in the social interaction — consider for instance dances such as tango and foxtrot, where the participants have diﬀerent tasks to perform but must be perfectly coordinated. Such interactions Croft, following Clark (1996), calls “joint actions”, and perhaps “joint knowledge” would therefore be a better term than “mutual knowledge”, at least if “mutual” is taken to be synonymous with “shared”. It should also be noted that shared knowledge is not enough to comprise a convention. A convention, like any agreement between individuals, also involves mutual expectations. Suppose Alice and Bob are going to a party, and both know the other is going. This obviously still does not mean that they have an agreement to go, because that would mean that each of them had to somehow commit him- or herself to do so. Similarly, conventions are not just common patterns of behaviour but principles that give us the right to take certain things for granted. For these reasons, conventions are irreducible social phenomena, i.e. emergent entities in the original, non-reductionist sense of the word “emergent”. The term “convention” has the drawback of suggesting something like an explicit agreement. In the case of language, it is of course the exception rather than the rule that speakers are aware of an explicit decision as the basis for a convention. Perhaps such conventions should be called something else, for instance “traditions”. Ritualization and symbolic acts. The statement that ritualization can only apply to communicative behaviour has to be modiﬁed somewhat. Consider the following example. Once upon a time alcoholic beverages could not be served in Swedish restaurants if they were not part of a meal; that is, you had to order some food with your drink. The natural strategy on the part of a thirsty guest was of course to miminize the meal that had to be ordered. It is said that special “token sandwiches” were introduced for this purpose. You can imagine that these were not exactly culinary wonders. This example illustrates what happens when a rule of some sort interferes with an agent’s cost-beneﬁt calculations. I go to the restaurant because I am thirsty; I am prepared to pay the price that is demanded for the drink I order. However, the state forces me to also pay for some food that I do not really want. From my point of view, this regulation is tantamount to taxation: I simply have to pay more for the drink than if the regulation did not exist. My reaction, as we have seen, is to reduce the extra cost as much as possible — I do not care if the sandwich I get is edible or not, I don’t want it anyway. Is this ritualization? The purchase of the “token sandwich” is indeed emancipated from the normal function of buying food, but it may be harder to accept that it involves communicative behaviour. The point of the story, however, is that a certain norm in society — in this case an explicit law — keeps me from obtaining my goal as long as I do not perform another action, that of ordering and paying for

87

88

The Growth and Maintenance of Linguistic Complexity

a sandwich (although I don’t have to eat it). Such acts are commonly called “symbolic”, and can be seen as a special case of indirect intentionality, as discussed above. It is characteristic of symbolic acts that an agent has to engage in a certain behaviour with focus on the formal aspects and partial or total emancipation from function — and this is likely to lead to a reduction in the energy spent on it. Symbolic acts need not be communicative in the usual sense, although they often are. However, in most cases, they do not make sense if they are not perceived and acknowledged by others. They thus have an essentially social character. The word “ritual” is usually primarily associated with religion and religious ceremonies. “Rituals” in this sense are typically performed at pre-speciﬁed intervals, e.g. on a particular day every year, and tend to lose their original functions over time, becoming “empty rituals”. Human societies abound with such “symbolic” acts, religious or secular. In linguistic communication, there are many elements which have a similar character. For instance, letters may be expected to start with salutation phrases such as “Dear X” and end with phrases such as “Remaining your humble servant”. Such phenomena are to a large part subsumable under the notion of politeness. In recent linguistics, politeness has largely been treated in terms of speaker strategies aiming at the “preservation of face” (Brown & Levinson (1987)); however, many politeness devices are required by convention and failure to use them leads to serious consequences. They are thus linguistic counterparts of token sandwiches and are predictably subject to the same tendency to be reduced in form (see further 7.2). Ritualization vs. habituation. In the discussions in the linguistic literature, ritualization has been associated with concepts such as repetition and habituation. Thus, Haiman (1994) proposes to use “ritualization” as a cover term for “all varieties of change which are brought about by routine repetition” including emancipation (from the original function), habituation, and automatization. Similarly, Bybee (forthcoming), referring to Haiman’s discussion of ritualization, proposes that phonological changes of reduction and fusion are conditioned by the frequent repetition of items that undergo grammatic(al)ization. She attributes an important role in this to processes of habituation and automatization of sequences of units in speech. It appears to me that this coupling between ritualization on the one hand and repetition and habituation on the other is somewhat unfortunate. The latter concepts are clearly both much wider than ritualization as that term is used in animal ethology.7 To a certain extent, repetition is probably a necessary condition

7.In everyday language, the word ritual (noun or adjective) has a more general meaning, as in the example a ritual glass of milk before bed, for which the American Heritage Dictionary provides the interpretation “being part of an established routine”. This ﬁts the discussion of

Aspects of linguistic knowledge

for ritualization to appear, but it is hardly a suﬃcient one — normally some other motivation is needed. For instance, in the ﬁst-shaking example, ritualization takes place because it makes it possible for an agent to solve a conﬂict without the costly consequences of downright violence. Without this motivation, the ﬁst-shaking behaviour will not emerge. A ﬁnal cautioning note: ritualization is actually not necessarily accompanied by a reduction in form. Biologists note that “ritualisation has served to make displays clear and unambiguous so that the observer cannot doubt that the animal is displaying” (Slater (1999: 136)). Thus, those aspects of a ritualized act that are helpful in increasing the reliability and eﬃcacy of the signal will not be reduced but rather enhanced. This could be called an ostensive use of signals — quite a common phenomenon in human life as well. We often wilfully perform acts that are normally involuntary — or at least not wholly under our conscious control — in order to communicate things, e.g. when I cough in order to draw someone’s attention to the fact that I am in the room. Another example is the use of stylized facial expressions as conventionalized signals in sign languages (see also 6.2). A general lesson to be drawn from these cases is that there is a limit to how much the performance of a communicative act can be reduced, a limit which is set by the need to safeguard the transmission of the message.

5.4 Entrenchment I noted in the preceding section that the notion of ritualization has been associated in the linguistic literature with other notions such as “repetition”, “habituation”, and “automation”. It is perhaps more common, especially in cognitive linguistics, for the latter terms to turn up in connection with what is called entrenchment (the term introduced by Langacker (1987: 59)). The idea is that the mental “trace” of an experience becomes progressively strengthened each time an individual is exposed to it. Entrenchment is thus an extremely general concept but the term is perhaps most often applied to the entrenchment of complex expressions, which, if frequently used, may become “progressively entrenched, to the point of becoming a unit” (ibid.). Entrenchment in this sense is also frequently invoked as an explanatory concept for processes of change. Consider, as a simple example, expressions for identifying dates. These are normally formed in English according to the pattern the + ordinal numeral + of + name of month, for instance the thirteenth of June. But certain dates are more important than others and are used very often. Their denominations accordingly

indirect intentionality in 5.2, but does not exemplify ritualization in the ethological sense.

89

90

The Growth and Maintenance of Linguistic Complexity

become “entrenched” and are understood as units, almost as proper names, e.g. (the) Fourth of July or the First of May.8 Likewise, combinations of verbs and objects that denote salient and frequent activities such as drink tea, wash one’s hands etc. are likely to be entrenched. The idea of progressive strengthening associated with entrenchment implies that it is dependent on frequency. This dependence cannot be completely straightforward, however. For instance, it is reasonable to assume that the degree of entrenchment not only depends on how often I have heard (or said) something but also on when I heard it and from whom. What I heard during the ﬁrst years of my life is bound to be more entrenched than things that have come later, but there may of course also be recency eﬀects that work in the other direction. What I hear from members of my own group(s) and from persons whom I want to resemble will become more entrenched than other things, etc. Also, frequency itself is not a simple concept. In addition to the overall frequency of an item, there is also frequency in speciﬁc contexts, and frequency of combinations in which the item occurs. We saw in 2.2 that the relationship between frequency and informational value is confounded by the equivocation between semantic and syntactic information. And we shall see later on that the resistance of elements against being fused with their neighbours may also depend on a number of diﬀerent factors, such as degree of referentiality (10.7). Furthermore, if we consider once more the expression Fourth of July, we see that here it is not only that this date is often spoken of, it also has quite special characteristics that are not predictable from its being the fourth day of the month of July. In a similar way, September 11 refers not just to that date in general but to speciﬁc events that took place on a particular instance of it. That is, the entrenched expression points to a mental node at which non-derivable (if we like, non-compositional) extra-linguistic information is attached, that is, is part of an external pattern (see 3.3) relative to the linguistic system. It is of course likely that some such information will normally be present for combinations of linguistic units that are frequent enough to be candidates for entrenchment. For instance, the activities of drinking tea and, say, drinking beer, do not diﬀer just in the identity of the liquid that is consumed; in most cultures, there will be speciﬁc occasions at which a certain beverage is drunk, and the social signiﬁcance of each beverage will be diﬀerent. Entrenchment without node-speciﬁc information — that is, with preservation of strict compositionality — is a more problematic notion. This depends a little on how we understand the underlying mechanisms. On one view, any set of co-

8.In Soviet jargon, the entrenchment of the expression for the 1st of May even led to it being — rather exceptionally for Russian — univerbated from pervoe maja to pervomaj. This never won general acceptance, however, and is by now most probably thoroughly forgotten.

Aspects of linguistic knowledge

occurring linguistic items would be represented in the mind as a “node”, and entrenchment would then be entirely a matter of degree, that is, it is equal to the weighting of the node in question, which in its turn depends on the number of times the node has been activated. The expression “to the point of becoming a unit” in the Langacker quotation then becomes pointless, however, since any combination would be a unit from the start, if we do not deﬁne a threshold for units. If entrenchment leads to the introduction of an additional mental node, it undoubtedly leads to an increase in the complexity of the cognitive system underlying language — the I-language. It would seem that it is analogous to the addition of a lexical item to the lexicon of a language and that it would thus add to the linguistic resources of the language. However, on the assumption that there is no node-speciﬁc information, the entrenchment of a complex expression as a unit does not create any possibilities of expression that the component expressions did not give us, and in this sense there is no change in richness of resources, only an addition to the redundancy of the I-language as a representation of the E-language. But we may note here that entrenchment may actually be seen as a type of reanalysis, a concept that we shall return to in 8.5, and it shares with reanalysis the problem of potentially having no observable empirical consequences — it is assumed that the degree of entrenchment of an expression is a consequence of its frequency (and perhaps other factors, as well, as was mentioned above), but we cannot otherwise measure it in the data. In spite of the formulation “becoming a unit”, Langacker clearly does not mean that the components of the entrenched combination lose their identity: rather, a “composite structure” is created (Langacker (2000)) with no “immediate loss of analyzability”. But the co-existence of the components and the composite item entails the creation of a hierarchical structure — entrenchment thus not only adds to the number of items but also means that structural complexity increases. Langacker (2000: 4) speaks of entrenchment as one of “a number of basic and very general psychological phenomena” essential to language. Another such phenomenon, he says, is abstraction, “the emergence of a structure through reinforcement of the commonality inherent in multiple experiences”. An important special case is “schematization”, which among other things give rise to “constructional schemata”, corresponding to constructions in a language. One may question whether entrenchment and abstraction are really distinct. After all, entrenchment also has to be based on multiple experiences, and what is entrenched always has to be a subset of the features that characterize each of these experiences. For instance, in learning a simple phrase such as Good morning, you have to realize that the time of day, but not for instance the weather, is relevant to its proper use. In this sense, all learning, linguistic or non-linguistic, is schematic: we need to categorize possible situations of use according to a set of relevant parameters. Even more abstract entities than schemata come into play here and are arguably subject to entrenchment. Distribution analysis, especially as developed by American

91

92

The Growth and Maintenance of Linguistic Complexity

structuralists, was based on the idea that each element of a language has a characteristic distribution: “the total of all environments in which it occurs” (Harris (1951: 16)). In a given environment, only a subset of all the elements in a language would typically occur. These subsets would not vary arbitrarily between environments but would rather tend to cluster together: in similar environments, similar subsets would be expected, although total identity between the sets of elements occurring in two diﬀerent environments would be rare (Harris (1951: 245)). The classes of elements that were postulated in a grammar would be those that are most useful in delimiting subsets that occur in many diﬀerent environments. In this way, structuralists felt that classes of expressions such as the traditional parts of speech could be placed on a ﬁrmer footing (and sometimes be revised). It is plausible that similar principles operate in learning. Each time a categorization is applied and found useful, it will be reinforced. When learning new schemata, we will tend to use those categorizations that are already ﬁrmly entrenched. In traditional terms, the more rules in a grammar that refer to a certain categorization, the more probable it is that a newly introduced rule will also do so. Another way of expressing this is to say that diﬀerential treatment is self-reinforcing. If little boys are dressed in blue and little girls in pink clothes, that will certainly strengthen the tendencies to treat them diﬀerently in other ways. In several Germanic languages, if a word is not stressed on the ﬁrst syllable of the stem, it is normally a foreign (Romance) word. It is not improbable that such a circumstance makes it easier to maintain a separate treatment of such words in other respects as well, for instance in inﬂection. The division of the elements of a language into classes or categories is motivated by the diﬀerential treatment of the elements in question in language use. If two sets of elements cannot be distinguished in their behaviour, we have no reason to treat them as diﬀerent classes in the description of the language. This is the rationale behind claims to the eﬀect that e.g. adjectives may not always be a separate word class in a language. But it also means that it is in principle possible to quantify the “entrenchment” of a grammatical distinction, in the sense of the number of contexts in the grammar that the distinction occurs in. This reasoning can also be extended to syntactic notions such as ‘subject’. A linguist who analyzes the structure of a linguistic utterance typically treats its diﬀerent formal properties as arguments for or against a certain analysis. If an NP agrees with the verb, this is an argument that the NP is the subject of the sentence (an argument that may possibly be contradicted by other facts). This way of thinking presupposes a “God’s truth” approach, in which there is a true answer to the question of what the structure of the expression is, and that structure determines its form — the genotype determines the phenotype. The alternative way of seeing things would then be as follows: The syntactic structure tells us which NP in a sentence is the subject. (For instance, we could do it the way it is done in the “standard theory” of Chomsky (1965): a subject is

Aspects of linguistic knowledge

deﬁned as the NP immediately dominated by S in the (surface) syntactic structure.) Any syntactic rule which mentions “the subject of the sentence” will apply to this NP. This includes the rules that determine word order, agreement and case marking but also matters such as binding and control. In a seminal paper, Keenan (1976) suggested an alternative approach: the notion of subject should be deﬁned in terms of a set of “subject properties,” such that NPs could be more or less “subject-like” according to the number of these properties that they exhibit. In particular, an NP occurring in a “non-basic sentence” may obtain some but not all of the “coding properties” such as case marking, word order and agreement that characterize subjects in “basic sentences”. For instance, consider “presentation” constructions like (32). (32) There are/There’s two cats on the mat. Here, the coding properties may be shared between the dummy subject (there) and the demoted subject (two cats) and intra-linguistic variation is not uncommon, as in English, where there may or may not be agreement between the verb and the demoted subject. Thus, rather than an all-or-nothing choice between applying or not applying the subject rules to a certain NP, we have a likelihood for each rule to apply which is determined by the number of subject properties that the NP has. But the information goes both ways: if a subject-referring rule applies to an NP, the NP thereby becomes more subject-like. This increases the likelihood that other rules will apply to the NP etc. We should thus think of “subject properties” not as components of the category of subject but rather as “subject criteria”, reasons for calling something a subject. The general opinion of grammarians is that even if there is no total consistency in how ‘subject’ should be understood in all contexts, the grammar of English would be considerably less straightforward if we tried to do away with the notion. We may contrast this with a syntactic notion such as “direct object”, which has been argued to be redundant in English (Comrie (1989: 67)), or with ‘subject’ in other languages, where it plays a more modest role or even is not generally unambiguously determinable. In particular, it is not possible to reduce it to any semantic notion e.g. in terms of thematic roles. In other words, the notion of subject is well entrenched in English. At this point, we may recall the “Pandemonium model”, introduced by Oliver Selfridge in the ﬁfties and described in the inﬂuential textbook by Lindsay & Norman (1977). In spite of its already venerable age, the Pandemonium model is still useful as a general metaphor for cognitive processes, including the processing of natural language. In this model, the elements are called “demons”, and each demon has a very speciﬁc task to fulﬁl. There is thus a hierarchy consisting of: – –

image demons, who record the incoming signal feature demons, each of whom look for a speciﬁc feature in the input

93

94

The Growth and Maintenance of Linguistic Complexity

–

–

cognitive demons, each of whom looks for a speciﬁc pattern in the features recognized by the feature demons (or by other cognitive demons) and who yells when the pattern is found a decision demon, who takes the ﬁnal decision on the basis of the information coming from the cognitive demons

The decision demon chooses the pattern that is announced by the cognitive demon who yells the loudest. How loud the cognitive demons yell depends on how well the incoming data ﬁt the patterns they are looking for but also on how often and successfully they have yelled in the past. Metaphorically, we may assume that whenever a higher demon approves an incoming signal from a lower demon, its rating increases. It may also be that the rating is decreased if the lower demon fails to be recognized over some speciﬁed period of time. Demons in higher echelons are thus specialized in patterns whose elements are detected by demons on lower levels. A major question is where those higher demons come from. Having higher demons for all possible combinations of lower ones would quickly lead to a combinatorial explosion problem. In the spirit of “Neural Darwinism” (Edelman (1987)), it could be assumed that demons are created in large numbers but that most of them simply die for lack of success in ﬁnding instances of their patterns. The analogy here is with the way antibodies are created in the immune system. One important feature of the Pandemonium model is that at any point, several demons may be active (“yelling”) at the same time. It is not necessarily the case that the responses of all but one of these are discarded; rather, the resulting picture may be an ambiguous one, containing several competing analyses of the data. The result may also be non-deterministic, in the sense that it is inﬂuenced by “contextual” (i.e. system-external) factors, the general state of the organism and simple chance. This takes us back to the discussion of the notion of subject. We can see Keenan’s “subject properties” or “subject criteria” as “demons” in the sense of the Pandemonium model. The degree of subjecthood will depend on the total activity of the subject demons. Obviously, things cannot work quite so simply. In the large borderline zone of NPs that have some but not all subject properties, rules do not in general apply stochastically, as the preceding paragraphs might suggest, but in most cases according to language-speciﬁc rules. For instance, dialects of English diﬀer in whether the verb in an existential sentence should agree with the demoted subject or not (there are two apples or there is two apples). But what I am suggesting is that whenever there is freedom of choice, and whenever a rule changes, the treatment of an NP or a certain type of NPs, will be governed by how often the rest of the system treats these NPs as subjects. This is entirely analogous to how borderline cases of imprecise (fuzzy, vague) concepts are treated in general. Consider e.g. the question of when a child becomes

Aspects of linguistic knowledge

an adult. The answer to this question varies not only between cultures but also within one and the same culture, and the border may be drawn in diﬀerent places for, say, voting rights and the right to buy liquor. With increasing age, the chance that a person will be treated as an adult increases. But whenever we take a decision on whether someone should be regarded as an adult, we are inﬂuenced by all other decisions that have been taken with regard to that person, or with regard to the kind of situation at hand. Cognitive linguists tend to speak of entrenchment almost exclusively at the level of the individual. But as we saw in the chess example on p. 66, if Deep Blue and Kasparov did not have mental representations that were somehow compatible with each other — that is, corresponding to a common set of patterns — they could not play against each other. Similarly, if our “cognitive representations of languages” are not compatible with each other, we cannot communicate. Thus, language is also an emergent phenomenon on the social level and is not reducible to individual knowledge. In addition, even individual habits may be “entrenched” in the external world in various material and immaterial ways which make it more diﬃcult to change them. Economists have discussed the phenomenon of “lock-in” or “path dependence” (e.g. Arthur (1990)) — an early decision constrains later choices in a possibly nonoptimal way. A celebrated example is the qwerty keyboard layout, which has long since lost its original motivation (the letters of the alphabet were distributed over the keyboard in such a way that the keys would not jam when pressed in quick succession). Similar phenomena show up in biological evolution, in that natural selection always operates on and is constrained by the already existing gene pool. Cf. also the notion of cultural inertia discussed in 3.6, 4.1 and 5.1. Opinions are actually divided as to whether the qwerty keyboard is as bad as has been claimed by the adherents of the alternatives, but what is interesting here are the ways in which it can be said to be “entrenched” and why it is diﬃcult to switch to any other keyboard layout. Not only are there tens of millions of typewriters and keyboards made according to the qwerty model, but an enormous amount of time and money has been invested in software, typing course materials, and above all in the typing skills of millions of professional typists and people like you and me. Essentially, a keyboard layout is an abstract information-theoretic object, corresponding to a P-language (see 4.4): a mapping from the set of alphanumeric entities onto a two-dimensional matrix. Still, what is striking is the emergent character of the mode of existence of qwerty: it is the totality of implementations in old-fashioned mechanical devices, modern hardware and software and human perceptual, cognitive and motor skills that constitutes its footing in the real world and makes it hard to change. The same can of course be said about other cultural phenomena, such as traﬃc rules (driving on the left or the right) and spoken and written languages.

95

96

The Growth and Maintenance of Linguistic Complexity

5.5 Piecemeal learning At the core of our attempts to understand how we acquire and use language are two contentious and closely interrelated issues concerning the relationship between what is in our heads and what we actually say. In fact, the issues are so closely intertwined that they can be argued to be the same, diﬀering only in what they are applied to — processing, acquisition, or language change. The ﬁrst is the question of the division of labour between what is stored in our heads and what is produced on-line. The second is the question as to what extent a child goes beyond the utterances it has heard before in using language. On-line processing vs. storage. Language is traditionally seen as a system of atomic entities — simplistically identiﬁed with words — which can be combined into complex expressions — phrases and sentences. Almost by deﬁnition, atomic entities are included among the “listemes”, i.e. those elements which have to be listed when describing the language, whereas complex expressions can be generated by rules or processes. Correspondingly, a speaker can be assumed to retrieve the atomic entities from a mental repository, whereas complex expression can be produced “on-line”. I say “can be”, because obviously, complex expressions could also be stored in memory, to be retrieved when needed rather than generated anew each time. This is a much-debated question in contemporary linguistics. The “traditional” approach is to minimize assumptions about competence, postulating instead that whatever can be generated on-line also is so generated. Against this, adherents of “usage-based models of language” (see Langacker (1987) and the papers in Barlow & Kemmer (2000)) argue that the cognitive representation of language may be highly redundant. In particular, in their view, high-frequency regular morphological forms and combinations of words are likely to be stored rather than produced on-line. A more conservative position is taken among others by Steven Pinker, in particular in his popularizing book (Pinker (1999)), where, with the support of various kinds of empirical evidence (in particular concerning the ways in which diﬀerent kinds of verbs in English and other languages behave), he argued that regular and irregular forms are produced by two separate mechanisms: “symbol combination” and “associative memory”. The distance between these positions may be somewhat smaller than it appears at ﬁrst sight: on the one hand, Pinker (1999: 153) seems to admit that regular forms are sometimes stored and even retrieved directly; on the other, Langacker (2000: 58) acknowledges the existence of a diﬀerence in processing between regular and irregular forms although he questions whether the organization is “strictly dichotomous”. Furthermore, when we look more closely at the way Pinker describes how he assumes that verb forms are actually generated, it turns out that the mechanisms may be less separate than they seem at ﬁrst glance.

Aspects of linguistic knowledge

Consider the phenomenon of blocking: if an English verb has an irregular past form, the corresponding regular form is usually9 ungrammatical. This would seem to be a quite trivial and seemingly self-evident fact, but it turns out to be somewhat tricky to accommodate. The problem for a model in which regular past forms are generated by rule and irregular ones are retrieved from storage is that it seems that before you can generate a regular past, you must check whether an irregular one exists. This ought to take time, and seemingly decreases the attractiveness of the online generation option (if you have to do a lexicon look-up anyway, you might as well retrieve the form at the same time). To solve problems with timing, Pinker (1999: 144) suggests that “words and rules are accessed in parallel”, with an inhibitory signal halting the generation if an irregular form is found. It appears that one could then equally well see the two mechanisms as diﬀerent parts of the same process, which would go a long way towards closing the gap between the two models. But things get even more complex when we include the fact that blocking is not always operative: some verbs have parallel strong and weak forms, e.g. dreamed and dreamt. In such cases, Pinker (1999: 152) argues, the regular form would also have to be obtained by retrieval from storage to avoid blocking — in other words evidence that not even regular forms are necessarily generated on-line. There is a further reason why the two models may be less diﬀerent than it seems, at least with respect to verb morphology. It turns out that if one looks at the way past tense marking operates in spoken English, the overwhelming majority of all verb tokens are marked by irregular, or more precisely, improductive markings. Thus, the productive marker -ed is in the minority. Moreover, the number of high-frequency regular verbs is quite restricted, both with regard to types and with regard to tokens. Since these are the only cases that the two models treat diﬀerently, the discussion really concerns a minority of all verb tokens. (See Appendix A for corroborating data.) Conservatism in language acquisition. The fundamental question in language acquisition is how the child converts the input from the environment into a working knowledge of a language. A central issue here is how the child solves the obvious conﬂict between the need to say things that she has not heard anyone say before and the risk of saying things that are ungrammatical. Generative grammar has emphasized the creative aspects of language, but any adequate theory also has to cope with the constraints on creativity. Partly as a reaction to the Chomskyan paradigm, many scholars have recently tried to go in the opposite direction, arguing that language acquisition is characterized by a general tendency towards conservatism,10 that is, it is claimed that the child basically repeats utterances that

9.More precisely, only about 15 per cent of English irregular verbs have alternative regular past forms. 10.The earliest work using the term “conservatism” in this way seems to be Gropen et al. (1989).

97

98

The Growth and Maintenance of Linguistic Complexity

she has heard, with minimal alterations. The issue of conservatism is directly tied in with the question of stored knowledge vs. on-line generation. If you are conservative, you have to learn more. For instance, suppose that you have heard instances of a construction like give somebody something. In principle, this could be abstracted to a schema V NP NP. The question is, can you construct new instances of this schema with other verbs, even if you have not heard them used in this way? A consistent application of the conservatism thesis entails that what is learnt are the individual verbs together with ﬁlled-in argument frames (say, give me some ice cream) rather than the general schema. This is analogous to the question of on-line processing vs. storage in morphology forms, where the hypothesis that regular forms are also retrieved from memory implies that each of them has to be learnt separately as well. It appears clear that children who acquire their ﬁrst language are more conservative than adult second-language learners, in the sense that various types of persistent errors that are typical of the latter are not frequent in child language. For example, when looking at grammaticalization processes in an area such as tense and aspect, it is striking that several of the expansions that take place diachronically correspond to error patterns that are found in second-language learners but hardly ever in children. Second-language learners of English may thus also expand a category such as the present progressive to cases when the simple present is normally used, or use the auxiliary will also e.g. in temporal clauses (When she will come… instead of When she comes…). My claim is not that such deviations from native adult usage never take place in child language but they are certainly not characteristic of it. Children thus seem to be less prone to generalize from the instances that they hear, although cases of over-generalization do occur. I have really touched here on two kinds of conservatism, both of which seem to have support in acquisition data (as suggested for instance by the work of Michael Tomasello (1998, 2000a, 2000b, 2000c)). One is the reluctance to generalize from concrete combinations of items into schemata. Another is the reluctance to generalize from one use of a pattern to another. The latter takes us to another closely related issue, that of the nature of polysemic, or more precisely, multi-use patterns. The nature of multi-use patterns. The quest for maximally general and economical models of language has led linguists to favour treatments in which lexical and grammatical items are assigned general meanings from which speciﬁc uses of the items in question can be derived by general processes. However, if children do not generalize from one use to another, each use has to be learnt separately. There is also typological evidence that points in the same direction. Quasi-equivalent patterns in two languages may exhibit minute diﬀerences in what uses are possible. If these uses were governed by general principles, one should not expect such variation among languages. I shall use indeﬁnite pronouns as an illustration,

Aspects of linguistic knowledge

relying on the cross-linguistic study in Haspelmath (1997), where the following set of uses, or in his terms, “functions”, of indeﬁnite pronouns is identiﬁed: (33) 1. speciﬁc, known to the speaker (Somebody called while you were away, guess who!) 2. speciﬁc, unknown to the speaker (I heard something, but I couldn’t tell what kind of sound it was) 3. non-speciﬁc, irrealis (Please try somewhere else) 4. polar question (Did anybody tell you anything about it?) 5. conditional protasis (If you see anything, tell me immediately) 6. standard of comparison (In Freiburg the weather is nicer than anywhere in Germany) 7. direct negation (Nobody knows the answer) 8. indirect negation (I don’t think that anybody knows the answer) 9. free choice (Anybody can solve this simple problem) An important claim in Haspelmath’s theory is that these uses can be ordered in a map — referred to in Haspelmath (1997) as an “implicational map” and in Haspelmath (forthcoming) as a “semantic map” — as shown in (34) and that an indeﬁnite pronoun series (e.g. all the English indeﬁnite pronouns containing the morpheme some) will always express a contiguous subset of uses on the map. (34) specific known

specific unknown

question

indirect negation

direct negation

conditional

comparative

free choice

irrealis non-specific

For instance, the uses of the English indeﬁnite pronouns can be displayed as in (35). (35) some specific known

specific unknown

any

no

question

indirect negation

direct negation

conditional

comparative

free choice

irrealis non-specific

In his book, Haspelmath draws such maps for the indeﬁnite pronouns in 40 diﬀerent languages. The remarkable thing is that each of these languages has its own unique pattern. The variety is not much smaller at the level of individual pronouns

99

100 The Growth and Maintenance of Linguistic Complexity

or pronoun series, for which Haspelmath lists no less than 37 diﬀerent sets of uses chosen from the list in (33). They all, however, obey the contiguity constraint. Consider, as a concrete example, the use labelled ‘free choice’. The 45 patterns (morphemes and/or constructions) that are used for ‘free choice’11 can also have other uses according to the following table: standard of comparison indirect negation conditional protasis polar question direct negation non-speciﬁc irrealis speciﬁc unknown

36 22 20 15 11 4 2

What I wish to stress here is the impossibility of predicting from one use of a pattern what the other uses will be. A linguist who wants to describe how indeﬁnite pronouns are used in a speciﬁc human language basically cannot avoid giving a full list of the uses of each pronoun. Similarly, a child who acquires the indeﬁnite pronouns of a human language must, for each use of a pattern, be given positive evidence of that particular combination and must learn it as an independent fact about its native language. In my opinion, this holds of both lexical and grammatical elements of a language: the set of uses of an element is not normally predictable from some general principle but can diﬀer from one language to another even for elements with the same basic meaning. Although Haspelmath calls his diagrams “semantic maps”, the semantic character of the nodes they are built up of is sometimes questionable. Note that several of the labels in (33) refer to syntactic constructions in which the pronouns appear. Consider also in this connection the following “semantic map”, adapted from Haspelmath (forthcoming), showing the uses of the preposition à and dative clitics in French compared to English to:

11.Some languages have more than one pattern in this function, while others do not show up in Haspelmath’s charts because they do not use indeﬁnite pronouns here.

Aspects of linguistic knowledge

(36) Uses of preposition à and dative clitics in French dative clitic predicative possessor

external possessor

direction

recipient

beneficiary

purpose

experiencer

preposition à

(37) Uses of preposition to in English predicative possessor

external possessor

direction

recipient

beneficiary

purpose

experiencer

preposition to

What we have here are diﬀerent contexts in which NPs marked by à and dative clitics show up in French. Arguably, in most of them the prepositions or cases do not have an independent status but appear as auxiliary patterns in larger constructions. In such situations, the larger construction has to be part of the deﬁnition of the use of the element, but in the same way as earlier, each use has to be learnt (or described) as an independent fact. I doubt that there is any particularly deep reason why one uses à for predicative possessors in French (Ce chien est à Jean ‘This dog belongs to Jean’) rather than de as in Spanish (Este perro es de Juan), or why the corresponding use of English to is restricted to the verb belong rather than being possible in copula constructions as well, as is the case in French. The latter fact is an example of how uses of auxiliary elements are dependent on lexical idiosyncrasies. For the main topic of this book, this discussion is relevant in that it makes it plausible to assume that language change also takes place at a low level and in a piecemeal fashion. We may further note that children’s ability to learn large amounts of low-level facts about language is a prerequisite for the development of phenomena such as lexical idiosyncrasy. On the other hand, this does not preclude the possibility that there are also patterns on a higher level, which may or may not be part of an individual speaker’s knowledge of the language.

101

Chapter 6

Maturation processes

6.1 The notion of maturity In the Introduction, I stated that linguistic patterns in general can be said to have life cycles in the sense that they pass through a number of diﬀerent stages, and that we can single out certain linguistic phenomena, called “mature” phenomena, that occur only at later stages of these life cycles, and have speciﬁc pre-histories. It is now time to examine more closely this notion of maturity and its application. To begin with, let us consider how it should be deﬁned. Suppose you envy your neighbours, who have a century-old oak in front of their house. The problem is of course that either you must buy another house with an oak or you must wait for a hundred years to see the oak that you just planted get to the same stage. Similarly, in linguistics, we do not expect an isolating language such as Vietnamese to develop anything like the Arabic or Navajo system of verbal morphology overnight, or a genderless languages such as Finnish to borrow the Swedish or Russian grammatical gender system wholesale. The generalization is that some linguistic phenomena, like oaks, necessarily take some time to develop and thus have a rather long pre-history. The question is how to formulate this in a more precise way. To understand the problem more clearly, we shall look at a nonlinguistic system — the game of chess. A game state in chess is deﬁned by assigning a square on the 8 × 8 chess board to each piece, black or white. There is of course an astronomical number of game states, making up the state space of the game. Notice, however, that there is exactly one initial state (call it S0), and that furthermore, for any other state Sn to occur, there has to be a way of playing the game so as to get from the initial state to Sn. This means that there may be some ways of positioning the pieces on the board that are simply impossible in the sense of being unreachable from the initial position. For instance, pawns are initially positioned in the second row (from the player’s perspective) and can only move forward. This means that a white pawn can never get to row one and a black pawn can never get to row eight. Any game state where a pawn is located in such a place is thus ruled out. Notice that such “synchronic” generalizations about possible game states are secondary to the “diachronic” rules that regulate the movements of pieces in chess. If a certain state is reachable from the initial state, it will normally be possible

104 The Growth and Maintenance of Linguistic Complexity

to formulate generalizations about the routes (sequences of moves in the game) by which it is reachable. For instance, if we ﬁnd a white pawn in row ﬁve, we know for certain that there is an earlier state where the same pawn was in row four. In addition, since there is always a ﬁnite number of move sequences that lead to a game state Sn, we can identify the shortest possible route from S0 to Sn, and thus determine the distance between them. Recall now the discussion of length of derivational history in 3.8. Regarding chess as a formal system on a par with a generative grammar or a deductive system for proving theorems from axioms, we can see the sequences of moves that lead up to a game state as derivations that are analogous to the generation of a sentence or the proof of a theorem. The distance from S0 to Sn would then equal the length of derivational history of the game state. (Recall that this does not necessarily reﬂect the development of structural complexity. Actually, game states in chess tend to become structurally simpler as the game proceeds and the number of pieces on the board diminishes.) But we could also think of chess as a metaphor for any system that evolves or develops over time. Any such system would have a state space, and any change to the system is equal to a movement in that state space. In most cases, such movements will not be “random walks” but will follow certain principles. Moreover, there will also be constraints on what an initial state of the system will look like, and just as in chess, these patterns together determine what states are possible and by which routes they can be reached. In any such system, then, it will also be meaningful to speak of — if not necessarily possible to determine — the minimal distance from the initial state(s) to any other state. Also, non-initial states “carry information” (in the sense deﬁned in 3.3) about previous states, that is, about their history. We might now deﬁne the evolutionary complexity of a non-initial state as the minimal time it takes for the system to reach that state — this is analogous to length of derivational history but measured in real time. How does all this apply to language? A language, to most linguists, is a synchronic object — the set of all possible human languages is thus equal to the set of all possible language states. The aim of linguistic theory is often seen as the establishment of a set of universal constraints on these states. But we see now that some such constraints may in fact be secondary to the principles that restrict transitions between language states. Such constraints will not in general concern global states of languages but rather individual patterns, and that is what interests us here. The general question is: if a certain grammatical pattern is found in a language, what does that tell us about previous states of that language? Some types of linguistic change are relatively trivial — for instance, a language may at any point add a noun to its lexicon, and indeed this happens every day to a language such as English. Adding new verbs is already slightly more complicated, and when it comes to grammatical changes, they normally take place through speciﬁc processes, which means that a given type of grammatical structure can only

Maturation processes 105

come about in a restricted number of ways. In a sense, then, new grammatical structures are less “directly available” than new lexical items. Consider plural marking in languages such as Tok Pisin and English. In Tok Pisin, ol, deriving from English all, is regularly used as a plural marker with nouns. The history of this marker, like that of Tok Pisin itself, cannot be very long. But plural marking in Tok Pisin also lacks at least the following properties found in the English number category: (i) inﬂectional marking, (ii) numerous lexical idiosyncrasies (irregular plurals); (iii) obligatory use of plurals even in contexts where number marking is informationally redundant, e.g. after quantiﬁers (many books); (iv) involvement in syntactic agreement (verbs, demonstrative pronouns); (v) existence of “pluralia tantum” (lexical items which are plural only: scissors, trousers). It is reasonable to assume that a system like that of English and other Indo-European languages can only come about after a historical development of signiﬁcant length, involving a number of intermediate stages, where the earlier ones are more like that found in Tok Pisin (although the ultimate source of the plural morpheme may be diﬀerent). Thus, the number category in English can be assigned a relatively high degree of evolutionary complexity. Although it is not as easy to determine the evolutionary complexity of a language state as it is to calculate the length of derivational history of a state in chess, we may still identify those phenomena which, like the number category in English, presuppose a non-trivial prehistory, that is have a non-zero evolutionary complexity. Such phenomena can then be called “mature”. Here is an attempt at a reasonably explicit deﬁnition of that notion: (38) Deﬁnition of “mature” x is a mature phenomenon iﬀ there is some identiﬁable and non-universal phenomenon or a restricted set of such phenomena y, such that for any language L, if x exists in L there is some ancestor L¢ of L such that L¢ has y but not x The condition on y that it be non-universal is necessary to avoid triviality. The proposition that all languages that have x also have an ancestor which has y is supposed to be a “law-like” rather than an accidental generalization; that is, it is not suﬃcient that all observed languages behave this way, we must also have reason to believe that this holds also for all other languages in the past, present, and future. We must allow for the possibility that a mature phenomenon can arise by several diﬀerent paths — that is why the deﬁnition says “some … phenomenon or a restricted set of such phenomena”. Consider for instance inﬂectional future tenses. Not only is it the case that they can arise from periphrastic constructions involving auxiliaries of diﬀering origins, but they can also arise from general non-past forms which have given way to expanding progressive constructions. In all cases, however, an inﬂectional future tense has a history of considerable length behind it.

106 The Growth and Maintenance of Linguistic Complexity

Even if we can hardly give exact measures of maturity, it is clearly gradable — there are phenomena that presuppose a previous state with a phenomenon that is in itself mature: this would be a a higher degree of maturity. On the other hand, it may be diﬃcult to delimit maturity downwards. It is often hard to decide if the process by which a structure of some kind arises necessarily goes via intermediate stages. It may, for instance, be argued that if we ﬁnd an obligatory grammatical marker in a language, there must have been a stage of that language where the same marker was used in the construction(s) in question, although in a less systematic fashion. In other words, a certain degree of maturity could be claimed for any obligatory marker. It may also be said that the class of phenomena that derive historically from speciﬁc sources is of interest as such, even if the members of this class do not always qualify as mature according to the deﬁnition given here, as the sources may be universally present in languages. Note that concepts such as length of derivational history, evolutionary complexity and maturity all pertain to states of a system — in the case of languages, we would say synchronic states. In principle, then, maturity is a synchronic property, although it is based on diachronic constraints.

6.2 Identifying mature phenomena in languages It is not often that we can observe the full life cycle of a linguistic pattern, since earlier historical stages of languages are more often than not insuﬃciently documented. It is generally observed in grammaticalization studies, however, that patterns involving words with a complex morphological make-up develop out of syntactic constructions, which in their turn normally originate in free combinations of lexical items. But it may also be noted that in morphology, fusional morphology normally derives from earlier non-fusional, aﬃxal structures.1 Grammatical patterns could thus be assumed to develop in the following stages: (39) free > periphrastic > affixal > fusional This schema is in basic agreement with the proposals made by Wilhelm von Humboldt as far back as the early 19th century and has its roots in the 18th century, where Condillac is mentioned as the ﬁrst proponent (Lehmann (1982: 1)).2

1.It is possible that reductive processes sometimes operate so fast that the aﬃxal stage is not discernible, as when the two independent words you and all fuse into [jf˜l] in some varieties of English. This is rather unlikely in the case of more general systems of non-linear morphological marking such as Germanic strong verbs, however. 2.In recent linguistics, similar ideas have often been expressed as the schema proposed by

Maturation processes 107

The three last stages in (39) correspond nicely to the traditional classiﬁcation of languages in language typology (isolating, agglutinating, and ﬂexional). The diﬀerence is that we are speaking here of the development of grammatical patterns, not of languages as wholes. Relating (39) to what was said in the preceding section, we may see it as a scale of increasing maturity, where each stage presupposes the previous one. Fusional morphology presupposes that there was aﬃxal morphology at an earlier stage, and aﬃxal morphology presupposes periphrastic constructions. The focus so far is on morphology — syntax enters the picture only indirectly, to the extent that it is the source of morphological structures. Two scholars who have recently discussed issues closely related to those treated here are Bernard Comrie and John McWhorter. “Before complexity”. Comrie (1992) considers the possibility of reconstructing early stages of human language that displayed a lower degree of complexity than present-day languages. Comrie discusses two kinds of evidence: – –

extant varieties of languages that might represent less complex systems — child language, deaf sign languages, pidgins and creoles; historical linguistics: phenomena that consistently develop from earlier states where those phenomena are absent.

The evidence leads him to conclude that one can identify a class of phenomena which “were not present in early human language, but have arisen on the basis of historical processes of types that can be observed as we examine the attested historical development of languages”. The following would be phenomena of this kind: – – – –

aﬃxal and fusional morphology; morphophonemic alternations; phonemic tone; phonemic vowel nasalization.

Comrie here continues the tradition from historical linguistics, but with an enlarged scope: not only morphology but also phonology is taken into consideration. What connects Comrie’s discussion of these phenomena to the notion of maturity is the dependence on a particular historical development. His focus is not on this notion as such, however, but rather on the possibility of projecting back to an earlier stage in the history of human language where mature elements were lacking. The argument takes the form “since α depends on a historical development, at some point in time α did not exist” and may seem logically unassailable. However, note

Talmy Givón (1971) (to be understood as “cyclic waves”): discourse Æ syntax Æ morphology Æ morphophonemics Æ zero

108 The Growth and Maintenance of Linguistic Complexity

that no conclusions can be drawn about the length of this “Garden-of-Eden” stage — if maturation processes started at once, it may have disappeared very quickly. But the serious question is rather whether the conditions for maturation processes have in any way changed over time, that is, if humans have in any sense evolved genetically in such a way that mature linguistic structures have become favoured. “The creole prototype”. McWhorter (1998) claims that creole languages are a “synchronically deﬁnable typological class”. Thus, creole languages, according to him, diﬀer from all other natively spoken human languages by the complete or almost complete absence of the following three phenomena: – – –

inﬂectional aﬃxation; tone distinguishing monosyllabic lexical items or encoding morphosyntactic distinctions; semantically irregular derivational aﬃxation.3

These phenomena have in common that they “combine low perceptual saliency with low import to basic communication” and “only develop internally as the result of gradual development over long periods of time” (McWhorter (1998:792)). The second of these criteria is basically equivalent to maturity in the sense I have deﬁned it. McWhorter (2001a: 163) goes further by giving the following extensive list of phenomena not found in a set of 19 creole languages, also listed in the paper: – –

ergativity; grammaticalized evidential marking;

3.In McWhorter (2001a), this is called “opaque lexicalization of derivation-root combinations”. I ﬁnd this problematic as a criterion. McWhorter says that e.g. in Mon-Khmer languages, which seem to fulﬁl the ﬁrst two conditions on creoles, “semantic drift over time has created endless idiosyncratic lexicalizations”. The problem is to determine the relevance of such idiosyncrasies for the synchronic state of a language. McWhorter says about English that it is crucially distinct from creoles in that in words like awful, “semantic drift has relegated the very status of -ful as a suﬃx to the margins of spontaneous perception” (798). But suppose awful shows up in an English-based creole — an assumption which does not appear particularly implausible. Would it count as a counterexample? I guess McWhorter would say that it would not, since the lexicalization took place in the lexiﬁer, not in the creole. But this is a historical fact, not a synchronic one. And it presupposes that we can distinguish lexiﬁers from bona ﬁde ancestral languages. McWhorter says that irregular lexicalization is found “in all regular languages”, i.e. in all non-creoles. This actually implies that if the criterion worked, it would by itself be suﬃcient for distinguishing between creoles and other languages, so, strictly speaking, the prototype would not be needed — the other criteria are redundant. If the irregular lexicalization criterion does not work, on the other hand, it is rather bad news for the creole prototype idea, since the two other criteria do not exclude e.g. Mon-Khmer languages.

Maturation processes 109

– – – – – – – – – – – –

inalienable possessive marking;4 switch-reference marking; inverse marking; obviative marking; “dummy” verbs; syntactic asymmetries between matrix and subordinate clauses; grammaticalized subjunctive marking; verb-second order; clitic movement; any pragmatically neutral word order but SVO; noun class or grammatical gender marking (analytic or aﬃxal); lexically contrastive or morphosyntactic tone (with one exception).

The quest for the “Garden-of-Eden” language. Both Comrie and McWhorter assume that it is possible to identify certain forms of human language that contain less or even none of the complexity conditioned by what I here call maturation processes. This assumption is of course far from unproblematic. Comrie lists three types of language which could provide evidence of “Garden-of-Eden” stages of language: child language, deaf sign language and pidgins and creoles. For McWhorter, the main source of information is creole languages. As for child language, or data from ﬁrst language acquisition, it seems clear that even if early stages of child language are less complex than adult language, for obvious reasons, there is little correlation between the order in which linguistic features are acquired and the maturity of those features. In fact, the acquisition of inﬂectional morphology and word level phonological features seems to start more or less simultaneously with the advent of complex expressions (that is, what is commonly referred to as the two-word stage). Slobin (forthcoming) notes that children under 2 who are exposed to languages with complex morphology such as Turkish or Inuktitut “do not exhibit the sort of “pre-grammatical” speech described by Bickerton, Givón, and others, such as absence of grammatical morphology and reliance on topic-comment word order”. It thus seems that it is hard to identify a “Garden-of-Eden” stage in child language. Turning now to the sign languages of the deaf, there is no a priori reason to assume that they would diﬀer in complexity from spoken languages. The typological proﬁle of deaf sign languages is no doubt diﬀerent from that of the average spoken language — but so is that of most individual spoken languages. The main reason why deaf sign languages — on the average — can be expected to contain less

4.John McWhorter no longer thinks this item should be included in the list (personal communication, January 2004).

110

The Growth and Maintenance of Linguistic Complexity

mature structures than spoken languages is that they typically have relatively short histories, as Comrie notes. As for pidgin and creole languages, it is perhaps questionable if the former should be included, as they by deﬁnition are not anyone’s native language. In McWhorter’s work, it is accordingly creoles that are at the centre of attention, having “the world’s simplest grammars”, “by virtue of the fact that they were born as pidgins, and thus stripped of almost all features unnecessary to communication and since then have not existed as natural languages for a long enough time for diachronic drift to create the weight of “ornament” that encrusts older languages” (McWhorter (2001a: 125)). However, this is not the only way of seeing the origin of creoles, and this is one reason why it may be diﬃcult to verify the claim about the absence of mature features in creole languages. Simplifying the alternatives, we may see creoles either as being “born as pidgins”,5 that is, having developed out of something that is not a native language, as McWhorter claims, or as being an extreme case of contact-induced change, as is argued e.g. by Trudgill (2001). Of course, it is not immediately obvious that these alternatives exclude each other. The kind of complexity-reducing change that would be involved here can be seen as a “ﬁltering” that takes place when a language is acquired in a situation that diﬀers from the ideal of a child picking up its ﬁrst/native language from other individuals who have acquired it in the same way. To avoid having to deﬁne notions such as “ﬁrst” or “native” language, I shall use the adjective suboptimal for the kind of acquisition where ﬁltering occurs and designate the results of the ﬁltering as suboptimal transmission eﬀects. Obviously, there are diﬀerent degrees of suboptimality (which is a reason for preferring the terminology proposed here to that of Thomason & Kaufman (1988), who speak of “non-normal transmission”). The pidgin source hypothesis presupposes that there is a generation whose only input is from non-native speakers, something which, I would assume, is still compatible with the extreme contact view. In addition, however, the pidgin source hypothesis goes further, in that it must be taken to mean that these nonnative speakers use something that is stabilized enough to merit the label “pidgin”. It appears that in actual practice, however, we seldom know enough about the situation in which a creole has arisen to be able to verify whether this condition is fulﬁlled. Nor can we guarantee that there is no “undue” inﬂuence from the languages involved in the creole genesis. Although the ﬁrst generation who acquire a creole as a native language do not “inherit” it from a previous generation, more or less mature features from either the native language(s) of their parents (the substratum language) or the lexiﬁer language could slip through and be rooted in

5.Strictly speaking, this is an inadequate metaphor, since it implies an identity between the creole and its pidgin predecessor.

Maturation processes

the new language. Since both the substratum and the lexiﬁer languages can continue to be present in the environment, it could also happen that mature features are adopted from them into the new language at a later stage. A further diﬃculty lies in keeping the claims non-circular, or at least falsiﬁable. A pidgin language is a language which is used in contacts between diﬀerent groups although it is not the native language of any of the groups. But that is not a suﬃcient criterion, as it would not exclude languages such as Medieval Latin, which is not usually regarded as a pidgin language. Deﬁnitions of “pidgin languages” therefore tend to contain a clause to the eﬀect that a pidgin should have “a typically simpliﬁed grammatical structure and reduced lexicon” (Asher (1994: 10.5157)). Thus, in order to be deﬁned as a creole, a language must have as its primary historical source a language which has a suﬃciently simpliﬁed grammatical structure. No grammatical property of a language can therefore be a counterexample to the thesis that creoles have the world’s simplest grammars, because in order to be a creole, the language has to originate from an earlier language state which did not have that property (a pidgin), and if it has it there are only two logical possibilities: either that stage did not exist, in which case it is not a creole, or the property has been acquired later, in which case it is not a counterexample either, since it just means that the language is on its way to losing its creole character. What all this means is that the quest for the perfectly pristine, or “Garden-ofEden” language, may never succeed in the sense that we will not be able to witness a language being created from scratch. In the case of spontaneous genesis of sign languages, to be brieﬂy discussed in 11.5, we do come quite close, but it is not entirely easy to see how this could be replicated for spoken languages. What phenomena are mature? Reviewing the candidates for inclusion in the class of mature linguistic phenomena, we ﬁnd that the most obvious one is inﬂectional morphology, the presence and character of which, as we saw above, has been the basis for the traditional classiﬁcation of languages into types, and is claimed to be absent from typical creole languages. Inﬂections, being grammatical items par préférence, are also central in the study of grammaticalization — which has been deﬁned as the development of grammatical items from lexical material. But widening the perspective to word-internal structure in general (in other words, the traditional domain of morphology), we ﬁnd that the condensation or tightening of a multi-word construction into a morphological unit does not necessarily involve lexical morphemes becoming grammatical markers in the narrow sense. In constructions traditionally labelled “incorporation”, the job otherwise performed by syntax is done within the limits of one word. Such constructions are subject to typological variation of a kind similar to that of inﬂectional morphology and can in general be assumed to be historically derived from “looser” constructions of a syntactic nature, like derivational patterns (see further Chapter 10). We may thus identify complex word-structure in general as a mature phenomenon.

111

112

The Growth and Maintenance of Linguistic Complexity

Inﬂection also interacts with inherent lexical features of words, perhaps most clearly in the case of grammatical gender. Grammatical gender systems generally presuppose rather long evolutionary chains and are in this sense among the more clearly mature elements of language. Grammatical gender exempliﬁes the very general phenomenon of lexical idiosyncrasy, that is, when a grammatical rule or process applies to lexical items in a diﬀerentiated and unpredictable way, which necessitates assigning “diacritic” features to individual items in the lexicon. Other cases of lexical idiosyncrasy are inﬂectional classes and idiosyncratic case-marking properties of verbs. Lexical idiosyncrasy does seem to be an irrational and counterproductive property of language. The changes that introduce it make grammar less predictable and less ordered. Lexical idiosyncrasy would thus seem to be nothing but “historical baggage” or “diachronic junk”. It may be argued, however, that lexical idiosyncrasy introduces potentially useful redundancy and therefore serves a certain synchronic function. What about maturation in syntax? There is at least one obvious candidate for an advanced syntactic phenomenon, viz. agreement. The interesting thing about agreement in this connection is that it normally involves inﬂectional morphology, at least when the agreeing unit is lexical. Thus, an adjective may agree with its head noun, but it is somewhat diﬃcult to see what this would mean if there were not diﬀerent inﬂectional forms of the adjective. So if inﬂectional morphology is a mature phenomenon, agreement has to be too. Obviously, inﬂection plays an important role in syntax in general, case marking perhaps being the most salient example. On the other hand, it is often argued that case inﬂections and adpositions in principle do the same job, so it is not equally clear that inﬂectional morphology is a necessary condition here. But we may note some more complex phenomena such as the combination of adpositions and case marking that is found in a number of conservative Indo-European languages (I do not know of any examples from other phyla) where for instance diﬀerent uses of one and the same preposition may be distinguished by the choice of inﬂectional case on the governed NP. For instance, cf. the following two Russian sentences, where the preposition v ‘in, into’ governs the locative case, as in (40), when stative location is referred to, or alternatively the accusative case, in a complement of a verb of motion, as in (41). (40) Russian Ja ezžu v Rossii. I travel.prs.1sg in Russia.loc.sg ‘I travel in Russia’ (41) Russian Ja ezžu v Rossiju. I go.prs.1sg to Russia.acc.sg ‘I travel to Russia’

Maturation processes

Similar patterns (although diﬀering in the choice of case for stative location) are found in other Slavic languages, as well as in Germanic (Standard German, Dalecarlian, Icelandic), Latin, and Greek. What we see here, then, is a use of a case which is semantically relevant although it appears in a speciﬁc syntactic construction and the semantics is not directly derivable from what is usually seen as the primary use of the case in question (that is, it does not follow from the fact that the accusative case is used with direct objects that it should also be used in constructions such as (41)). Such constructions are certainly not possible without a welldeveloped inﬂectional system.6 Leaving the syntactic reﬂexes of inﬂection, one may ask whether maturation is also visible with regard to word order. It does not appear possible to regard maturity as a property of word order rules in general. Even if many languages are claimed to have free word order it is probably true that some kind of preference at least with respect to such orderings as that between verb and direct object can be found in most languages — including creoles and even pidgins (the latter are often claimed to have rigid word order rules). Word order rules, then, cannot in general be said to derive from non-word order phenomena. On the other hand, word order rules may be of quite diﬀerent kinds. Most word order rules, in particular those found in pidgin languages and early child language, are like verb–object ordering in the sense that they just state the placement of sister constituents relative to each other, obeying a general condition of adjacency on such constituents — they thus do not violate linearity. On McWhorter’s extended list of properties that are absent from creoles, one ﬁnds three that have to do with word order: verb-second order, clitic movement (more theory-neutrally: special rules of clitic placement e.g. in accordance with Wackernagel’s principle), and “any pragmatically neutral word order but SVO”. I refrain from commenting on the last one. The two ﬁrst, however, clearly violate the adjacency condition. Verb-second word order, a pervasive phenomenon in Germanic languages except English, belongs to the most diﬃcult features to master for second-language learners and creates problems for children diagnosed with Speciﬁc Language Impairment as well (Håkansson (2001)). Two of the items on McWhorter’s extended list have to do with subordination: “syntactic asymmetries between matrix and subordinate clauses”, and “grammaticalized subjunctive marking” — the latter because subjunctive marking typically shows up in subordinate clauses. Diﬀerences between main and subordinate clauses often seem to be due not to developments within the subordinate clauses but, perhaps somewhat surprisingly, to processes that eﬀect changes in main clauses. Thus, subjunctive marking of speciﬁc types of subordinate clauses may arise as a

6.I am not claiming that the distinction between stative and dynamic location presupposes maturity, only this particular manifestation of it.

113

114

The Growth and Maintenance of Linguistic Complexity

side eﬀect of a development by which new tense and aspect forms take over the functions of old ones in matrix clauses but leave all or some types of subordinate clauses untouched. For instance, the descendants of the indicative forms of Classical Armenian are in Modern Armenian only used in subordinate clauses, being replaced by new periphrastic (originally progressive) forms in main clauses (Bybee (1994: 231)). Such diﬀerences typically presuppose rather long grammaticalization chains and are thus mature phenomena. Going from morphology in another direction — into phonology, we saw above that both Comrie and McWhorter mentioned phonemic tone on their lists. McWhorter has the absence of “lexically contrastive or morphosyntactic tone” as one of the components of his creole prototype, and Comrie claims that “all tonal oppositions in language have nontonal origins”, supporting this with a number of examples of historical developments. Probably, such a claim can be made with about equal force for other phonemic prosodic features. I shall later argue (9.6) that phonemic prosodic features share a number of properties with inﬂections and lexical features such as gender, in particular in requiring the postulation of nonlinear morpheme or word level features. As we shall see, the fourth item on Comrie’s list, phonemic vowel nasalization, is also among the phenomena claimed to manifest morpheme-level autosegmental features in some languages. It appears that in general, there is a strong correlation between maturity and non-linearity, to the extent that we may even identify maturation processes in grammar and phonology with the development of non-linearity. There are, however, some important exceptions to this. In languages all over the world, questions are characterized by rising intonation. To the extent that this is grammaticalized, it is probably a conventionalization of a natural tendency, i.e. something genetically conditioned. Whether rising intonation is “pre-speciﬁed” for questions or rather is originally an expression of something like surprise, is another matter that we need not go into here. This kind of process is perhaps more easily discernible in sign languages, where various facial expressions and similar “natural signals” have been conventionalized (as was mentioned in 5.3), for instance, when raised eyebrows signal a question. What is important here is that such non-learnt features of communication seem to be “immediately available” for use in language and can also be freely conventionalized. The prominence-marking devices discussed in 2.4 are most probably other cases in point. Summing up the preceding discussion, we may identify the most important types of mature phenomena in languages as follows: –

complex word structure, including – inﬂectional morphology; – derivational morphology; – incorporating constructions;

Maturation processes

–

–

– – –

lexical idiosyncrasy, including – grammatical gender; – inﬂectional classes; – idiosyncratic case marking; syntactic phenomena that are dependent on inﬂectional morphology, including – agreement; – case marking (partly). word order rules over and above internal ordering of sister constituents; speciﬁc marking of subordinate clauses; morpheme and word level features in phonology.

6.3 Naturalness, markedness, and Universal Grammar Naturalness Theory. As the notion of markedness was originally understood in the Prague school, it was a property of members of phonological and grammatical oppositions. The member which was less frequent, demanded more eﬀort, was more often neutralized, etc., was considered the marked one. Thus, nasal vowels would be marked relative to oral ones, plurals would be marked relative to singulars etc. It was also said that marked terms were acquired later than unmarked terms and that any language in which marked items was found would also have the corresponding unmarked ones. All languages have oral vowels but only some have nasals etc. However, since only the latter can be said to have an opposition oral : nasal in the ﬁrst place, it is natural to apply the notion of markedness to the opposition as such as well, and from this it is only a small step to generalizing the notion in such a way that it can be applied to basically any linguistic phenomenon, meaning that it is low in frequency (both language-internally and typologically), eﬀort-demanding, diﬃcult to learn etc. In Natural Morphology, or Naturalness Theory, as developed by Willy Dressler (1987b), Wurzel (1984, 1987, 1989) and Dressler (1987a, 1987b), a notion is employed which is basically the inverse of markedness, namely “naturalness” — essentially, what is wholly unmarked is also maximally natural. If a morphological structure is natural, it will be more frequent in languages, will be acquired earlier by children, and will be favoured by processes of change. Mayerthaler explicates naturalness in morphology in terms of such notions as “(constructional) iconicity”, “uniformity” and “transparency”. Constructional iconicity means that the semantically marked member of an opposition has greater phonetic weight than the unmarked member. Uniformity means adherence to the principle “one form — one function”. Transparency (it is unclear if this should be subsumed under uniformity) includes compositionality of meaning and segmentability of morphological structure. Phenomena such as allomorphy, suppletion,

115

116

The Growth and Maintenance of Linguistic Complexity

multiple use and “ambiguous symbolization” run counter to uniformity and transparency. While Mayerthaler’s notion of naturalness is essentially languageindependent, Wurzel’s introduces a system-dependent notion that can be summed up as primarily involving “system congruity”. Dressler introduces additional principles of naturalness: “optimal word length” and “indexicality”. The principles of naturalness in morphology may come into conﬂict above all with the exigencies of phonology, but also, as in particular Dressler emphasizes, with each other. This explains why the ideal state of maximal naturalness is seldom or never attained. Still, there appears to be a basic contradiction between naturalness as an ideal state towards which languages strive and the idea that linguistic systems will develop over time towards increasing maturity, including precisely those phenomena that are taken to be “unnatural”.7 This contradiction perhaps comes out most clearly when considering the proposal by Trudgill (1983: 102) in which the term “natural” obtains an almost complementary use to that in Naturalness Theory (see further 11.4). The question is, then, if the contradiction can be resolved.8 Much of what is subsumed under naturalness in Naturalness Theory is explicable as lack of complexity, in particular in the form of non-linearity, and the residual part primarily concerns economy of expression. The problems arise with the empirical interpretation of the concepts employed. It seems to me that there are at least two ways of understanding the theory, depending on what status one attributes to naturalness as such, i.e. whether it is primarily empirical or methodological. As an empirical principle, the claim would be that language change is essentially driven by a striving towards naturalness. Such a striving could be understood either as an active force or as an entropic principle, if the loss of complexity entailed by increased naturalness is seen as a result of noise in the process by which language is transmitted from generation to generation. But one could also think of naturalness as a methodological principle, in the sense that deviations from it always have to be motivated. The latter alternative is then fully compatible with claiming that the morphological systems of real-life

7.After pointing to a number of parallels between “natural grammar” and the grammaticalization approach, Heine et al. (1991: 121) note that they “deal with drastically diﬀerent perspectives of linguistic behavior”. 8.There is a certain similarity between Naturalness Theory and the Aristotelian theory of locomotion. Aristotle and his contemporaries thought that physical objects have a “natural” location towards which they tend to move and where they come to rest. “Unnatural” movement, i.e. movement away from the preferred location, occurs but is “violent” and always has an external cause. This theory, as is well known, was later discredited. When bodies fall towards the earth, we no longer think of it as resulting from their seeking their natural place but resulting from mutual gravitational attraction with the earth. .

Maturation processes

languages are in fact quite far from being “natural”, and even that they become increasingly so over time. I think one can discern a development in the Naturalness Theory literature from the empirical to the methodological interpretation, which would make it much more consonant with the views taken in this book. Having said this, I hasten to add that some of the empirical claims made by proponents of Naturalness Theory seem somewhat diﬃcult to accept. For instance, Mayerthaler says that it is generally the non-marked member of an opposition that wins out in language change (Mayerthaler (1980: 4)). But grammaticalization processes characteristically involve the spread of a grammatical construction or marker to new domains, as discussed elsewhere in this book. Most drastically, what was earlier used in highly restricted (“marked”) contexts may end up as an obligatory adornment of a word or phrase class, as when deﬁnite articles (typically deriving from demonstrative pronouns) conquer contexts that would normally be thought of as prototypically indeﬁnite and in the end become general markers on nouns (Greenberg (1978a)). Likewise, dative markers ﬁrst spread from indirect objects to animate direct objects and may then generalize to other direct objects, as seems to be happening in Spanish. In this process, an originally marked member of an opposition may be said to lose its marked character, but this is of course not the same thing as saying that the unmarked member wins out. I shall return to claims made by the proponents of Naturalness Theory in 11.4. Complexity and Universal Grammar. In 3.3 above, I noted that the complexity of a language could be seen as relative to a general theory of language or “Universal Grammar”. From the point of view of language acquisition, it is possible that part of the information needed to specify a language is given “for free” by the genetic endowment of the child, in some sense. I also stated that a parametric view of Universal Grammar could be interpreted to the eﬀect that the simplest language is the one in which all parameters have default/unmarked values. Becoming more complex then would simply mean obtaining non-default parameter values. As far as I know, generativists do not think of the choice of non-default values as entailing increased complexity. But regardless of this, it is hard to see how a parametric model could adequately handle maturation processes.9 The problem is that the parametric model of Universal Grammar has mainly been developed to deal with syntactic knowledge, and the kind of complexity that arises in maturation to a large extent involves morphological information and information that is connected to speciﬁc lexical items in a language. Obviously, the lexicon of a language, in the sense

9.Along much the same lines, McWhorter (2001a: 159) points to discrepancies between the assumed default settings of Universal Grammar and his “complexity metric”, entailing that his hypothesis about the maximal simplicity of creole grammars cannot be equated with a claim that they display unmarked UG parameters.

117

118

The Growth and Maintenance of Linguistic Complexity

of information about the meaning and form of lexical items, cannot be reduced to a set of parameter values, and it is generally assumed in generative grammar that this kind of knowledge is of a diﬀerent kind. Morphology is then supposed to go with the lexicon rather than with syntax. For instance, Uriagereka (1998: 564), discussing the thesis that “language acquisition is more akin to growing than to learning”, says that it does not apply to morphology, which instead belongs to the somewhat mysterious “periphery” of language: “whatever is not directly speciﬁed within the innate language faculty and does not directly follow from epigenetic growth”. Not being speciﬁed by the innate language faculty, the periphery can hardly partake in explanations of it. Consider in particular the “poverty of the stimulus” argument, that is, the claim that the input from the environment underspeciﬁes the language system, necessitating the postulation of an innate component. Either the periphery, including morphology, is not underspeciﬁed or the argument is invalid. The innate component together with the critical age hypothesis have been supposed to explain age diﬀerences in the capacity for language acquisition; the diﬃculties adults have in acquiring new languages would be explained by assuming that they no longer have access to Universal Grammar. Notice that this again leaves out the periphery: since it is not driven by the innate language faculty, any age diﬀerences here have to be explained in another way. But much of what creates diﬃculties for late learners undoubtedly would belong to the periphery in the minimalist model.

Chapter 7

Grammatical maturation

7.1 The notions of grammaticalization and grammatical maturation As we have already seen, much of what happens in maturation processes as deﬁned in Chapter 6 is commonly subsumed under the notion of “grammaticalization” (sometimes called “grammaticization”). In this chapter, we shall look more closely at that notion and how it can be integrated into the more general framework assumed in this book. After having undergone a period of what Lehmann (1982) has aptly called ‘amnesia’, the study of grammaticalization witnessed an upsurge of interest among linguists towards the end of the 20th century. Although it was extensively studied as a phenomenon even in the 19th century (most often under the heading of ‘agglutination theory’), grammaticalization as a term is usually ascribed to Meillet (1912), whose deﬁnition is often quoted in the modiﬁed form provided by Kurylowicz (1965), where it is said to be “a process which turns lexemes into grammatical formatives and makes grammatical formatives still more grammatical”. This approach focuses on the fate of individual items such as words and morphemes. The deﬁnition given by Hopper & Traugott (1993), “the process whereby lexical items and constructions come in certain linguistic contexts to serve grammatical functions” shifts the focus somewhat, widening it to constructions as well. While I would endorse this deﬁnition in principle, I would still like to reformulate it to make more precise the relationship between lexical elements and constructions. A lexical item cannot by itself come to serve a grammatical function; it must do so by virtue of becoming a ﬁxed part of a larger pattern — a grammatical construction. The processes by which “lexical items become grammatical morphemes” — grammaticalization — are thus only parts of the genesis and evolution of grammatical patterns or constructions, which I shall call grammatical maturation for convenience. When a lexical item becomes a ﬁxed part of a grammatical pattern, I shall say that it is trapped in the pattern. For instance, the English verb go is trapped in the be going to-construction, and the French noun pas ‘step’ is trapped in the negation construction ne…pas. I ﬁnd the trapping metaphor suitable since the lexical item as used in the construction loses its autonomy semantically and eventually also formally, and its further destiny is dependent on what happens to that construction. In a way, the properties of the element may be said to be epiphenomenal or at least secondary to the properties of the construction.

120 The Growth and Maintenance of Linguistic Complexity

During its life cycle, a grammatical pattern typically matures in many successive steps, each of which may involve changes of several diﬀerent types. A theory of grammatical maturation has to provide a characterization of each of these types and of their relationships to each other, in particular of the causal chains that bind them together. Here is, brieﬂy, how I regard it. There are three major components in the maturation of grammatical patterns: – – –

pattern spread: a pattern comes to be used in situations where it was not used before pattern regulation: the choice between two patterns sharing the same niche is constrained in one way or the other pattern adaptation: a pattern undergoes changes which make it better suited to its new uses, changes such as reduction and condensation (tightening)

Pattern spread and pattern regulation are analytically diﬀerent but may not always be temporally distinct in actual processes. As an example of pattern spread, suppose that in some variety of English a verb such as intend starts being used about future events in a context such as It intends to rain, where no intention is involved. Normally, there will already be a way of expressing the same proposition in the language, e.g. It will rain. In the minimal case, then, what happens is that we have two competing ways of saying the same thing rather than one. This means that variation increases and order in language decreases (see 3.2). On the other hand, if the incidence of future time reference remains constant, the invasion of the new pattern into the niche will automatically mean a decrease in the frequency of the old patterns — we are dealing with a zero-sum game: one pattern’s gain is another’s loss. Often, however, things do not stop here — rather, the choice between the competing patterns is regulated in one way or the other. Pattern regulation may involve either the total disappearance of one or more old patterns from a niche or the introduction of a total or partial division of labour between the new and the old patterns. Variation is thus constrained and order is increased. The essential diﬀerence between the two processes — pattern spread and pattern regulation — would thus lie in the eﬀects on language as an ordered system of content-expression pairings: whether order and predictability decrease or increase. As noted above, pattern spread and pattern regulation are not necessarily distinct in time, or at least not perceptibly so. The spread of a pattern may entail a more or less instantaneous change in the rules that regulate the choice between the spreading pattern and its competitors. In such cases, it might seem that we should instead just speak of a replacement of one pattern by another. However, there are a number of reasons why we should speak consistently of pattern spread and pattern regulation as distinct phenomena. One is that there may indeed be a timedelay between the two: the regulation may come only at a later stage. The second is that pattern regulation may give rise to new boundaries, in eﬀect to the partitioning

Grammatical maturation

of a niche, which may result in the genesis of new grammatical distinctions seemingly “out of thin air”, and thus to an increase in system complexity. The third reason is that there is a possibility that pattern spread and pattern regulation are driven by diﬀerent forces and that the distribution of roles between children and adults is diﬀerent for the two (see further discussion in 11.5). If a pattern comes to replace an older one with less phonetic weight, pattern spread leads to an increase in verbosity. Here pattern adaptation comes into the picture. Pattern adaptation in the form of phonetic reduction decreases verbosity and is a way of restoring a balance between the role the pattern plays in discourse and its phonetic weight. At the same time, however, reduction tends to preserve grammatical structure. The combination of a reduction in phonetic weight and preservation of grammatical structure leads to the integration of reduced elements into neighbouring words — the process often referred to as univerbation, to a general increase in non-linearity, and often in the end to what I will call featurization, the genesis of higher-level — mainly word-level — features, to be discussed in Chapter 9. Finally, the changes brought about by pattern spread and pattern regulation may inﬂuence the structure of an expression, that is, lead to what is usually called reanalysis. It is possible that reanalysis may also have a more active role in maturation processes. This will be discussed in 8.3.

7.2 Pattern spread By using the formulation “a pattern comes to be used in situations where it was not used before” we avoid making a diﬃcult and perhaps not always very essential choice between diﬀerent possibilities: whether the pattern existed before or is newly created, and whether the new situations represent a new and well-delineated use of the pattern or whether we are rather dealing with an increase in frequency of an old use. Obviously, pattern spread is an extremely general notion which covers many diﬀerent processes, most of which have little to do with grammatical maturation. By deﬁnition, the ﬁnal output of grammatical maturation should be a grammatical pattern. This does not mean that pattern spread by itself has to have a grammatical result, but it should be the case that the spread triggers the other components of the maturation process — pattern regulation and pattern adaptation. In my opinion, an essential characteristic of grammatical maturation is that the pattern spread leads to a decrease in the rhetorical and/or informational value of the pattern or its component expressions — what I call rhetorical devaluation. Let us look at some cases. The point of departure for a pattern spread may be a situation where two patterns P1 and P2 are synonymous (at least as far as truth-conditions go) except

121

122 The Growth and Maintenance of Linguistic Complexity

for P2 being used under more restricted conditions. For instance, P2 may be used only in situations when a certain eﬀect is intended, for instance, highlighting a particular aspect of the message. The pattern spread would then apply to the more restricted pattern, P2, with the eﬀect that the restrictions are either lifted or weakened — that is, P2 comes to be used also when previously only P1 could be used. If P2 was earlier associated with a highlighting eﬀect, that eﬀect is lost. Let us look at a couple of examples to make this more concrete. Sentences (42a–b), which diﬀer in the presence of own in (b), mean essentially the same except that Mary’s (exclusive) ownership of the car is emphasized in (b). (42) a. Mary drove us home in her car. b. Mary drove us home in her own car Historically, words meaning ‘own’ or ‘proper’ are a possible source for reﬂexive pronouns. Thus, oma means ‘own’ in Finnish but functions as a regular reﬂexive possessive in Estonian. Assuming that Finnish represents the older stage, the highlighting element contained in oma has thus been lost in Estonian. A similar relationship holds between words meaning ‘self ’ and ordinary reﬂexives. From the point of view of prominence management, this means that patterns that have been employed for high-prominence items tend to be extended to normal or low prominence uses. Other examples are the genesis of plural markers out of items with the original meaning ‘all’, as in Tok Pisin, or when you all develops into a plural pronoun in some varieties of English. Cf. (43) a. Mary saw the boys. b. Mary saw all the boys. Whereas (43b) seems straightforwardly translatable into quasi-predicate logic as ‘For every x such that x is a boy, Mary saw x’, it is rather tricky to formulate the truth-conditions for (43a). It may be said that the default interpretation of (a) also is that Mary saw the whole group of boys, but somehow one or two exceptions seem less damaging for the truth of the sentence in (a) than in (b). We may say that all has the eﬀect of highlighting the totality element and that when a word meaning ‘all’ starts being used even when such highlighting is not intended, it is on the way towards becoming a plural marker. The development of indeﬁnite articles from the numeral ‘one’ can be described in a similar way. Both of the following sentences can be understood as permission to take one apple, but only the ﬁrst highlights the restriction to one and one alone: (44) Take one apple! (45) Take an apple! A prerequisite for the grammaticalization of the numeral is thus that it comes to be used without this highlighting eﬀect.

Grammatical maturation

The type of changes that I have now exempliﬁed shades into changes in which a device for highlighting or singling out a particular element in an utterance acquires a more general use where the highlighting disappears. Thus, Givón (1976) demonstrated that both direct object case marking (sometimes) and subject and object agreement on the verb (probably always) have a diachronic origin in topicalizing constructions, that is, constructions in which one of the NPs in a sentence is singled out and highlighted. An example of a development where an erstwhile emphatic word order appears to have been generalized is found in Scandinavian. In earlier forms of Scandinavian, possessive pronouns were generally postposed to nouns. In modern Swedish and Danish, preposing is the rule, although postposing is marginally possible with some kin terms such as far min ‘my father’. Many conservative northern Scandinavian dialects, however, still have the postposed construction as the normal choice, and use preposing only when the possessive pronoun bears emphatic stress. Plausibly, then, this is an intermediate stage in the spread of the preposing pattern and its eventual takeover of the niche previously occupied by the postposing pattern. Consider also the common development from perfects (as the English I have seen it) to general pasts (as the English I saw it). In Dahl (1985: 138), I argued that the choice between the perfect and the simple past in languages such as English and Swedish is bound up with information structure in that the probability for using the simple past grows as the “time of event” (in the sense of Reichenbach (1947)) becomes more deﬁnite or presupposed. Thus, the perfect in Swedish (like in English) is usually compatible only with indeﬁnite time adverbials, but Swedish more easily than English allows deﬁnite time adverbials if these express “new information”. When a perfect expands its use and eﬀectually becomes a general past, as in many continental Germanic varieties, we have an example of a pattern spread whereby the most important semantic element of the pattern (the time reference) becomes backgrounded. But what are the causal mechanisms behind changes involving rhetorical devaluation? One possible explanation involves inﬂationary processes. In 2.5, we saw that a speaker may get a short-term advantage by using a stronger expression than is warranted by the true circumstances. This may lead to over-use of such expressions, which in its turn causes their devaluation — an inﬂationary phenomenon. We shall now see how inﬂationary mechanisms can trigger grammatical maturation. In Mandarin Chinese, scalar predicates such as kuài ‘fast’ or dà ‘big’ are quasiobligatorily modiﬁed by the intensiﬁer heˇn, whose traditional meaning is ‘very’ (Ansaldo (1999: 93), Shi (2002: 194)). The following example from my typological survey of tense–aspect systems (Dahl (1985)) may serve as an illustration. (46) is the normal way of expressing ‘The house is big’ — the alternative without heˇn is felt to be rather odd except in some special contexts.

123

124 The Growth and Maintenance of Linguistic Complexity

(46) Mandarin Chinese Zhì suoˇ fángzi heˇn dà. this clf house “very” big ‘This house is (very) big’ In fact, when asked to translate English sentences containing the word very, speakers tend to resort to other intensiﬁers such as fe¯icháng ‘extremely’ (Ansaldo (1999: 93)). The word heˇn has thus undergone a shift, in which it has moved from being an optional intensiﬁer to being an obligatory part of the scalar predicate construction. Scalar predicates are somewhat special in that their rhetorical properties are very closely tied up with their truth-conditional semantics. According to standard analyses (Bierwisch (1967)), a sentence such as (46) can be paraphrased as The size of this house exceeds the norm, where what the norm is has to be contextually determined. It leaves open how much bigger than the norm the house is, but the diﬀerence cannot be too small, because then it wouldn’t be worth noting, or perhaps not even observable. The interesting consequence of this is a quasiconﬂation of the notions of truth and relevance for such statements; we might claim that (46) is true if and only if the size of the house is large enough to be worth mentioning. Intensiﬁers are used to convey the information that the diﬀerence between the parameter value and the norm is greater than would be implied by the plain scalar predicate. If The house is big means ‘The diﬀerence between the size of the house and the norm is noteworthy’, The house is very big could be said to mean ‘The noteworthiness of the diﬀerence between the size of the house and the norm is noteworthy’. The close relationship between informativity and truth here is emphasized by the fact that a common etymology of intensiﬁers (including that of very) is ‘truly’. Furthermore, statements of the form The size of the house is x, where x is some suitable measure, will be less probable and therefore more informative and/or newsworthy as the diﬀerence grows between x and whatever the norm is. In any situation where informativity and newsworthiness are valued, you can therefore get some extra mileage out of a statement by implying a slightly higher value of the scalar parameter in question than you can actually vouch for. The simplest way of doing so is by adding an intensiﬁer — which opens up the door for inﬂation. Listeners will gradually lower their expectations, which will lead to a general weakening of the force of the expressions. What is somewhat trickier is to explain how an intensiﬁer may be weakened but at the same time become obligatory, rather than be replaced by another one. A parallel to the “tip cycle” (p. 73) suggests a possible mechanism. If a certain expression is used often enough, the non-use of it becomes noteworthy. That is, if you leave out the intensiﬁer, people will start wondering what’s wrong, so you feel obliged to put it in. It is reasonable to assume,

Grammatical maturation

however, that a certain threshold must be reached before you obtain this eﬀect, and since we are dealing with a “rare event” here — most intensiﬁers never get that far, as we have seen — it remains unclear exactly what factors trigger oﬀ the necessary development to reach that threshold. It is thus unclear how far inﬂation can be taken as an explanation of rhetorical devaluation, in particular if we want to generalize to cases such as the ones discussed earlier in this section. We shall now consider a number of other mechanisms which may trigger or favour pattern spread, although not necessarily giving rise to rhetorical devaluation. Problem solving. The term “problem solving” has been used in the grammaticalization literature in the sense of “conceptualization by expressing one thing in terms of another” (Heine et al. (1991: 150–151)), in particular with reference to metaphorical extensions. The cases I shall discuss here are slightly diﬀerent. In the literature, the spread of a new construction or form is often attributed to speakers’ communicative needs. This is often done by assuming that the old construction has left the scene before the new construction is introduced: for instance, case endings disappear and prepositions have to be introduced instead. These explanations are usually unconvincing (Hopper & Traugott (1993:121)). To start with, if the constructions in question are really necessary for communication, how do the language users survive in the meantime? Also, the chronology does not ﬁt the facts in many observed cases: a new construction may start being used while an old one is still there. And usually there are lots of languages which do without any marking in the corresponding constructions. However, there may still be a grain of truth in this kind of explanation. It turns out that new constructions are often introduced in cases where the old one is unsatisfactory for one reason or the other — as a “last resort” solution. Consider e.g. the choice between the synthetic and the analytic comparative constructions in Swedish. The synthetic (inﬂectional) construction, characterized by the ending -(a)re, is used with most adjectives, the major exceptions being adjectives with derivational suﬃxes such as -isk, -ande and -ad (the last two originally being participles). This is hardly an accident; there is apparently reluctance (on whatever level) to adding a “heavy” suﬃx to a word that already contains one. Similarly, in syntax, we know that more verbose relative clause constructions may be introduced as ways of avoiding constraints on the use of empty positions. Thus, in Swedish, resumptive pronouns are regularly used in cases where the ordinary gapping relative construction would result in an empty subject slot in an embedded clause, e.g. (47) Swedish student-en som jag inte vet vilket betyg hon ﬁck student-def rel I not know.prs which grade she get.pst ‘(lit.) the student who I don’t know what grade she got’ (ungrammatical without hon)

125

126 The Growth and Maintenance of Linguistic Complexity

When the gapping construction is grammatical, the resumptive pronoun strategy is not allowed. This is a case of “blocking” reminiscent of what is found in morphology, when the existence of an irregular form blocks the generation of a regular one (see p. 97). Notice that resumptive pronoun strategies are used much more widely in many other languages. Relative clauses have functional equivalents in various participial constructions (e.g. English the people who live here: the people living here) that are obviously more mature in that they involve inﬂectional morphology. Likewise, relative clauses in Bantu languages rely on bound relative markers on the verb, and are thus also more mature. It is entirely possible that there are also diﬀerences in maturity among the purely syntactic relative clause constructions, the gapping constructions being “tighter” than the resumptive pronoun strategies. A special case of problem solving is “paradigm gap ﬁlling”, that is, periphrastic forms are employed to ﬁll empty cells in a paradigm. A classical example is the Latin passive, where the present, imperfect, and future tenses are inﬂectional and the perfect, pluperfect, and futurum exactum forms are periphrastic. I shall return to this type of periphrasis later (9.3). Sometimes, periphrastic constructions are used instead of morphological marking for non-native parts of the vocabulary. Thus, in Sirionó, possessive pronouns may be joined to a head noun in two ways: either as preﬁxes or by a periphrastic construction, involving a dummy noun mbae ‘thing’ (mbae could be called a classiﬁer, but it doesn’t really classify, since there is just one such morpheme). The choice depends on whether the head noun is native or not. This yields contrasting pairs of synonymous expressions such as se-rerekua ‘my chief ’ (native): se-mbae patrón ‘my boss’ (Spanish loan) (personal ﬁeld data). It appears highly plausible that the encroachment of new constructions on the territory of an old one may start in this way and then spread further. But it remains to be explained why the latter extension occurs. Extraneous pressures: The “token sandwich” syndrome. By “extraneous pressures” I understand factors that are in principle irrelevant to what is communicated but which induce the speaker to choose a certain way of expression. This would be analogous to the “token sandwich” story in 5.3, where customers were forced to buy sandwiches with drinks whether they wanted them or not. In particular, societal norms regarding politeness, various kinds of taboos and what has now come to be called “political correctness” often force speakers to express themselves in other ways than they would otherwise. These ways are often more verbose, either because they contain markers of deference and the like, or because they use descriptive circumlocutions that are longer than the usual constructions. Consider expressions such as he or she compared to the now disfavoured generic he or African-American instead of negro or black. Perhaps most of these phenomena

Grammatical maturation 127

stay lexical and do not lead to grammatical changes but some of them do. In particular, this tends to happen with those phenomena that involve politeness. Politeness expressions both resemble and overlap with grammatical items. They resemble them in that they are more or less obligatory elements which tend to undergo a sometimes quite radical phonetic reduction (consider for instance the Russian word sudar’ ‘gentleman, sir’ which in pre-revolutionary times was used as a regular addition to utterances when speaking to superiors and became reduced to -s). They overlap with grammatical items above all in the domain of terms of address. Second-person pronouns of course often develop from polite phrases such as Spanish usted from Vuestra Merced ‘your grace’, but politeness marking may also be integrated into verb inﬂection (regardless of person) as in Japanese, where the verb forms containing the politeness morpheme -mas(u)- (which is said to originate in a verb meaning ‘to go to a high or secret place’) are obligatory in more formal speech and in addition serve a syntactic function by distinguishing main and subordinate clauses. Contact-induced change. The distribution of grammatical elements such as articles, case markings, tense and aspect categories etc. is usually highly skewed areally, as work in areal typology has made clear. Also, many cases of such elements spreading very rapidly over large geographical areas are known from history. For instance, deﬁnite articles, which are now found in practically all languages in Western Europe, spread from the Mediterranean northwards during the Middle Ages. This suggests that grammaticalization processes are highly sensitive to contact inﬂuence, which might in fact be a problem for the notion of linguistic maturation in general, since if patterns were borrowed totally promiscuously, this would threaten to empty of content the condition that a mature structure can only develop in a language which has passed through a speciﬁc earlier stage. Happily, the empirical evidence at hand rather suggests that borrowing is constrained in a way that reﬂects the maturity of the borrowed patterns. Thus, Field (2002: 38) suggests the following “Hierarchy of Borrowability” — cf. (39) on p. 106 above: (48) content item > function word > agglutinating aﬃx > fusional aﬃx Thus, what is borrowed, or calqued (i.e. translated), in grammar will most frequently be periphrastic constructions or free markers, and less often aﬃxes, although the latter is also observed to happen. Pattern borrowing is mainly a special case of pattern spread. At least for pattern adaptation, it is harder to see how it could be enhanced by borrowing. It is not wholly implausible that language contacts may induce pattern spread in another way than by native speakers adopting patterns from other languages. It was noted above (p. 98) that second-language speakers may over-generalize e.g. periphrastic tense–aspect constructions such as progressives — thus mirroring a common diachronic development. The relation to the grammatical categories of the

128 The Growth and Maintenance of Linguistic Complexity

speaker’s native language may here be somewhat indirect: a Swede who over-uses the English present progressive probably sees it as a general counterpart of the Swedish present tense, but it is not clear whether any element is “borrowed”. Particularly in cases of non-optimal transmission, such over-generalizations could eventually also inﬂuence the language of native speakers.

7.3 Pattern competition and pattern regulation As noted above, when a grammatical pattern invades a new niche, there will normally be one or more occupants of that niche — the language already contains ways of saying the same thing. The incoming pattern will thus have to compete with the existing ones. Such a competition may in principle develop in three ways. One is that we reach a stable state with continuing free variation between the patterns. Another is a total victory for the incoming pattern — the previous occupants can no longer be used for that particular function. Finally, the competition may end in a “truce” involving a division of labour between the patterns. The two last-mentioned possibilities — total niche takeover and a truce — can be subsumed under the general notion of pattern regulation, which is what interests us in this section. In the case of total niche takeover, an important consequence follows for the elements that are trapped in the new pattern: they are now obligatory in the niche in question. But strictly speaking, this is secondary to the niche takeover of the larger pattern they are trapped in. Many cases of grammatical change invite an account in which a previously optional element becomes obligatory. This, however, is really only a special case of the more general process by which one construction is replaced by another. If we like, we have obligatoriness at two levels: one consisting in the obligatoriness of the item in the pattern where it appears, the other of the construction in the particular function it has, which is the same as saying that it constitutes the only way of performing that function in that language. (Perhaps it would be more correct to say “the simplest way” than “the only way” in most cases, since there will usually be more complex circumlocutions that will do the same job, more or less.) In the case of a “truce” between competing patterns, they establish a more or less stable co-existence that may be positioned at any point on the continuum from completely free variation to a rigid division of the territory, or to use another metaphor, a ﬁxed division of labour. The tendency to avoid total synonymy in language (see p.78) is reﬂected in the fact that the choice between the competitors is seldom completely free. However, it is often the case that many factors are operative at the same time. We shall look at one example, the grammatical marking of future time reference in Western languages, as treated in Hermerén et al. (1994), Schlyter & Sandberg (1994) and Dahl (2000a: 315–317).

Grammatical maturation 129

As is well known, a periphrastic de-andative construction (i.e. a construction derived from a verb meaning ‘go’) is found in several Western languages (e.g. je vais travailler ‘I shall work’ or voy a trabajar ‘I shall work’). This construction is gradually taking over the territory of the older, inﬂectional future (e.g. je travaillerai, trabajaré). Within the eurotyp Theme Group on Tense and Aspect, a typological questionnaire was administered to native speakers of French and Spanish, making it possible to see how the competition between the two constructions shows up in the ways speakers choose between them in diﬀerent contexts. Several factors were clearly at work, both stylistic and semantic. Thus, in French, the older construction is associated with formal and written language, but also favoured by the following semantic/pragmatic factors: (i) prediction-based rather than intention-based future time reference; (ii) 3rd person subject; (iii) remoteness in time. The ﬁndings for Spanish were analogous. It can be added that the choice between will and be going to in English is somewhat similar, although the semantic factors involved may be slightly diﬀerent. Grammatical systems, perhaps in particular verb systems, abound with competing constructions or forms of this kind. When the choice between the competitors is made according to semantic factors, these are often of a rather subtle and elusive kind. A good example is the choice between the perfect and the simple past as ways of expressing past time reference in languages such as English or Scandinavian. It is well known that perfects often take over the territory of past tenses, ousting them completely from their niches (simultaneously losing their character of perfects, of course). This has happened e.g. in the Germanic and Romance languages spoken in an area comprising most of France, southern Germany, Switzerland, Austria and northern Italy. This ﬁnal victory of the perfect was preceded, however, by a long period of competition between the two. Thus, Squartini & Bertinetto (2000) show that the use of the Simple Past (passato remoto) in Italian decreases gradually not only along the geographical North-South dimension but also along a semantic dimension, with the typical “perfectal” functions at one end via personal and impersonal narration to historical narration at the other. It may thus be fruitful to think of the “opposition” between perfect and simple past as a case of one pattern intruding upon another’s territory even in those languages where the two constructions appear to lead a fairly stable co-existence. It is notable that the semantics of the perfect has been one of the most controversial problems of tense–aspect theory. Competitive situations like the ones mentioned here are diﬃcult for the structuralist ideal, with a system of neat oppositions “où tout se tient”. For instance, in the case of future marking in Romance, there seems to be no consistent semantic diﬀerence between the inﬂectional and the periphrastic constructions, although it can be shown that there are a number of semantic factors that inﬂuence the choice. It would e.g. be hard to argue that remoteness is part of the meaning of the inﬂectional

130 The Growth and Maintenance of Linguistic Complexity

future in French, although there is no doubt a tendency to use it more for remote than for near future. To give a further illustration of the rather confused situation that may arise as a result of the competition between two constructions, let us consider predicative adjectives in Russian. Russian adjectives have both “short” and “long” forms, the latter being characterized by suﬃxes that were originally deﬁnite articles, but which ﬁrst totally took over the attributive niche and now have also invaded predicative uses. In the competition, then, the short forms are gradually losing ground. At this point, I wish to quote a Russian normative grammarian, Rozental’ (1968), who devotes six pages to explaining how to choose between short and long forms in predicative position. He notes the following diﬀerences (the classiﬁcation is his): –

–

–

semantic diﬀerences: – some more or less technical or derived meanings of individual adjectives demand the long form, e.g. stena gluxaja ‘the wall is blank (long form), i.e. has no opening in it’ (the primary meaning of gluxoj is ‘deaf ’); – some lexical items are only used in the short form, e.g. rad ‘happy’; others only take the long form, e.g. many colour terms such as goluboj ‘(light) blue’; – lexicalized collocations may demand the short form, e.g. živ i zdorov ‘alive and well’ or alternatively the long form e.g. položenie bezvyxodnoe ‘the situation is hopeless’; – adjectives characterizing the weather tend to take the long form, e.g. pogoda prekrasnaja ‘the weather is ﬁne’; – in many cases, long forms denote permanent characteristics, while short forms denote temporary states: on bol’noj ‘he is sickly/permanently ill (long form)’: on bolen ‘he is (acutely) ill (short form)’; – long forms express an “absolute” characteristic whereas short forms express a property or state that is “relative”: komnata nizkaja ‘the room (i.e. the ceiling) is low’: komnata nizka ‘the room is too low’; grammatical diﬀerences: – the short form is usually the only possible one when the adjective has some kind of complement or adjunct: on bolen (*bol’noj) anginoj ‘he is ill with tonsillitis’; – …but the long form may appear when the particular adjective lacks the short paradigm, e.g. reka vsja golubaja ot luny ‘the whole river is (light) blue from the moon’. stylistic diﬀerences: – the short form has a “categorical” nuance and the long form is “attenuated”: ty glupa ‘you are stupid (short form)’ is an insult but ty glupaja ‘you are stupid (long form)’ may be said in a friendly way; – the short form is “bookish” and the long form is “colloquial”, although this, says Rozental’, may be partly a secondary eﬀect of other factors (the

Grammatical maturation

uses where short forms are preferred tend to occur more often in the written language). A number of points may be noted here. One is that the competition between the two patterns for the same syntactically deﬁned niche is a necessary precondition for the semantic and stylistic diﬀerences that can be found between them. It is only if two patterns are used in the same niche that the choice between them can be connected with semantic and stylistic eﬀects. For this reason, these eﬀects are rather volatile and unstable: they did not exist before the new pattern entered the niche, and if it is allowed to expand further and take over the niche totally, they will disappear again. But as long as this does not happen, the spreading process has resulted in increased semantic complexity (cf. 3.9). Stylistic diﬀerences between two patterns in terms of the younger one being more colloquial or informal than the other are of course entirely natural and can be seen as a direct eﬀect of the spreading process. Several of the items in Rozental’s list are readily explainable if it is assumed that when long forms were ﬁrst used in predicative position, they were understood as elliptic or headless noun phrases (this is also the standard explanation). On bol’noj would thus be interpreted as ‘he is a sick one’. It is in the uses where such an interpretation is not possible that the short forms survive — when speaking of temporary states or when the adjective has a complement. But at least in the spoken language the long forms have expanded far beyond this — it is already possible to say things like Ja ustalyj s dorogi ‘I am tired (long form) from the road’ — a statedenoting adjective with a complement. Here, then, is a fairly clear example of a reanalysis: an elliptic noun phrase is reinterpreted as a simple adjective.1 On the other hand, there is also something for those who prefer explanations in terms of entrenchment of frequent patterns. We see that there is much lexical idiosyncrasy in the choice between long and short forms. Among adjectives, there are those that show up more often in predicative (as opposed to attributive) position, and it appears that those are also the ones that tend to preserve the short forms. The book I have quoted, Rozental’ (1968), presents itself as a treatise on ‘practical stylistics’, one of its major objects of study being ‘synonymy of linguistic means’ (sinonimija jazykovyx sredstv). That the bulk of this 400-page book is found under the heading ‘Grammatical Stylistics’ suggests that this kind of competition between grammatical patterns is the rule rather than the exception. However, cases where there is a clearer division of labour between the older and the newer pattern may also be found. To take a simple example: in many European languages, adjectives can form comparatives and superlatives either with suﬃxes or

1.Arguably, this is a case where reanalysis is the driving force in the development, rather than being a consequence of it (see further discussion in 8.3.)

131

132

The Growth and Maintenance of Linguistic Complexity

with the help of words like ‘more’ and ‘most’. The synthetic (morphological) construction is the older and more advanced. In all languages, it tends to be used for the most frequent adjectives, but the demarcation line between constructions varies from language to language. Thus, the following adjectives employ the older construction: – – – –

French: only a few high-frequency adjectives, usually suppletive (bon :meilleur; mauvais : pire); English: all monosyllabics and some bisyllabics (pretty : prettier); Swedish: all adjectives except those formed from certain derivational suﬃxes (-ande, -isk); German: all adjectives.

What we see here is that the domain of the younger construction has expanded to include varying portions of the lexicon. In French, it covers almost all lexical items, In Swedish, it is used only as a ‘last resort’ solution when the adjective already has a ‘heavy’ suﬃx (see discussion above). In English, the development seems to have stopped half-way. This illustrates a rather interesting phenomenon: between an older and a younger construction there often arises a relatively stable ‘line of demarcation’ where the grammaticalization process may halt for a long time, perhaps several centuries. In fact, such lines of demarcation constitute a major source of complexity in grammar, in that, in eﬀect, a forced choice between two communicatively equivalent alternatives is created, depending on factors that may be unrelated to the intended message. In other cases of a halted spread of a construction, we ﬁnd that a new grammatical ‘opposition’ arises through the necessity of choosing between the new and the old construction. For instance, if markers of indirect objects (datives) are extended to direct objects, but only to those with animate reference, as has happened for instance in Spanish, an ‘opposition’ between animate and inanimate is created. Another, perhaps clearer example of the creation of a grammatical opposition is the rise of new gender distinctions as a result of the recruitment of demonstrative pronouns as third-person personal pronouns. Some such cases are discussed by Corbett (1991: 312), citing Greenberg (1978a: 78–79). Here, we shall look at a fairly clear example not mentioned by Corbett, viz. Scandinavian. Older stages of Scandinavian had the common three-gender system found in many Indo-European languages both for nominal agreement and in the pronoun system. Modern Standard Swedish and Danish and some varieties of Norwegian have simpliﬁed the agreement system by merging the masculine and feminine genders. The third-person pronoun system, on the other hand, has actually become more complex and now forces speakers to choose between four diﬀerent pronouns. The story behind this involves an encroachment of a demonstrative pronoun on the territory of 3rd person pronouns. There is cross-linguistically a general pressure from demonstrative pro-

Grammatical maturation

nouns on 3rd person pronouns, but it is strongest at the lower end of the animacyreferentiality cluster of scales. As a result, inanimate pronouns are often identical to demonstrative pronouns, or weaker forms of them. For Scandinavian, this seems to have been the case early on for the neuter pronouns, with the following system still found in many varieties: (49) Conservative Scandinavian pronoun system masculine han feminine hon (hun, ho) neuter det (also demonstrative) The masculine and feminine genders in this system comprised both animate and inanimate nouns, whereas the neuter gender was almost exclusively inanimate. In the crucial innovation which was introduced into the written Swedish in the 17th century (Wessén (1968: 215)), but which had probably started considerably earlier in the language spoken in cities such as Stockholm, the non-neuter demonstrative den came to be used of inanimate referents instead of the masculine and feminine pronouns han and hon. In other words, there was an encroachment on the domains of these pronouns, which, however, came to a halt at the animate cordon, where it has remained reasonably stable for several hundred years. In the old system, the distinction animate-inanimate was there only covertly, in that animate nouns were normally either masculine or feminine. In the new system, animate and inanimate referents are systematically distinguished in the pronominal system: (50) Innovated Scandinavian pronoun system masculine han feminine hon (hun, ho) inanimate non-neuter den (also demonstrative) animate neuter det (also demonstrative) What was previously a distinction between two types of pronoun has now acquired a new semantic content through the limitation on the encroachment process. More precisely, the factor that delimited the diachronic expansion of the demonstratives has turned into a condition of use that governs the choice between the pronouns. An interesting question now arises: is it possible for the creation of a demarcation line to have the eﬀect that a pattern which was previously optionally used in a wider range of contexts now becomes obligatory, but simultaneously restricted to a narrower set of uses? This would be analogous to a truce where two armies that have been ﬁghting all over a territory withdraw behind their lines, dividing the territory between them. In a way, this would be a violation of the unidirectionality thesis about grammaticalization, since it implies a partial retreat from an earlier expansion. A well-known example of this kind is do-periphrasis in English, which at an earlier stage (15th–16th centuries) was possible in all sentence types although it is

133

134 The Growth and Maintenance of Linguistic Complexity

now restricted to questions, negatives and emphatic sentences (Ellegård (1953)); cf. the famous Shakespeare quotation The lady doth protest too much, methinks. The conditions under which this ﬁxation of the use of do-periphrasis took place are not quite clear. Ellegård says that its more general use may have been restricted to formal speech, which would mean that the “retreat” could be seen as the victory of an informal speech variety. If this is invoked as an explanation, it shows how hard it may be to distinguish dialect competition from language change (see also 7.5). If the area that has been untouched by the expansion is characterizable through a positive criterion, the natural way of describing the resulting situation is by treating that criterion as a distinctive for the receding pattern. This means that it is the old pattern that obtains a new characterization. A typical case of such a development is when subjunctives and similar non-indicative moods arise through the expansion of new verbal patterns in main clauses. The old indicative may in such situations survive in subordinate clauses and become associated with non-assertivity (as in the development of the Armenian subjunctive, mentioned in 6.2). Sometimes, similar developments leave two or more disparate areas untouched, as when an old non-past or imperfective is invaded by a new pattern (usually an expanding progressive) but survives in future and habitual functions. An example would be the Aorist in Turkish. An even more drastic development appears to be represented in Hawaiian (Dahl (1985: 163)), where a previous imperfective (e + V + ana) is used for the somewhat implausible combination future and past progressive. The same construction has a general progressive use in Maori (relatively closely related to Hawaiian in spite of the geographical distance involved). Given that Hawaiian also has a speciﬁc present progressive (ke + V + nei) and uses the bare verb for generic and habitual contexts, a development can be reconstructed by which an original progressive has expanded to future uses and then lost the centre of its domain of use to a new progressive construction. Such situations have been labelled “donut categories” (Kemmer (1993)) and do not lend themselves to a neat synchronic description in terms of a Gesamtbedeutung in the sense of a coherent set of semantic properties that would cover all the uses of the category in question and only those.

7.4 The cyclical theory of grammaticalization A hypothesis which has been quite popular at least since the end of the 19th century is that grammaticalization is cyclic in the strong sense of the word — meaning that there is something in the end of the life cycle of the grammaticalizing item that triggers the genesis of a new item. A closer look reveals that there are rather diﬀerent views on the nature of the trigger in question. Thus, Gabelentz (1891), who used the metaphor of a spiral to describe the cycle of grammaticalization,

Grammatical maturation

apparently thought that the old expression had to go to zero by way of phonetic reduction before the need for a renewal made itself felt: “Die Aﬃxe verschleifen sich, verschwinden am Ende spurlos; ihre Functionen aber oder ähnliche bleiben und drängen wider nach Ausdruck.”

When, about twenty years later, Meillet (1912) discusses the problem (without quoting Gabelentz), he attributes the need for renewal not to the disappearance of an expression through phonetic reduction but rather to its loss of expressive value through being used too many times. A recent formulation of the cyclical theory is found in a paper by Geurts (2000: 783), who presents what he calls “the standard view of grammaticalization”, citing Gabelentz as the originator of this view, in which grammaticalization would result from the interaction of two forces: – –

eﬀectiveness or clarity: “speakers seek to make themselves understood and therefore strive for maximally eﬀective messages” eﬃciency or economy: “there is a general tendency not to expend more energy than is strictly necessary and therefore to prefer economical forms to more elaborate ones”

Grammaticalization is described as occurring in two stages: “Grammaticalization begins when a form α that may be eﬃcient but is felt to lack in eﬀectiveness is replaced by a periphrastic, and therefore less economical, locution β calculated to enhance eﬀectiveness… Then β gets the upper hand and wears down due to the general drive toward eﬃciency of expression, until it is weakened to the point where it has to be replaced by some γ.”

However, “weakening” here seems to include both phonetic reduction (à la Gabelentz) and loss of expressivity (à la Meillet). The latter is perhaps the most popular explanation for “renewal” in the literature.2 According to Hopper & Traugott (1993: 65), “expressivity serves the dual function of improving informativeness and at the same time allowing the speaker to convey attitudes toward the situation, including the speech situation”.3 In fact, this suggests that what Hopper

2.What Keller calls “Lüdtke’s Law of Language Change”, referring to Lüdtke (1980), is also a cyclical theory which builds on the assumption that expressions in the language over time become “too small to satisfy the redundancy needs of the speaker”. As the name suggests, “Lüdtke’s Law” aims at a general explanation of language change, which makes it less suitable for explaining the speciﬁcs of grammatical maturation. 3.There seems to be a potential conﬂict between this statement and the claim made in the same book (and by Traugott in other places) that the grammaticalization of an expression leads to “subjectiﬁcation”, that is, that a grammaticalizing expression tends to acquire uses

135

136 The Growth and Maintenance of Linguistic Complexity

and Traugott have in mind when they use the term “expressivity” is not a unitary phenomenon, one facet of it being Geurts’ “eﬀectiveness” and the other closer to Haspelmath’s “extravagance” (Haspelmath (1999)), where the former might be deﬁned as the capacity to convey the intended message and the latter as the capacity to satisfy the maxim “talk in such a way that you are noticed”. No doubt both of these have their place in the creation of “new ways to say old things” (Hopper & Traugott (1993: 65)). However, it does not appear possible to use this as a general explanation for the initiation of grammatical maturation processes. Notice that many central grammaticalization processes do not involve the creation of new patterns at all. A case in point is the genesis of indeﬁnite articles, where the standard source is the numeral ‘one’, an element that to the best of my knowledge is universal in language, and is itself rather infrequently renewed. In other words, the process starts with the expansion of an old pattern rather than the creation of a new one: the numeral ‘one’ also comes to be used in contexts where bare nouns were earlier the normal case. In languages lacking indeﬁnite articles, the use of the numeral ‘one’ is, as I have argued above, presumably governed by relevance considerations. Thus, the numeral is not normally used with count nouns when the cardinality of the set of objects referred to is uninteresting or predictable from the context. It is a little diﬃcult to see how expressivity or extravagance could be attained by using the indeﬁnite article in those contexts. Let us now consider how well a cyclical theory can account for a case of grammatical maturation discussed in 7.2, that of scalar predication in Mandarin Chinese. Recall that Mandarin Chinese heˇn ‘very’ has undergone a shift, in which it has moved from being an optional intensiﬁer to being an obligatory part of the scalar predicate construction. To sort out what is going on here, we must consider how intensiﬁers like very function in general. In English, if I want to say that someone is tall, I have a number of options, depending on how much force I want to put into the statement: I can use the adjective tall without any modiﬁcation, as in (a); I can add the “standard” intensiﬁer very, as in (b), or I can add one of the “strong” intensiﬁers like those in (c). (51) a. He is tall. b. He is very tall. c. He is extremely/terribly/incredibly/unusually/shockingly tall. In most languages that I have any knowledge of, intensiﬁcation of scalar predicates is structured in much the same way. We may note that the “strong” level is normally characterized by diversity: there are usually many diﬀerent alternatives, and they

that are more subjective than the earlier ones — something that, one might imagine, ought to lead to an increase in expressivity, according to the formulation just cited.

Grammatical maturation

The enigmatic case of negation By the notion of cross-linguistic dispensability, we may deﬁne most inﬂectional categories as belonging to phenogrammatics. There is one salient counterexample among verbal inﬂectional categories, however: negation. The evolution of negation in languages is often adduced as a paradigm case of grammaticalization, and seems to give support to a cyclical view of this process. Thus, in what has come be called “Jespersen’s Cycle”, a negation construction is renewed by the addition of some strengthening morpheme, leading to a “double negative” stage, but where the original negation marker ﬁnally disappears, resulting in a return to the point of departure, with a single marker. However, what happens to negation is actually somewhat atypical of grammaticalization. Communicatively, a negative morpheme cannot become redundant, or “incidental” to the basic message in the same way as e.g. an agreement morpheme. In a sentence such as It is not raining, the presence of the word not radically changes the proposition expressed. In fact, the very point of a negated sentence is typically precisely the fact that it is negated. It is diﬃcult to imagine a language in which negation systematically has zero expression. In spite of this, what we observe in language after language is that negation morphemes tend to be unstressed, phonetically reduced and eventually fuse with the ﬁnite verb or auxiliary of the sentence, seemingly without any change in the semantics or pragmatics of the negation morpheme (cf. English forms such as won’t, don’t and ain’t). This process of reduction and tightening cannot be accounted for by a change in the informational or rhetorical value of the negation morpheme, but must be explained in some other way. We may note that when we emphatically assert or deny some proposition, this is commonly done by assigning extra stress to the ﬁnite element of the sentence. In English, this will often be the dummy auxiliary do. In the Slavic languages, negation morphemes (ne, nie etc.) usually form a prosodic word unit with the verb, although this is not always reﬂected in the written language. It is thus normal that it is this unit as a whole, rather than the negative morpheme as such, that receives extra emphatic stress. This process, by which prosodic prominence is shifted from the negation morpheme to the ﬁnite element, is probably what is behind the reduction of the former and its integration into the latter. In Jespersen’s Cycle, something extra happens, viz. some strengthening morpheme (such as pas ‘step’ in French) is weakened and ﬁnally reinterpreted as a standard negation morpheme. Here, the original negation morpheme does become redundant and may even disappear totally. However, it is important to see that the process by which the negation morpheme is reduced and fused with the verb, as in the process giving rise to won’t, don’t and ain’t, is much more widespread than Jespersen’s Cycle: the renewal of the negation construction that we see in French is attested only in a minority of the world’s languages.

tend to be renewed very frequently: the Merriam-Webster Thesaurus gives 45 synonyms for very — and most of them would seem to belong in the strong category. The strong level may be rather unstable as well: Hopper & Traugott (1993: 121) quote the history of intensiﬁers as “a vivid example of renewal”,

137

138

The Growth and Maintenance of Linguistic Complexity

enumerating awfully, frightfully, fearfully, terribly, incredibly, really, pretty, truly as having been fashionable at diﬀerent points in time.4 It is evident that the eﬀect that is obtained from a strong intensiﬁer is not compatible with it being used too often — which would be congruent with Meillet’s claim that a linguistic element loses in expressiveness every time it is used.5 There are two possible ways of avoiding this: either by replacing the expressions often, as predicted by the cyclic hypothesis, or by having a large inventory that you can choose from, thereby diminishing the risk of frequent repetition. Probably both of these strategies are used at the same time in most languages. But now we must ask: Is this grammaticalization? And does it explain what happens in Mandarin Chinese? The answer has to be “no” to both questions. If words like extremely and shockingly are worn out and discarded, according to the loss-of-expressiveness theory, it is a process that takes place exclusively on the “strong” level of intensiﬁcation, and there is no obvious reason why this should inﬂuence the other levels. And it does not involve any change in the grammatical status of any morpheme or construction. What happens in Mandarin Chinese is quite diﬀerent: the equivalent to English very expands its domain of use in such a way that it takes over the niche of the ordinary scalar predication construction — it becomes the normal way of saying ‘X is tall’. As a result, in order to say ‘X is very tall’, one has to borrow one of the strong modiﬁers such as fe¯icháng ‘extremely’. A further important fact that distinguishes the two processes is that whereas renewal of intensiﬁers is (presumably) something that is constantly going on in every language, this development, although it is fairly frequent from the typological point of view, is still a “rare event” in the sense that the probability that it will happen in any given language within the next generation is relatively small. In (52), I have tried to draw a more adequate picture of what could be called the ecosystem of scalar intensiﬁers than the usual cyclic model oﬀers.

4.Hopper & Thompson say about the set of English intensiﬁers: “Over time, however, we can expect the choices to be reduced, owing to specialization.” This is a bit strange, since they have just spoken of the constant renewal of intensiﬁers, which ought to keep the number relatively constant. 5.« A chaque fois qu’un élément linguistique est employé, sa valeur expressive diminue et la répétition en devient plus aisée. » (Quoted from Meillet (1921: 135))

Grammatical maturation 139

(52)

Strong intensification awfully ADJ

frightfully ADJ

Standard intensification

Plain scalar predicate

fearfully ADJ very ADJ

extremely ADJ

ADJ

terribly ADJ incredibly ADJ really ADJ

pretty ADJ

The examples are English, but as I noted above, I think the overall structure is very much the same in most languages. Each of the partitions (“niches”) in the “hoop net” corresponds to one of the alternatives in (51). Patterns may advance from left to right, but they may also be discarded at any stage; in fact, only a minority in each niche make their way to the next one,6 which means that the number of inhabitants of the niches decreases very drastically — a process which is confusingly referred to in the grammaticalization literature as “specialization”. Probably the situation in English, with a large number of strong intensiﬁers, just one standard intensiﬁer7 and no marking of plain scalar predication is the most common one.

6.As David Minugh points out (personal communication), standard intensiﬁers seem to be shorter than the average strong ones. This may be due to two factors: either the shorter ones have a better chance of getting into the second hoop or standard intensiﬁers tend to be phonetically reduced. In the case of very, the longer form verily has been disfavoured. We thus seem to be dealing with a case of selection between diﬀerent elements rather than reduction. 7.In many languages, one can identify a unique “standard” intensiﬁer (e.g. French très, German sehr, Russian ocˇen’), suggesting that this is a one-member niche. Goddard (2001: 24) says: “On present evidence, it appears that all languages have an intensifying word with the same meaning as English ‘very’, which can combine with words like ‘big’ and ‘good’.” It is noteworthy that very is included among the about 40 purported “semantic primes” according to Wierzbicka (1996). A semantic prime in this theory is supposed to be a “linguistic expression whose cannot be paraphrased in any simpler terms” and to have “a lexical equivalent (or a set of equivalents) in all languages”. However, the one-member niche hypothesis turns out to meet with diﬃculties in my own native language, spoken Swedish. I must admit to have been temporarily fooled on this

140 The Growth and Maintenance of Linguistic Complexity

The cyclical theory really only covers what happens in the ﬁrst partition of the ecosystem, and cannot tell us anything about why patterns sometimes do not simply slide out the back door but are instead promoted to the niches to the right. Above, I invoked inﬂationary mechanisms as one possible explanation of the expansion of heˇn in Mandarin Chinese. It should be emphasized that the inﬂation hypothesis is not equivalent to that given by the cyclical theory in its various variants: it does not assume that using the expression by itself “wears it down”, rather, overusing it to attain an extra rhetorical eﬀect in the long run leads to a devaluation of the expression. Furthermore, the process does not render the expression unusable; rather, it becomes appropriate in other contexts.

7.5 Unidirectionality, directionality and problems of identity An idea that has played an important role in the discussion of grammaticalization is that of its assumed unidirectionality (Hopper & Traugott (1993), Chapter 5), alternatively its irreversibility (Haspelmath (1999)). What I shall do in this section is not to argue for or against the unidirectionality thesis on the basis of empirical examples, as has already been done by many authors, but rather to attempt something more like a deconstruction of the notion. I shall demonstrate how “identity problems” at a number of diﬀerent levels make it hard to evaluate claims about unidirectionality in that the question of what a counterexample is becomes a matter of interpretation. Philosophers and logicians have at length discussed the problem of “crossworld identiﬁcation”: how do we know if a certain individual object in one possible world is identical to an object in another? The problem is equally diﬃcult when we compare individuals at diﬀerent points in time. Is Russia in 2001 the same country as the Soviet Union in 1970 or the Russian Empire in 1900? Often, such questions have rather important practical consequences — for instance, does the present Russian government have to pay debts incurred by previous regimes? Likewise, in any branch of science that is concerned with change, one has to decide about crosstemporal identities before one can say what has changed.

point by the word mycket ‘very’, which indeed seems to fulﬁl the criteria for a standard intensiﬁer in written Swedish. In spoken Swedish, on the other hand, mycket turns out to have serious competitors in väldigt ‘extremely’ and the incorporated jätte- ‘giant-’. In the corpus “Samtal i Göteborg” (see fn. 103), I checked combinations of those intensiﬁers with the high-frequency adjectives stor ‘big’, liten ‘small’ and gammal ‘old’ and found 14 occurrences of mycket, 21 of väldigt and 18 of jätte-. Thus, at least on the frequency criterion, there is really no clear candidate for a standard intensiﬁer in spoken Swedish.

Grammatical maturation

Somewhat surprisingly, historical linguists seem to relatively seldom worry about questions of this kind, although change is without any doubt the most central concept in this ﬁeld. Statements about linguistic change may take slightly diﬀerent forms, as in the following quotations, which are not provided in order to criticize their authors but to illustrate common usage: “…an allative or benefactive adposition becomes a dative marker…” (Haspelmath (2000: 789)) “…an expression moves away from the lexical pole and toward the grammatical pole…” (Geurts (2000: 781)) “…the auxiliary which expresses immediate futurity derives historically from the motion verb go…” (Hopper & Traugott (1993: 1))

These three quotations really illustrate three diﬀerent ways of speaking about change — if we like, three diﬀerent models. In the ﬁrst, the construction “x becomes y” is used. This construction is neutral in its interpretation with respect to the question of whether the element that undergoes the change thereby changes its identity or not. If Mary becomes a teacher, she is still Mary; if, on the other hand, an acorn becomes an oak, we would probably not think of the acorn and the oak as being the same individual. When Geurts says that an element moves along the lexicalgrammatical continuum, this pictures the element as being “the same” during this process. Hopper & Traugott’s way of speaking, on the other hand, suggests that there is a weaker relationship than identity between the source (the motion verb) and the target (the auxiliary) — one is said to be historically derived from the other. In general, it can be said that the whole grammaticalization framework strongly induces the model in which one element develops new properties but keeps its identity. However, there is as far I can see relatively little that can empirically distinguish this model from one in which identity is not presumed. What is important, and indeed problematic, is what other assumptions we make about the nature of the relationship “x continues y” or “x derives historically from y”, most saliently, the assumption that it is in general possible and meaningful to assume that there is a unique historical source for any element of a language. Likewise, even if it is not necessarily interpreted as an identity statement, the “x becomes y” template for describing linguistic change suggests that there is a one-toone correspondence between linguistic items at diﬀerent diachronic stages. If one hears that x has become y, the natural assumption is that x has in general become y. However, in language change, notably in grammaticalization, it may happen — or, rather, it is the normal case — that y is the successor of x in some contexts only. So the “x becomes y” template is potentially misleading, and this may become crucial when discussing what the inverse of a given development (whose non-existence supports unidirectionality) is supposed to be. For instance, according to the “x becomes y”

141

142 The Growth and Maintenance of Linguistic Complexity

model, we may have an item α which is a demonstrative pronoun at stage I, and a deﬁnite article at stage II. The inverse development would mean that the deﬁnite article “becomes” a demonstrative pronoun. But this is not an accurate description of what mostly happens. The essential change, as we have seen, is that an item develops a grammatical use, or if we like, acquires a grammatical niche. For instance, at stage I, an item α is used as a demonstrative pronoun, at stage II, α is used both as a demonstrative pronoun and a deﬁnite article. Further, more complex developments, such as phonetic reduction of the deﬁnite article, or the disappearance of the demonstrative use, are possible but not necessary. So what would the inverse of this process be? Presumably, it would involve the transition from stage II back to stage I: the new use of α gets lost. Since the consequence is that the result of the original change is obliterated and the previous stage is again established, this is a development that could take place very often without anyone even noticing. If an item develops a new use, whereby it acquires new semantic properties and/or undergoes phonological reduction, a potential identity problem arises in that it may not be obvious whether a new item has arisen or whether we are just dealing with a variant of the old one. This may be crucial in some cases. It is rather important for the current conception of grammaticalization to be able to distinguish between two kinds of language change: the replacement of one item or pattern by another, and a change in the properties of one single item (Lehmann (1982: 21)). For instance, when a periphrastic construction starts to be used instead of an inﬂectional one, this is not regarded as a counterexample to the claim that grammaticalization is unidirectional, since it is a replacement of one construction by another. There are, however, situations that look like the replacement of an item by itself — a phonetically heavier element may replace a reduced element that comes from the same source. Thus, it is not uncommon to observe a process by which a set of enclitic object pronouns have been or are being replaced by the full forms. In Old Russian, there were e.g. strong/weak pairs of the dative and accusative 2nd person singular pronouns: tobeˇ (tebeˇ)/ti (dat.) and tebe/te˛ (acc.). In Modern Russian, according to the written norm there are only bisyllabic forms: tebe (dat.) and tebja (acc.). In fact, though, new enclitics seem to be developing: one often hears forms such as te and tja. In this case, it may seem relatively uncontroversial that the full and enclitic pronouns were separate items rather than variants of the same, when they were both present in the language. But this is not necessarily always the case. In Swedish, the pronoun jag ‘I’ has two possible pronunciations: [j"˜g] and [j"˜]. It appears that the “full” variant is gaining ground. This and similar changes are often ascribed to the inﬂuence of the written language, but the full pronunciation is used also by quite small children. We thus seem to be dealing with the mirror-image of phonetic

Grammatical maturation 143

reduction, in spite of the claims that have been made that such a thing is impossible.8 Thus, Keller (1994: 109), quoting Lüdtke (1980), says that since “one cannot be more distinct than perfectly distinct”, there is a limit to articulatory redundancy beyond which lexical means have to be used, and “due to the fact that articulation has an upper limit but not a lower one, linguistic units can only become shorter”,9 and Haspelmath (1998: 321) quotes this as part of the explanation of the unidirectionality of grammaticalization: “it is not possible to introduce additional vowels or consonants to make an utterance more salient and easier to perceive”. Instead, he says, one has to use the syntactic-semantic dimension of variation, that is, to ﬁnd an expression with a more complex structure. But such claims are empirically false. As we have already seen, phonetic enhancement is a regular part of prominence management in language, and it sometimes involves addition of segments. Thus, words like yes and no tend to be pronounced emphatically, for pragmatic reasons, and the very fact that they function as complete utterances rather than being embedded in a larger structure is suﬃcient to guarantee that they get maximal resource allocation.10 This may result in the addition of glottal stops (as in the pronunciation orthographized as nope) or a bisyllabic pronunciation. The Scandinavian nej/nei ‘no’ (borrowed into English as nay) is said to derive from the Indo-European negation morpheme ne, and here the enhanced form has been conventionalized. Bisyllabic forms such as Swedish nähej are not uncommon. (Neo-grammarian sound change (see 8.2) may also increase phonetic weight. The Swedish pronoun jag ‘I’ is deﬁnitely heavier than its assumed proto-Germanic source *ek.) One problem with moving the focus of attention from morphemes to constructions is that it becomes more diﬃcult to distinguish replacement from change of properties due to the temptation to identify constructions by their functions. For instance, we speak without hesitation about “the transitive construction of language L”.

8.It appears that phonetic strengthening may appear on a fairly large scale in certain genres where phonetic distinctness is valued, and this may potentially spread to other styles. A rather striking example of this is the tendency for many Russian speakers (for instance in public lectures) to use non-reduced forms of prepositions that are normally pronounced proclitically, for instance k našemu priezdu ‘to our arrival’ pronounced as [k6 Ánaw6mu priÁj7zdu] rather than [Áknaw6mu priÁj7zdu] or pod ˙etim ponjatiem as [pod Á7tim p6Ánjat’ij6m] rather than [p6Ád7tim p6ÁnjatÁij6m]. In a way, this is a real-world counterpart of Keller’s hypothetical religious community where articulatory sloppiness is regarded as a deadly sin and otherwise universal tendencies towards phonetic reduction are blocked (Keller (1994: 112)). 9.As Anne Markowski points out, there is of course a lower limit as well — zero. 10.These are also high-frequency items, and thus a model in which frequency is used as a single explanation of phonetic reduction (see 8.1) falsely predicts that they would tend to be reduced.

144 The Growth and Maintenance of Linguistic Complexity

But when a new way of expressing transitive sentences comes into being, is it then the same construction or a new one? In Mandarin Chinese, the direct object may either follow the verb, without any marking, or precede it, marked by ba. Here, the diﬀerences between the two ways of constructing a transitive verb phrase are so marked that it feels natural to speak of two constructions. But as the diﬀerences become more subtle, we may feel that the two ways of expression are really varieties of the same construction. Thus, in Spanish, the diﬀerence lies only in the presence or absence of the preposition a; in Turkish, a direct object may be marked as accusative or have the zero-marked nominative form. An increase in the use of the marked way of expression could then be seen as a change in one construction rather than as a replacement. Similarly, consider inﬁnitive clauses in English. It has been claimed that the use of the so-called inﬁnitive marker to is increasing and that this means that the inﬁnitive clause construction is developing “backwards”, towards a more analytic way of expression. But this presupposes that we are dealing with one construction and not with two. Another similar problem arises in situations where a construction has more than one ancestor in the sense of there being two historical sources which seem to have contributed equally to the descendant construction. For instance, in Swedish, there is a new and so far non-standard comparative construction in which the particle än ‘than’ has been replaced by the expression jämfört med ‘compared with’. Instead of describing this as the replacement of one expression by another, as I just did, one could equally well see this as the blend of two constructions. (53) Swedish a. Andersson tjänar mer än Pettersson. Andersson earn.prs more than Pettersson ‘Andersson earns more than Pettersson.’ b. Andersson tjänar mycket jämfört med Pettersson. Andersson earn.prs much compared with Pettersson ‘Andersson earns much compared with Pettersson.’ c.

Andersson tjänar mer jämfört med Pettersson. Andersson earn.prs more compared with Pettersson ‘Andersson earns more compared with Pettersson.’

The question of identity concerns also the processes of change themselves. Consider the following quotation from Hopper & Traugott (1993: 95): “The basic assumption is that there is a relationship between two stages A and B, such that A occurs before B, but not vice versa. This is what is meant by unidirectionality.”

The main problem with this formulation is that it speaks merely of temporal relationships between stages, and doesn’t really say anything about the processes

Grammatical maturation

that take you from one to the other. Consider the popular but somewhat dangerous activity of ski jumping. Ski jumping would appear to be a prototypical case of a unidirectional process. At the beginning of the jump, you are up (stage A), at the end you are down (stage B), and nobody has yet been observed jumping from the bottom of a ski jump to the top. But of course ski jumpers have other means of getting back to the initial position — taking the ski lift, for instance. So stage A does occur after stage B, in such cases. Intuitively, though, it seems wrong to regard the ski lift ride as a counterexample to the unidirectionality thesis for ski jumping. We would like to restrict counterexamples to ones that somehow use the same or a similar method, or at least passing through the same points on the way back. This becomes very tricky, though. Exactly what are the criteria for deﬁning what counts as a counterexample? Let us see what happens in grammaticalization. If we give the widest possible deﬁnition of this notion, say “the change of lexical items into grammatical ones”, the inverse would simply be “the change of grammatical items into lexical ones”. But would we like to accept any such change as a counterexample to the unidirectionality thesis? For instance, Ramat (1992) presents the derivation of the noun ism from the derivational suﬃx -ism as an example of “degrammaticalization” and thereby a counterexample to the unidirectionality thesis. It is disputed by Haspelmath (1999: 1048) with a formulation that suggests that he would see it as parallel to the ski lift case (“it seems to be another case of a citation form of a word part taken out of its constructional context, rather than degrammaticalization”). But then the question is whether one could not explain away any putative counterexample in a similar fashion. The unidirectionality thesis thus risks losing much of its empirical content. Finally, identity problems arise not only with respect to elements of a language and processes of change but also for languages themselves. This may be a greater problem for the theory of language change than is sometimes assumed. Speaking about changes in a language presupposes that we can identify one synchronic state of a language as the continuation of another, earlier state. The conventional wisdom in linguistics is that mixed languages, that is, languages with more than one parent, are very rare. However, when very closely related languages or dialects are involved, it may be quite hard or even impossible to identify a unique parent. For instance, in the very common case of a new regional variety arising from the contact between a standard language and a traditional local dialect, it may not at all be clear which of the two the new dialect descends from. This indeterminacy in descendancy relations makes it diﬃcult to verify or falsify general claims about language change, at least when they take the form of denials of the possibility of a certain change. Such a denial is in essence a claim to the eﬀect that a synchronic state of type A can never descend from a synchronic state of type B. If we ﬁnd a group of speakers whose language is of type B and we

145

146 The Growth and Maintenance of Linguistic Complexity

know that their grandparents spoke a language of type A, this is only a counterexample if we are certain that no language shift has taken place in between. But in a dialect continuum, the diﬀerence between a change in one language between two generations and a shift from one dialect to another is not well-deﬁned. This is the case in particular in contemporary cities, where only a minority of the population may be born there — is it a case of language shift when the migrants’ children pick up the dialect of the city? Questions such as these become relevant for claims about the (uni)directionality of changes such as those involved in grammaticalization. Imagine the following scenario. There is a language community in which some kind of grammaticalization takes place, but in a geographically restricted fashion, resulting in a dialect split, so that speakers of dialect A accept the change but speakers of dialect B preserve the old state of the language. Later on, however, due to factors having to do with prestige and other extralinguistic factors, the speakers of dialect A give up most of their dialectal features and adopt what is essentially dialect B, with the eﬀect that the change is reversed. For this scenario, one might argue that we are really dealing with language shift rather than with language change. We could however imagine that the grammaticalization we are speaking about would be the only feature that would distinguish the two dialects. To circumvent this counterexample to unidirectionality, one would have to rule that contact-induced change does not count. Such a position may seem attractive but immediately runs into problems considering that the high degree of areality of grammaticalization phenomena suggests that grammaticalization processes are more often than not contact-induced. As a concrete example of a reversed development, consider the extended use of deﬁnite articles in certain varieties of Scandinavian (see 5.2.1 and also 10.3 below), which have been stable for a long time; however, under the pressure of Standard Swedish, it is now receding and has disappeared from areas where it can be assumed to have existed previously. Returning now to the general question of unidirectionality, it seems to me that what we want to know about processes of change is not so much whether they are unidirectional or not — which is really mainly a matter of deﬁnition — but to what extent and in what respect they are directional, or directed. Imagine a group of extraterrestrial observers sitting in a spaceship above a middle-sized European industrial city at the beginning of the 20th century and observing a lot of movement. They will be able to discern certain patterns in these movements. In particular, there is a twenty-four hour cycle and a higher-level cycle consisting of seven twenty-four hour cycles. Once in such a longer cycle, they observe a lot of earthlings moving about in what seems to be a random manner. On all other twenty-four hour cycles, they see two big movements, an early one that they label centripetal, and a late one they label centrifugal. A lot of high-level discussion takes place about the possible explanations for these patterns. We earthlings know, however, that what they have seen is (a) the city population taking

Grammatical maturation 147

their traditional Sunday afternoon stroll; (b) the same population moving to and from work on weekdays. In a fairly clear sense, the movements in (a) are not directional. That is, if we observe a movement from A to B, we could equally well have observed a movement from B to A, and maybe in fact we do: some people walk from the Town Hall to Constitution Square, and others do it the other way round. In (b), on the other hand, the direction makes a diﬀerence. In the morning, people move from the residential areas to the areas where the work-places are, and it would not make sense to do it in the other direction, given normal working hours. (Let us, to make things simpler, assume that this is a humane society, without night workers.) After work, on the other hand, they do walk in the other direction, and now this is what makes sense. So these movements have directionality. Are they also unidirectional? Well, as in the examples above, that really depends on what you mean. If everyone has the same working hours, there are no counterexamples to the claim that everyone walks from A to B in the morning. But this doesn’t mean that nobody ever walks from B to A — in the evening, everyone does. Directionality thus is a weaker but, I think, more useful notion than unidirectionality. It is still a bit tricky to deﬁne, however. One simple way of putting it is to say that it is the negation of the concept of a “random walk”. That language change is describable as a random walk between states describable as parameter settings has been argued by some generativists. The antithesis could be formulated in slogan form as Direction Matters. In slightly more explicit terms: if one has a theory that describes a certain type of language change, swapping the source and the target in your description will usually cause the theory to stop working — which does not mean that there is not another type of language change that happens to have these terms reversed. The notion of maturity, as deﬁned in 6.1, may be seen as a type of directionality, in that a synchronic stage A of a language may presuppose another, earlier, synchronic stage B. It is still weaker than unidirectionality, though: what is said is that we cannot get to B without passing A, but we do not say that one could not get to A from B.

7.6 The rise and fall of semantic redundancy One characteristic of grammatical constructions, mentioned above, is semantic redundancy — they convey information that is either already known or is irrelevant to the message. How does this situation come about? Here as elsewhere, we cannot give a full account of the causal chain, but we shall look at two types of cases where it seems that the initial stages may be observed.

148 The Growth and Maintenance of Linguistic Complexity

7.6.1 Alienability and obligatory possessive marking We are told in textbooks of linguistics that many languages make a distinction between two kinds of possession, called alienable and inalienable. The distinction is explicated in terms such as the permanence, inherentness or essentiality of the possessive relationship and/or the relationality of the head noun. The choice between inalienable and alienable constructions is seldom predictable from such general deﬁnitions, however; rather, what the alienability distinction means in most languages is that a set of inherently relational nouns are singled out for special treatment, and that this set always includes members of one or both of (i) kin terms, (ii) body part terms (Nichols (1988)). In the terminology used here, “singled out for special treatment” means that there are (at least) two separate possessive constructions, such that one of them is used for the relational nouns in question. The term “alienability split” therefore seems more adequate than “alienability distinction”. That nouns used in inalienable constructions are inherently relational means that, strictly speaking, the possessive marker does not express a relation that is separable from the semantics of the noun itself. Labels such as “possessive” and “possessor” are therefore not wholly adequate for them, but I shall use them for convenience. (54)–(55) illustrate alienability splits in two diﬀerent languages: (54) Eastern Pomo (McLendon (1975:92, 108), quoted in Nichols (1992:118)) a. inalienable wí-bayle 1sg-husband ‘my husband’ b. alienable wáx šá.ri my.gen basket ‘my basket’ (55) Maltese (Koptjevskaja-Tamm (1996)) a. inalienable bin is-sultân son def-king ‘the king’s son’ b. inalienable id ir-rag˙el hand def-man ‘the man’s hand’ c. alienable is-sig˙g˙u ta’ Pietru def-chair of Peter ‘Peter’s chair’

Grammatical maturation 149

We can here observe patterns that are characteristic of alienability splits crosslinguistically: –

–

Inalienable constructions tend to involve inﬂection of the possessee-nominal/ head-marking (as in Pomo) or to be zero-marked (and involve mere juxtaposition of the possessee and the possessor nominals, as in Maltese) Alienable constructions are often periphrastic (as in Maltese) or involve inﬂectional marking of the possessor-nominal/dependent-marking (as in Pomo)

Of equal importance in this context, however, is the phenomenon of obligatory possessor marking, which tends to occur with the same nouns as those that appear in inalienable constructions. There are really two varieties of this phenomenon. In the ﬁrst, certain nouns obligatorily carry a possessive aﬃx. An example is: (56) Navajo a. shi-ma 1sg-mother ‘my mother’ b. a-ma indf-mother ‘someone’s mother, mother in general’ In the second variety, the possessor is obligatory but may be either pronominal or lexical. This is found in the Tupí-Guaraní language family, e.g. (57) Sirionó a. Juanito ru J. father ‘Juanito’s father’ b. nde-ru 2sg-father ‘your father’ (cf. Velazquez Castillo (1996:62) for the corresponding facts in Guaraní.) Note the parallel to obligatory subjects/subject marking: In English or Swedish, every sentence must (in principle) have a subject, but it may be either pronominal or lexical. By contrast, many languages have an obligatory subject marker in the form of a pronoun or an aﬃx on the verb, irrespective of the presence of a lexical subject. Notice in this connection the “dummy” or “placeholder” possessor a- in Navajo, paralleling dummy subjects in other languages. Obligatory possessor marking turns out to be more widespread than is usually thought, if one also includes languages where marking is obligatory only in certain contexts. For instance, consider the following Russian sentence:

150 The Growth and Maintenance of Linguistic Complexity

(58) Russian Ja povredil nogu. I hurt.pst.m.sg foot.acc ‘I hurt my foot (lit.: I hurt foot).’ The body part term nogu ‘foot:acc’ here occurs without any speciﬁcation of the possessor. By contrast, in English, the possessor is obligatorily indicated by a pronoun: (59) I hurt my foot. It may be said that this is because singular count nouns must normally have a determiner in English. However, English has here chosen another route than some other languages which have the same constraint. Thus, in Swedish, the translation of (59) is: (60) Swedish Jag skadade foten. I hurt foot.def ‘(lit.) I hurt the foot.’ Here, the body part term foten ‘the foot’ carries a suﬃxed deﬁnite article. It may be noted that this is a somewhat unorthodox use of the deﬁnite article: humans have two feet, and a sentence such as (60) does not tell us which foot the speaker hurt. This thus violates the uniqueness constraint that commonly holds for deﬁnites. Notice, though, that this pattern is rather limited — the following sentence is not acceptable in a context where the body part in question has not been mentioned before: (61) Swedish Jag tittade på foten I looked on foot.def ‘I looked at the foot.’ The pattern with the deﬁnite article appears to be restricted to predicates in which body parts play a more than accidental role: in addition to ‘hurt (body part)’ we ﬁnd ‘wash (body part)’, ‘wipe (body part)’, ‘put (body part) on (object)’ etc. It is probably not accidental that these are the same cases as those where we can expect incorporation of body part nouns (see 10.2). Returning to English, we ﬁnd that there is some rather unexpected languageinternal variation there too. In (61), the possessive pronoun is coreferential with the subject — we may say that the body part NP is under subject control. If it is under object control, on the other hand, the pattern is the same in English as in Swedish — we get a deﬁnite article, rather than a possessive pronoun: (62) I hit him on the head. (63) ?I hit him on his head.

Grammatical maturation

There is also a deﬁnite article if the verb is passive, yielding contrasts of the following kind: (64) She was hurt in the stomach. (65) She hurt her stomach. In other words, the choice of determiner in these constructions in English is at least partly determined by the syntactic structure of the sentence. This is an argument for seeing the English situation as an incipient stage in the evolution of a system of obligatory possessive marking like that found e.g. in Navajo, where the nouns in question never show up without the possessive preﬁx.11 In the other group of relational nouns typically involved in obligatory possessive marking, we also have a rather complicated pattern. Kinship terms are frequently exceptions to the usual rules about NP determination: they tend to behave like proper names, and are thus used without a determiner even in contexts where it would be obligatory for other nouns, e.g. (66) Mother was talking to Granny. In Dahl & Koptjevskaja-Tamm (2001), we argue that this is characteristic of kinship terms that are close to the “parental prototype”, i.e. have one or more of the following properties: – – –

denote an ascending relation (‘father’ rather than ‘son’) or have a unique referent within a family (‘father’ rather than ‘uncle’) or the distance to the “ego” is no more than one generation (‘father’ rather than ‘grandfather’)

It is possible to observe certain stages in the development of parental-like kinship terms. When they are ﬁrst introduced into the language, whether as borrowings from other languages or from “Motherese”, the egocentric, proper-name-like use is often the only possible one. At this stage, the terms behave grammatically as proper names do in the language. At a second stage, it becomes possible to use them with third-person possessors, as well, in which case the kinship terms will tend to be treated as ordinary common nouns in the language. They may for instance be used with a deﬁnite article, as in English the father. Usually, such forms cannot be used

11.It should be added here that the tendency to use possessive pronouns rather than deﬁnite articles is not restricted to body-part terms. There is a clear diﬀerence between Swedish and English here. Consider the English phrases lock the car and lock your car (typical signs in garages) — the latter is about twice as common as the former, to judge from a quick Google search. The Swedish phrase lås din bil ‘lock your car’, on the other hand, does not show up at all (with a single exception on a webpage from Finland) although lås bilen ‘lock the car’ is quite common.

151

152

The Growth and Maintenance of Linguistic Complexity

with ﬁrst or second person possessors. In further developments, however, this may become possible and even obligatory: in certain Northern Swedish vernaculars, nouns with suﬃxed deﬁnite articles are the normal way of referring to the speaker’s own relatives, as well, e.g. farfar-n ‘the paternal grandfather-def’ or papp-en ‘daddy-def’.12 Alternatively, it is the construction with a possessive marker that becomes normal or obligatory with all kinds of possessors. We argued in Dahl & Koptjevskaja-Tamm (2001) that the development just sketched can be seen as an example of lexical integration, a process by which new lexemes are gradually pulled into the grammatical system of the language. This process both resembles and interacts with the maturation processes discussed in this book. Thus, kinship terms seem to have their own life cycles, during which they both acquire new uses for themselves and become part of the domains of expanding grammatical patterns. Given the inherent relationality of kinship terms and body part nouns, a possessor noun phrase is always meaningful, at least in principle — this is what makes obligatory possessor marking possible in the ﬁrst place. But the possessive markers used with relational nouns often have a very low informational value. As pointed out in Dahl & Koptjevskaja-Tamm (1998, 2001), kinship terms and body part terms both tend to be used in contexts where the possessor is highly predictable, although they diﬀer in how the choice of possessor is determined in the typical case. Kinship terms are usually anchored to the speech-act participants — they are egocentric and pragmatically anchored. Thus, father is normally ‘my father’ or ‘your father’. Body part terms, on the other hand, are typically syntactically anchored — they tend to occur in certain syntactic constructions, where the possessor is indicated by the subject or some other NP in a determined syntactic position. Thus, ‘Mary hurt foot’ will be interpreted as ‘Mary hurt her foot’. The rise of obligatory possessive marking means that possessive pronouns (or aﬃxes) enlarge their domain of use and are used even when this is not motivated from a strict communicative point of view. The expansion thus goes from contexts where the possessive pronouns have a high (or relatively high) informational value to contexts where the informational value is low or even nil. It is signiﬁcant that the development of possessive constructions that leads to alienability splits takes the

12.In fact, there are further complications. In colloquial Swedish, there is a peculiar set of kinship terms derived by the suﬃx -sa, e.g. farsa ‘father’, morsa ‘mother’, which are normally used with the suﬃxed deﬁnite article -n even with egocentric reference. In middle-class speech, the use of these terms indicates a slightly disrespectful attitude (appropriate for teenagers); in the working-class varieties in some parts of the country, this connotation is lacking. Nobody seems to know the origin of the -sa forms (Lars-Gunnar Andersson, personal communication). One is reminded, however, of the British English working-class use of the wife for ‘my wife’.

Grammatical maturation

same route. Almost invariably, constructions restricted to inalienable possession exhibit characteristics of mature patterns and are demonstrably historically older than the alienable constructions they co-exist with: this is the essence of the crosslinguistic patterns noted above. Why is this? There are actually two possible routes by which an alienability split may arise, as noted by Nichols (1988: 589), one conservative and one innovative: (i) a new construction develops for alienable possession but fails to expand to the inalienable cases; (ii) a split arises in a general possessive construction by inalienables falling prey to phonetic reduction — in the framework proposed here, due to their lesser informational value. The ﬁrst possibility is well documented in e.g. the Maltese case. The second is somewhat more diﬃcult to establish empirically. A plausible case is the split in Catalan between cases such as ma mare ‘my.f.sg mother’ and la meva casa ‘the my.f.sg house’, where the two forms of the possessive seem to be derived from the same historical source. Likewise, the numerous examples of enclitic possessive pronouns — reduced variants of the usual ones — in diﬀerent Italian varieties enumerated by Rohlfs (1954) with rare exceptions contain kinship terms and might also be a case in point, to judge from the form of the possessive morphemes — but lacking any real evidence, this must remain speculative. 7.6.2 Locational constructions Another case of an incipient grammaticalization of a semantically redundant construction is found in a number of languages in northern Europe, in which there is a clear tendency to use posture verbs such as ‘stand’, ‘lie’, and ‘sit’ as main verbs in sentences that specify the location of an object. Consider e.g. the following examples from Swedish: (67) Swedish a. Boken ligger på bordet. book.def lie.prs on table.def ‘The book is on the table.’ b. Boken står på hyllan. book.def stand.prs on shelf.def ‘The book is on the shelf.’ In general, it is also possible to use a plain copula construction in Swedish, as in (68), but in many contexts, the alternative with a posture verb is felt as more natural. (68) Swedish Boken är på hyllan. book.def be.prs on shelf.def ‘The book is on the shelf.’

153

154

The Growth and Maintenance of Linguistic Complexity

The choice of (67b) rather than (68) thus means that the speaker expresses information that is not really essential to the message. This may be said to be the ﬁrst step towards grammaticalizing the posture verb construction and making it obligatory. Indeed, it is diﬃcult to disentangle semantics, pragmatics and grammar here. In a language such as Swedish, the choice between the copula and a posture verb in a locational construction is apparently made at least partially on other grounds than pure considerations of relevance — this can be interpreted as a shift in the division of labour from the freely manipulable component to the automatized or regulated part of language, if we like, from pragmatics to grammar. In (67), the choice between the posture verbs ligger ‘lies’ and står ‘stands’ still depends on properties of the situation that is described, that is, the actual horizontal or vertical orientation of the object in question. The posture verb construction is also used, however, for objects whose position is ﬁxed, such as geographical objects, buildings etc. In these cases, the choice of posture verb is already totally conventionalized. For instance, cities ‘lie’ in Swedish but ‘stand’ in Russian (to some extent also in English). For mobile objects as well, the choice of verb sometimes depends more on the type of object than on its actual position. Thus, in Swedish, a tablecloth or a cutting board ‘lies’ on a table, whereas a tray or a plate ‘stands’ (presumably because they have raised edges): (69) Swedish a. Skärbrädan ligger på bordet. cutting-board.def lie.prs on table.def ‘The cutting board is (lit. lies) on the table.’ b. Brickan står på bordet. tray.def stand.prs on table.def ‘The tray is (lit. stands) on the table.’ It is not implausible that what we see here are the initial stages of the development of a system of “classiﬁcatory verbs”, that is, a system where the choice of verb depends on features such as the shape of one of its arguments. Another possible development is for one posture verb to take over totally. This is what appears to have happened in those languages (e.g. Spanish) where the descendants of the Latin verb stare ‘stand’ is now the standard verb in the corresponding constructions. In eﬀect, this means that the redundant expression of posture has now been eliminated: Spanish el libro está en el mesa ‘the book is on the table’ does not tell us anything about the horizontal or vertical placement of the book. Locational constructions are generally rich in semantically redundant elements. Much of what has been said here about posture verbs has parallels in the system of locative adpositions of languages such as English. Another kind of redundant element is exempliﬁed by the German preﬁx hin- und her-, which indicate the direction of a movement relative to the deictic centre, as in:

Grammatical maturation

(70) German Sie ging in-s Haus herein/hinein. she go.pst in-def.n.acc house in ‘She went into the house.’ This system has already undergone a development similar to that of stare in Romance: in spoken her- has been reduced to r- and taken over the niche of hin-. The form ‘rein thus expresses a movement irrespective of its relation to the deictic centre. Notice also that the morphemes in and -ein derive etymologically from the same source and thus at least historically constitute redundant expressions (see 10.6 for further discussion). The introduction of semantic redundancy clearly means an increase in system complexity: the speaker is forced to make a choice between several diﬀerent expressions, depending on factors that are extraneous to the message. The system may stabilize at the higher level of complexity but there may also be levelling, as in the her- and hin- case, leading to a state which is equivalent in complexity to the original one.

155

Chapter 8

Pattern adaptation

8.1 Introduction After having spread to new uses, patterns adapt by reducing phonetically and becoming tighter. In this chapter, I shall look closer at these processes.

8.2 Reductive change Neogrammarian vs. adaptive sound change. In discussing the role of phonetic change in linguistic maturation processes, it is necessary to make a distinction between two major types of such change, which are sometimes conﬂated and subsumed under headings such as “erosion” and “attrition”: –

–

The ﬁrst type is that of classical Neogrammarian “sound laws”, ones that hit the lexical items in a language indiscriminately. Examples would be the Great Vowel Shift in English or the consonant shifts at various points in Germanic. Call this Neogrammarian sound change. The second type would be a sound change that hits certain expressions as a response to their acquiring new niches or being used more often. For instance, the Spanish phrase Vuestra Merced ‘your grace’ was reduced to Usted as a consequence of being used in lieu of a second person pronoun. Call this adaptive sound change.

It is probably fairly uncontroversial to say that Neogrammarian sound change is reductive in the majority of all cases, that is, that it tends to reduce the phonetic weight of utterances. Even if there are quite a few exceptions, the result is that over time, linguistic expressions are shortened and/or simpliﬁed in their phonetic make-up. Extreme examples are the evolution from Latin Augusti to French [u] (orthographically août) or Latin aqua to French [o] (orthographically eau, cf. also the similar-sounding Scandinavian cognate å [o˜] ‘(small) river’). Indeed, the accumulation of Neogrammarian reductive change over longer periods indeed has many of the characteristics of erosion processes as deﬁned in 4.5: a large number of small changes hitting the object seemingly at random and in the end wearing it down. Clearly, during its life cycle, a grammaticalizing element may well undergo phonetic

158

The Growth and Maintenance of Linguistic Complexity

changes of this type, which make it lose phonetic weight. But since by deﬁnition, Neogrammarian sound change applies to all elements of language in the same way, it cannot explain those reductive changes that are speciﬁc to maturation processes such as grammaticalization. It would also seem that, being essentially a special case of the Second Law of Thermodynamics (see 3.2), it cannot also explain how complex grammatical systems are built up. Thus, Neogrammarian sound change does not really explain a maturation process such as grammaticalization, since it applies to all elements of language alike. Instead, Neogrammarian sound change has an important role to play in maturation processes, in that it contributes to lexical idiosyncrasy. For instance, the tricky relationship between masculine and feminine forms of French adjectives as in grand : grande [gr˜": gr˜"d], petit : petite [p6ti : p6tit] is largely due to the loss of ﬁnal consonants and schwas. Here, Neogrammarian sound change clearly leads to an increase in system complexity. On the other hand, if such a change or set of changes leads to the total disappearance of a grammatical marking, this might result in the simpliﬁcation of the grammatical system, an issue I shall return to below. Without invalidating the distinction between Neogrammarian and adaptive sound change just introduced, it is necessary to point out some complications. First, we may note that the exceptionlessness of sound-laws has been contested in various ways; the observation that sound change is sometimes implemented through “lexical diﬀusion” means that a sound change may well apply diﬀerentially to lexical items, which makes it at least conceivable that Neogrammarian change might start out as an adaptive sound change. Moreover, a Neogrammarian sound change which is contingent on prosodic prominence — for instance, a change of vowel quality in unstressed syllables — will aﬀect lexical and grammatical morphemes in a diﬀerentiated way, and thus be quite similar to an adaptive change. Diﬀerences in frequency may play a similar role. Also, it should be noted that the two kinds of reductive change need not really be diﬀerent in the ways that they change the phonetic shape of words. After all, there are only a limited number of ways to reduce an expression phonetically — it is natural that the same processes should show up in many places. The causal mechanisms behind adaptive sound change. It will come as no surprise to the reader at this point that I see adaptive sound change as mainly driven by redundancy and prominence management, as outlined in Chapter 2. Adaptive sound change is in my view a reaction to the changed role of an expression, when it is trapped in a complex construction, or when the construction of which it is a part expands its territory. It can be seen as a way of restoring the balance between the communicative role of an expression and its form. It follows that “erosion” and “attrition” are not suitable metaphors for this process — if one needed such a metaphor, it would rather be “trimming”, as suggested in 4.5.

Pattern adaptation 159

Basically, adaptive phonetic reduction would be a response to a decrease in the informational or rhetorical value of the expression. As a general idea, this is hardly a new one. It is an important component of the work of Helmut Lüdtke, for instance (e.g. Lüdtke (1980)). Givón (1991) proposes a “Quantity Principle” according to which “a larger chunk of information” or “less predictable information” “will be given a larger chunk of code” or “more coding material” as an explanation of the “larger size of, and more prominent stress on, lexical words, as against grammatical morphemes”, although his explanation of the principle is somewhat vague (it “must be sought in the areas of attention and mental eﬀort”). On the other hand, Newmeyer (1998: 254) emphatically rejects the idea of a connection between the amount of information conveyed by an expression and its grammatical status, with slightly dubious arguments, favouring instead an explanation in terms of frequency (which is strange, in view of the intimate relationship between those concepts, see below). Notice, however, that it is not the grammatical status as such that conditions the informational value of an element — rather it is the change of its role in discourse that entails a change in its informational value and thus leads to phonetic reduction. Something similar may happen to e.g. politeness elements which become more or less obligatory and are consequently reduced (see p. 127). The role of frequency in maturation. The two most salient ways in which the frequency of an item correlates with its behaviour in maturation processes are: – –

high frequency items are more resistant to new, expanding patterns high frequency items more easily undergo reductive change

Thus, high frequency items are associated both with “conservative” and “innovative” tendencies in maturation (Bybee (2001: 12)). This means that the high frequency items in a category may come to be diﬀerentiated from the rest in two diﬀerent ways — either passively, by not undergoing a change, or actively, by undergoing it. As for the ﬁrst tendency, it is commonly assumed that frequent items are more entrenched and thus more immune to new patterns. A slightly more indirect explanation is also possible: it is not frequency in itself that saves such items but rather that they are in general acquired early by children. What about the second tendency? There is a direct link between frequency and informational value — indeed, in information theory they are simply two sides of the same coin. To the extent that elements with a low informational value are reduced, so too will elements with a high frequency. But in the same way as there are diﬀerent concepts of information, frequency may also be understood in diﬀerent ways, and not all of them are necessarily equally relevant to phonetic reduction processes. In particular, we will run into diﬃculties if we just consider the total frequency of an expression in language. If reduction were directly dependent on

160 The Growth and Maintenance of Linguistic Complexity

the token frequency of an item, it ought to follow that all tokens of the item would be equally reduced, which is clearly not the case. Instead, what we regularly observe in grammatical maturation is that an item is reduced in particular uses or particular contexts, which may in the end lead to “divergence” or “split” of the item into two, as in the development of the English indeﬁnite article from the numeral one. In addition, it may be conjectured that the speaker’s calculations of how easy it is for the listener to make the right guess about the expression play a signiﬁcant role for the degree of reduction. Considerations such as these are, I think, deﬁnitely a stumbling block for those who, like Newmeyer (1998), try to explain the phonetic reduction processes connected with grammatical maturation as a frequency eﬀect tout court. Jurafsky et al. (2001: 229) advance the “Probabilistic Reduction Hypothesis” according to which word forms are reduced when they have a higher probability. They do not give a deﬁnition of “probability” but say that it is contingent “on many aspects of its context, including neighboring words, syntactic and lexical structure, semantic expectations, and discourse factors”. This does not really exclude anything and leaves unresolved the question of whether probability is an objective or a subjective concept. In the study presented in the paper, however, an important role is attributed to “predictability from neighboring words” — that is, how probable a certain word form is in the environment of certain other word forms. This probability was calculated on the basis of the text frequencies of the word combinations. Jurafsky et al. say that their results “suggest that probabilistic relations between words must play a role in the mental representation of language”, but again do not make more precise what “probabilistic relations” is supposed to mean. So far, we have been speaking of redundancy management, by which redundancy is kept at the level which ensures safe transmission at a minimal cost. However, there is a further possible connection between frequency and phonetic weight. One lesson of information theory is that the number of alternative expressions determines their minimal length. Thus, the length of telephone numbers has been growing with the number of subscribers in telephone networks. But not all telephone numbers need to be equally long. Modern telephones oﬀer you the possibility of choosing shortcut numbers. Usually they consist of one digit only and there can therefore only be ten of them. Obviously, you save more energy if you allocate them to the persons you most often make calls to. This fairly elementary principle of optimal coding seems suﬃcient to account for the observation usually ascribed to Zipf (1935),1 that the length of words is inversely correlated to their frequency, and provides an additional motivation for the phonetic reduction of frequent words

1.In the linguistic literature, this is sometimes called “Zipf ’s law”, but the principle traditionally referred to by that name is another one: that the relationship between frequency and rank is a power-law function with an exponent close to 1.

Pattern adaptation

— although perhaps it should be put this way: being lazy, we would like to shorten all words, but we can only shorten a few of them without making the system unusable, and thus we had better choose the frequent ones. Other explanations for the connection between frequency and phonetic reduction have been proposed. Bybee & Hopper (2001: 11) say that “the origins of reduction are in the automatization of neuro-motor sequences which comes about with repetition” and Bybee (2001: 11) argues for a gradual theory of phonetic change “with each token of use having a potential eﬀect on representation”, which would lead to a faster rate of change for high frequency items as their production is automated through repetition (see also 5.3).2 It is not wholly clear to me if such a mechanism would primarily speed up reductive processes that exist in the language anyway or if it could be seen as a more general explanation for phonetic reduction in grammatical maturation. Bybee (forthcoming) notes the so-called “priming eﬀect” in word recognition tasks, that is, the tendency for recently presented words to be accessed much more quickly than words that are presented for the ﬁrst time. She claims that this eﬀect shows up in natural discourse as well, pointing to a number of studies that show that “one of the best predictors of the use of a construction is its use in immediately preceding discourse”. These results are somewhat baﬄing in that there is also a directly opposite tendency — to avoid repetition (of course, this tendency could also be ascribed to a priming eﬀect). One possibility is that the anti-repetitive tendency is operative above all in written and formal spoken discourse. In any case, an interesting but neglected eﬀect of grammaticalization is that any reluctance to repeat the items in question disappears. Consider the following evidence from Finnish. As described in traditional grammars, Finnish has no deﬁnite article. In the spoken language, however, the demonstrative se is often used in a way that has led to claims about its becoming grammaticalized as a deﬁnite article (Laury (1997)). It appears that the development is at least far from being completed, but certain speakers do use se consistently with animate (and often with inanimate) noun phrases in anaphoric function (Juvonen (2000: 136)). In the following example, forms of se used adnominally are in boldface: (71) Finnish …niin sit se mies meni ja, …so then this man go.pst and osti ne kaikki ilmapallot buy.pst this.pl all balloon.nom.pl

2.This view is in fact reminiscent of the idea that grammatical morphemes are “worn down” by use, as formulated by Gabelentz (see above 7.4).

161

162 The Growth and Maintenance of Linguistic Complexity

ja anto ne sille pojalle, and give.pst them this.all boy.all ja sit se poika… and then this boy ‘so then the man went and bought all the balloons and gave them to the boy, and the boy…’ In other styles, or by other speakers, such a repeated use of se would be seen as unduly repetitious (Päivi Juvonen, personal communication). It may be noted in this context that it seems rather unlikely for a speaker of English to react against there being too many thes in a sentence, even in formal written language. It thus appears that grammaticalized items become less salient in the processing of discourse, even to the extent that they become hard to perceive consciously. The role of prosody. One would expect that prosodic notions would be given a prominent place in the discussion of grammaticalization and maturation processes in general. A major characteristic of grammatical elements is that they tend to be either unstressed or share a stress domain together with some lexical item. Furthermore, as we have seen, reductive phonetic change may also target unstressed syllables in general. Having independent stress is one of the major criteria for wordhood: the compacting of multi-word into one-word structures is central to maturation processes, so changes in stress patterns will necessarily be a central part of them. However, strangely enough, prosody is very much neglected in grammaticalization research. For instance, in the standard textbook Hopper & Traugott (1993), one looks in vain in the subject index not only for the term “prosody” but also for related ones such as “stress” and “intonation”, and in the text, the loss of “a stress or tone accent” is mentioned only in passing as an example of “phonological concomitants of morphologization” (p. 145). Very much the same could be said of recent treatments of incorporating constructions. Detailed accounts of such constructions may be totally silent about prosody, which means that the reader is often left in the dark with regard to the reasons why something is seen as a one-word construction in the ﬁrst place. The phenomena subsumed under prosody are intimately connected with what I called “prominence management” (2.4). Thus, stress, quantity, and intonation all involve the distribution of energy among the components of an utterance. To a large extent, as I argued there, prosodic phenomena may be seen as conventionalizations of natural, partly non-learned mechanisms, and as the development of system-level out of user-level prominence management. The processes involved in this conventionalization are not well understood although they are most certainly of direct relevance to many of the topics discussed in this book. Of even more crucial importance, but not much better understood, are the changes in prosodic status that expressions undergo when they mature.

Pattern adaptation 163

The central theoretical issue, in my opinion, is the place of such changes in the causal chains underlying maturation processes. Are they just “phonological concomitants of morphologization”, as suggested by the treatment in Hopper & Traugott (1993) or can it even be the other way round — that morphologization is at least partly driven by prosodic change? To me, it would appear natural to assume that the ways in which we structure an expression into units such as phrases and words are highly dependent on the prosodic and rhythmic properties of the expression, and that those properties are partly determined by other, independent factors involving information structure and prominence management. But this does not exclude, of course, that there are also inﬂuences going in the other direction. In particular, given that there exist a certain number of patterns in a language, each of which is conventionally connected with certain prosodic properties, we may expect new expressions to assimilate to those patterns. That is, when for instance a combination of words are “entrenched” and lexicalized, there will be a ﬁnite number of choices for what the expression will look like prosodically. When an expression gets trapped in a larger construction and loses its rhetorical value and informational autonomy, it will also naturally be treated as a less prominent item and be downgraded prosodically. What kind of phonetic reduction that entails in concrete terms, however, seems to some extent to depend on factors having to do with the overall structure of the language. It is well known that closely related languages may diﬀer signiﬁcantly in the degree to which non-prominent segments, such as vowels in unstressed syllables, are reduced phonetically. Cf. such pairs of more or less closely related languages as Danish – Norwegian, Portuguese – Spanish and Russian – Polish, where the ﬁrst member exhibits a much stronger reduction of unstressed syllables in general. Moreover, although inﬂectional morphemes are often unstressed and thus subject to reduction, the more general statement is that they comprise a prosodic domain with the word stem they are attached to. This means that inﬂectional morphemes sometimes carry full stress and are thus not reduced. In a language like Russian, with complex stress alternations in inﬂectional paradigms, one and the same inﬂectional suﬃx may contain a fully stressed and prominent vowel or be realized as a schwa, depending on the stress pattern of the lexical item. Cf. the Russian genitive singular form ﬂót-a [Áﬂo™t6] ‘ﬂeet’ from ﬂot [ﬂo™t] on the one hand, and kot-á [k%Áta™] ‘tom-cat’ from kot [ko™t], on the other. Possibly, the fact that an inﬂection sometimes is manifested in a more salient form has a stabilizing function in that it helps to preserve its identity in the system. Morpheme-size reduction. There is a type of reduction that shows up fairly often in grammatical maturation and that does not look like phonetic reduction. I am referring to those cases where a morpheme disappears in what at least appears to be a one-step change rather than a gradual reduction to zero. A nice illustration is the Swedish construction kommer att + Verb, used in statements with future time

164 The Growth and Maintenance of Linguistic Complexity

reference. In older stages of Swedish, as is still the case in Danish and Norwegian, it had the form present of komma ‘come’ + till ‘(preposition) to’ + inﬁnitive marker att + inﬁnitive of main Verb In modern Swedish, the preposition till ‘to’ has been dropped, and in a development that has not yet been accepted as standard language, the inﬁnitive marker is also dropped, yielding simply kommer + Verb. If this were phonetic reduction, we would, I think, rather expect till and att to fuse. In Russian, the old Slavic perfect, which consisted of a copula and a participle, has developed into a past tense, and in this process the copula was deleted. Similar developments are found in other Slavic languages. In Bulgarian, there is a functional diﬀerentiation between forms with and without copula (although there is disagreement about its status). Such a diﬀerentiation is also found in e.g. the Indic languages. In written Swedish, the perfect-forming auxiliary ha ‘have’ may be omitted, but only in subordinate clauses. This possibility apparently developed under the inﬂuence of German, although it no longer exists in Modern German. The ﬁnal eﬀect of such developments is to create one-word forms out of periphrastic constructions without there being any fusion. If the traditional deﬁnition of grammaticalization is taken literally, these cases would not be covered, since there is really no morpheme that acquires a more grammatical function. In the examples I have given, it is the free markers that are dropped, while aﬃxes are retained.

8.3 The concerted scales model of grammaticalization Scholars studying grammaticalization have observed that there seems to be a relatively high degree of correlation between grammaticalizing items in diﬀerent languages with respect to the relation between form and meaning. Thus, we noted in Bybee & Dahl (1989: 68) that “the phonological reduction necessary for aﬃxation moves hand in hand with the reduction of semantic content in grammaticization”. In what I shall refer to as the concerted scales model of grammaticalization, this correlation is made explicit, in that the degree of grammaticalization of an item is understood as depending on the value of a set of parameters. One of the most explicit formulations of this model is found in Lehmann (1985: 306), where grammaticalization is understood as the parallel change along six scales or parameters applying to linguistic signs: – –

integrity: the “substantial size, both on the semantic and phonological sides” of a sign scope: “the extent of the construction which it enters or helps to form”

Pattern adaptation 165

– – – –

paradigmaticity: the degree to which a sign “enters a paradigm, is integrated into it and dependent on it” bondedness: “the degree to which it depends on, or attaches to” other signs in the syntagm paradigmatic variability: “the possibility of using other signs in its stead or of omitting it altogether” syntagmatic variability: “the possibility of shifting it around in its construction”

Lehmann further hypothesizes that the scales are correlated in such a way that an item (in a given synchronic state of a language) tends to occupy more or less the same position on each. For each of these scales, there is an associated process by which a sign moves from one end to the other. Thus, the ﬁrst scale, integrity, is associated with the process of attrition, deﬁned as “the gradual loss of semantic and phonological substance”. Herein lies a paradox, in that the highest degree of grammaticalization turns out to entail the total annihilation of the item: when attrition comes to its logical end-point, the item has totally lost both meaning and overt expression. Lehmann says: “If grammaticalization proceeds further, the parameter of integrity assumes the value ‘zero’, whereas the others cease to be applicable.” The total loss of phonological substance or phonetic weight does indeed happen, as in the following example of a wholly unmarked possessive construction in Old French, where the original genitive marking has been reduced to zero: (72) Old French la bouche sa mere the.f.sg mouth his.f.sg mother ‘his mother’s mouth’ (Herslund (1980: 126)) But it is certainly counter-intuitive to assume that all juxtapositional constructions are necessarily preceded by a long grammaticalization process and that they could not be found in a “Garden-of-Eden” language. And if zero-marked constructions can arise from scratch, we are left with two kinds of zero-marking, one that is the ﬁnal output of grammaticalization and thus presumably maximally grammaticalized and one that has not undergone any grammaticalization at all. The same applies to the last phase of the process referred to by Lehmann as “coalescence”, that is, increase in bondedness, “in which the grammaticalized item loses its morpheme identity, becoming an integral part of another morpheme”,3

3.It turns out that Lehmann’s examples of fusion do not quite ﬁt his description. Consider what he calls “symbolic alternations”, for instance the pair foot : feet in English. Here, as Lehmann himself notes, there is no fusion on the semantic side between the ‘foot’ and ‘plural’ components in feet, and it may be added that the ‘plural’ component is not only semantic but

166 The Growth and Maintenance of Linguistic Complexity

and which can thus be labeled “fusion or merger” (Lehmann (1982: 148)). Again, if an item is completely absorbed into another item, it presumably ceases to exist as something that can be ascribed a degree of grammaticalization, and the resulting state of the language is impossible to distinguish from a “Garden-of-Eden” case. It appears that one source of the problem is a confusion between two ways of understanding “degree of grammaticalization”. One is essentially diachronic: what is measured is the amount of change that the item has undergone; the other is synchronic, measuring parameters such as “phonetic weight” and “size of paradigm”. When an item is grammaticalized, it tends to undergo phonetic reduction and become “smaller”. But we cannot, in principle, tell from the size of an item if it has been reduced or not. For instance, the Spanish formal second person pronoun usted ‘you (polite)’ has relatively recently been reduced from Vuestra Merced ‘your Grace’ but is still heavier than the informal pronoun tú ‘you,’ which has been monosyllabic since prehistoric times. From the synchronic point of view, a parameter such as Lehmann’s “integrity” thus cannot be used as a measure of degree of grammaticalization. But there is also another aspect, one which I think takes us to the essence of maturation processes in general. Although Lehmann assumes as a general rule that an item has the same position on any two scales at any given time, he allows for the possibility that the correlation is not complete. The construction to + inﬁnitive in English is a case in point: it is less “bonded” than one would expect in view of its advancement on the other scales. Such situations would be exceptional in Lehmann’s view, however. I now want to claim that in fact, a certain lack of correlation is in a sense even necessary. Consider the case of the token sandwiches discussed in 5.3. Swedish law prohibited the serving of liquor without food. The customers who just wanted to drink thus saw the de facto cost of liquor increase and tried to restore the balance in the cost-beneﬁt calculation by reducing the “culinary substance”, ordering only a minimal sandwich. But they were not able to go all the way: they could not reduce the sandwich to zero, even if they wanted to, because that would mean that the demands of the law would not be fulﬁlled. In a similar way, when a linguistic item acquires a new, typically more grammatical, use, its informational value diminishes. This upsets the balance in the speaker’s cost-beneﬁt calculation, which motivates a counter-move — a reduction in phonetic weight. In many cases, it would seem that the informational value of grammatical markings is zero or very close to it. After all, we can usually ﬁnd

also has a grammatical role to play, e.g. in agreement. Contrast this with a genuine case of merger: the development of the form data in English, which for many speakers is not the plural of datum but rather an isolated mass noun, taking singular agreement.

Pattern adaptation 167

languages or dialects that live happily without those particular markings. Consider, as a concrete example, plural endings in the context of a numeral, e.g. three apple-s, where it would seem that the plurality is unambiguously signalled by the word three and the plural morpheme -s would thus be completely superﬂuous. Indeed, in many languages where nouns inﬂect for numbers, plural markers are omitted after numerals. In English and its ancestors, on the other hand, they have been used obligatorily over millennia, in spite of being almost empty from the informational point of view. Similar examples are found everywhere in languages, but attract little attention, since the absence of change is usually not considered to need any explanation. However, from a slightly diﬀerent perspective, it has been proposed that synchronic phonological processes (the “phonological rules” of generative phonology) tend to be blocked “in environments where their free application would wipe out morphological distinctions on the surface” (Kiparsky (1992: 87)), and this proposal has been supported by observations such as the following. The conditions under which speakers of American English delete /t/ and /d/ in word-ﬁnal position have been studied in great detail, and it has consistently been found that the /t/ in a monomorphemic word such as past is more likely to be deleted than the past tense ending in passed. Past forms such as kept, on the other hand, where the past tense is signalled also in the stem, behave like an intermediate case. Thus, it would seem that the /t/ is sheltered from deletion when its presence is necessary to uphold the morphological tense distinction. However, Labov (1994: Chapter 19) contests this conclusion, adducing counterevidence from a number of other investigations, concerning such phenomena as the deletion of ﬁnal /s/ in various dialects of Spanish, where it even seems that /s/ is more likely to disappear when it is the single grammatical marker of plurality. The issue is a tricky one, among other things because it is not always clear what one is discussing. The general view Labov argues against is one he ascribes to “functionalism”, viz. that sound change is constrained by speakers’ striving to choose the most eﬃcient and eﬀective way to put across their meaning. Thus, alternatives that lead to misunderstanding will be avoided. But this is in fact rather diﬀerent from a tendency to preserve grammatical markers, which would also aﬀect those grammatical markers that are informationally redundant, e.g. -s in three apples. In fact, Labov quotes an interpretation of the /t/ deletion data from Guy (1980), according to which the intermediate incidence of deletion of /t/ in “semiweak” verbs such as kept is due to a diﬀerence between speakers who analyze this segment as a separate morpheme and those who do not, which suggests a dependence on grammatical structure rather than on informativity. Similarly, the observation that /t/ in regular past participles is deleted as often as in the regular past tense forms contradicts the ambiguity avoidance principle (given that the former generally occur in constructions where they are not ambiguous) but not the hypothesis that grammatical markers are in general protected. We may also note

168 The Growth and Maintenance of Linguistic Complexity

that the diﬀerence between monomorphemic and bimorphemic words goes beyond the cases that are subsumable under deletion rules. Walsh & Parker (1983) found that the plural /s/ in laps has a longer duration than the /s/ in the monomorphemic word lapse. What this suggests is that (at least some) grammatical markers are given a slightly greater prominence, that is, the speaker spends more energy on them, than their informational value would warrant. This, however, is not explicable in terms of the speaker’s need to be understood, and is thus not a “functional” explanation in Labov’s sense. In order to account for a tendency to spend extra energy on grammatical morphemes, we have to invoke the norms of the language, more precisely the grammar. In fact, it could be argued that claiming that there is such a tendency is no diﬀerent from saying that the morphemes are obligatory, which presumably must mean that they have to cross some threshold of audibility. Having claimed that language change is not governed by speakers’ communicative needs, Labov suggests that there is a tendency for languages to “preserve their means of conveying information, more or less, by one route or another” (1994: 568). This would thus be an emergent tendency in the sense that it can only be deﬁned on the system level rather than on the level of the individual user. Labov adduces two examples from French to illustrate what he means. One is the replacement of the plural feminine article las by the masculine les, which made it possible to preserve the diﬀerence between singular and plural in spite of the loss of ﬁnal /s/, potentially leading to a merger of las with singular la. The other is the introduction of obligatory subject pronouns to compensate for the loss of person and number distinctions in the verb. The latter case thus involves the replacement of one marking pattern by a new, more verbose one. It was noted on p. 115 that explanations of the spread of a new construction or form in terms of speakers’ communicative needs are usually unconvincing. It is not obvious that it helps to move the account to a higher level. In the next section, we shall look more closely at the phenomenon of preservation of structure in reductive sound change.

8.4 Preservation of structural complexity Morphologization. Preservation of structure in spite of loss of phonetic weight is a phenomenon that can take place both in Neogrammarian and adaptive sound change. Consider a constructed example. Suppose we have a language in which -i is used as a plural suﬃx, yielding pairs of word forms such as (73) tat: tat-i Suppose, further, a sound change to the eﬀect that the ﬁnal i is dropped. One possible result is that the distinction singular : plural is now formally unmarked (at

Pattern adaptation 169

least in this word) — we will simply have (74) tat: tat However, we may also imagine that the lost ﬁnal vowel leaves a trace behind in the form of e.g. – – – –

palatalization of the ﬁnal consonant: tat’ palatalization/rising of the stem vowel: tät lengthening of the stem vowel: ta¯t an “incomplete” tone contour, reinterpreted as a rising tone.

What is common to all these possibilities is that when the ﬁnal vowel is deleted, the rest of the word continues to be pronounced as if it were still there. As a result, the contrast between the two forms is upheld, and the singular : plural distinction survives. This kind of structure-preserving change, where structural complexity is preserved although the phonetic weight of the expression decreases, occurs over and over again in maturation processes. Indeed, this is important since it makes it possible for the later stages of these processes to be qualitatively diﬀerent from the initial state. That is in a sense necessary: if structural complexity were not preserved, grammatical structure would be obliterated and verbose grammatical marking will tend to undergo phonetic reduction even to the point where there is no reason to speak of maturity, grammaticalization, or whatever. Changes of the kind I am speaking about here are commonly treated as cases of “morphologization”. Thus, what was originally an allophonic feature, e.g. the fronting of a back vowel in the vicinity of a front vowel, comes to be the single distinctive property of a form in a morphological paradigm. Lass (1990) calls this “exaptation”, borrowing a term from evolutionary biology, which was introduced by Stephen Gould and Elizabeth Vrba for “any organ not evolved under natural selection for its current use — either because it performed a diﬀerent function in ancestors … or because it represented a nonfunctional part available for later cooptation” (Gould (1991: 144)). An example of an exaptation would be the development of ﬁns out of legs in seals and other maritime mammals.4 The analogy between biology and linguistics leaks, however. If a previously terrestrial mammal comes to live in water, swimming becomes more useful than walking — thus legs may acquire a new function. But in the case of the morphologization of allophonic features, no new function arises. Rather, if for instance an inﬂectional suﬃx disappears, whatever allophonic features were conditioned by it may contribute to

4.Dennett (1996: 281) criticizes the notion of exaptation as being empty: “every adaptation is one sort of exaptation or the other — this is trivial, since no function is eternal; if you go back far enough, you will ﬁnd that every adaptation has developed out of predecessor structures each of which either had some other use or no use at all.”

170 The Growth and Maintenance of Linguistic Complexity

the preservation of the distinctness of the form — thus these features take over a function that was already there. In fact, in enhancing the redundancy of the form, they served this function even before the disappearance of the suﬃx, and this fact may even increase the probability of the reductive change. Better analogues of exaptation in linguistics would be those cases of “pattern spread” where an already existing pattern is employed in a new use, or in the lexical realm, when a word such as mouse is recruited to denote a new kind of object such as a computer pointing device. For the cases mentioned earlier, I personally prefer term Matisoﬀ’s term “Cheshirization” (1991) (after the Cheshire Cat in Alice in Wonderland, who “vanished quite slowly, beginning with the end of the tail, and ending with the grin, which remained some time after the rest of it had gone”).

8.5 Reanalysis and structural change According to Hopper & Traugott (1993: 32) “the most important mechanism for grammaticalization, as for all change” is unquestionably reanalysis, a notion deﬁned by Langacker (1977: 58) as: “a change in the structure of an expression or class of expressions that does not involve any immediate or intrinsic modiﬁcation of its surface manifestation”.

There is, however, no unanimity with regard to the relation between reanalysis and grammaticalization. At one end of the scale, we ﬁnd those generativists who agree with Hopper & Traugott in attributing a central role to reanalysis but who diﬀer from them in seeing grammaticalization as secondary to it, even as an epiphenomenon (e.g. Newmeyer (1998)). At the other, we ﬁnd Haspelmath (1998), who argues that reanalysis is a very restricted phenomenon with little or no relevance to grammaticalization. It could be argued that Langacker’s deﬁnition really says nothing substantial except that reanalysis is structural change. The clause “that does not involve any immediate or intrinsic modiﬁcation of its surface manifestation” can hardly exclude any cases of structural change from being labelled as reanalysis since you cannot ever really tell if any observed modiﬁcation of the surface manifestation is really “immediate” or not. The idea behind the cited clause is that overt manifestations of the structural change (sometimes referred to as “actualization”) will typically come about only gradually. This theory has an obvious problem: it is not veriﬁable. As Haspelmath (1998: 341) notes, “it allows one to posit non-manifested reanalysis as one pleases”. Hopper & Traugott (1993: 63) state categorically about reanalysis: “It is covert.” Given the above, I think it can be said that the explanatory value of the notion of reanalysis for grammaticalization has been somewhat overrated in the literature,

Pattern adaptation

the more so since the way the term is used often makes it more or less vacuous or at least synonymous with “structural change”. Thus, when Harris & Campbell (1995: 71) say that “through reanalysis, postpositions have developed from nominals which bear certain case endings in Finnish,” the phrase “through reanalysis” tells us nothing that is not already inferrable from the rest of the sentence, since a change from a case-inﬂected NP into a postposition can hardly be anything but reanalysis, on Langacker’s deﬁnition. Similarly, when the same authors say that the main clause in the French construction est-ce que [7sk6] ‘is it that’ + S in French has been “reanalyzed as a sentenceinitial question particle” this could equally well be formulated as “has developed into a sentence-initial question particle”, or even, “is used like a sentence-initial question particle”. Still, when Hopper & Traugott (1993) and Harris & Campbell (1995) speak of reanalysis as a major “mechanism” in grammaticalization and language change, respectively, this indicates that something more substantial is intended: a mechanism suggests a causal chain. Cf. also in this connection Traugott (1994: 1483), where reanalysis is said to be the most usual among “mechanisms of change leading to grammaticalization”. The development of French [7sk6] is worth considering in some detail. Presumably, what happens in such a case is that a periphrastic construction like ‘is it the case that S?’ comes to be used increasingly as a standard way (not necessarily the only one) to formulate yes-no questions. In being a ﬁxed string introducing questions, [7sk6] is thus functionally indistinguishable from a question particle. What is problematic here is whether this fact already makes it a question particle, or if something more is required. We may assume that after a while the connection to the source is lost and speakers acquire [7sk6] as a single morpheme, at which stage its status as a question particle seems indisputable. If this is what reanalysis means, it amounts to loss of structural information, or “forgetting”. On the other hand, such a complete fusion is not usually seen as a necessary condition for analyzing something as a grammatical morpheme. The essential problem, again, is whether categorization is determined by function, as a usage-based model would imply, or if it has a more active role as the driving force in the process. Haspelmath (1998: 339–341) argues that reanalysis must be seen as abrupt, whereas grammaticalization is gradual. Among his examples of gradualness in grammaticalization is the change from oblique experiencer to subject with verbs such as like in English, where “there is widespread agreement that such a change can be gradual in that the experiencer nominal may acquire the various subject properties one after the other rather than all simultaneously”5 (339). One may ask

5.Gildea (1997) gives an account of a development of transitive (ergative) constructions out of passives in Cariban languages along similar lines.

171

172 The Growth and Maintenance of Linguistic Complexity

how diﬀerent this is from the account given of the gradual actualization of reanalysis. The problem, I think, boils down to the question of the status of the structural analysis. All authors seem to agree that there must already be a potential ambiguity in the structure before the reanalysis takes place. On the other hand, according to the view proposed by Timberlake (1977), there is a strict temporal ordering between reanalysis and actualization, and a speaker will in principle never have more than one analysis in his or her head. Even if consequences of the old analysis may linger on for a long time, they will be wholly unmotivated by the structure (Timberlake (1977: 153)). Pace Timberlake and Haspelmath (1998: 340–341), who thinks that it is “highly implausible” that a speaker can entertain two diﬀerent analyses at the same time, I would submit that there is not really any very good argument against such a possibility. Given the view of grammatical structure suggested in 5.4 above, this would in fact be a normal situation. For instance, in the transition from oblique experiencer to subject, the old analysis of the NP as an indirect object would gradually fade while the new analysis gains in strength. Haspelmath makes a valid point in this connection, however. In the “abrupt” view, the gradual actualization is caused by the reanalysis — the output gradually adapts to the structure. On the assumption that both analyses are present during the whole process, there is no automatic reason why the new analysis should be preferred to the old one. Actually, this is only a problem in the initial phases of the process. As soon as the new analysis becomes stronger, it will be able to “exert attraction” and inﬂuence the transition from the old analysis. But to explain why the process starts in the ﬁrst place we have to invoke other forces that may not always be easy to identify. That structural change takes place in grammaticalization and in maturational processes in general is in my opinion hard to deny, and whether we should call it “reanalysis” or not seems to be a terminological question. What is more problematic is what place structural change has in the causal chain underlying these processes, in particular, what role it has in relation to reductive changes in the output. Hopper & Traugott (1993) make somewhat contradictory statements about this. On the one hand, “signal simplicity typically results from the routinization (idiomatization) of expressions” (p. 64), on the other, it is said about the development of the progressive that “once the reanalysis has occurred, be going to can undergo changes typical of auxiliaries, such as phonological reduction” (p. 3). Also, it is said that morphologization (presumably a type of reanalysis) is “accompanied by” reductions (p. 145). Traugott (1994: 1482) is more unequivocal in putting reanalysis ﬁrst: “Reanalysis involving boundary loss…is a prototypical case of grammaticalization, since it involves reduction and subsequent phonological attrition.” Harris & Campbell (1995) allow explicitly for both possibilities. Thus, they take the view that reanalysis may be triggered by a phonological change, such as the “erosion” of the Nuclear Micronesian verb that preceded the rise of the incorporating

Pattern adaptation

construction discussed in (10.2), but also mention a contrary example: the Old Georgian relative pronoun romel(-i) was reanalyzed as a relative particle and later phonologically reduced to ro(m) (p. 81). In my own view, those reductive processes that are speciﬁc to maturational processes are driven by changes in the informational status of expressions, and structural changes will in general be caused by rather than be the causes of reduction. On the other hand, more general phonological changes may also motivate restructurings, as when grammatical morphemes are totally obliterated. As mentioned above, it is usually claimed that a structural ambiguity must exist for reanalysis to take place. If this ambiguity is the result of another recent change, that change is naturally seen as the original trigger of the reanalysis. If, on the other hand, the ambiguity has been “endemic” for a longer time, it is harder to see how reanalysis happens. As Haspelmath notes, rather disparate phenomena have been subsumed under reanalysis. Thus, a major distinction should be drawn between – –

relabelling or recategorization, that is, the assignment of a diﬀerent category label to a single element in a structure; rebracketing or better: regrouping, that is, changes in the ways elements are grouped or otherwise interrelated.

The distinction plays a crucial role in Haspelmath’s refutation of the relevance of reanalysis for grammaticalization. Relabelling does occur in grammaticalization, he says, but it is not really reanalysis, since it exhibits properties that are not characteristic of reanalysis, such as gradualness and unidirectionality. Rebracketing (regrouping), on the other hand, is indeed reanalysis, but is not grammaticalization. Two of Haspelmath’s examples of rebracketing are: –

the integration of the prepositional phrase for NP with a following inﬁnitival clause in the history of English (75) (it is good (for him) (to stay)) > (it is good (for him to stay)) [my example]

–

the rise of possessive NPs from external possessor constructions as in German dialects: (76) German a. Da zerriss (dem Jungen) (seine Hose). then rip.pst (def.dat.m.sg boy.dat.sg (his.nom.f.sg trousers ‘Then the pants tore on the boy.’ b. Da zerriss (dem Jungen seine Hose). then rip.pst (def.dat.m.sg boy.dat.sg his.nom.f.sg trousers ‘Then the boy’s pants tore.’

173

174 The Growth and Maintenance of Linguistic Complexity

This latter case exempliﬁes a rather common development, at least in European languages, although there are diﬀerent varieties — the possessive pronoun may or may not be present, like the explicit dative case marking. The examples cited cannot be subsumed under reanalysis, Haspelmath says, because in them (i) “no particular element needs to become more grammatical(ized) as a result of the change” and “the whole construction does not necessarily become tighter”, (ii) hierarchical relations are changed in an abrupt manner, (iii) “reanalysis is potentially reversible, whereas grammaticalization is essentially irreversible”. The notion of abruptness was discussed above (p. 172). The fact that no particular element becomes grammaticalized appears irrelevant in a construction-oriented view of grammaticalization (grammatical maturation). With regard to tightness, it does seem to me that in both the examples discussed here, there is in fact an increase, since in both cases, we witness the creation of a new complex syntactic unit, in (75) an embedded clause construction, in (76), a possessive NP. (The notion of tightness will be further discussed in 8.6.) As for the reversibility of the changes, Haspelmath acknowledges that there are no attested cases of reversal of these particular developments, and it does seem rather improbable that they could occur.6 As evidence of regrouping as a directed change, consider also a ditransitive schema like the following: (77) NP1 shake NP2 hand that is, an agentive subject and a verb to shake with two arguments, one denoting a person and the other a part of that person’s body. Cross-linguistically, such a schema may be the source of two possible developments, both involving regrouping. One is the one exempliﬁed in (76) above and is most simply displayed as (78) NP1 shake (NP2 hand)NP3 where NP3 is a possessive noun phrase. In the other, “shake” and “hand” are integrated in that “hand” is incorporated into the verb (see also 10.2): (79) NP1 hand-shake NP2 (in real life, this is probably most common if the verb and the object are contiguous in the source). In both cases, there is an increase in tightness — a transfer of complexity downwards, from the VP level to one of the VP’s constituents, and reverse changes are unlikely. These common developments thus exhibit a clear directionality.

6.Admittedly, there are developments (as in Modern Greek) where an original genitive takes over the territory of a dative case, which might be interpreted as a reversal.

Pattern adaptation

Now, the development in (76) diﬀers in an interesting way from other cases of putative reanalysis discussed in this section. The reanalysis becomes manifest only when the expression dem Jungen seine Hose starts being used outside of the external possessor construction, as a possessive noun phrase. At the same time, it is hard to see how that could take place without the reanalysis having already happened. Arguably, then, the reanalysis is a precondition for the pattern spread, rather than the other way around, as in for instance the creation of the French question particle. This takes us back to the issue of the nature of analogical change. Hopper & Traugott (1993: 56) make a rather sharp distinction between reanalysis and analogy, saying that reanalysis “refers to the development of new out of old structures” and is covert, whereas analogy “refers to the attraction of extant forms to already existing constructions”. However, as noted by Kiparsky (1992), analogy may presuppose reanalysis. Consider for instance a word such as workaholic formed by analogy to alcoholic and presupposing an ahistoric analysis of the latter into alco- and -holic (or -aholic?). Similarly, cheeseburger and chickenburger exemplify a pattern X-burger which is inspired by hamburger but demands that one forgets about the connection between this word and the city of Hamburg, and Irangate and other names of political scandals in -gate rest on a reanalysis of Watergate. Presumably, the possibility of interpreting the morpheme -gate as meaning ‘political scandal’ has existed ever since the Watergate break-in but became manifest only when the other names were created. More serious-sounding examples are the forms yourn and hisn found in certain dialects of English, analogical formations based on mine and thine, where the ﬁnal -n, which is originally part of the stem, was reanalyzed as a suﬃx, diﬀerentiating the independent possessive pronouns from the bound ones. But the main empirical evidence for this reanalysis is the existence of the analogically formed yourn and hisn. Until these forms come about, the correct morphological analysis of mine remains undecided, like the fate of Schrödinger’s cat as long as it is locked up in its box, in the famous thought experiment from quantum physics. In fact, Hopper & Traugott (1993: 57) come close to this line of thinking when they say that “the workings of analogy, since they are overt, are in many cases the prime evidence…that a change has taken place”. But this undermines the neat distinction between reanalysis and analogy, the latter becoming the empirical manifestation of the former. Reanalysis and the generativist critique of grammaticalization. Generative linguists, who tend to be skeptical towards the notion of grammaticalization, or at least towards some of the claims by functional linguists in connection with it, often argue that grammaticalization is in fact reducible to reanalysis. Let us look at two such proposals. I mentioned above Newmeyer’s objections against treating the numeral ‘1’ and indeﬁnite articles as the same category. Let us look at his argument in more detail.

175

176 The Growth and Maintenance of Linguistic Complexity

Newmeyer (1998) attempts to “deconstruct” the notion of grammaticalization in two steps. In the ﬁrst step, he argues that grammaticalization must by deﬁnition involve reanalysis “since we say that grammaticalization has taken place only if there has been a downgrading, that is a reanalysis, from a structure with a lesser degree of grammatical function to one with a higher degree”, and that when Heine et al. (1991: 2) say “assumes a (more) grammatical function”, this “seems like a simple rephrasing of ‘is reanalyzed categorially’” (244). After a discussion of the grammaticalization of articles, he concludes that he knows “of no convincing cases involving a lexical unit or structure assuming a (more) grammatical function in which reanalysis is not implicated” (245). The crucial point here is whether diﬀerences in function have to entail diﬀerences in categorial status. In fact, the question of whether two diﬀerent uses of phonetically identical (or similar) items should be counted as belonging to the same lexeme and thus to the same category or rather as belonging to two diﬀerent lexemes from diﬀerent categories is notoriously one of the more diﬃcult problems of identity in linguistics (see 7.5). Newmeyer argues against earlier claims (Heine et al. (1991: 219)) to the eﬀect that there is grammaticalization without category change in cases such as the development of indeﬁnite and deﬁnite articles from numerals and demonstratives, respectively. Thus, according to him, “a longstanding view of the numeral/article distinctions is that the former belong to the category Q(uantiﬁer), while the latter do not” (244). My own personal view on this matter is coloured by the fact that in the variety of Swedish that I speak, there is no clear phonetic distinction between the numeral ‘one’ and the indeﬁnite article — rather there is a continuum from [7n˜] to [6n], and the spelling is en for both. I remember being a bit confused when I ﬁrst heard at school that there was supposedly a diﬀerence, wondering if this was not an idea inappropriately copied from English grammar. The Swedish Academy Grammar (Teleman et al. (1999: 1.203)) deﬁnes “indeﬁnite article” as “the numeral ‘one’ when it is unstressed and is not used to contrast against other numerals but is still often syntactically obligatory, instead marking the referent as individuated and indeﬁnite” [my translation]. Indeed, one may wonder if the exact categorial status of morphemes like Swedish en is not in the eye of the linguist rather than anywhere else in such cases. Other examples of the same kind not discussed by Newmeyer are the use of prepositions such as English by or French par for marking agentive phrases in passive constructions. which would appear to be more grammatical than other uses of the same words. Similarly, compare the use of adpositions for direct and indirect object marking to the same elements when appearing in locative constructions. The heterogeneity of the class of prepositions is one of the major problems for the hypothesis that there is a neat division between “content words” and “function words” (Hudson (2000)). Likewise, when possessive markers become obligatory with inalienable

Pattern adaptation 177

nouns (see 7.6.1), this is arguably a case of grammaticalization, but so far nobody has argued (to my knowledge) that it entails a change in categorial status. The second step in Newmeyer’s deconstruction of the notion of grammaticalization consists in demonstrating that there is no necessary connection between reanalysis on the one hand and other changes that have been argued to accompany grammaticalization. So far I can agree, but I think he goes too far when trying to imply that the processes are wholly independent of each other. In the section devoted to the problem (252–259) there are two lines of argument that are not kept apart very clearly. One line aims at showing that a certain change α can appear without another change β, and the other that even if α and β occur together, there is no direct causal link between them. I ﬁnd the ﬁrst line of argument rather irrelevant to the issue — it is like saying that since love and sex can occur without each other, it is not interesting to look at what happens when they happen together. Judging from what he says on p. 250, Newmeyer does think that downgrading reanalysis and semantic change may come together and be causally linked. For phonetic reduction, as we have already seen in 8.2, he takes the stronger position that it is not really connected to other components of grammaticalization. In the other generativist critique of the notion of grammaticalization that I want to mention, Roberts & Roussou (1999) attribute an even stronger role to reanalysis, claiming that grammaticalization is essentially “an instance of reanalyzing lexical into functional material”. Moreover, the general driving force is a striving for structural simpliﬁcation. Curiously enough, however, their way of arguing for this is the opposite of Newmeyer’s, since they do not see the alleged component processes of grammaticalization as independent of the reanalysis; rather, they ﬁnd that they are all “straightforwardly captured by the idea that grammaticalization involves the development of new exponents of functional categories”. The arguments they give are basically of the form “Functional categories tend to have property P; it is therefore natural that x gets P when x becomes a functional category”. For instance, concerning phonetic reduction (“attrition”): “lexical heads typically have more semantic and phonological content than functional heads, and as such the development of lexical into functional material will typically involve the loss of such content” (Roberts & Roussou (1999:1012))

Obviously, this is only valid as an explanation if the loss of content does not only typically accompany a change from lexical to functional but is also caused by it, and it then remains how to explain the mechanism behind this.7 Furthermore, one

7.The same thought was actually expressed in Newmeyer’s generativist critique of the functionalist view on grammaticalization: “I do not disagree with the observation of Heine (1994: 267) that ‘When a given linguistic unit is grammaticalized, its phonetic shape tends to undergo erosion’. But the question is why this should be the case.” (1998: 253)

178 The Growth and Maintenance of Linguistic Complexity

would also have to give an account of the consequences of the structural simpliﬁcation on the use of an element. If a verb meaning want is reanalyzed as a future auxiliary, it must change its behaviour in discourse quite radically — among other things, it has to be used much more often.

8.6 Tightness and condensation Many scholars have used terms like ‘tight’ or ‘dense’ to describe the character of the structures that arise diachronically through processes like grammaticalization. For instance, Givón (1979) describes ‘a number of recurring themes in diachronic syntax’ which ‘represent processes by which loose, paratactic, pragmatic discourse structures develop — over time — into tight, grammaticalized syntactic structures’. What Givón had in mind were developments such as that from topiccomment structures to subject–predicate constructions. But also the change from phrase-level to word-level constructions can be described in terms of tightness, and it is those that I shall have most to say about. Consider a complex noun phrase such as Peter’s ﬁeld with the one-word place name Petersﬁeld. If we agree that the latter is “tighter” than the former, and also assume that Petersﬁeld is diachronically derived from Peter’s ﬁeld, the condensation process has resulted in the transformation of a phrase into a word. Simultaneously, the previously independent words Peter’s and ﬁeld, which made up the phrase, have been demoted into bound elements of the newly created word Petersﬁeld. Assuming the three levels of phrase, word, and morpheme, we can depict the process as one of hierarchical lowering or downgrading: (80) phrase

word

morpheme

Peter’s field field

Peter’s

Peter

tightening

’s

Petersfield

Peter

s

field

Another way of describing the same thing is to say that a multi-word construction has been transformed into a one-word pattern. In Dahl (2001a), I used this as a basis for deﬁning the notion of tightening, but such a deﬁnition misses the point that the whole structure is lowered in the hierarchy. Notice that although the structure moves downwards, it is not really reduced: the elements that make it up are preserved, although demoted to a lower level. In fact, this is characteristic of the tightening processes that we can observe in the maturation of complex linguistic

Pattern adaptation 179

patterns, in particular of the ones subsumed under grammaticalization — as we have already seen, they can be said to adhere to a general principle of preservation of structure, or if we like, preservation of structural complexity. In fact, it leads to an increase in structural complexity in that more structure is compacted into each word-level unit. In addition, word-level structure tends to be more non-linear than phrase-level structure. In this connection, McWhorter (2001a: 137) argues that even if there is no independent reason for assuming that inﬂectional marking is more complex than marking with free morphemes, inﬂection is “more often than not” accompanied by complexifying factors, such as morphophonemic alternations, suppletion, and declensional and arbitrary allomorphy, which, according to him, exert a load on processing. It is perhaps more correct to think of inﬂectionalization as a necessary than as a suﬃcient condition for these phenomena, which often develop late or not at all in languages with inﬂectional systems (thus, there seems to be rather little of them in the Altaic languages — the schoolbook examples of agglutinating languages). It therefore appears desirable to keep the phenomena in question analytically apart from inﬂection as such. Also, although the phenomena McWhorter mentions are indeed diﬃcult for second-language learners, it is far from obvious that they in general make processing more diﬃcult. The concept of a word is of course far from straightforward. Among many other things, it may be taken to pertain either to grammatical or phonological structure. Thus, one often speaks of “phonological words” which are not necessarily the same units as words at the grammatical level. Correspondingly, in speaking of tightness, one may be thinking either of the grammatical structure or its phonological manifestation. In the latter case, a unit might be said to become tighter if it, for instance, becomes integrated prosodically. In this book, I see changes in structure as typically driven by phonological changes, which are in turn driven by changes in the use of patterns (see 8.1). Thus, structural condensation would depend on phonological condensation — the fusion of two words into one is conditioned by their having been phonologically integrated. The distinction between what is stored and what is produced on-line is usually thought to be reﬂected in the traditional division between lexicon and grammar, or more narrowly, syntax. The traditional lexical unit coincides with the word, and indeed, it seems diﬃcult to deny words a central status in the lexicon. As linguistic patterns mature, they tend to pass from phrase-level constructions to word-level ones. The transition to the word level does not take place abruptly but rather gradually, in several steps. During this process, then, patterns acquire properties that are characteristic of word-level patterns. In spite of the salient status of the concept of word in our everyday thinking about language, our understanding of the nature of words is still limited. In my view, two interrelated characteristics of words are essential here. One is that a word is the smallest unit that is informationally

180 The Growth and Maintenance of Linguistic Complexity

autonomous in the sense deﬁned in (2.4). The other is that the word is typically a stored unit rather than one generated on-line. This, in turn, is not unrelated to the fact that words are produced within a limited time-span, which is bound to put a limit on the maximal complexity of the operations that can be performed on wordlevel structures. We can therefore expect that word-level patterns will exhibit a lesser degree of ﬂexibility and freedom in their structure than syntactic patterns. Productivity is one parameter that is likely to suﬀer when proceeding from looser to tighter structures. Also, new combinations are more likely to be taken as additions to the lexicon rather than creations on-the-spot. Two somewhat speciﬁc constraints deserve special mention. If an item is stored, rather than produced on-line, this concerns all its aspects. In particular, its meaning will normally also be retrieved from permanent storage rather than computed compositionally, from the constituents. It is a common characteristic of word-level patterns such as compounding that the resulting expression should denote a “unitary concept”. For instance, a compound consisting of an adjective and a noun would tend to denote a well-established and stable “kind” rather than the accidental combination of two properties that the corresponding syntactic combination may well express. (Regrettably, minimal pairs of this type in English are somewhat hard to come by, so the standard example blackbird vs. black bird will have to do here.) The “unitary concept” constraint is most likely a direct consequence of the stored character of lexical information. The same is probably also true of the well-known fact that referential expressions do not ﬁt well into the internal structure of lexical items, which has its counterpart in the apparently universal reluctance to incorporate deﬁnite noun phrases. In Chapter 10, we shall see how the constraints on word-level patterns are manifested in incorporation processes.

Chapter 9

Featurization

9.1 Introduction In 7.1, I deﬁned featurization as “the genesis of higher-level — mainly word-level — features”. In this chapter, I shall look more closely at this phenomenon, which to my knowledge has never been singled out as a separate and unitary concept. Featurization can be seen as a special case of the development of non-linear structures in language, and thus of structural complexity of a particular kind. More speciﬁcally, linguistic structure involves not only components related by linear order and the properties of those components, but also properties of hierarchically higher entities. While this is fairly uncontroversial for syntax, many schools of linguistics have been skeptical towards the assumption that non-linear structures are necessary in phonology (at least outside prosody) and morphology. Recently, non-linear models have become more popular in those ﬁelds. What I want to show here is that the genesis of non-linearity is an important part of maturation processes in language, in that non-linear components normally develop through such processes from less mature, linear structures. Another way of looking at these processes is in terms of an increase in abstractness, in the sense that structural units arise that do not correspond in a one-to-one fashion to identiﬁable and independent segments of the output. An increase in abstractness will ceteris paribus increase system complexity, since it makes the mapping between structure and output less straightforward.

9.2 Abstract features in grammar 9.2.1 General Spoken French contains quite a few word-forms that consist of a single phonological segment. The descriptions given by traditional grammars of some of these are astonishingly complex, however. For instance, the second word in the sentence Pierre a deux ﬁlles ‘Pierre has two daughters’ is pronounced [a] but is said to be “the third person singular of the present indicative of the verb avoir ‘have’”. This is certainly no joke: Any change in any of these grammatical parameters is likely to

182 The Growth and Maintenance of Linguistic Complexity

entail the choice of another form of avoir. Thus, “the ﬁrst person singular of the present indicative of the verb avoir” is the equally monosegmental [e] (written ai), “the third person singular of the present subjunctive of the verb avoir” is [7] (written ait) etc. These French examples display a discrepancy between output complexity and structural complexity at the morphological level — a single phoneme may simultaneously be the exponent of a lexical item and at least four morphological categories. The French verb form a is the result of a process of radical phonetic reduction: it derives etymologically from Latin habet. What is remarkable, however, is that the grammatical characterizations of Latin habet and French a are identical — habet is the third person singular of the present indicative of the verb habere ‘have’.1 What we see here, then, is an illustration of the preservation of structural complexity in spite of the reduction of phonetic weight. Furthermore, since habe-t is a regular Second Conjugation verb form, segmentable into at least two if not more morphemes, and the French forms are formed in an almost totally idiosyncratic way, we also see the growth of system complexity at the morphological level. To see the genesis of the structural complexity of the form, we would have to go much further back into the pre-history of Indo-European: most of it will certainly remain hidden forever. All this presupposes, however, that we accept the traditional morphological description of Latin and French. It is therefore appropriate to consider some problems of morphological description or modelling. 9.2.2 Models of morphological structure We teach our beginning linguistics students that linguistic utterances are built up from building blocks that are called morphemes. In traditional grammar (as in the common-sense view prevalent at least in Western culture), the word has always been the basic building block. The notion of morpheme was introduced by structuralists only about a century ago, and has been hailed as a great insight. It is indisputable that a word form such as English apples consists of two smaller parts: apple, the stem, and -s, the plural ending, yielding a morphematic analysis as follows: (81) apple + Plural Æ apples Likewise, a verb form such as called would be analyzed as

1.As pointed out by the anonymous referee, Latin habet and a are not functionally equivalent, rather habet corresponds to il a ‘he has’ in French, Latin being a pro-drop language. My point still remains, however, and is perhaps even strengthened by this observation: the development of a more verbose syntax in French is not accompanied by a reduction of the morphological system in spite of a radical phonetic reduction.

Featurization

(82) call + Past Æ called However, if we try to apply morphemic analysis to a form such as drank, we run into problems. Drank is obviously parallel to called in that both are past tense forms (of drink and call, respectively) but unlike called, it cannot be segmented into two parts in any natural way. A mode of representation that can be used for both called and drank will have to be more abstract in the sense that the elements of the representation will not correspond to linearly ordered components of the word form. Rather, it will look something like. (83) {call, Past} Æ called (84) {drink, Past} Æ drank where call is a lexical item (lexeme) and Past a property or feature. In more complex cases, several such properties — corresponding to the traditional grammatical or inﬂectional categories — will be needed to identify an inﬂectional word form. For instance, the Latin and French verb forms a and habet would be analyzed as follows: (85) {av-, Present, Indicative, 3rdPerson, Singular} Æ a (86) {habe-, Present, Indicative, 3rdPerson, Singular} Æ habet Representations such as (81)–(82) correspond to what linguists such as Hockett (1958) and Matthews (1991) have called the ‘Item-and-Arrangement’ models of morphological structure (IA). (83)–(85), on the other hand, conform to what they call the ‘Word-and-Paradigm’ model (WP). Of these two models, Item-and-Arrangement is more consonant with the structuralist morpheme-based thinking, whereas Word-and-Paradigm, as the name indicates, is more in accordance with the traditional word-based description. To sum up the diﬀerences, Item-and-Arrangement representations consist of a linearly ordered set of entities of equal status called “morphemes” — the “rosary view” of morphology. Word-and-Paradigm representations consist of a lexeme and an unordered set of morphological properties. Assuming that morphology relates two kinds of representations of words — one structure which is “visible” to syntax and another structure which is “visible” to phonology — we may say that Item-and-Arrangement and Word-and-Paradigm make diﬀerent claims about the relationship between these representations, in that Item-and-Arrangement requires a more direct correspondence — ideally a one-toone relationship — between the elements at the two levels of representation. In keeping with the general morpheme-based ideology of structuralism, Item-andArrangement also implies that there are no essential diﬀerences in nature between syntactic and morphological objects, or between the morphological objects visible to the syntax and those visible to the phonology — they are all ordered sequences of smaller objects.

183

184 The Growth and Maintenance of Linguistic Complexity

However, as we have already seen, trying to analyze concrete word forms in Itemand-Arrangement may be problematic. Matthews (1991) and Anderson (1992:51–56) discuss in detail diﬀerent types of cases which have diﬃculty ﬁtting into the Itemand-Arrangement model, such as zero expression, fused expression, suppletion, inﬁxing, distributed realization etc. Moreover, syntactic processes such as agreement seem to require abstract properties of the kind envisaged in the Word-andParadigm model rather than the classical morphemes of Item-and-Arrangement. But note that not all morphology is equally recalcitrant. As noted by e.g. Matthews, Item-and-Arrangement analyses work fairly well for many languages, in particular for those traditionally labelled “agglutinating”. More generally, if we make a division of morphology into “agglutinating” and “fusional”, it is the latter that really motivates the postulation of a Word-and-Paradigm model. And since fusional structures depend diachronically upon agglutinating ones, it follows that the gap between the “rosary” ideal of structuralism and morphological reality arises and grows with the gradual maturation of linguistic structures. What this suggests is that Word-and-Paradigm-type structures, consisting of non-linear components such as the features Present, Indicative etc. in (83) and (85), are a result of maturation processes, and that they crystallize in a gradual fashion to become distinct from the concrete analysis of words into “morphs”. Rather than seeing this as a simple accumulation of more and more properties, we may describe it by using the metaphor of a picture or a text emerging and becoming clear on a screen when the focus is adjusted (cf. discussion of subjecthood in 5.4). Strictly speaking, a separate Word-and-Paradigm structure may arise in at least two ways. On the one hand, we may be dealing with retention of structure: when the surface manifestation of morphemes fuse (which normally happens through processes of phonetic reduction), their identity at the level of grammatical description may still be preserved, resulting in a discrepancy between choice structure and output structure. On the other, maturation processes may also lead to the creation of new structure, for instance, when morphologically simple forms are reanalyzed as members of paradigms and consequently, are assigned a feature structure. Let us look closer at some diﬀerent cases. In the narrowest interpretation of the word, fusion takes place when the boundary between two previously separate morphemes is blurred by a reduction process. For instance, in the past tense form bled of the verb bleed we can no longer see where the stem ends and the past tense marker -d begins. At the same time, bled uncontroversially continues to be a member of the paradigm of the lexeme bleed, and there is no doubt that it is grammatically opposed to e.g. the present tense form bleeds. The coalescence of the past tense morpheme with the ﬁnal consonant of the stem has thus not had any inﬂuence on the analysis on a more abstract level. The term “fusion” is however often used in a looser way, where no diachronic process of coalescence between previously separate morphemes can be identiﬁed.

Featurization

One very frequent phenomenon is thus what we, following Matthews, may call distributed realization — Anderson’s term “reciprocal conditioning” is probably a somewhat more restricted notion but covers much of the same ground. It is common for inﬂectional features to be manifested not only in an aﬃx but also in the choice between diﬀerent variants of the stem of the word. As an example, consider the Latin present ﬁrst person singular form fac-io ‘I do’ and the perfect third person singular form fec-i ‘I did’. The realization of the opposition between present and perfect can thus be said to be distributed over more than one word element. In more extreme cases, an inﬂectional category may exert inﬂuence at four or ﬁve points in a word. Distributed realization may arise through a number of diachronic processes, but the result, as in the cases of fusion proper, is one of mismatch between surface and abstract structures in morphology. Notice that distributed realization is a case of information spread-out discussed in 2.3, and thus could be a device for increasing redundancy in order to ensure safe transmission. The term portmanteau expression is used by structuralists to indicate that one and the same morpheme at the same time marks several inﬂectional categories, as when Russian -am as in kot-am ‘tom cats’ simultaneously marks plural number and dative case. This is not necessarily a result of coalescence — in many cases there is no reason to believe that there ever were separate morphemes for the diﬀerent categories. The most trivial case is when what looks like a portmanteau morpheme is in fact a combination of a designated marker of one category and zero marking of another, as in Latin ablative singular mensa¯ where it is reasonable to assume that there has never been a separate marker for the singular number. (See below for further discussion of zero marking.) Likewise, in many languages, there are special markers for animate and/or direct objects, but they have not (in the normal case) arisen from fusion of a marker of animacy and/or deﬁniteness with an accusative marker but rather through the expansion of a case marker from indirect objects (which are typically animate and deﬁnite) to animate direct objects. Such a process may be seen as creating new inﬂectional (or in some cases, lexical) categories, thus increasing rather than just preserving morphological complexity (both structural and systemic). In suppletion, forms share the same paradigm without being derivable or even in any way related to each other. Most commonly, suppletion involves the alternation of stems that have separate historical sources, like English go : went or the synonymous French aller with third person present and future forms va and ira. Such diﬀerences in diachronic origin are sometimes taken as criterial for the notion of suppletion. From the synchronic point of view, however, there may not really be any fundamental diﬀerence between “true” suppletion and cases where phonological change has totally obscured the relationship between the forms of a paradigm. Thus, the Swedish verb gå [go˜] ‘go, walk’ has the past tense form gick [jik˜], which at least in the spoken language bears little resemblance to the rest of the paradigm, although etymologically the source is the same.

185

186 The Growth and Maintenance of Linguistic Complexity

Suppletion has traditionally been seen as a marginal anomaly and has only recently been subject to systematic study (see Veselinova (2003) for a review of the research on the topic). In the context of this chapter, it is of interest from several points of view. To start with, suppletion gives evidence for, or rather presupposes, the reality of paradigms, and thus of lexical items as abstract entities which are separate from their concrete realizations. Someone who wants to avoid postulating such entities could suggest that suppletive forms such as go :went do not really belong to one and the same paradigm, rather we have two separate although defective paradigms belonging to separate although synonymous lexemes. However, some well-known phenomena argue against this. Thus, suppletive forms such as went tend to also show up also in various kinds of lexicalized expressions where go is a component, e.g. a compound verb such as undergo : underwent. This is typically the case even when the lexeme is grammaticalized for instance as an auxiliary, as with French aller, which also has suppletive inﬂection in constructions such as je vais partir :nous allons partir ‘I am going to leave: we are going to leave’. Anyone who argues for suppletive forms belonging to separate paradigms also must explain the existence of suppletive paradigms with perfect complementarity: the forms that are missing in one paradigm are exactly the ones found in the other. Another way of formulating this is to say that the existence of a suppletive form causes the productive formation to be blocked: we have neither *goed nor *badder in English, these forms being blocked by went and worse. At the very least, this suggests a strong association between the forms in the suppletive paradigm. On the other hand, the question of identity or non-identity between paradigms is not really an all-or-nothing matter: blocking may not always fully work. The existence of suppletion furthermore inspires speculation on the nature of the processing of morphological forms. Some types of suppletion are very frequent. Thus, an estimated 25–30 per cent of all languages exhibit suppletion between tense–aspect categories with verbs meaning ‘come’ and/or ‘go’ (Veselinova (2003)). In some cases, we can observe how paradigms of entirely diﬀerent verbs have fused in the course of the historical development of these languages. Between Latin and French, forms from no less than three original paradigms have become united in the verb aller. It is hard to accommodate this fact with the view that suppletion is an aberrant and dysfunctional phenomenon. It may actually be the case that suppletion has some advantages in the processing of high-frequency items, as argued for instance by Ronneberger-Sibold (1980, 1987). If you have to look up the forms anyway, you may as well make them more diﬀerent from each other. In this connection, some phenomena that parallel suppletion may be noted. For instance, alphabetic writing systems are based on the principle that each word is represented by a sequence of graphemes corresponding to the phonemes that it consists of in the spoken language. However, high frequency words such as and may sometimes be

Featurization 187

rendered in an irregular, “suppletive” way: either by a unique symbol such as & or by some idiosyncratic spelling such as och [fk] ‘and’ in Swedish (the only native word where k is spelled as ch). When forms from diﬀerent lexical items are united in an inﬂectional paradigm, it may increase complexity in several diﬀerent ways. Using totally unrelated stems for diﬀerent forms in the same paradigm quite obviously complicates the mapping between lexical items and inﬂectional forms. But the recruitment of a form into a paradigm may also mean that this form acquires morphological features that it did not have before. This may be seen as a special case of a somewhat wider phenomenon. Consider nouns such as people and cattle, which denote non-individualized collections of animate entities. Due to their meaning, such nouns in many languages come to be treated grammatically as if they were plural — which primarily means that they take plural agreement.2 It is not possible to subsume this under grammaticalization as it is traditionally deﬁned, as it does not involve a change in the grammatical status of any overt morpheme. Rather, if we assume that overt plurals have arisen through a standard process of grammaticalization, the assignment of plural number to a word like people is a minor embellishment of this process, which, however, helps to increase the non-linear character of the category of number in the language, in that “plural” becomes nothing but an abstract grammatical feature of the word forms. A further step is to integrate the word forms in question into a larger paradigm. In English, people and cattle are not usually seen as part of suppletive paradigms. However, when asked to provide the plural of person, 40 per cent of a group of American English speakers answered people (Ljuba Veselinova, personal communication). In the traditional descriptions of many languages, the corresponding words are indeed seen as forming one paradigm, e.g. Russian ˇcelovek : ljudi. (Arguably, ljudi has a plural ending -i and is therefore not a wholly parallel case.) Syncretism is the term used in traditional grammar to denote the formal identity of two or more members of an inﬂectional paradigm, as when the genitive singular, dative singular and nominative plural of Latin mensa ‘table’ all have the form mensae. At ﬁrst sight, syncretism might seem to involve a simpliﬁcation of the paradigm, and thus not belong in a discussion of the creation of new structure. However, what is interesting with syncretism in the context of this chapter is what it implies for the relationship between the levels of morphological representation. We would not postulate syncretism if the genitive were always identical to the dative; then, we would simply say that there is just one category rather than two.

2.The plurality of people is apparently a recent phenomenon in English. Compare Acts 11:24 in the King James version (1609), much people was added unto the Lord, to a modern version: many people were added to the Lord.

188 The Growth and Maintenance of Linguistic Complexity

Syncretism presupposes that even if two cells in the paradigm are identical in form, there is good reason to suppose that they are indeed two cells rather than one. This may happen in diﬀerent situations. One is when the cells involved have totally diﬀerent characterizations, as in the Latin mensae case. Another is when two forms diﬀer on one dimension only but the distinction is necessary in some other part of the paradigm. In Latin, the dative and ablative cases are never formally distinct in the plural but are still supposed to be there in the paradigm since they are systematically distinguished in the singular. The most controversial situation is when the distinction is only motivated by its necessity in some other paradigm, that is, with another lexical item, because such cases make possible an analysis in which the distinction is seen as undeﬁned for the lexical items where it is not overtly marked. We shall return to this kind of problem below. In general, syncretism implies that the relationship between surface and underlying representations is not one-to-one and that the underlying structure is more complex than would theoretically be needed to distinguish the set of forms in question. It could also be argued that syncretism of the mensae type, where the identical forms diﬀer on at least two dimensions, is in fact an economical way of creating “smart redundancy” (see 2.3). The question of the nature of zero marking is highly relevant to the issues discussed here. If a grammatical feature lacks an overt exponent, this may be due to a number of causes. To start with, a marker may have been wholly obliterated by phonetic reduction processes. Thus, due to the general truncation of ﬁnal segments in French, many nouns and adjectives do not overtly distinguish singular and plural, or masculine and feminine in the spoken language (e.g. [blø], which may represent orthographic bleu, bleue, bleus and bleues). In many cases, such zero marking is only apparent in there being some other diﬀerence between the forms. Also, the markers may be zero in some and non-zero in other contexts (as when the French plural [z] shows up before a vowel — the sandhi phenomenon called liaison). Zero marking due to phonetic reduction is really only a special case of mismatch between output and abstract structures which arise from such diachronic processes. Other types of zero marking are of quite another nature, as we shall now see. If a phenomenon occurs as a matter of course, more speciﬁcally, in the majority of a certain set of situations, its absence becomes noteworthy. In particular, if x is obligatory under certain circumstances according to some societal rule, the absence of x forces us to assume one of two possibilities: either those circumstances do not obtain or the rule is being violated. This applies directly to grammatical marking. If the use of a grammatical marker, or of a certain construction, is obligatory (or less strictly, is expected) under certain circumstances, the non-use of that grammatical marker or construction induces the conclusion that the circumstances are not present, or else the speaker is violating the rules of the language. Thus the absence of an item becomes meaningful (Bybee et al. (1994: 294–295)), or perhaps better, informative (since normally the “meaning” would be incidental to the intended message).

Featurization 189

For instance, suppose that it becomes obligatory, or at least a routine, to use a certain marker when conveying information from a second-hand source. As a result, leaving out that marker will suggest that the speaker has ﬁrst-hand information — e.g. that she has witnessed the event she is describing. Furthermore, this leads to a situation where it becomes natural to see the two alternatives ‘presence of marker : absence of marker’ as members of a binary grammatical opposition — a grammatical category of evidentiality. To the extent that we feel the need for postulating such entities in grammars, they are clear examples of maturation (or grammaticalization) processes leading to the increase in system complexity through the creation of abstract entities that do not correspond directly to concrete elements in the signal. Grammatical oppositions played a central role in structuralist thinking, in particular in the European tradition of linguistic structuralism. In fact, it can be said that for many linguists of this school, oppositions, especially binary ones, were the prototypical model for the organization of language as a whole. The Saussurean principle that that there is nothing but diﬀerences in language reﬂects the same way of thinking. According to this idea, the interpretation of an expression is determined by the other members of the paradigm that the expression enters into, and the meaning of what you say depends as much on what you do not say. However, treating this as a general principle of language blurs the fact that other elements of the language may not work the same way as obligatory grammatical categories in this respect. For instance, in a language where the use of second-hand information markers is not obligatory but rather guided by considerations of relevance, the absence of such a marker does not necessarily lead to the assumption that the speaker has witnessed the event but only implies that she does not deem it necessary to reveal the source of information. In the lexical realm, it is even less probable that the absence of something becomes meaningful. There are two diﬀerent levels at which we can speak of obligatoriness in connection with inﬂectional categories. On the one hand, we may speak of the obligatoriness of a marker, on the other, of the obligatoriness of a category. For instance, if there is an obligatory marker of second-hand information, we could also say that the category of evidentiality is obligatory — a considerably more abstract statement. Notice that in order to verify it, we cannot just look at individual sentences, since an obligatory category is not necessarily overtly marked in every case. (87)–(88) do not contain a single overt grammatical morpheme. (87) Sheep bleat. (88) Fish swim. Yet anyone who has any acquaintance with English grammar would agree that they contain a plural subject noun and a verb in the present tense. What the obligatoriness of a category really means is that it contains at least one member that is

190 The Growth and Maintenance of Linguistic Complexity

obligatory given certain circumstances. We say that (87) is a tensed sentence even in the absence of an overt tense marker because we relate it to the overtly marked sentence (89). (89) Sheep bleated. The establishment of a grammatical opposition between the presence and absence of a marker somewhat paradoxically opens the door for the possibility of letting this grammatical opposition also be realized in other and seemingly opposite ways: the presence of one marker may be seen as equivalent to the absence of another. In structuralist terms, a “privative” or asymmetric opposition may develop into an “equipollent” or symmetric one, that is, one in which the two members are on a par with each other. In one type of development, two privative oppositions merge. For instance, a number system may involve both plural and singulative markers, as in the following Maasai examples (Payne (n.d.); markers in bold): (90) Maasai singular

plural

7nk-ají ‘house’ fl-ákír-á ‘star’

Hnk-ájí-jík ‘houses’ Hl-ákír ‘stars’

Other examples are found in tense–aspect systems, in particular, with respect to the perfective-imperfective distinction. In Classical Greek (and other early IndoEuropean languages), verb morphology employs diﬀerent stems to form e.g. the present and the aorist, but there is no systematic derivational relationship between them — rather, several diﬀerent morphological processes may serve to mark oﬀ one of the stems, or both. Cf. – – –

from the root lab-: lambano¯ ‘I take’: e-lab-on ‘I took’ from the root lu-: luo¯ ‘I solve’: e-lu-s-a ‘I solved’ from the root gno¯-: gi-gno-sko¯ ‘l learn’: e-gno¯n ‘I learnt’

The formatives in question apparently had more speciﬁc, ‘Aktionsart’-like meanings before they were recruited into the aorist : present (perfective : imperfective) opposition. The aspect system found in Russian, with counterparts in other Slavic languages, shows another variant of the creation of an equipollent opposition. Although the details of the process are partly hidden in prehistory, we may observe in the presentday system of Russian that whereas adding a preverb to a simplex imperfective verb yields a perfective verb, the addition of an imperfectivizing suﬃx to the latter takes us back to imperfectivity:

Featurization

(91)

imperfective

perfective

pisat’ ‘write’ perepisat’ ‘re-write’ perepisyvat’ ‘re-write’

In other words, simplex imperfective verbs and preﬁxed verbs with imperfectivizing suﬃxes are treated as equivalent by the aspect system, although they have very diﬀerent structures. The creation of equipollence is yet another way of emancipating grammatical structure from concrete morphemes in that what are originally markers with opposite eﬀects come to be seen as parts of the same general and more abstract scheme, that is, they become markers of the two poles of a binary opposition. This may be taken even further in that an opposition may come to involve more than two values. For instance, in the early Indo-European system, nouns have up to eight cases and three numbers. It is taken for granted in traditional grammar that for instance all the six cases in Latin (possibly excluding the vocative) are on a par with and constitute alternatives to each other — any noun form has one and exactly one case, adjectives agree “in case” with nouns etc. Such equality between cases is actually quite unexpected, given that cases typically are rather diﬀerent with respect to degree of maturation and most probably have originated at diﬀerent points in time. Another, slightly diﬀerent process, which we have already touched upon in connection with suppletion, is the blurring of the distinction between lexical classes and morphological features. In English, collective nouns, when used of sets of persons, may take plural agreement, although they do not contain a plural morpheme. On the other hand, some words — pluralia tantum such as scissors — are always plural, regardless of what they refer to. That is, it is no longer possible to see plural as an operation that converts a singular noun into a plural one by adding an ending — it is an abstract feature of the noun or noun phrase, which can be determined by many diﬀerent criteria. Similarly, although most treatments of Russian aspect say that simplex verbs are imperfective, a sizeable number are actually perfective, in particular ones with inherent telic meaning, such as dat’ ‘give’, stat’ ‘stand up, become’, sest’ ‘sit down’, lecˇ’ ‘lie down’. Inherently atelic verbs, on the other hand, are often unpaired imperfectives. Thus, the imperfective : perfective opposition in Russian is to a signiﬁcant degree based on lexical meaning. As we saw above, the morphological properties of a word may depend on the absence of a marker as well as on its presence. However, this is pointless if the marker is not present in other word forms. There are really two kinds of reasons why we say that a form such as sheep in (87) is plural rather than outside the system of number altogether. One is that most words (such as goats or lambs) do have overt plural markings in a similar context. The other is that other words in the sentence (such as bleat in (87)) agree with sheep — that is, they “inherit” its number. Saussure stressed

191

192 The Growth and Maintenance of Linguistic Complexity

the dependence of the properties of linguistic elements on their paradigmatic relations. (Had the term “emergent property” been more in vogue in his time, he might have used it in this connection.) But we see here that the dependence extends to what structuralists called the syntagmatic axis — we ascribe properties to items not on the basis of their own behaviour but on that of their neighbours. When applied to morphology, traditional deﬁnitions of grammaticalization like those derived from Meillet, which focus on the development of grammatical markers or formatives from lexical elements, strongly induce a morpheme-based approach, detracting from the question discussed in this chapter, viz. the genesis of morphological features. This may result in a discrepancy between Word-andParadigm or feature-based synchronic morphological descriptions and the diachronic accounts of their development. It also detracts from the structure-building aspects of grammatical maturation processes. On the other hand, we do have to think of those processes as the spread of markers, particularly if we want to explain the distribution of overt vs. zero marking of grammatical oppositions. In other words, it is sometimes necessary to be able to look at grammatical structures in more than one way.

9.3 The inﬂectional model As we have already seen, the Word-and-Paradigm model is really a theory of the input to the morphological component of grammar: what information morphology needs in order to output the right kind of phonological structure. As was noted above, Word-and-Paradigm representations consist of a lexeme and an unordered set of morphological properties. It is, however, a bit more constrained than this characterization implies: the properties are really values of a small set of parameters that are speciﬁc to a part of speech — inﬂectional categories. Thus, a Latin noun is said to have the inﬂectional categories number and case — the former has two values, singular and plural, and the latter six: nominative, genitive, dative, accusative, ablative, and vocative. The total number of possible combinations of values is thus 2 × 6 = 12. It is typical of inﬂectional systems that they contain such closed sets of possible values that can be arranged in paradigms, more or less neatly. In order to appreciate fully the implications of this state-of-aﬀairs, we must make a digression and discuss some systems that do not have this architecture. One such system is tense logic, as formulated by the logician Arthur Prior (1957, 1967, 1968). Prior obtained tense logic by enriching the usual systems of propositional and predicate logic with a set of tense operators, such as F ‘it will be the case (at some time in the future) that…’, or P ‘it was the case (at some time in the past)

Featurization 193

that…’.3 If p is a sentence meaning ‘It is raining’, we might thus form a new sentence Fp ‘It will be the case that it is raining’. An operator can thus be described as an expression that forms new expressions out of already existing ones. However, in an approach where the notion of construction is central, it might be more natural to say that an operator is really the head of a construction. In any case, an essential feature of operators in systems like that of Prior is that input and output belong to the same category. You start out with a sentence and the result is a sentence. This means that an operator can be applied again to its own output (e.g. FFp ‘It will be the case that it will be the case that it is raining’), or to the output of another operator (FPp ‘It will be the case that it was the case that it is raining’). A single unrestricted operator is suﬃcient to generate an inﬁnite set of expressions. For instance, with the help of an operator expressing the successor relation, it is possible to generate expressions denoting all the natural numbers (of which there are, as is well known, inﬁnitely many). Consider now instead a system of features, binary or otherwise. Jakobson et al. (1963) proposed that all speech-sounds are representable in a universal system of twelve binary features. Such a system of course has a much larger state-space than e.g. the tense system of a natural language with two or three binary features. However, even if the number of possibilities generated by such a system is large — 212 or 4096 — it is, crucially, ﬁnite, in contradistinction to an operator system like Prior’s. The question is now what kind of system we need. It is easy to show that Prior’s system generates possibilities that are not realized in natural languages. Assume for instance that we regard English will as equivalent to Prior’s operator F. There is no way of getting two instances of will into one clause: *It will will be raining. Prior’s own circumlocution ‘it will be the case that…’ is in fact nothing but a sly attempt to get around this uncomfortable fact. On the other hand, it is clear that there are diﬀerences here between diﬀerent languages and between diﬀerent expressions in any one language. In English, the modals may and will are combinable with the perfect, but only in one direction: it may have been that way but not *it has might been that way. In Swedish, on the other hand, both det kan ha varit så ‘it may have been so’ and det har kunnat vara så *‘it has could been so’ are possible. In English, there is also a diﬀerence between will and be going to: (92) Many people have been going to marry Susan. (Dahl (1985: 18)) (93) *Many people have would marry Susan. and the following, with two be going to, is at least understandable, if not exactly natural:

3.Actually, this looks more like the perfect than the past. But this is not relevant to the discussion here.

194 The Growth and Maintenance of Linguistic Complexity

(94) Many people are going to have been going to marry Susan. It will be objected that this is because true modals in English lack perfect participles. On the other hand, it appears that the development of similar restrictions on combinability is part of the maturation of the constructions. It is hardly an accident that the less mature be going to construction is more easily embeddable than the more mature will. Similarly, whereas the Perfect can be embedded under other operators, this is totally impossible for the Simple Past in English. Matthews (1991: 213–214) claims that non-recursivity4 is a characteristic trait of morphology: one cannot, he says, form a Future stem of a Latin verb, then an Imperfect from that and apply the Future once more to the result. Although some counterexamples are found in derivational morphology — Matthews mentions cases like Turkish double causatives and Portuguese double diminutives — inﬂectional morphology is always strictly non-recursive, he claims. In fact, this follows directly from the assumption that the contribution of inﬂectional morphology is a ﬁxed set of features or dimensions with a ﬁxed set of values. If the ultimate sources of inﬂectional paradigms are syntactic structures built up by recursive phrasal constructions, it follows that maturation processes leading to inﬂectional structures must involve a gradual reduction of recursivity. Note, however, that we also ﬁnd recursivity lacking in some places in syntax where it would be expected from the logician’s point of view. Sentence negation is the most obvious example. In most languages, a simple sentence cannot be negated more than once: *It was not not raining is not possible. (Apparent counterexamples seem always to contain some intervening logical operator, as in a Russian example such as Ona ne možet ne otvetit’ ‘(lit.) She cannot not answer = She cannot help answering’.) A non-recursive construction has ceteris paribus a lesser expressive power in that it generates a ﬁnite set of expressions rather than an inﬁnite one. A maturation process that leads from recursivity to non-recursivity thus means a decrease in expressivity. However, this decrease concerns the construction in question, not necessarily the language as a whole. I noted above that the Word-and-Paradigm model is more attractive for “fusional” than for “agglutinating” morphology. However, within one and the same language it is not equally applicable to all kinds of morphology. Above, the main

4.Matthews deﬁnes “recursive” in the following, somewhat vague, way (Matthews (1991: 213)): “one may build a sentence by the repetition — once, twice or, in principle, indeﬁnitely — of the same or essentially the same process”. In the tradition of generative grammar, a “recursive rule” is one where the same category shows up on both sides of the arrow, e.g. S Æ S and S. A grammar containing such a rule will generate an inﬁnite number of sentences. The property of recursivity should rather be ascribed to the grammar as a whole, however — for the output to be inﬁnite it is suﬃcient that the same category occurs twice on the same branch in the phrase marker.

Featurization 195

topic has been inﬂectional morphology. Derivational morphology has seldom been subject to an analysis in terms of paradigms, and would most probably fall outside the domain of applicability of the Word-and-Paradigm model entirely. In fact, it may be argued that to the extent that a set of derivationally related forms come to be naturally seen as belonging to a paradigm, they will be seen as inﬂectional rather than as derivational. For instance, suﬃxes such as -ess in English or -in in German, which form nouns speciﬁcally denoting women, are usually seen as clear cases of derivational morphemes. But when the use of such suﬃxes becomes increasingly general and obligatory whenever a woman is referred to, it also becomes more tempting to see them as parts of an inﬂectional paradigm (Jobin (2004)). On the other hand, there is also a gradation within the categories traditionally treated as inﬂectional with respect to how motivated a Word-and-Paradigm analysis is for them. Again, this gradation seems to coincide with what has been seen as an inﬂection-derivation continuum: the prototypical inﬂectional categories are also those which most strongly suggest the Word-and-Paradigm model. It has been proposed (Booij (1993)) that there are really two kinds of inﬂection: “contextual” and “inherent”, where the contextual categories would coincide with the inﬂectional prototype, including agreement categories and “structural case”. The main basis for the contextual : inherent distinction, which has gained a certain popularity, would be whether a category is determined by the syntactic context or rather expresses “independent information”, which the speaker is free to manipulate. One objection is that one and the same category may be determined in several ways. For instance, according to Booij’s proposal, the category of number is “inherent” for nouns; however, the choice between singular and plural in quantiﬁer phrases is typically determined by language-speciﬁc rules which depend on the syntactic context. Similarly, tense, aspect, and mood are all subsumed under “inherent” morphology, but in particular the choice of mood tends to be forced in certain syntactic contexts (e.g. certain types of embedded clauses). Such a syntactic dependence is characteristic of more advanced stages of grammaticalization, that is, categories move closer towards the inﬂectional prototype as they mature. As was mentioned in 7.2, the Latin passive is a mixture of inﬂectional and periphrastic forms, as shown in the following table:5 (95) Latin passive

present perfect

active

passive

amo ‘I love’ amavi ‘I (have) loved’

amor ‘I am loved’ amatus sum ‘I have been/was loved’

5.What I state here is inspired by Börjars et al. (1996), but I would not want to claim that my way of representing the problems is entirely compatible with theirs.

196 The Growth and Maintenance of Linguistic Complexity

We seem to be dealing here with a complete fusion of two diﬀerent patterns, in that the periphrastic forms are in all respects functional counterparts of the inﬂectional ones, something that is most strikingly illustrated by the fact that deponential verbs (“passiva tantum”) also use the mixed paradigm: (96) Latin deponentials active present perfect

loquor ‘I speak’ locutus sum ‘I spoke/have spoken’

This is unlike other languages with two diﬀerent passive constructions, one inﬂectional and one periphrastic, such as Swedish, where they are in competition with each other, or Russian, where there is a division of labour, in that the reﬂexivebased construction is only possible with imperfective verbs and the periphrastic construction is restricted to perfective verbs. If we postulate a Word-and-Paradigm description of the inﬂectional forms of the verb and in addition want to obtain a uniﬁed characterization of the whole paradigm in (96), we have to assume that locutus sum also has, at some level, an analysis in terms of abstract features. What this means is that the necessity for such an analysis is not always restricted to the word level, and that diachronically, the separation of the abstract feature level from the sequential morphemic analysis starts even before the integration of the elements into one word. 9.4 Agreement: Where syntax, morphology and lexicon meet Agreement is curious as a syntactic phenomenon in that it largely presupposes inﬂectional morphology. If inﬂectional morphology is taken to also include distinctions made in pronominal systems, the only possible counterexamples are some borderline cases where it looks as if the agreeing morpheme is not bound. However, the latter may be seen rather as incipient agreement, and may tell us something about its origins (which are otherwise rather obscure). In 9.3, “agreement features” were cited as prototypical cases of inherent inﬂection. The features referred to were features of the agreeing entity — the agreement target — rather than those of the entity agreed with — the agreement source. Number and gender are agreement features of adjectives but not (normally) of nouns, falling instead under “inherent features”, whether these be inﬂectional (as number and case) or lexical (as gender). But the property of being agreed with is also something acquired in a maturation process. For instance, I argued above that when words such as people and cattle start being treated as plural by agreement rules in English, this means that plural as a grammatical entity becomes abstracted from

Featurization 197

any concrete plural morpheme. Target and source agreement features overlap but do not coincide; source agreement features are, as we have seen, partially lexical (that is, pertain to lexemes rather than to word forms). Arguably, then, there is not a simple scale going from derivation to inﬂection; source agreement features can be regarded as constituting a prototype of their own. Within source agreement features, perhaps the most salient place is taken up by gender. Gender as a grammatical phenomenon is so closely linked up with agreement that manifestation in agreement is often taken to be a deﬁnitional criterion for gender (Corbett (1991)). According to the standard view, gender is a lexical property of nouns that manifests itself primarily in the choice of inﬂectional features on words that have a certain syntactic relationship to that noun. For instance, in the phrase ein gross-es Haus ‘a big house’, the neuter gender of the noun Haus ‘house’ shows up on the modifying adjective gross-es ‘big-neut’ as the ending -es. Gender thus crosscuts lexicon, morphology and syntax; gender, inﬂectional morphology and syntactic agreement make up an interesting cluster of phenomena that all belong to the later stages of maturation processes. It is not entirely easy to determine what languages have and what languages do not have gender in their grammars, since the notion may be delimited in several ways. In particular, languages such as English, where gender is only reﬂected in the choice of third-person pronouns, and is based on properties of referents rather than on a classiﬁcation of nouns, may or may not be regarded as having gender. In the following, I shall try to make clear at every point what I mean, insofar as it makes a diﬀerence. The following quotation (Bichakjian (1999)) expresses, albeit in an unusually categorical way, a fairly common opinion about grammatical gender: “… the slightest familiarity with English is suﬃcient to realize that gender for inanimate referents and a dual serve absolutely no linguistic function. Hence, their steady disappearance.”

Trudgill (1999: 148) is only a little more cautious when he says: “[G]rammatical gender marking in languages such as European languages which have only two or three genders seems to be almost totally non-functional…”

…and McWhorter (2001a: 129) agrees, saying: “Grammatical gender aﬃxes, beyond the extent to which they distinguish natural (biological) gender, do not mark any real-world entity or category or serve any communicative need…”

Gender would thus be a prototypical case of “historical junk”: complexity which has arisen through a “gradual evolution of a sort which proceeded quite independently of communicative necessity, and must be adjudged happenstance accretion” (McWhorter, ibid.).

198 The Growth and Maintenance of Linguistic Complexity

The readiness with which grammatical gender is denied any raison d’être seems unmotivated to me. I shall now review a number of characteristics of gender that in my opinion create problems for the “historical junk” theory. The cross-linguistic frequency of grammatical gender. Nichols (1992: 124) says that “just over one-fourth” of the languages in her world-wide sample have “gender or some other variety of nominal classiﬁcation”. However, there is evidence to suggest that this ﬁgure may be too low: languages of the New World represent as much as two ﬁfths of the sample and among them, there are very few languages with gender/class systems. In the remainder of the sample, the percentage of such languages is as high as 40. Likewise, among ten Old World languages classiﬁed by Nichols as “isolates”, there are four with genders/classes, which is an illustration of the fact that gender is found in many diﬀerent unrelated languages. Furthermore, Greville Corbett (personal communication) reports that in the sample prepared for Corbett (forthcoming), which contains 256 languages, there are 112 (44 per cent) languages with and 144 (56 per cent) without gender systems. Supposing that the general percentage is at least 40, it may be estimated that about half of all languages with inﬂectional morphology have gender systems (deﬁned as “classes of nouns reﬂected in the behaviour of associated words”) — the general conclusion is thus that the probability that gender or class systems should arise in a language is quite high. The cross-linguistic uniformity of gender systems. Typological work (Corbett (1991), Dahl (2000c, 2000d)) has shown that gender systems in human languages obey a number of generalizations, such as the following: 1. In any gender system, there is a general semantically based principle for assigning gender to animate nouns and noun phrases 2. The domain of the principle referred to in (1) may be cut oﬀ at diﬀerent points of the animacy scale (animacy hierarchy): between humans and animals, between higher and lower animals, or between animals and inanimates 3. All animates above the cut-oﬀ point may be either be assigned to the same gender or there may be further divisions. 4. If the principle referred to in (1) distributes animate NPs among diﬀerent genders, sex is the major criterion. Furthermore, I argued in Dahl (2000c) that many (or most) seemingly complex gender systems are in fact decomposable into “elementary gender systems”, which all obey the principles above. A further principle that may or may not be followed is that of “leakage” from animate to non-animate genders: inanimates are referred to (with or without any apparent semantic motivation) using one or more genders involved in semantically based animate gender assignment. The complexity of the genesis of gender systems. In most cases, we cannot reconstruct the details of the historical developments that have given rise to the

Featurization 199

gender/agreement systems we ﬁnd in languages. What we know is that these developments must involve a rather long chain of events. Corbett (1991: Chapter 10) sketches a few possible scenarios. He notes two ultimate possible sources for gender morphemes: ordinary lexical nouns and demonstrative pronouns. Nouns such as ‘thing’, ‘man’, ‘woman’ and ‘person’ may acquire a function as classiﬁers, to be used (among other contexts) together with demonstratives, and when demonstratives and classiﬁers fuse, the result is gender or class distinctions among demonstratives, which may spread further if they also come to be used as anaphoric pronouns, as is common. There is a shorter route to this stage: as mentioned in 7.3, even without the help of classiﬁers, a gender distinction may arise among anaphoric pronouns, as the result of an invasion of demonstratives restricted to the inanimate domain. But this is still a far cry from the gender systems of the Indo-European or AfroAsiatic languages. We have to explain at least two things: how gender markers come to be obligatory parts of e.g. adjectives and verbs, and how non-semantic gender arises. For the ﬁrst question, the standard theory is that subject and object markers on verbs derive from pronouns and that agreement markers on adjectives derive from pronouns or adjectives. In both cases, one has to assume a stage where there is obligatory “redundant” or doubled use of these pronouns or articles. For the second question, the natural assumption is that animate gender(s) come to be extended to inanimates through various processes, although the full story of how a system like e.g. that of French or German arises has not yet been told. There may be alternative accounts, but it is unlikely that they will be simpler. In terms of the complexity of the development that precedes them, gender systems undoubtedly belong to the most mature phenomena in language. The genetic stability of gender systems. In several large phyla, some of which have a long documented history, gender systems are found in the majority of all languages, with suﬃcient uniformity to suggest that the systems have been inherited from the common proto-language. In the two phyla with the most well-documented history, Afro-Asiatic and Indo-European, the original gender systems6 are preserved wholly or partially in at least some languages in virtually all branches. Thus, the only branch of Indo-European where the old gender system has been wholly lost is Armenian, and the total number of individual languages where this has happened is also quite low. The demise of the gender agreement system in English in combination with the reduction of the Indo-European three genders to a two-gender system in many Germanic and Romance languages is probably the reason for the wide-spread idea that gender “is on its way out” in Indo-European

6.Obviously, we do not know precisely what the gender systems of the proto-languages (if these were deﬁnable synchronic language stages) were like. The cross-branch uniformity makes it probable that the common origin of the systems was not too diﬀerent.

200 The Growth and Maintenance of Linguistic Complexity

languages. It is important to bear in mind here that in the majority of cases we are dealing with a reduction rather than a disappearance of the gender system, and that the reduction appears to be highly correlated with the degree of external contacts and with other similar processes of change (see further discussion in 11.4). In addition, new gender distinctions have been introduced in several cases. Thus, animate-inanimate distinctions, wholly or partially independent of the old gender system, have been introduced generally in Slavic nominal agreement and inﬂection systems, and in Standard Central Scandinavian pronouns (see 7.3). To take another example of the stability of gender systems, out of 29 North East Caucasian languages, all except two have gender systems (traditionally called “class systems”): Lezgian and Udi, the latter of which is spoken outside the core area of the family. Given these facts, it appears that the half-life of a well-established gender distinction must be several millennia. The strong ﬁltering eﬀects on gender of suboptimal acquisition. As is well known, second-language learners have great problems with grammatical gender, both learning to apply agreement rules and learning the genders of individual words. For immigrant children in Sweden, it has been demonstrated that the age when they enter the public day-care system has an eﬀect on the acquisition of gender in Swedish, with older children making signiﬁcantly more errors (Andersson (1993)). The frequency of manifestations of gender systems in discourse. Grammatical gender tends to be manifested with great frequency in discourse. To try and give exact statistics is probably not particularly meaningful, but consider as an illustration the following paragraph from the website of the Spanish newspaper El País, where I have marked in boldface those nouns whose genders are manifested by agreement on some other word in the text: (97) La Sección Séptima de la Sala Tercera del Tribunal Supremo ha desestimado, aunque no de forma unánime, la petición de los sindicatos de suspender cautelarmente los servicios mínimos ﬁjados para Radio Televisión Española, radiodifusión sonora y televisión, y sociedades de salvamento y seguridad marítima. En todo caso, la tramitación judicial de los recursos presentados por UGT y CC OO continuará por los cauces habituales. The number of such nouns is 13, in a text of 64 words. In many cases, the gender is marked twice, both by the choice of article and on the following adjective. In spoken discourse, the average complexity of noun phrases and therefore also the frequency of gender marking is most certainly lower, but the main point remains: gender decisions have to be made very often in discourse. In addition, gender markers constitute a signiﬁcant part of what Spanish speakers say. In a text like the one above, the gender markers -a and -o make up more than ten per cent of all

Featurization 201

syllables7 — they must thus be among the most frequent morphemes in the language. It is obvious that the cost of a gender system like the Spanish one is considerable, as the following admittedly absurd calculations show. With 350 million speakers of Spanish in the world who speak on average a couple of hours every day, abolishing gender agreement could save tens of millions of hours of conversation every day. In the same way, an average Spanish novel could be 20–30 pages shorter if gender markers were deleted. The above observations give rise to the following questions: Why do such complex chains of events as those giving rise to gender systems take place so often, and why are the results so similar? If gender systems serve no communicative need, why are they so stable? In particular, given that gender markers are among the most frequent morphemes in a language, why aren’t they subject to phonetic reduction? In fact, gender markers sometimes display a quite unexpected capacity for survival. Consider the development of the deﬁnite article in the Romance languages, the feminine form of which is la, a or something similar, the Latin source being the feminine accusative singular of the demonstrative ille – illam. In French, regular sound changes should have given [il] (cf. la ville ‘the town’ from Latin illam villam ‘that estate’), but exceptionally the unstressed second syllable with the gender marker is retained, yielding la. In Portuguese, the gender marker a is even the only part of illam to survive in the feminine article. Again, if gender is historical junk, one wonders why precisely this junk is chosen to be preserved above anything else. Not everyone would agree that gender has no communicative function, of course. A frequently mentioned possible function is that of reference tracking. In a sentence such as John told Mary that she could help him the genders of the pronouns undoubtedly help in disambiguating them. However, reference tracking does not explain the use of gender e.g. in NP-internal agreement, and it may also be questioned how frequently disambiguation of the kind illustrated occurs in natural texts. In particular, pronominal reference to inanimate entities is relatively infrequent (Dahl & Fraurud (1996)), and having more than one inanimate gender for this purpose thus seems to be overkill (in spite of claims to the contrary, e.g. Zubin & Köpcke (1986)). Likewise, given that animate and inanimate NPs tend to show up in distinct syntactic positions, a referential ambiguity between two referents that diﬀer in animacy is unlikely, and the not infrequent systems where there are only two pronominal genders — animate and inanimate — would add little to the eﬃciency of reference tracking. But reference tracking may be of greater importance in a language where other grammatical marking is scarce, as suggested for Nunggubuyu

7.More exactly: -a- and -o- show up 17 times as agreeing morphemes, which is 11 per cent of the 154 syllables in the text. In addition, seven nouns end in -a and -o, and if these occurrences are also counted as manifesting gender, the total is 24, which is 16 per cent of all syllables.

202 The Growth and Maintenance of Linguistic Complexity

by Heath (1983), and can probably be seen as one of several functions of grammatical gender that are all part of a larger scheme of redundancy management. I would like to suggest as a candidate for another and in my view more generally applicable function of grammatical gender a mechanism that would be rather analogous to the “checksum digit” system discussed in 2.3. What would this mean? Consider the French sentence (98), rendered in IPA notation in (99). (98) French Le renard voit la souris. def.m fox see.prs.3sg def.f mouse ‘The fox sees the mouse’ (99) l6r6nar vwa lasuri Each of the two noun phrases le renard ‘the fox’ and la souris ‘the mouse’ consists of three syllables. Notice, however, that these syllables diﬀer signiﬁcantly in predictability (“syntactic information”): while the two last ones depend on the choice of a lexical item from among thousands (or tens of thousands) possible candidates, the alternatives to the deﬁnite articles are very few in number. This is paralleled in the phonological structure in a way that is perhaps best illustrated by the English sentence (100). (100) The fox sees the mouse. Here, even if the grammatical and lexical morphemes are one syllable each, the grammatical morpheme contains a smaller number of segments, and the segments are both “lighter” and chosen from a smaller inventory (Willerman (1994)). As we have already seen, a common explanation of the diﬀerence in make-up between lexical and grammatical morphemes is that the latter have been reduced because they are more predictable, but as I have argued, this is in a way putting the cart before the horse. The question is: why are predictable items needed at all? My suggestion, then, is that a grammatical marker such as a deﬁnite article acts as a check on the lexical item that it is associated with. In this light, the role of grammatical gender becomes understandable — it is essentially an error-checking mechanism. We know that a masculine article has to go with a masculine noun, if we perceive any other combination we know that something has gone wrong. Also, we see that the (presumably universal) pattern “heavy lexical item — light grammatical marker” may have a rationale in the optimization of the distribution of information in the signal, although I would prefer to leave to a more mathematically-minded person to do the actual calculations needed to conﬁrm this claim.

Featurization 203

9.5 Can we do without abstract features? In Cognitive Grammar, as developed in the works of Ronald Langacker and his followers, an important principle is that grammatical units must be “intrinsically symbolic”, that is, they must be “bipolar”, “consisting of a semantic unit deﬁning one pole and a phonological unit deﬁning the other” (Langacker (1991: 16)). The requirement that grammatical units have semantic content is taken to proscribe “purely grammatical” constructs, such as “contentless features or empty diacritics” and “syntactic dummies with neither semantic nor phonological content, introduced solely to drive the formal machinery of autonomous syntax” (Langacker (1991: 19)). An example of an “empty diacritic” would be a lexical feature marking a noun as belonging to a certain gender (1991: 312). As a serious attempt to describe language without entities of the kind argued for in this chapter, Langacker’s treatment deserves some attention. First, on the “content requirement in general”. Langacker speaks of his theory as being “highly restrictive” in this respect, contributing to it being “intrinsically desirable from the standpoint of scientiﬁc method (not to mention esthetically and intuitively)” (1991: 312). Apparently, the idea is that by disallowing non-symbolic entities one applies Occam’s Razor, not allowing any unnecessary entities. However, this works only if it means that we can thereby reduce the overall complexity of the description. If the entities that we are considering have to be postulated in any case, it is not obvious that it is “intrinsically desirable” to assign a content to them — on the contrary, assigning a semantic content to a unit will ceteris paribus complicate the description, even if it may seem to lead to greater generality. Furthermore, such an assignment is not in general falsiﬁable, if no constraints are put on the ways in which contents are assigned. For this reason, I want to claim that the “content requirement” is not at all “intrinsically desirable” — on the contrary, by being unfalsiﬁable, it is highly undesirable from a methodological point of view. As a minimum, it is not suﬃcient to show that it is possible to assign a content to all grammatical units — one also has to show that it is necessary, the burden of proof being on the adherents of the “content requirement”, not on the sceptics. Next, let us consider Langacker’s treatment of alleged “empty diacritics”. As Langacker points out, grammatical gender and the “apparent arbitrariness of gender assignment” (1991: 319) is a prima facie challenge to the “content requirement”. The language he chooses to discuss is Spanish, where (as we saw in the previous sections) there are two genders (masculine and feminine), at least partly arbitrarily assigned and essential for agreement processes. A great number of its nouns end in -o and -a; with rare exceptions, nouns in -o are masculine and nouns in -a feminine. Langacker suggests that -o and -a are part of constructional schemata for nouns; for animate nouns, they have the meanings ‘male’ and ‘female’, respectively; for inanimate nouns, on the other hand, they both mean “thing”, that is, “semantically,

204 The Growth and Maintenance of Linguistic Complexity

each is then equivalent to the noun-class schema” (306). In other words, -o and -a, traditionally seen as gender markers, are (i) synonymous, (ii) do not have any meaning apart from that of the patterns they appear in, which, in addition, is a maximally general one. It should be obvious that by such a procedure any element can be assigned a semantic content. Notice, in particular, that we get a huge set of minimal pairs which diﬀer formally only in the choice between -o and -a and which would be synonymous according to Langacker’s treatment but which diﬀer systematically in grammaticality — seemingly a counterexample to the claim that “grammar reduces to the structuring and symbolization of conceptual content” (Langacker (1999: 1)). The gender of nouns is not always overtly recognizable: for instance, the fact that the word lápiz ‘pencil’ is masculine would, according to mainstream grammatical practice, have to be individually speciﬁed for that item in the lexicon. This is not possible in Cognitive Grammar, since that would involve an “empty diacritic”. Instead, a phrase such as lápiz corto ‘short pencil’ is accounted for by a special case (a “subschema”) of the usual attributive adjective + noun pattern as follows, where the capitalized words stand for semantic units and the lower case ones for phonological ones: (101) [[PENCIL/lápiz][RELATION(THING)/… -o]] — basically saying that lápiz can be combined with an adjective (a word that expresses a relation) ending in -o. Obviously, more is needed for a full account of gender agreement in Spanish, and Langacker postulates, without giving any details, that there are “higher-level schemata” which generalize over the patterns where gender agreement shows up. But these patterns are not necessarily restricted to noun modiﬁers but can involve rather diﬀerent syntactic conﬁgurations (anaphoric relationships between noun phrases, predicative adjectives agreeing with their subjects or with other arguments etc.) and it is not self-evident that higher-level schemata can be abstracted out of them. Furthermore, in a language with a more complex morphology than Spanish, e.g. Latin, these schemata will have to accommodate diﬀerent declensional patterns and the interaction with other agreement categories such as case and number. Langacker notes that “to say that a noun is grammatically masculine is merely to say that it occurs in certain constructions” (312), that is, in those combinations where the relevant masculine agreements occur. This is true, but the fact that the set of constructions in question may be quite large and heterogeneous combined with the non-trivial circumstance that Spanish and Latin nouns show an astonishing consistency in this matter — either they occur in these constructions or they don’t — makes it very tempting to introduce “masculine” if not as an “empty diacritic”, then at least as an abbreviatory device. But the main objection is one similar to that made above: being masculine — that is, occurring in certain constructions — is not reducible to a statement about the structuring and symbolization of conceptual content.

Featurization 205

9.6 Parallels in phonology As it turns out, featurization in morphology has some parallels in phonology, which are too striking to be left unnoticed. In this section, I shall try to characterize them even if it will have to be in a somewhat sketchy way. An obvious place to look for non-linear phenomena in phonology is prosody — obvious enough for the term “suprasegmental” to be applied to them. Thus, in languages with lexical tone, a word is characterized phonologically not simply in terms of a sequence of phonemes but also in terms of the choice of tone contour. In many systems, tone contours can be seen as being made up of sequences of high and low tones. The phonological representation would then have to consist of several parallel “tiers”. In the systems common in South East Asia, where words tend to be monosyllabic, the tone contour, chosen from among a small set of possibilities, is typically a property of the whole word. This is also true of tonal word accent systems such as that of Norwegian and Swedish, where each word form has one of two word accents (called “accent I” or “acute” and “accent II” or “grave”). However, here the choice of tone typically carries grammatical rather than lexical information. The idea that phenomena normally regarded as non-prosodic might also be subsumable under a non-linear analysis seems to have originated in the thirties with J. R. Firth in his “Prosodic Analysis” but became popular in wider circles with the advent of “Autosegmental Phonology” (Goldsmith (1976)). The cases where an analysis in terms of higher-level features oﬀers itself most obviously are harmony phenomena of diﬀerent kinds: the well-known type of front/back harmony found in Finnish and Turkish but also for instance “advanced tongue root” (ATR) harmony in many languages, e.g. in Maasai (Payne (n.d.)), with the [−ATR] vowels [H7~f] and the [+ATR] vowels [ieuo]. Expressed in simple terms: each syllable in a word receives the same value for some phonological feature; in other words, the choice is made just once in each word. Cf. the Maasai pair ad´f ‘to be red’: adorú ‘to become red’ — in the latter word, the ﬁnal vowel of the stem harmonizes with the suﬃx -ru, becoming [+ATR]. The “once per word” constraint can also apply to e.g. nasalization, e.g. Tuyuca (Barnes (1996)), glottalization, e.g. Cuzco Quechua (Parker & Weber (1996)), and aspiration, e.g. Cuzco Quechua (ibid.) and Palula (Henrik Liljegren, personal communication). Lopes & Parker (1999) analyze Yuhup as having morpheme-level features of constricted glottis, spread glottis and nasalization. A further example is Danish “stød”, often regarded as a glottal stop but characterized by Fischer Jørgensen (1989) as “phonetically, a phonation type reminiscent of creaky voice and, phonologically, a prosodic feature bound to a deﬁnite syllable in certain word types”. Even if all of these features would probably be called “prosodies” in the Firthian school, I shall use “suprasegmental” as a less misleading general term for features that do not pertain to single segments. It does not appear possible to divide features

206 The Growth and Maintenance of Linguistic Complexity

into segmental and suprasegmental once and for all — diﬀerent systems may ask for diﬀerent solutions. This also goes for features traditionally called prosodic. Stress and tone are typically realized either on the syllable or the word level and are thus relatively unequivocally suprasegmental. In the case of stress, it is typically (almost by deﬁnition) a matter of relative prominence, which makes it natural that it will obey the principle “exactly one per word”. Length of vowels and consonants, on the other hand, is often seen as a feature of individual segments, although in some languages such an analysis is less attractive. For instance, in Swedish, as in some other Germanic languages, there is normally exactly one long segment associated with each word stress, although it may be either the vowel or the ﬁrst postvocalic consonant of the stressed syllable. In the case of non-prosodic suprasegmental features, it seems that their suprasegmental character is an even more contingent aﬀair. Thus, nasalization appears to be a well-behaved segmental feature in many languages; that is, a given word may contain one or more nasal vowels, freely distributed over the word. In the case of something like front/back vowel harmony in Finnish, harmony is of course by deﬁnition a relational phenomenon and thus cannot be segmental. On the other hand, the front/back distinction itself may be segmental. It is reasonable to assume that harmony phenomena in phonology arise through assimilation processes. However, harmony systems are often asymmetric, like the one found in Finnish. This means that it is the front/back feature in the stem that is the “independent variable” rather than that of the ending: each stem is either front or back and the ending agrees with it, e.g. Finnish Turu-ssa ‘in Turku’ vs. Töölö-ssä ‘in Töölö’. But there are also languages where there is no vowel harmony but rounded front vowels are only found in stressed syllables. An example is Estonian, a close relative of Finnish (with some marginal exceptions in loanwords; Diana Krull, personal communication). In other words, the choice of a rounded front vowel can only be made once in a word (if it is not a compound). This fact makes the systems look less diﬀerent. There is a general tendency in phonological systems for more “marked” distinctions to be restricted to stressed or prominent syllables — in other words, unstressed syllables tend to have a reduced phonological system. If this is the case, it follows more or less automatically that the marked distinction in question can only be realized once per word. In other words, there is a natural connection between restricting some features to stressed syllables and the rise of word-level features. When we look more closely at non-linearity in phonology and grammar, we can see both parallels and overlaps. The most obvious parallel is in the fact that in both domains, we encounter word-level features. The phenomena overlap in the sense that word-level features at the morphological level may be expressed wholly or partially as word-level phonological features. Diachronically, this takes place relatively late in the development of a morphological feature, but, as we have seen already, it can be seen as a consolidation of the Word-and-Paradigm analysis of the form.

Featurization 207

The ideas of autosegmental phonology have also been applied to morphology. A multi-tier analysis of “nonconcatenative” morphology was proposed in McCarthy (1981) and applied to verb morphology in Semitic languages. Thus, a Modern Standard Arabic verb form such as kuttib ‘it was written’ was suggested to consist of three components: a “consonant melody” /ktb/ representing the root ‘write’, a “vowel melody” /u-i/ representing the inﬂection (perfective passive), and a “skeletal morpheme” CVCCVC representing the derivational category ‘causative’ (McCarthy (1994: 2599)). Non-phonological word-level features may be either inﬂectional (such as case or number) or lexical (such as gender). There is a parallel in phonology in that wordlevel features may be predominantly lexically or predominantly grammatically determined, as was noted above for tone and pitch accents. Notice also the rather close parallel between agreement in gender and harmony phenomena of the type illustrated by the Finnish vowel harmony example above: a grammatical morpheme takes diﬀerent forms depending on properties of the lexical morpheme it depends on. The presence of a suprasegmental tier in phonological structure is bound to have consequences for the ways in which phonetic/phonological information is processed by speakers and listeners. Someone who hears the Yoruba word méjì ‘two’ not only must register the phoneme sequence /meji/ but also, separately, the tonal sequence HL, whose realization goes on at the same time. In pronouncing a Swedish word, I must time the fall in fundamental frequency after the stressed syllable in such a way that the listener can identify the right tonal pattern simultaneously with the segmental structure of the word. It would appear that suprasegmentality puts higher demands on the processing capacity of the language users (which does not mean that it is diﬃcult for them, provided that they have the proper equipment). For other cases of non-linearity in grammar, this is perhaps less obvious, but it is still hard to deny that such phenomena as word-level grammatical features constitute a signiﬁcant addition to structural complexity, not only in degree but also in kind. As we have already seen, suprasegmentals, inﬂections and lexical features such as gender belong to the mature type of linguistic phenomena, being dependent on more or less complex chains of historical changes. Typologically, their role in languages varies from nil to being highly elaborate and salient. This would perhaps not be so remarkable if we were dealing with individual linguistic phenomena, but here we have whole components of language systems that may be totally absent. Furthermore, the phenomena in question are highly prone to being ﬁltered out in suboptimal language acquisition. This last-mentioned fact would be explained if we assume that smooth acquisition of word-level features presupposes access to a speciﬁc learning mechanism, one which is only available in optimal language acquisition. What makes such a hypothesis particularly attractive here is that we are dealing with components of language that, as we have noted, do seem to put special requirements on processing, as well.

Chapter 10

Incorporating patterns

10.1 Introduction Consider the following sentence from the Öömrang dialect of Northern Frisian: (102) North Frisian (Öömrang) Hat as uun’t eerdaapler-skelin. she is in/at_the potato(es)-peeling ‘She is peeling potatoes.’ (Ebert (2000: 610)) (102) is an example of a common type of progressive construction, based on the schema ‘X is at Y-ing’, where ‘Y-ing’ is a nominalization. What is peculiar here is that the direct object eerdaapler ‘potatoes’ merges with, or to use the established term, is incorporated into, the nominalized verb form skelin ‘peeling’, forming one word. This complex word-level pattern has thus taken over a niche that is normally reserved for syntactic constructions, as in English peeling potatoes. In fact, this is the only way a direct object can be accomodated in this particular pattern in North Frisian. However, the incorporating construction is much more restricted than syntactic transitive patterns usually are — only bare nouns can be incorporated, and with any complex or deﬁnite noun phrase other constructions have to be used, for instance: (103) North Frisian (Öömrang) Hat as diarbi 6 pünj eerdaapler tu skelin. she is there-at 6 pound potatoes to peel ‘She is peeling 6 pounds of potatoes.’ (ibid.) (102) thus illustrates the phenomenon of noun incorporation, which is perhaps not so often reported from the languages of Europe, but tends to be seen as a typical property of polysynthetic languages. Diachronically at least, eerdapler-skelin is a compound noun. The relationship between compounding and noun incorporation, and the question of whether noun incorporation is “lexical” or “syntactic”, have been contentious issues since at least the beginning of the 20th century, when they

210 The Growth and Maintenance of Linguistic Complexity

were the object of a heated discussion between Edward Sapir and Alfred Kroeber.1 Noun incorporation is often deﬁned narrowly as the integration of a noun into a verbal complex. I think both noun incorporation and “ordinary” compounding are better seen in a broader perspective, in which they are treated together with on the one hand, various other types of incorporating patterns, and on the other, with multi-word patterns which deviate from the prototypical syntactic constructions by being in diﬀerent ways “tighter” and thus approaching an incorporated status. I shall use the term incorporating pattern as a general term for a one-word pattern which contains more than one lexical element, including both phenomena that have been called incorporation in earlier treatments and those that are usually called “compounds”. This implies only that I think they can be subsumed under one heading, not that one of the notions has priority over the other. As (102) illustrates, the rise of noun incorporation may be part of grammatical maturation — in this case, of the development of a progressive construction. In other respects, as well, incorporating patterns are highly relevant to the themes discussed in this book. In various ways, they exhibit behaviour typical of mature patterns. Thus, incorporation is a major source of complex word-structure, and of particular interest in that it seems to be doing the work usually relegated to syntax. Although (102) is hardly directly diachronically (or synchronically, for that matter) related to any corresponding non-incorporated source, it is reasonable to assume that the eventual sources of incorporating constructions are syntactic, and that the tighter syntactic constructions represent intermediate stages on a diachronic path towards incorporation. In this chapter, I shall ﬁrst look at the cases of incorporation that have traditionally been seen as central and then at some interesting borderline phenomena. I shall then give an overview of NP-internal incorporation and discuss a few other speciﬁc phenomena before trying to come to a conclusion about the nature of incorporating patterns and their place in maturation processes.

10.2 “Classical noun incorporation” By “classical noun incorporation”, I will refer to the patterns that ﬁt common deﬁnitions like “the morphological construction where a nominal lexical element is added to a verbal lexical element; the resulting construction being a verb and a single word” (de Reuse (1994: 2842))

1.Kroeber (1910), Sapir (1911), Kroeber (1911)

Incorporating patterns

The following maximally simple example from Classical Nahuatl illustrates the concept: (104) Classical Nahuatl Ni-naka-kwa. 1sg-ﬂesh-eat ‘I eat meat.’ (Launey (1999: 352)) Classical noun incorporation is perhaps the most well-described type of incorporating constructions, and most of what I will rehearse here will necessarily be a repetition of well-known facts. Classical noun incorporation is often associated with polysynthetic languages, especially those spoken in North America, but exists in more or less well-developed forms in many languages all over the world. With regard to morphological marking, incorporated nouns are usually deprived of case endings, determiners etc. There is usually no agreement marking on the verb (but see discussion of Southern Tiwa below). Regrettably, little or nothing tends to be said about prosody in the literature. Semantically and pragmatically, incorporation of subjects and objects is in general restricted in various ways: –

–

–

reduction of productivity: In the list de Reuse (1994: 2845) of languages with classical noun incorporation, it is said to be non-productive in about twenty cases (it is diﬃcult to give an exact number, since languages and language families are mixed in de Reuse’s list) constraints on referentiality, including animacy and agency: In most languages, only inanimate nouns can be incorporated. Sometimes there is also a restriction to indeﬁnites or non-speciﬁc cases. Subject incorporation is usually quite limited. Agentive nouns are apparently never incorporated (probably this holds generally of “essentially animate arguments” (Dahl (2000b)), that is, slots restricted to NPs with animate referents) “unitary concepts”: verb complexes with incorporated nouns tend to refer to “habitual, permanent, chronic, specialized, characteristic, or unintentional” activities or states (de Reuse (1994: 2845))

In an inﬂuential paper, Mithun (1984) claims that there are four hierarchically ordered types of noun incorporation that together form a diachronic path, which can be interrupted at any point: – – – –

Type I. “Lexical compounding” Type II. “Manipulation of case” Type III.” Manipulation of discourse structure” Type IV. “Classiﬁcatory noun incorporation”

Type I, “Lexical compounding”, includes both straightforward classical noun incorporation, as exempliﬁed e.g. by (104) and what Mithun refers to as “composition by

211

212 The Growth and Maintenance of Linguistic Complexity

juxtaposition”. She quotes the following sentence pair from Mokilese (original source: Harrison (1976)) as an example of the phenomenon, which she says in common in Oceanic languages: (105) Mokilese a. Ngoah kohkoa oaring-kai. I grind coconut-these ‘I am grinding these coconuts.’ b. Ngoah ko oaring. I grind coconut ‘I am grinding coconuts.’ Although the verb phrase in (105b) is written as two words, which retain their independent stress patterns, it has a semantics distinct from that of (a), indicating “unitary, institutionalized activities”, and there are various other indicators “that the compound functions syntactically as a unit”. Notice that this type of construction does not ﬁt the deﬁnition of classical noun incorporation above, since it does not have a clear one-word character. I shall return to it in 10.3. The deﬁning property of Type II, “Manipulation of case”, is that another argument of the clause is permitted to occupy the case role vacated by the incorporated noun (Mithun (1984: 859)). The following examples from Guaraní (Velazquez Castillo (1996: 133)) illustrate this type. (106) Guaraní a. Intransitive, non-incorporated Che-resa hovy 1intr-eye blue ‘My eyes are blue.’ b. Intransitive, incorporated (Che) che-resa-rovy (I 1intr-eye-blue ‘I am blue-eyed.’ c. Transitive, incorporated A-hova-hei-ta pe-mitã 1acc-face-wash-fut that-child ‘I’ll wash that child’s face.’ d. Transitive, non-incorporated A-johei-ta pe-mitã rova 1acc-wash-fut that-child face ‘I’ll wash that child’s face.’ As these examples show, a body-part noun can be incorporated into the verb in Guaraní, at the same time as a noun phrase referring to the owner of the body part

Incorporating patterns

shows up as an argument of the verb. This construction appears to be particularly common in South American languages, and indeed often is the only possible noun incorporation construction. Dixon & Aikhenvald (1999: 9) state as a common property of languages in the Amazonian area that if incorporation is possible, “typically only those nouns which are obligatorily possessed can be incorporated”. Given that obligatorily possessed nouns usually include body-part nouns (see 7.6.1), we may take this to refer to Type II incorporation, in which case it contradicts Mithun’s claim that languages with productive Type II incorporation always have Type I incorporation. Mithun proposes that Type II incorporation develops from Type I when the system is “extended to permit a signiﬁcant oblique argument to assume the syntactic role vacated by an IN [incorporated noun]”. Notice that the nouns occurring in Type II incorporation — typically body-part nouns with deﬁnite reference — are a rather diﬀerent set than the one most often associated with Type I incorporation — indeﬁnite mass and plural nouns. Thus, in the absence of any concrete diachronic data, I would rather propose that Type II incorporation develops directly from syntactic “possessor ascension” constructions (see also 8.3), as illustrated by the translation of (106d) into Russian: (107) Russian Ja vymoju lico e˙tomu rebenku. I wash.pfv.n-pst.1sg face this.dat.m.sg child.dat.m.sg ‘I’ll wash that child’s face.’ The hypothesis that Type I is the historical source is more attractive in the case of Types III and IV. In Type III, “Manipulation of discourse structure”, noun incorporation, according to Mithun (1984), is “also used to background known or incidental information within portions of discourse”. She quotes some Huautla Nahuatl examples from Merlan (1976): (108) Huautla Nahuatl A: Askeman ti-’-kwa nakatl. B: Na’ ipanima ni-naka-kwa. never you-it-eat meat I always I-meat-eat ‘You never eat meat — I eat meat all the time.’ (109) Huautla Nahuatl A: Kanke eltok kocˇillo? Na’ ni-’-neki amanci. where is knife I I-it-want now B: Ya’ ki-kocˇillo-tete’ki panci he (he)it-knife-cut bread ‘Where is the knife? I want it now — He cut the bread with it (the knife).’ In these examples, a lexical item shows up twice, the ﬁrst time independently, the second time incorporated into a verb complex, and the second instance obtains a

213

214 The Growth and Maintenance of Linguistic Complexity

kind of anaphoric function. Merlan (1976: 184), quoting more than ten cases, says about them that “incorporation serves to maintain deﬁniteness of the discourse referent by signaling coreferentiality with a previously-occurring NP adjunct”. However, most of her examples are like (108) in that the incorporated noun is translated into English as an indeﬁnite noun phrase or as the dummy pronoun one.2 The expressions “discourse referent” and “coreferentiality” are thus a bit misleading since what is shared between the two instances of the noun is not a speciﬁc individual referent but rather the generic concept denoted by the noun. (109) is an exception to this in that it appears to be a case of regular co-reference — it is not possible to judge about the frequency of such cases, in the absence of more detailed documentation. Evans (1997: 405) says about the Gun-djeihmi dialect of Mayali (Gunwinggu) that “incorporated nominals are frequently used to track nonhuman NPs in discourse, just as pronominal preﬁxes are”. The examples he cites seem more like regular coreferential anaphora than most of Merlan’s examples, e.g. where kurlah ‘pelt’ in the second clause clearly refers back to the hides mentioned in the ﬁrst clause: (110) Gun-djeihmi Kun-kurlah a-ka-ni djamun-djahdjam cl:IV-pelt 1/3-take-pst.cl:I dangerous-place a-kurlah-wo-ni kun-warde an-wo-ni. 1/3-pelt-give-pst.cl:I IV-money 3/1-give-pst.cl:I ‘I would take the hides to the police station. I would give them (the hides) to him and he would give me money.’ What can be noted here is that the quoted examples of Type III incorporated nouns have a minimal informational value and seem to obey the usual constraints on incorporation such as non-agentivity. Mithun’s Type IV is what she calls “classiﬁcatory noun incorporation”. In this type, the incorporated noun is “doubled” by an external NP specifying the referent in question. The following example (Mithun (1984: 867); original source: Oates (1964)) is from the Kunwinjku dialect of Mayali (Gunwinggu): (111) Kunwinjku …bene-dulg-na] mangaralaljmayn …they_two-tree-saw cashew_nut ‘… They saw a cashew tree.’

2.To emphasize this point, I have rendered Merlan’s literal English translation of (108) rather than quoting Mithun’s ‘I eat it (meat) all the time’.

Incorporating patterns

Sadock (1991) speaks of a more general phenomenon of “stranding”, where it seems that only part of a noun phrase has been incorporated, leaving the rest stranded outside, as in this rather dramatic example from West Greenlandic, where the incorporated element is the numeral ‘two’ and the stranded element the rest of the expression ‘two and a half ’: (112) West Greenlandic Marlo-raar-poq aﬀar-mil-lu. two-3sg.catch-ind half-ins-and ‘He caught two and a half.’ The most plausible diachronic source for stranding and classiﬁcatory incorporation is afterthought constructions. Thus, (111) might be diachronically derived from something like They saw a tree, a cashew nut tree. Southern Tiwa is often noted in the literature as a potential counterexample to various theoretical claims about noun incorporation. (113) exempliﬁes the Southern Tiwa noun incorporation construction (Allen et al. (1984: 295)): (113) Southern Tiwa Bi-seuan-mu˜-ban 1sg.3sg-man-see-pst ‘I saw the men.’ In fact, this construction deviates from the typical pattern for noun incorporation on no less than three counts, all of which are illustrated by (113): – –

–

noun incorporation is possible even with animate referents (but not with proper names); noun incorporation is obligatory for the following types of direct objects: – inanimate nouns; – nonhuman animate singular nouns and human plural nouns, if not qualiﬁed by a demonstrative or numeral; – human animate singular nouns if the subject is third person. the agreement preﬁx (which is initial in the verb complex) carries information about the incorporated object as well.

Southern Tiwa also allows incorporation of inanimate subjects. It appears that the deviations all point in the same direction: noun incorporation is less “tight” than the garden variety noun incorporation construction, i.e. it has preserved more of the properties of a syntactic construction. In particular, we may notice that it is obligatory for a rather wide range of cases, which implies that it is also fully productive and not subject to “unitary concept” constraints.

215

216 The Growth and Maintenance of Linguistic Complexity

10.3 Quasi-incorporation As noted above, constructions may share some but not all of the properties of prototypical incorporation. In particular, the “incorporated” element may, wholly or partly, retain its status as an independent word, Mithun’s examples of “composition by juxtaposition”, as seen in (105) above, being cases in point. In the literature, there are diﬀerent opinions as to whether such cases should be considered as incorporation or not. Thus, Miner (1986) wants to distinguish incorporation from what he calls “noun-stripping”, a phenomenon “whereby nominals…are rendered indeﬁnite — modiﬁers, determiners, number aﬃxes, etc. are ‘stripped away’ — and enter into closely-knit units with their verbs, but stop short of actually being incorporated” (p. 243).3 For instance, in Zuñi, according to Miner, deﬁnite nouns normally have obligatory number-marking suﬃxes. In the constructions in question, these suﬃxes are omitted, but word stress is retained, thus: (114) Zuñi cˇá tékałašna child neglect ‘neglecting your children’ The phenomenon of reduced grammatical marking without loss of word-status is in fact very wide-spread in constructions which seem functionally analogous to noun incorporation. Grammatical marking may be wholly absent, as in the Zuñi, example, or only partially reduced. The latter possibility is illustrated by the following sentence-pair from Hungarian (Kiefer (1990–91)): (115) Hungarian a. Éva level-ek-et ír. E. letter-pl-acc write.prs.3sg ‘Éva is writing letters.’ (Kiefer (1990–91: 153)) b. Pisti level-et ír. P. letter-acc write.prs.3sg ‘Steve is writing letters/a letter (is engaged in letter-writing).’ (Kiefer (1990–91: 151)) In (115a), there is both a plural and and an accusative marker on the direct object, and it is naturally interpreted as referring to more than one letter. In (115b), on the other hand, there is no plural marker, and the interpretation allows for both ‘a

3.Miner’s use of the terms “deﬁnite” and “indeﬁnite” is a bit confusing. It would seem that nouns are either deﬁnite and “dressed” or indeﬁnite and “stripped”. What is unclear is whether other ways also exist of making a noun indeﬁnite, other than stripping it.

Incorporating patterns 217

letter’ and ‘letters’. The accusative marker -et, on the other hand, is retained. In other languages, it may be precisely the lack of case-marking that is the most salient characteristic of similar constructions. Thus, in Turkish, there are minimal pairs such as the following (Nilsson (1985: 24)), where the direct object is in the accusative case in (a) but zero-marked in (b): (116) Turkish a. Ays¸e balıg˘i tutuyor. A. ﬁsh.acc catch.prs.3sg ‘Ays¸e is catching the ﬁsh.’ b. Ays¸e balık tutuyor. A. ﬁsh catch.prs.3sg ‘Ays¸e is catching ﬁsh.’ The reduction of grammatical marking may aﬀect not only bound but also free markers. In Swedish, alongside singular indeﬁnite noun phrases introduced by an indeﬁnite article, as in (117a), we frequently ﬁnd bare singular count nouns in examples like (117b): (117) Swedish a. Vi har en häst. we have.prs a horse ‘We have a horse.’ b. Vi har häst. we have.prs horse ‘We have a horse, i.e. we are horse-owners.’ In contradistinction to the Hungarian example, (117b) has a singular reading only. It presupposes a community where horse-owning is something normal, and where, in addition, people have one horse, rather than several. Thus, the phenomenon of reduced grammatical marking turns out to be quite varied. Moreover, in the constructions mentioned, it is accompanied by various other properties. Returning to Kiefer’s account of Hungarian, what he calls “incorporated” objects, such as the one in (115b), have the following characteristics, which distinguish them from ordinary objects in syntactically formed verb phrases: – – – – – – –

non-referentiality: they cannot be referred back to by an anaphoric pronoun; non-modiﬁability: they do not admit the addition of a modiﬁer; they form a single phonological unit with the verb — the verb is unstressed; they are commonly lexicalized; they have a ﬁxed position immediately before the verb; they cannot co-occur with a verbal preﬁx; the singular-plural opposition is neutralized.

218 The Growth and Maintenance of Linguistic Complexity

Similarly, we ﬁnd that in Swedish, häst in (117b) is not generally expandable: *vi har brun häst ‘we have brown horse’ is strange. It also has lowered referentiality: (117b) is deﬁnitely not a good way of starting a story where the horse in question is going to be the protagonist. In the same way, if we continue the Turkish examples (116a–b) with (118), onu ‘he/she/it’ is referentially ambiguous between ‘Ays¸e’ and ‘the ﬁsh’ after (a) but can refer only to ‘Ays¸e’ in (b) (Nilsson (1985: 25)): (118) Turkish Onu gördün mü? her/it see.pst.2sg q ‘Did you see her/it?’ Arguably, we may have reasons to pick out constructions that exhibit similar incorporation-like properties even if they do not involve a reduction of grammatical marking. Thus, the impossibility of adding a modiﬁer which was said to characterize Hungarian “incorporated” objects ﬁnds a parallel in some English verb + adjective combinations, discussed by Williams (1997): (119) a. John wiped clean the table. b. John wiped the table clean. Only in (b) can clean be modiﬁed by an adverb: (120) a. *John wiped very clean the table. b. John wiped the table very clean. On these grounds, it might thus be claimed that clean in (a) is on its way to becoming incorporated into the verb wipe. The combination wipe clean still diﬀers from a normal English verb by allowing inﬂectional aﬃxes on the ﬁrst rather than on the second element, though. For constructions where, in the words of Miner (1986), elements “enter into closely-knit units…but stop short of actually being incorporated”, I propose the term quasi-incorporation, which I ﬁnd preferable to “analytic incorporation” proposed by Nedergaard-Thomsen (1992), since it is more compatible with seeing true one-word constructions as the primary cases of incorporation. Unexpected grammatical marking in quasi-incorporation. We have seen that quasi-incorporated patterns are often characterized by the total or partial lack of “normal” grammatical marking. In some cases it looks as if what we are getting is not lack of marking but rather the “wrong” marking, that is, a use of a grammatical marker that seems to be against the normal rules. For instance, in Northern Swedish vernaculars, constructions with bare singular count nouns are often translated as deﬁnite noun phrases. Thus, (117a) in Elfdalian would be

Incorporating patterns 219

(121) Elfdalian Am est-n. have.1pl horse-def ‘We have a horse, i.e. we are horse-owners.’ Cf. also an example like the following: (122) Elfdalian O ˛ jät suppu˛ min stjiedn. she eat.prs soup.def.acc with spoon.def.dat ‘She eats soup with a spoon.’ In both of these examples, the use of the deﬁnite form of the noun is wholly unexpected, from the point of view of current theories of the deﬁnite article. The phenomenon is not unique, though. A possible translation of (122) into French is (123) French Elle mange la soupe à la cuillière she eat.prs def soup with def spoon ‘She eats soup with a spoon.’ where a deﬁnite NP is used after the preposition à. With this preposition, the deﬁnite article seems more or less obligatory. With the synonymous preposition avec the deﬁnite article is possible but the preferred variant appears to be with an indeﬁnite NP: (124) French Elle mange la soupe avec une cuillière. she eat.prs def soup with a spoon ‘She eats soup with a spoon.’ Grammars of French tend to ignore examples like (124) or treat them as generic, which is hardly an adequate characterization synchronically. Diachronically, on the other hand, it is probable that such uses of deﬁnite marking are indeed extensions of generic uses, like (27)–(28) in Chapter 5, and like them, are examples of what may happen to deﬁnite articles at advanced stages of their life cycles, when they have expanded far from their original niches.

10.4 Lexical aﬃxes In the preceding section, we saw one type of borderline cases of incorporation. Problems in delimiting the notion of incorporation may also arise from the fact that what is incorporated may not always be an element that can show up as an

220 The Growth and Maintenance of Linguistic Complexity

independent word. Thus, languages from a number of diﬀerent North American families (Salishan, Tsimshian, Wakashan) exhibit lexical aﬃxes — morphemes that are like grammatical aﬃxes in that they may be non-syllabic, they must always be attached to a root (and cannot function as roots themselves) and form relatively closed classes, but retain concrete meanings that would in most languages be connected with lexical items or stems (Mithun (1997, 1998)). Thus, Bella Coola has a suﬃx -lst translated as ‘rock’ and another -nalus»lsaR ‘between the toes’ (Mithun (1997)). The aﬃxes are used in forming both nouns and verbs (or predicates), as when the Bella Coola suﬃx -uc ‘mouth’ is used in the noun sqal-uc ‘fruit’ or the predicate aRpusm-uc ‘have a swollen mouth’. In general, these languages contain ordinary lexical items with the same or similar meanings as the aﬃxes. Sometimes, these are clearly etymologically cognate to the aﬃxes, in which case the aﬃxes show signs of having undergone phonological reduction, as in the Nisga’a pair is‘kw- (verb stem) ‘to stink’: is‘ (aﬃx) ‘smell of ’. In other cases, however, there is no such link to free lexical items. In spite of their concrete, word-like semantics, lexical aﬃxes tend to have meanings which are “more general and diﬀuse than those of their stem counterparts” (Mithun (1997: 364)) as exempliﬁed by the Bella Coola pair cuca (stem) ‘mouth’: -uc (aﬃx) ‘mouth, lips, hair around the mouth, implements associated with eating, oriﬁces in general including doors and the opening of bottles, nets and bags, and for edges, particularly of bodies in water’. Moreover, says Mithun, they may function “to regulate the ﬂow of information in connected speech” (1998: 294), e.g. by serving to “background information that is established, predictable, or incidental”. The number of lexical aﬃxes in a language can be quite large — some 200 to 400. They retain not only (part of) their concrete meanings but also their productivity, i.e. their capacity to enter into new combinations (although the aﬃx classes themselves are usually closed). The Eskimo languages exhibit a somewhat special pattern in that special verb roots are used in incorporating constructions in combination with nouns that also appear freely: (125) Eastern Canadian Inuktitut a. Jaani-up iqaluk niri-janga. J.-erg ﬁsh.abs eat-par.3sg/3sg ‘Johnny is eating/ate (the) ﬁsh.’ b. Jaani iqaluturluq < iqaluq-tuq-juq. J.-abs ﬁsh-eat-par.3sg/3sg ‘Johnny is eating/ate (the) ﬁsh.’ (Allen (1996: 158)) In the Inuktitut example (125), the incorporating -tuq and the independent niriboth mean ‘eat’ but are separate lexical items. The question of the status of these

Incorporating patterns 221

and similar patterns — whether they should be regarded as incorporation or not — has been discussed for about a century without being ﬁnally settled. Obviously, the borderline between “ordinary” incorporation/compounding and lexical aﬃxes is a rather thin one. Ideally, in the ﬁrst case we would be dealing with elements that can also be used as independent words, whereas in the second case we would have items that only show up as aﬃxes. However, as we have seen, some of the aﬃxes still retain a connection to their free cognates, and an alternative analysis might be possible where they would be seen as bound variants of them. Words do in fact often change their appearance when they are parts of compounds, as in Swedish gata ‘street’ + korsning ‘crossing’ Æ gatu-korsning ‘street crossing’. In fact, the phenomenon of lexical elements that only show up as bound morphemes is a very general one. Examples can be found for instance in English. Thus, to some extent matching Bella Coola -lst ‘rock’, there is an English morpheme petro- ‘rock’, as in petroglyph ‘rock carving’, petrogenesis ‘the formation of rocks’ etc., which occurs only as the ﬁrst part of a compound. Functionally, morphemes such as -pod ‘foot’, occurring in more than a hundred English words listed in the OED (almost exclusively bahuvrihi compounds), come closer to the lexical aﬃxes of North American languages. In English, elements of this kind typically originate as borrowings. Thus, Latin and particularly Greek compounds have entered the modern European languages in great numbers, but many of the elements appearing in those (e.g. -pod) have become productive, some not only in combination with each other but also with native elements, such as pseudo- in pseudo-life and pseudo-word. Lexical aﬃxes are important theoretically in several ways. As Mithun points out, they are a challenge to common ideas of the division of labour between diﬀerent kinds of elements in language, as they show that processes which are similar to those found in grammaticalization may take place without the ﬁnal result being grammatical items of the usual type. What this means, in my opinion, is that classical grammaticalization is part of a larger class of processes of grammatical change, but also, crucially, that this is not the whole story — the genesis of grammatical categories cannot be fully explained by those processes. Another relevant point here is that by preserving much of their original concrete meaning, lexical aﬃxes serve as an illustration of the thesis that an important source of structural complexity in language lies in the retainment of properties from earlier stages of the life cycles of constructions.

10.5 Overview of NP-internal incorporation and quasi-incorporation While “classical noun incorporation” has been at the centre of an intensive debate for a long time, some other types of incorporation have received much less attention. This applies to several of the patterns I shall discuss in this section, which have

222 The Growth and Maintenance of Linguistic Complexity

in common that some lexical element becomes an incorporated or quasi-incorporated element of the head noun of a noun phrase. Before going into these, however, I shall make some observations about “ordinary” noun compounding, which of course has been extensively discussed in the literature, but perhaps less often in this kind of context. 10.5.1 Compound nouns Compound nouns are a common phenomenon, but languages diﬀer signiﬁcantly as to the role and frequency of noun compounding, and there are also quite marked diﬀerences with respect to individual types of compounds. In fact, rather than thinking of compound nouns as a general undiﬀerentiated category, it appears more adequate to speak of a relatively large number of possible compounding constructions which may be manifested in languages, sometimes in rather diﬀerent ways. For instance, consider a type of compound that is extremely useful in our technological society: the combination of a general noun such as machine with some verb indicating its function, e.g. washing-machine. Notice that English compounds of this type are consistently expressed as V-ing-N. In many other Germanic languages, on the other hand, the verb stem is simply juxtaposed to the noun: German Waschmaschine, Swedish tvättmaskin, making it diﬃcult to tell if the ﬁrst component is a verb or a noun. In French, this type of compound simply doesn’t exist: one has to use what is in essence a periphrastic construction — machine à laver ‘machine to wash’. In Russian, one would normally use an adjectival construction: stiral’naja mašina, where stiral’naja is an adjective meaning something like ‘pertaining to washing’. In fact, the existence of compounds may be no worse as a basis for a typology of languages than traditional criteria such as the existence of inﬂectional morphology. From the point of view of a compound-challenged language like Russian, Swedish and German are almost polysynthetic. Instead of the compound nouns typical of Germanic languages, Romance languages often have phrasal constructions of the type vêtements de femme ‘women’s clothing’, where the modifying noun is used without a determiner — arguably a case of quasi-incorporation. Consider also in this connection English expressions such as sheep’s clothing. The degree of “tightness” of compounds may vary, even within a language. Let us take Burmese as an example. Burmese is a language that is characterized by an extensive use of nominal compounding,4 and also, apparently, by a rather fuzzy

4.Verb + verb compounding is also quite extensively used in Burmese, but noun + verb compounding appears to be fairly restricted; the canonical cases of noun incorporation do not seem to be possible.

Incorporating patterns 223

borderline between compounds and NPs with phrasal attributes. Okell (1969: 49) speaks of the distinction in terms of “tighter” and “looser” linkage between the elements of noun expressions, saying that for the looser cases, “it is more convenient to refer to the expression as a ‘noun phrase’ rather than as a ‘compound noun’”. But this has no bearing on the structural relationship between the elements, and there are many borderline cases, he says, and he uses “compound noun” as a convenient abbreviation for both kinds. Okell (1969: 50) lists a number of features that are “relevant to the tightness or looseness of the link between the members in a compound noun”. First, loose links are characterized by “separability” — which here is actually the same as expandability — and reversibility (of element order). Thus, the loose character of (126a) is seen in the fact that it can be expanded into (b) or reversed as in (c). (126) Burmese a. eìñci a˘pya blue shirt ‘blue shirt’ b. eìñci le‘tou a˘pya blue short-sleeved shirt ‘blue short-sleeved shirt’ c.

a˘pya eìñci shirt blue ‘blue shirt’

On the other hand, the tightness of an expression may be manifested in internal sandhi, as when the ﬁrst consonant of the second component is voiced (indicated by underlining), e.g. thoùñpei ‘three feet’ (< thoùñ ‘three’ + pei ‘foot’), or in phonological reduction (Okell’s term: “weakening”), as in hna˘pei5 ‘two feet’ (< hni‘ ‘two’ + pei ‘foot’). Finally, there may be variants with and without the “formative preﬁx” a˘on the second element, as in mye‘ a˘yaìñ or mye‘yaìñ ‘wild grass’ (< mye‘ ‘grass’ + a˘yaìñ ‘wild’). Which semantic or other factors determine whether a Burmese complex noun expression is given a loose or a tight expression is not clear, but the tight ones do not necessarily correspond to noun + noun compounds in other languages. In particular, there exist a certain set of “head-following attributes tightly linked with the head, which occur… with such a wide variety of heads, and so frequently, that they are classiﬁed here as ‘auxiliary members’ in compound nouns — or brieﬂy ‘auxiliary nouns’” (Okell (1969: 82)). In addition, most of these are so-called “bound nouns”, that is, they occur exclusively in compounds. The list of auxiliary

5.The symbol a˘ represents [6].

224 The Growth and Maintenance of Linguistic Complexity

nouns includes several morphemes labelled ‘plural’ and a number of quantiﬁers such as taìñ ‘every’ and si ‘each’ but also some words expressing prototypical adjectival meanings (see 10.5.2 below). It is of course common for more lexicalized compounds to show a greater degree of phonological reduction etc. than newly formed ones. Sunday and Monday are obviously compounds of sun and moon + day, and yet an astronomer who lectures on the sun one day and on the moon the other might call these days sun day and moon day with full vowels and a very diﬀerent prosody. But we may also observe that compound constructions change over time, becoming more “rigid” in their structure. In Swedish, compound nouns are with few exceptions stressed according to the principle: main stress on ﬁrst element, secondary stress on last element. This is quite independent of the internal structure of the compound: for instance, when the words region ‘region’ and the already compounded sjukhus ‘hospital’ are joined to form the compound regionsjukhus ‘regional hospital’, the primary stress (on sjuk) of the second component disappears but the secondary stress (on hus) stays. There are two kinds of consequences: –

–

It is not possible to use stress as a disambiguating device in compounds. For instance, the compound noun stor-stads-gata ‘big-town-street’ can (at least in principle) be parsed as [stor-stads]-gata ‘street in a big town’ and stor-[stadsgata] ‘big street in a town’ and will have the same stress in both cases. This is in contrast to languages such as German, and even Standard Finland Swedish, where such words would be distinguished by the placement of secondary stress. It is not possible to use stress to focus on an element of a compound. Again, we may compare with German, where it is very common for elements of compounds to receive contrastive stress. For instance, the words ÁArbeitgeber ‘employer’ and ÁArbeitnehmer ‘employee’ both have the main stress on the ﬁrst element, but when they are conjoined they are normally pronounced as ArbeitÁgeber und ArbeitÁnehmer. This is in general not a natural thing to do in Swedish; the main stress stays on the ﬁrst element: Áarbetsgivare och Áarbetstagare.

What we see here is the association of compounds with a ﬁxed and particular prosodic pattern, which does not normally show up in simplex words or in phrases — it is thus unique to compounds.6 I think this evolution of speciﬁc prosodic patterns for

6.Not quite, though. Swedish has a class of “false compounds”, such as löjtnant ‘lieutenant’ and paradis ‘paradise’, which are pronounced according to the compound pattern — the second element has secondary stress and contains a long segment (e.g. the vowel i in paradis). Most of these are loanwords with the “abnormal” property of being polysyllabic yet not analyzable into morphemes. Apparently, treating them as compounds is a way of accommodating them in the phonological system. This means, however, that the compound pattern has expanded so as to also include some morphologically simplex words.

Incorporating patterns 225

compounds is one way in which compounds can be said to represent mature structures. The genesis of these patterns remains obscure, however. Other unique (and mature) features of compounds have a more transparent origin. Thus, in Germanic languages, there are often traces of genitive marking on the ﬁrst element of compounds. Sometimes, this is manifested straightforwardly as a linking -s-, homonymous with the genitive ending of nouns, as in Swedish bord-s-ben ‘table leg’. At other times, however, the ﬁrst element preserves an older genitive form, which only shows up synchronically in this context, e.g. Swedish kyrko-gård ‘churchyard’ (from kyrka ‘church’) or fåra-kläder ‘sheep’s clothing’ (from får ‘sheep’). In English noun-noun compounds (or words etymologically derived from such compounds), we ﬁnd at least three diﬀerent prosodic patterns, ordered by degree of tightness: – – –

“phrasal stress”, with the main stress on the last element, as in orange mármalade; “compound stress”, with the main stress on the ﬁrst element and secondary stress on the last element, as in bírdsong; simplex word pronunciation, as in Monday.

The English system illustrates a feature that seems to be fairly general, viz. that there are only a few prosodic patterns that cover a large number of compound types, often cross-cutting them. Semantically and syntactically similar compounds may thus display diﬀerent degrees of tightness. For instance, street names usually have phrasal stress, as in Massachusetts Avenue, with the notable exception of those that end in Street, which have compound stress, as in Oxford Street. (See also below 10.5.5.) Many cases are quite idiosyncratic. Spencer (forthcoming) says that “[t]he variation in stress patterns prompts us to think of analogical or connectionist explanations rather than ascribing them to some set of general grammatical rules”. It may be noticed that compound stress sometimes also shows up with phrases that look like ordinary possessive constructions, such as Pláto’s Problem. On the other hand, noun + noun combinations with phrasal stress in English tend not to be clearly diﬀerentiated from other attributive structures. An alternative would be to see them as adjective-noun constructions with zero-derived de-nominal adjectives. 10.5.2 Adjective + noun compounding As is well known, adjectives do not always form a separate word class in languages. Under this heading, we shall look at functional equivalents of English adjective + noun combinations, irrespective of the word class status of the ﬁrst elements, but referring to them as “adjectives” for simplicity. Incorporation of adjectives into noun phrases is a process that has received relatively little attention in the linguistic literature. When it appears in languages of

226 The Growth and Maintenance of Linguistic Complexity

a polysynthetic character, it tends to be taken as a matter of course, and examples are more or less quoted in passing. Compounds consisting of an adjective and a noun are common in many languages. In textbooks, the diﬀerence between compounds and syntactic word combinations is often illustrated by the English minimal pair blackbird vs. black bird. In English, adjective-noun compounds are usually heavily lexicalized and the process is only marginally productive. In other Germanic languages such as Swedish and German, the number of such compounds is much higher and it is easy to create new ones. For instance, a mainframe computer may be called stordator ‘bigcomputer’ in Swedish and Grossrechner in German. But these compounds still obey a “unitary concept” constraint. A stordator is a speciﬁc kind of computer, not one that just happens to be a bit bigger than usual. A special type of adjective + noun compound is what is called “bahuvrihi”, following the Sanskrit tradition. A bahuvrihi compound such as bluetooth (as in the name of the Danish king Harold Bluetooth) does not denote an object described by the components but rather, by metonymy, an individual which is characterized by it. Of more direct interest to the questions discussed in this chapter are cases in which tighter combinations of adjectives and nouns have developed without being constrained in the ways compounds usually are. Such constructions can be found in a number of diﬀerent languages, and seem to follow some general tendencies. I shall now review some of them — I suspect that there are many more examples, but these constructions are not always very salient in grammars. Lakota. In Lakota, according to Boas & Deloria (1941), “[n]eutral verbs performing the function of adjectives are compounded with the noun which they follow” (157). Adjectives are thus not treated as a separate word class but rather as a functional notion — they are stative verbs in attributive position. Boas & Deloria say: “The adjective is identical with the neutral verb. As a verb it retains its independent accent, as adjective it loses it.” (69) Although it is possible to join a neutral verb and a noun by putting the verb into a full relative clause, compounding appears to be the only way of constructing a noun-adjective combination within a single clause, and it could thus be argued that we are dealing here with a case of obligatory incorporation. (127) illustrates the word t‘ø ’ka‘ ‘large’ in predicative and attributive function: (127) Lakota a. s˙ u˛´ka ki˛ t‘ø´ka‘ dog def large ‘the dog is large’ b. s˙ u˛´ka t‘ø`ka dog large ‘large dog’

Incorporating patterns 227

“Neutral” verbs in Lakota are opposed to “active” verbs and include statives but also some dynamic verbs that do not relate “exclusively to animate beings” such as ‘tremble’. Presumably not all these show up as adjectives. The meanings of the “neutral verbs” that are mentioned by Boas & Deloria as occurring in attributive function include ‘hot’, ‘long’, ‘red’, ‘pleasant’. Compounds in Lakota obey special accenting rules. In the section on accent (Boas & Deloria (1941: 21)) it is stated that if the ﬁrst part of the compound is bisyllabic, the second part retains “a very weak secondary accent”. Accordingly, the ﬁrst syllable of the adjective in (127) is marked by ‘. This mark is absent in the words cited by Boas & Deloria where the adjective is monosyllabic, although they do not give a rule for this. If the ﬁrst part of the compound is monosyllabic, the placement of the accent depends on “syntactic rules”, Boas & Deloria say (1941: 21). However, the examples of compounds with monosyllabic nouns that they cite do not diﬀer in accentuation from the bisyllabic cases. On the other hand, “[w]hen noun and adjective are thoroughly amalgamated into one concept, the ﬁrst stem, if monosyllabic, loses its accent which falls on the second syllable”. Cf. the two examples c’e´g˙a-zi ‘yellow kettle’ and ˙ ‘brass kettle’. Thus, the latter type is presumably both more highly lexicalc’eh-zi´ ized and tighter than the former. (See also Rankin et al. (2002) for discussion.) Burmese. As noted above, “compound nouns” in Burmese may contain adjective-like elements, as in a˘pyouhlá ‘pretty girl’ (< a˘pyou ‘girl’ + a˘hlá ‘pretty’) or na˘hpyu ‘white cow’ (< nwa ‘cow’ + a˘hpyu ‘white’). In particular, among the “auxiliary nouns”, which only occur in compounds, we ﬁnd cì ‘large, great, much, very’, as in tai‘cì ‘large building’, and (hka˘)leì ‘young, small, little’, as in le‘hswèei‘hka˘leì ‘small handbag’ exemplifying prototypical adjectival meanings. (To judge by the example kou Tiñcì ‘old Ko Tin,’ cì can also have the reading ‘old’.) Chukchi (Muravyova (1998: 526)). In Chukchi, there are three types of adjectives: (1) qualitative adjectives formed on the pattern n-Stem-qin: n-erme-qin ‘strong’; (2) possessive adjectives formed from nouns by the suﬃxes -in, -6nin(e) or -6rgin: Tutun-6nin ‘belonging to Tutun’; (3) relational adjectives marked with -kin: a]qa-ken ‘relating to the sea’. Attributive adjectives are obligatorily incorporated when the head noun is in any other case than absolutive. (Incorporation is sometimes also possible with absolutive nouns but it is unclear if there are any principles.) Non-incorporated adjectives agree with the head noun in number, although agreement is optional with possessive adjectives. Incorporated qualitative adjectives lose the circumﬁxal marker n-…-qin:

228 The Growth and Maintenance of Linguistic Complexity

(128) Chukchi Non-incorporated construction

Incorporated construction

n-ilg6-qin-Ø qora-]6 ‘a white reindeer (abs.sg)’

elg6-qora-ta ‘a white reindeer (erg.)’

Relational adjectives keep their derivational suﬃx (but do not agree in number): (129) Chukchi Non-incorporated construction

Incorporated construction

a]qa-kena-t galga-t ‘sea birds (abs.pl)’

a]qa-kena-galga-ta ‘sea birds (erg.)’

Possessive adjectives are said to be incorporated less often than other types of adjectives (it is not clear how that meshes with obligatoriness). Among the other languages from the Chukotko-Kamchatkan family, adjective incorporation is also found in Koryak but not in Itel’men (Volodin (1976: 322)). Celtic. In the Celtic languages, attributive adjectives typically follow their head nouns; an alternative preposing, and generally tighter, construction is also usually possible (Croft & Deligianni (forthcoming)). Thus, in Old Irish (Thurneysen (1909)), attributive adjectives were either postposed to the head noun, agreeing in gender, number, and case, or preposed, without agreement and forming a compound with the noun. In such a compound, the main word stress was on the adjective (stress was generally word-initial) and the initial consonant of the noun was subject to lenition. Examples: (130) Old Irish Preposed

Postposed

fír-brithem ‘righteous judge’ ar noíb-bríathraib ‘before the holy words’ in nuae-thintúd sa ‘this new translation’

bretha fír-a ‘just judgments’ húanaib aidmib noíb-aib ‘of the holy instruments’ á cétal nuae ‘the new song’

In a few cases, diﬀerent stems were used in the two constructions: preposed dag- or deg- ‘good’ vs. postposed maith ‘good’ (e.g. dagfer vs. fer maith ‘good man’), preposed droch- or drog- ‘bad’ vs. postposed olc(c) ‘bad’. In some cases, the compound construction was the only possible: mí- ‘bad, ill-, mis-, wrong’, so-, su‘good’, bith- ‘lasting, eternal’: however, the examples given by Thurneysen all seem

Incorporating patterns 229

to be lexicalized, e.g. so-chor ‘good agreement, advantage’. Some derived adjectives were excluded from the compound construction. In the more recent Celtic languages, compound-like preposed attributive adjectives are also found, although it appears that there is a tendency for the construction to fossilize: –

–

–

–

– –

In Modern Irish, “many adjectival meanings are expressed by means of preﬁxes” (Mac Eoin (1993: 117)), e.g. dea- ‘good’, droch- ‘bad’, so- ‘good’, do‘bad’, in- ‘capable of ’, etc. In Scottish Gaelic, a small number of common adjectives are preposed to the noun and form “quasi-compounds” with them, e.g. seann chù ‘old dog’, droch thìde ‘bad weather’, deagh dhuine ‘an excellent fellow’ (Gillies (1993: 203)). In Manx, the following “adjectival preﬁxes” are found: ard ‘high, main, chief ’ (ard valley ‘city’), drogh ‘bad’ (drogh ven ‘bad woman’), shenn ‘old’ (shenn ven ‘old woman’), reih ‘choice’ (y reih dooinney ‘the best man’) (Broderick (1993: 240)). In Welsh, preposed adjectives are said to occur mainly in poetry, except for a few such as hen ‘old’, which however must be postposed when modiﬁed: hen ddyn ‘old man’ (with mutation of the initial consonant in dyn ‘man’) but dyn hen iawn ‘a very old man’ (Watkins (1993: 331)). For Cornish, hen ‘old’ and tebel ‘evil’ are given as examples of adjectives that preceded the noun and caused lenition of it (George (1993: 440)). In Breton, a few adjectives precede and cause mutation of their head: gwall ‘famous’, krak ‘puny’ (always); hir ‘long’, berr ‘short’, kaezh ‘poor’, gwir ‘true’ (sometimes) (Hemon (1970: 32)).

Romance. Phenomena somewhat similar to those in Celtic are found in the geographically and genetically close Romance languages, where both preposed and postposed attributive adjectives occur, and the preposed ones tend to have “tight” properties. The set of adjectives that can be preposed to the noun is usually restricted, and the list tends to include at least some of the “prototypical” adjectives such as ‘good’ and ‘big’ but not necessarily all of them. Let us consider some of the rules given for Spanish in standard grammars, such as Fält (2000): – – –

–

the default placement is postnominal; the most common prenominal adjectives are grande ‘big, large’, bueno ‘good’, and malo ‘bad, evil’; grande, bueno, and malo tend to be used postnominally (i) for contrast; (ii) when modiﬁed; (iii) in a more “objective” sense: un hombre grande ‘a (physically) big man’ vs. un gran hombre ‘a (spiritually) great man’; prenominal placement is common (especially in narrative writing) when the property ascribed is “inherent” or “commonly known”, e.g. la bella princesa ‘the beautiful princess’;

230 The Growth and Maintenance of Linguistic Complexity

–

–

prenominal placement may be used to obtain a “balance” when there are two modifying adjectives: un importante centro ferroviario ‘an important railway junction’; diﬀerences in meaning are sometimes found, e.g. un viejo amigo ‘an old (= long-time) friend’: un amigo viejo ‘an aged friend’; un nuevo vaso ‘another glass’: un vaso nuevo ‘a new glass’.

In some cases, notably with adjectives having the meanings ‘big’ and ‘beautiful’, preposed adjectives may be reduced — the ending and sometimes the ﬁnal consonant are dropped, as in Italian il bel paese ‘the beautiful country’7 (instead of il bello paese) or Spanish el gran libro ‘the big book’ (instead of el grande libro). A slightly diﬀerent pattern is found in French, where the more tightly integrated patterns — NPs with preposed attributive adjectives — are paradoxically enough phonologically heavier than the less integrated ones. French attributive adjectives may both precede and follow the head noun, but so-called liaison — the nonapplication of the process of ﬁnal consonant deletion before a following vowel — can only occur in when the adjective precedes, as in un petit ami [œ ˜ p6tit ami]. Given that ﬁnal consonant deletion typically occurs at the end of a phonological phrase, it is of course natural that it is less likely to take place within a tight construction. Frequency eﬀects may also play a role, as suggested by Bybee (2000: 349), although it is not so easy to disentangle them from other factors.8 Southern Ute. Another fairly clear case, from another part of the world, is Southern Ute, as described in Southern Ute Tribe (1980: 290), where it is said that attributive adjectives may be either postposed to nouns, in which case they carry agreement suﬃxes and are said to have a “contrastive/identifying function”, or else preposed, in which case they are optionally9 incorporated: (131) postposed, non-incorporated: Kavá sá-gˆa-ru-m ¸ u¸ ‘u y¸a’¸áy-kya. horse.sbj white-an-sbj that-sbj die-anterior ‘The white horse died.’ (rather than the black one) (132) preposed, incorporated: Sá-gavá ‘u y¸a’¸áy-kya. white-horse.sbj that.sbj die-anterior ‘The white horse died.’

7.Written as one word and capitalized (Il Belpaese), this is used as a lexicalized way of referring to Italy. 8.I am grateful to Mats Forsgren for discussions of this issue. 9.The grammar gives no examples of preposed adjectives that are not incorporated and it is unclear if they agree with the head noun or not.

Incorporating patterns

Scandinavian. In the traditional dialects spoken in a rather large area of Northern Scandinavia — comprising parts of Norway, Sweden, and Finland, incorporation is a common and often the standard way of including a modifying adjective in a deﬁnite noun phrase. Consider e.g. the following example from Elfdalian and its counterpart in Swedish: (133) Elfdalian swart-rattsj-in black-dog-def.nom.sg ‘the black dog’ (134) Swedish den svart-a hund-en def black-wk dog-def ‘the black dog’ Both Elfdalian and Swedish have suﬃxed deﬁnite articles, but Swedish has, in addition, a preposed article when there is an adjectival modiﬁer. However, in the traditional dialects in the adjective incorporation area, the use of a preposed article is very restricted and can probably be ascribed to recent inﬂuence from Standard Swedish. In Swedish, the adjective takes what is traditionally called a “weak” ending (possibly a development of an erstwhile deﬁnite article on adjectives); in most of northern Scandinavia, the weak endings of adjectives tend to be deleted in a process referred to as apocope. Since the adjective incorporation area is properly included in the apocope area, it is not implausible that the latter created the preconditions for the former. Apocope was probably originally a wholly phonologically conditioned process applying to word-ﬁnal unstressed vowels in non-utterance ﬁnal position after a stressed long syllable (a syllable which contained at least one long segment). In Modern Elfdalian, many words still alternate between apocopated and nonapocopated forms depending on the position in the sentence. The process is no longer purely phonological, though, since many words (especially new additions to the lexicon) do not participate in it. Apocope leaves a trace behind in that the distinction between the two Scandinavian tonal word accents (see p. 205) is preserved even though the resulting word might consist of a single syllable: the tone contour “spills over” on the ﬁrst syllable of the next word, as it were. Apocope does not apply to words whose stressed syllable is short (i.e. both the vowel and the following consonant are short). The prosodic pattern in a phrase consisting of an apocopated adjective and a noun is relatively similar to that of compound nouns, and it is possible that this made it easier for the adjective pattern to merge with the compound noun pattern. However, not only apocopated adjectives but also those with short stem syllables (where the ending is not apocopated), take part in the incorporation pattern. We

231

232 The Growth and Maintenance of Linguistic Complexity

thus get forms such as (135), where the weak ending is preserved in the incorporated adjective: (135) Elfdalian ber-o-kwið-n naked-wk-belly-def ‘the naked belly’ This shows that even if apocope may have been an important factor in the genesis of adjective incorporation, the latter is not a straightforward consequence of it. On the other hand, it also shows that adjective incorporation is not simply a generalization of the Adj+N compound patterns that are found in most Germanic languages, including older stages, but rather derives from a syntactic combination of adjectives and nouns. Within the adjective incorporation area, there are certain important diﬀerences. Comparing the Upper Norrland region in the north of Sweden with Elfdalian (representing Dalecarlian from northernmost Central Sweden), it can be generally said that adjective incorporation is more restricted in the south. In Upper Norrland, not only plain adjectives but also e.g. superlatives and ordinals such as ‘ﬁrst’ and ‘last’ can be incorporated. Thus, in Upper Norrland we ﬁnd incorporated forms such as fössjt-gânga ‘the ﬁrst time’ corresponding to Elfdalian fuäst gandjin. Why these forms have resisted incorporation in the south is unclear. It may be noted that such constructions often have peculiar properties — in Swedish they tend to lack the preposed article, for instance (första gången ‘the ﬁrst time’). Another important feature of Elfdalian is that there is an alternative to incorporation in that the demonstrative an dar ‘that’ can be used without its original deictic force and is grammaticalizing into a preposed deﬁnite article with adjectives. Instead of (133) one might thus say (136) Elfdalian an dar swart rattsj-in that there black dog-def.nom.sg ‘the black dog’ where the adjective and the noun do not form one word. The competition between the two constructions can be seen in the translation of a Swedish novel, Hunden ‘The dog’ by Kerstin Ekman, into Elfdalian (the translator was Bengt Åkerberg).10 Before I comment further on the result of that investigation, I wish to consider

10.Bengt Åkerberg, who is himself a native speaker of Elfdalian, checked the translation with a number of other speakers. It therefore seems relatively safe to say that the text is a good representation of native intuition.

Incorporating patterns 233

deﬁnite noun phrases with adjective modiﬁers in general. One important point here is that this is a very infrequent construction in spoken language, as was noted by Thompson (1988).11 In a corpus of half a million words of spoken Swedish — corresponding to 1250 printed pages — there were only 253 examples of the pattern (137) den/det/dom Adj-e/aN-def that is, the standard form of such NPs in Swedish. (Comparatives and superlatives were excluded from this count.) This is equivalent to about one in ten minutes of conversation, or once in ﬁve printed pages. It is important to note that I am speaking of deﬁnite NPs here — the overwhelming majority of attributive adjectives show up in indeﬁnite NPs. In addition, it turns out that a few adjectival lexemes had a rather dominant place among those examples: about 40 per cent consisted of tokens of the four adjectives stor ‘big’, liten ‘small’, gammal ‘old’, ny ‘new’. It is probably no accident that these items are among the cross-linguistically prototypical adjectives in the sense that they show up in practically every language that has a separate class of adjectives. In written dialect texts, which are either direct renderings of spoken language or else tend to be close to spoken language in form, the corresponding patterns also show up very sparingly. By contrast, in Kerstin Ekman’s novel, the frequency of this construction was 279 in about one hundred pages, that is, on average three per printed page, or approximately ten times as many as in the spoken corpus. In addition, the distribution of diﬀerent adjectival lexemes is very diﬀerent. The four “top” adjectives mentioned above account for 26 tokens or less than 10 per cent of the total. It is fairly clear that deﬁnite NPs with adjective modiﬁers have a rather diﬀerent role in the genre represented by this novel. Instead of simply helping to identify the referent of the NP, adding a modifying adjective to a deﬁnite NP in such texts is often a device to add information edgewise — consider examples such as det starka ljuset från himlen ‘the strong light from the sky’ or den mörkgröna bladfällen ‘the dark-green pelt of leaves’. Someone who wants to translate such a text into a language with a very restricted written tradition faces a peculiar situation: it is necessary to decide how to say things that have never or very seldom been said before in that language. In this sense, the translated text is not a natural sample of the language, and this might call the results into doubt. On the other hand, the translation may also be seen as a (partly unintentional) grammatical experiment — what happens if a native speaker is forced to express all these deﬁnite NPs with the adjectival modiﬁ-

11.The allegation in Newmeyer (1998: 41) that Thompson’s ﬁndings, which are corroborated by my data, were an artifact of her corpus seems unfounded. The data Newmeyer quotes from Chafe (1982) only show that deﬁnite attributive adjectives do exist even in spoken language, a fact which Thompson does not deny — she only says they are infrequent.

234 The Growth and Maintenance of Linguistic Complexity

ers retained? And the patterns in the results turn out to be quite signiﬁcant. In the original Swedish text, there were 279 cases of NPs with at least one adjectival modiﬁer and a preposed deﬁnite article. Of these, 108 (39 per cent) were translated by using the an dar construction, and of those that were not, 53 (19 per cent) involved adjective incorporation. The total number of incorporated adjectives in the material is 172, and also included, for instance, adjectives in indeﬁnite NPs and cases in which a compound noun with an adjective was used in the Swedish original as well. The rest of the 279 deﬁnite tokens were translated in various indirect ways (e.g. with relative clauses), testifying to the diﬃculties in ﬁnding a natural translation with an attributive adjective. Thus, the an dar construction is by far the most common translation of Swedish deﬁnite noun phrases, suggesting that it may be on its way towards taking over the niche of the incorporation construction. It should be emphasized, however, that the percentage of incorporated adjectives may be higher in spoken language, where there are far fewer attributive adjectives in deﬁnite NPs and the distribution of individual adjectives is quite diﬀerent. Among the adjectives that are incorporated, we can ﬁrst note that there are 16 occurrences of the three “prototypical” adjectives stur ‘big’, lissl ‘small’ and gambel/ gamt ‘old’ (the fourth adjective from the top group — ny ‘new’ — occurs only once in the original and the translation is not incorporated). In particular, the adjective stur ‘big’ is incorporated 10 out of 12 times. In other words, these prototypical adjectives have an incorporation propensity that is about three times higher than average. Among other adjectives that occur incorporated more than once we ﬁnd gryön ‘green’, guäl ‘yellow’, langg ‘long’, swart ‘black’, wåt ‘wet’. Except for the last one, all of them belong to the semantic groups that are likely to show up as adjectives. There were also clear correlations between propensity for incorporation and parameters such as frequency and length. Out of 29 examples of (single) adjectives with more than one syllable, only four were incorporated. Only once were two Swedish adjectives translated as a double incorporation (lausug-wait-kwiðn ‘the lice-ridden white belly’). This seems to be another diﬀerence between the southern and northern parts of the adjective incorporation area: in the north, there is no reluctance against incorporating more than one adjective. This was reﬂected in the translations provided by informants from Arjeplog (Upper Norrland) and Älvdalen for the Swedish sentence Den lilla vita katten sprang in i det stora röda huset ‘The little white cat ran into the big red house’: (138) Swedish Den lilla vita katten sprang in def little.wk white.wk cat.def run.pst in i det stora röda huset. in def big.wk red.wk house ‘The little white cat ran into the big red house.’

Incorporating patterns 235

(139) Arjeplog Lill-vit-katt-n sprang in i sto-rö-hus-e. little-white-cat-def run.pst in in big-red-house-def ‘The little white cat ran into the big red house.’ (140) Elfdalian (Älvdalen) An dar lissl wait mass kåjt in that there little white cat run.pst in i e dar stur roð ausað. in that there big red house.def ‘The little white cat ran into the big red house.’ In Ossmol, the dialect spoken in Orsa, about 50 kilometers from Älvdalen, the incorporating construction seems to have disappeared almost totally in favour of the an dar construction, except in clearly lexicalized phrases and with the two adjectives ny ‘new’ and gambel ‘old’. Generalizing about the competition between the two constructions, it appears that the incorporating construction survives better with “core” or “prototypical” adjectives, and that it has particular diﬃculties in the case of multiple modiﬁers.12 Summing up. Combinations of adjectives and nouns may become tightened and integrated into a one-word construction without losing their productivity. Such a development seems to be favoured by a low prominence of the adjective but may also be preceded by more general reduction processes. Diachronically, there seems to be a tendency for the tighter constructions to be restricted to a few adjectives, usually “prototypical” ones, such as ‘big’, ‘small’, ‘old’, ‘new’, ‘good’, ‘bad’, i.e. the ones that show up in languages in which adjectives are a closed class with a small number of members (Dixon (1977)). The existence of such languages suggests that there is a tendency to give special grammatical treatment to this group of concepts, which may well be connected with greater inherent tightness. In Celtic, Romance, and Southern Ute, we may note the contrast between tighter preposing constructions and looser postposing ones, manifested both in lack of agreement and in the application of sandhi rules such as liaison.

12.One would also expect such diﬃculties to occur when the adjective is modiﬁed by an adverb. However, it turns out that there are no such cases in the material! The conclusion is that even in a literary text such as Kerstin Ekman’s novel with a comparatively high frequency of deﬁnite NPs with adjectival modiﬁers, the adjectives are themselves seldom modiﬁed. An Internet search reveals that such cases do occur, although much more infrequently than with indeﬁnites (this goes for both Swedish and English). Thus, the phrase a very big is about twenty times as frequent as the phrase the very big.

236 The Growth and Maintenance of Linguistic Complexity

Similar diﬀerences seem to be quite widespread. In the context of a general discussion of the relationship between tightness (“linguistic distance”) and NP-internal word order, Croft & Deligianni (forthcoming) quote Komi and Persian as examples of languages where preposed adjective constructions lack the grammatical markings of the corresponding postposed ones. They argue for a general principle according to which preposed modiﬁers would be more tightly integrated.13 Even in English, diﬀerences between preposed and postposed modiﬁers supporting such a claim may be found. Adjectives in the default preposed slot cannot be expanded by posthead modiﬁers; the position after the head noun is then exploited: a cinema close to you rather than *a close to you cinema. (Swedish has basically the same constraint. See however 10.8 below for some complications.) In order to avoid the diﬃculties this would create for X-bar theory, Stowell (1981) (quoted in Bhat (1994)) proposed that prenominal adjectives in English should be derived by lexical rules. Croft & Deligianni (forthcoming) advance the hypothesis that “prenominal modiﬁers are syntactically more tightly integrated into the noun phrase than postnominal modiﬁers”. However, it is questionable whether the hypothesis holds for other NP-internal elements than adjectives: in many cases, demonstratives/ articles and possessive pronouns seem to be more easily integrated with the noun when postposed to it. One notable feature of adjective incorporation in Scandinavian and ChukotkoKamchatkan languages is that it is obligatory — that is, the only available construction — in some grammatical contexts, although it is diﬃcult to make any crosslinguistic generalization about possible conditions for such obligatoriness. In the case of noun incorporation, obligatoriness appears to be a relatively rare phenomenon, Southern Tiwa being a glaring exception. 10.5.3 Possessive NP constructions There are several examples of possessive NP constructions (Mary’s book) with incorporating features (cf. also above about possessive adjectives in Slavic). Most celebrated is the Afro-Asiatic “construct state construction”, represented in the Semitic and Egyptian branches of this phylum, where a noun in a form called “construct state” representing the possessee is juxtaposed to a possessor noun phrase, as in the following example:

13.It seems to me that this principle becomes more plausible if it is restricted to lexical elements such as adjectives and NPs as possessors. For demonstratives and pronominal possessors, the number of counterexamples is quite large (some of them are mentioned in Croft & Deligianni (forthcoming)).

Incorporating patterns 237

(141) Modern Hebrew beit ha-mora house.cs def-teacher ‘the house of the teacher’ In Classical and Modern Standard Arabic, the possessor NP is in the genitive case, as in: (142) Modern Standard Arabic mu’allimu l-madrasati teacher.cs def-school.gen ‘the teacher of the school’ This most probably represents an older stage of development. There are a number of observations that support the incorporation analysis of the construct state construction. The following points are based on the discussion in Borer et al. (1994): –

–

–

The form referred to as “construct state” is really a stem without any aﬃxes. It does not carry word stress and it may be more or less reduced phonologically, as in Modern Hebrew bayit ‘house’: beit ‘house (construct state)’ In spite of the fact that most of the languages involved have an NP-initial deﬁnite article, an NP formed with the construct state construction does not have such an article; rather, the deﬁnite marking of the possessor NP is enough for the whole NP to be regarded as deﬁnite, as in (141)–(142). The possessee noun cannot be directly modiﬁed, for instance by an adjective; instead, adjectives modifying the whole possessive NP show up after the possessor phrase. If the possessor phrase is itself modiﬁed, a “nested” structure is obtained as in the following example:

(143) Modern Hebrew kis‘ot [ha-kita ha-xadasha] ha-civ‘onim chairs.m.pl [the-class the-new the-colourful.m.pl ‘the colourful chairs of the new class’ It may be noted that the absence of deﬁniteness marking on the possessee head noun is a feature also found e.g. in the Celtic and the Germanic languages — cf. English Mary’s book vs. the book of Mary. Since Celtic also has a possessee–possessor word order, some researchers have found the similarities to the Afro-Asiatic construction suﬃcient reason for extending the term “construct state” to those languages also. Another parallel is that cited by Longobardi (1994) from Romance. Apparently, in a number of Romance languages, cognates of casa ‘house’ and a few other nouns may appear — sometimes in a phonologically reduced form — in a peculiar possessive construction exempliﬁed by Catalan ca’l mestre ‘the teacher’s

238 The Growth and Maintenance of Linguistic Complexity

house’. It may be added that a very similar construction, although with an adverbial function, must be the source of the French preposition chez ‘at (the place of)’ and the synonymous Scandinavian hos (< hus ‘house’). Discussing the construct state construction in Old Egyptian, Kammerzell (2000) speaks of it as involving true compound nouns. Around 2500 B. C. E. it was apparently the standard construction for all kinds of nominal possessors:14 (144) Old Egyptian a. inalienable Aal-Áw an face.cs-brother ‘the brother’s face’ b. alienable t’apat-Áw an boat.cs-brother ‘the brother’s boat’ However, in Middle Egyptian (around 2000 B. C. E.), a new construction was introduced for alienable possession, involving a demonstrative pronoun. At this stage, then, compounding was used only for the expression of inalienable possession (characterized as “above all body parts, kinship terms and entities indispensably connected with a particular individual such as name, household property” (Kammerzell (2000: 100))). (145) shows the two possibilities: (145) Middle Egyptian a. inalienable Aar-Ásan face-brother ‘the brother’s face’ b. alienable t’apat n6t-san boat det.f-brother ‘the brother’s boat’ A similar division of labour between the descendant of the construct state construction and an innovative periphrastic construction is found in Maltese (KoptjevskajaTamm (1996) and Dahl & Koptjevskaja-Tamm (1998)), although here, the inalienable construction involves the juxtaposition of independent words rather than a compound. Kammerzell discusses the Egyptian data mainly in relation to the

14.I am using Kammerzell’s phonological representation rather than the traditional Egyptologist transcription that leaves out the vowels.

Incorporating patterns 239

hypothesis in Nichols (1988: 576–578) about the connection between inalienable possession and head marking; from the perspective taken in the two papers just referred to, it is another example of the genesis of an alienability distinction through the expansion of a new periphrastic construction encroaching on the territory of an old construction with bound marking (see 7.6.1). In the Upper Norrland dialect area in Northern Sweden, an incorporating construction is common as one of several alternative ways of expressing NPs with possessive modiﬁers. The possessor noun is preposed to the head noun without any marking, and the result is apparently indistinguishable from an ordinary compound. The possessor nouns in the examples given in the literature are mainly proper names and kinship terms, e.g. pappaskjorta ‘father’s shirt’ (Lövånger, Västerbotten (Holm (1942))); Pel Pescha-he´stn ‘Per Persson’s horse’ (Luleå, Norrbotten; Nordström (1925)). Notice that the area in question is included in the larger area in which attributive adjectives in deﬁnite noun phrases are normally incorporated in the NP. The possessive constructions with incorporated possessors are remarkable in that they involve the incorporation of highly referential noun phrases, in the case of the Semitic construct state construction even NPs with deﬁnite articles. We may also note that when phonetic reduction takes place, it aﬀects the possessee noun (the head noun) rather than the possessor noun. 10.5.4 Co-compounds aksara » ¯ na¯m aka¯ro ‘smi dvandvah» sa¯ma¯sikasya ca aham eva¯ksaya » h» ka¯lo dha¯ta¯ham vishvatomukhah» ‘Of letters I am the letter A, and among compound words I am the dual compound. I am also inexhaustible time, and of creators I am Brahma.’ Bhagavad-Gita 10.33

A type of compound deserving special attention is the one variously called coordinative or (with the traditional term) dvandva compound, illustrated by the following example from Vedic Sanskrit: (146) Vedic Sanskrit aja¯váyas ‘goats and sheep’ (< ajá ‘goat’ + ávi ‘sheep’)

240 The Growth and Maintenance of Linguistic Complexity

The following account relies heavily on Wälchli (2003), whose term “co-compound” I shall also adopt as a general name for the phenomenon. Sanskrit dvandva constructions are, for all practical purposes, single words — although it is apparently relatively rare that the development goes that far (Wälchli (2003)). It is more common for the components to preserve some autonomy (thus, in writing, they are often hyphenated). From a typological point of view, it can be said that there is a continuum from coordination of two totally separate syntactic phrases to compounds of the Sanskrit type, but in individual languages, coordinating constructions may be positioned at diﬀerent points in this continuum. It is thus common that one and the same language has one “looser” and one “tighter” way of coordinating nouns and noun phrases. Consider the following examples from Tamil: (147) Tamil a. Akkaav-um a^^an-um neettu elder.sister-coord elder.brother-coord yesterday vantaa]ka. come-pst-3rat.pl ‘Elder sister and elder brother came yesterday.’ (Asher (1982: 72)) b. Appaa-va-mmaa e]ka poonaa]ka. father-and-mother where go-pst-3rat.pl ‘Where did mother and father go?’ (Asher (1982: 206)) Here, (a) contains two conjoined noun phrases, both marked by the coordinating suﬃx -um, but in (b) ‘father and mother’ is expressed by a compound noun. Tightness in coordinating constructions may be manifested in various ways. Tighter constructions often have nothing corresponding to a conjunction or marker of coordination, or else the marker is reduced relative to what is found in looser constructions. In English, the phonetic realization of the conjunction and varies from [ænd] via [6n] to [n]. The spellings ’n (as in Tom’n Jerry) and & (as in Smith & Sons) can be seen as representing tight coordination in the written language. Tighter constructions also tend to have deviant grammatical marking of the components. Wälchli (2003) distinguishes three types: – –

–

zero marking, where neither component receives an expected marking, as in Father and son went for a walk (instead of The father and the son…); single marking, where the expected marking shows up on one component only, as in the house and garden (rather than the house and the garden), or Hopper & Thompson’s theory (rather than Hopper’s & Thompson’s theory); double marking, where both components are marked, as expected, but where this marking may diﬀer from what would be expected, as when components with singular reference are marked as dual or plural (see below for an example from Vedic Sanskrit).

Incorporating patterns 241

Tighter constructions are used above all for combinations of items that occur together “naturally” and are thus understood as conceptual units, such as ‘father and mother’, ‘knife and fork’, ‘night and day’ etc. They are often heavily lexicalized, in which case patterns such as rhyme, ablaut and alliteration are also very common (Ourn & Haiman (2000)). Productivity may vary. For instance, the tight pattern mentioned above from Tamil is said to be ‘not very productive’ (Asher (1982: 206)). On the other hand, patterns of this type play an important role in the formation of lexical items in many languages. For instance, Ourn & Haiman (2000) report that there are hundreds of “near synonym compounds” in Khmer, such as chap rfhah ‘fast’ (where the components also mean ‘fast’ or ‘quick’). As Wälchli (2003) shows, this is an areal phenomenon typical of East Asian languages. The productivity of less tight constructions is often greater. Thus, in Germanic languages, we ﬁnd that what Lambrecht (1984) calls “binomial expressions” are quite widespread. This type of conjoined noun phrases are characterized by the absence of determiners, as in examples such as Father and son went for a walk. Lambrecht distinguishes three types of binomials in German: – – –

Type A. Irreversible binomials, as in Mackie und Polly wurden Mann und Frau ‘Mackie and Polly became husband and wife’ Type B. Pre-schematized binomials, e.g. Mann und Frau bilden eine biologische Einheit ‘Man and woman form a biological unit’ Type C. Contextualized, e.g. Mann und Frau verließen das Zimmer ‘(The) man and (the) wife left the room’

Type A is wholly conventionalized (“freezes”). Binomials of Type B are “semantically motivated but not necessarily conventional”. Both Type A and B “are motivated by some pre-existing cognitive relation that holds between their members”. Binomials of Type C are motivated by context only, and are thus in a way similar to Mithun’s Type C. Lambrecht gives the following example to show how this works: (148) German Er ging in den Laden, um ein Hemd und ein Messer zu kaufen. Er fand, was er brauchte, und nachdem er Hemd und Messer bezahlt hatte, verließ er zufrieden den Laden. ‘He went into the store to buy a shirt and a knife. He found what he needed, and after he had paid for (the) shirt and (the) knife, he left the store satisﬁed.’ Here, the binomial expression Hemd und Messer has an anaphoric function, referring to a pair of objects that were introduced in the preceding sentence. There is no pre-established conceptual relationship between these two objects, however.

242 The Growth and Maintenance of Linguistic Complexity

The development of co-compounds towards a tighter construction can be followed in Sanskrit. In Vedic Sanskrit (Macdonell (1916: 269)) the most common type of dvandva compounds was formed from names of deities, where each member had dual number and a separate accent: (149) Vedic Sanskrit mitrØa-váruna » mitra.du.nom-varuna.du.nom ‘Mitra and Varuna’ In oblique cases, the tendency was to modify the case ending only of the last component, e.g. mitrØa-várunayo » h» in the genitive rather than mitráyor-várunayo » h. » In a later development, the ﬁrst component may assume the stem form rather than the dual, as in indra-va¯yØu ‘Indra and Vayu’ — here the ﬁrst element has also lost its stress and the last syllable of the second element takes an acute accent. This, then, illustrates the progressive formal reduction and integration of the components of the pattern. 10.5.5 Titles and other proprial classiﬁers Combinations of titles and proper names behave peculiarly in many languages, in that there is often a small class of nouns denoting socially relevant categories of persons that can be combined with proper names without being equipped with the customary grammatical trappings of a full-blown noun phrase. Cf. (150) I met Professor Smith. (151) *I met Lecturer Smith. (cf.: I met Smith, the lecturer.) In other languages, this class of nouns may be larger. In Swedish, for example, it is possible to say (152) Swedish Jag träﬀade lektor Smith. I meet.pst lecturer Smith ‘(lit.) I met lecturer Smith.’ but on the other hand, if a modiﬁer is added to the title, the deﬁnite form of the noun has to be used: (153) Swedish Jag träﬀade den engelsk-e lektorn Smith. I meet.pst def English-wk.m.sg lecturer Smith ‘I met the English lecturer Smith.’ The articleless construction is thus still quite restricted. In Finnish, the question of articles does not arise, but it is noteworthy that titles are not case-marked:

Incorporating patterns 243

(154) Finnish Tapasin lehtori Leino-n. meet.pst.1sg lecturer Leino-acc ‘I met the lecturer Leino.’ Titles of this kind, then, arguably represent a proto-incorporating construction. Phonetic reduction is common, as in French monsieur [m(6)sjø] < mon seigneur [m˜f s7\œr]. True incorporation of titles also occurs in many languages. Finnish thus has an alternative construction in which the title follows the name. In the written language, the name and the title are then written together with a hyphen, as in Antti-setä ‘uncle Antti’. In this construction, only the title is declined: (155) Finnish Tapasin Antti-sedän. meet.pst.1sg Antti-uncle.acc ‘I met uncle Antti.’ Similar constructions are found e.g. in Japanese: Tanaka-san ‘Mr. Tanaka’ and Elfdalian: Ands-bil ‘uncle Anders’. Titles are of course prime examples of politeness items, and as such, tend to be used obligatorily, both in vocative and referential uses. It is therefore no surprise that they are subject to reduction and tightening processes. Similar embellishments of names, more or less obligatory, are found in preﬁxes to names of saints, e.g. Saint in St. John, also phonetically reduced to [s6nt] or even just [s‘] (and correspondingly, in the written language, abbreviated to St.). In the same way, Spanish santo ‘saint’ is reduced to San before names, e.g. San Francisco. A peculiar development is seen in Mandarin Chinese, where the words lao ‘old’ and xiao ‘young’ are obligatory with monosyllabic surnames, both in referential and vocative uses: thus a person called Zhao has to be called lao Zhao or xiao Zhao. Disyllabic surnames, on the other hand, are normally used without these adjectives. According to Shi (2002: 71), the explanation lies in the general preference for disyllabic patterns in Modern Mandarin Chinese. Titles are actually only one particular case of a more general phenomenon which to my knowledge has not received proper attention in the literature, namely what I shall call proprial classiﬁers, that is, elements that more or less systematically or even obligatorily attach to proper names of certain semantic types, thus exemplifying a kind of “grammaticalization”. An example from English would be Mount, which seems to be rarely used anywhere else than as preﬁxed to the name of a mountain, as in Mount Everest. A well-developed system is found in oﬃcial Russian, where any toponym is in principle preﬁxed by a noun, most often abbreviated, e.g. gorod Moskva or g. Moskva ‘(the city of) Moscow’, selo Bliny-S”edeny ‘(the village of) Bliny-S”edeny’ etc.

244 The Growth and Maintenance of Linguistic Complexity

One reason that proprial classiﬁers are not noticed is that they are understood as part of the name. In many languages, toponyms in particular are often compounds in which one element identiﬁes the type of entity referred to, as in Cape Town. The identifying element then often varies and its status as a classiﬁer is perhaps not always very clear. In the case of English street names, however, we saw above that there is one speciﬁc such element — Street — that is associated with a diﬀerent stress pattern and can therefore be claimed to be an incipient case of grammaticalization. There are also very productive compound-forming processes of this type. In English, an ethnic group like the Sirionó in Bolivia may be referred to collectively as the Sirionó or the Sirionós. In Swedish however, the preferred way is Sirionófolket ‘the Sirionó people’ or (less politically correctly) Sirionóindianerna ‘the Sirionó Indians’. To some extent, this may be ascribed to an uncertainty about what the deﬁnite plural of Sirionó would be. In a similar way, one may speak of one car made by Volvo as en Volvo ‘a Volvo’ but ‘two Volvos’ tends to be två Volvobilar ‘two Volvo cars’ to avoid problems with plural formation. Among titles, it is not uncommon for there to be elements that display a particular behaviour which suggests that they have moved further along the “grammaticalization” path. For instance, in German, Herr and Frau tend to be used even if other titles are present, more or less obligatorily and always in initial position, e.g. Frau Professor Schmidt.

10.6 Incorporation of locational and directional adverbs/particles Elements such as English out and up tend to be subject to many diﬀerent processes that straddle the borderline between grammaticalization and incorporation. These processes diﬀer from classical grammaticalization in that it is often not an individual element that acquires a grammatical use but rather a whole set of elements that are recruited into a construction. Moreover, they are also diﬀerent from typical cases of incorporation in that what is incorporated is a grammatically closed class of items. The most important processes are well-known from the history of English and other languages: – –

– –

a locational/directional adverb may develop into an adposition, e.g. English up as in up the hill; a locational/directional adverb may be recruited to form a complex adposition, as in English into or upon (cognates of the same morphemes appear in reduced form in Swedish på ‘on’ < upp å ‘up on’); a locational/directional adverb may be incorporated into the verb complex, as in English upgrade; the combination of a locational/directional adverb and a simple verb may be lexicalized as one unit, as in English grow up.

Incorporating patterns 245

At least in the languages of Europe, it appears to be quite common for a directional element to double, that is, cognate morphemes appear in two positions, both in the verb and as an adposition, or once as a particle and once as an adposition: (156) Russian On vo-šel v dom. he in-go.pst in house ‘He went into the house.’ (157) Swedish Han gick in i huset he go.pst in in house.def ‘He went into the house.’ In the Germanic languages, there have been at least two diﬀerent waves of incorporation of locational particles, resulting in two synchronic layers. In the ﬁrst round, particles developed into unstressed preﬁxes, sometimes with syntactic functions such as transitivization (as in German be-malen ‘paint’). In Scandinavian, these preﬁxes were in general deleted, giving rise to a zero-marked transitivizing process. The following round resulted in a somewhat chaotic picture, in that incorporation tended to be incomplete. The choice between incorporated and non-incorporated particles here follows diﬀerent rules in the various languages. For most verbs belonging to the second wave in German, the choice depends mainly on the position in the sentence: if the verb is in the ﬁnal position, the particle is incorporated, otherwise it is normally free. In Scandinavian, the choice is partly stylistic, but in many cases depends on lexical idiosyncrasies. As is well known, incorporated locational adverbs play an important role in the aspectual systems of the Slavic languages, and similar phenomena can also be found in many of the geographically adjacent languages, such as Baltic and Finno-Ugric languages. In this context, what is worth noting is that this is a rather peculiar case of grammaticalization: it is not a speciﬁc morpheme that has been assigned a grammatical role; rather, it is the members of the construction “locational preﬁx + simple verb” that have obtained the categorization “perfective”, and subsequently been integrated into imperfective : perfective verb pairs (see also discussion in 9.3). Since the origins of the system go back to prehistoric times, it is rather diﬃcult to specify exactly what happened, but here we obviously see the result of a complex interaction between diﬀerent kinds of maturation processes. Craig & Hale (1988) describe the behaviour of what they call “relational preverbs” in a number of languages from the Americas, as exempliﬁed by the following:

246 The Growth and Maintenance of Linguistic Complexity

(158) Rama Ka-na-ngalbi-u rp/from-1-run-tns ‘I ran away from (him).’ (159) Nadëb Bxaah kalapéé ya-sooh tree child on-sit ‘The child is sitting on the tree.’ They hypothesize that the relational preverbs derive historically from postpositions, as in the following example: (160) Rama Na-ngalbi-u naing taata kang. 1-run-tns my father from ‘I ran away from my father.’ An alternative possibility that they do not consider but which would bring the situation in these languages more in line with what we ﬁnd in Europe would be to assume that there is no direct link between postpositions and preverbs but that they are both derived from an adverbial particle like away in English.

10.7 Referentiality constraints We have seen that the degree of referentiality of an expression may inﬂuence its incorporability. There are in fact various other phenomena that suggest that referential lexical noun phrases (and perhaps stressed pronouns, as well) on the one hand show greater internal integrity, yet integrate less well into surrounding expressions. In the study of “island phenomena” in syntax, it has been noted that referential expressions make up less easily penetrable islands than non-referential noun phrases. In Swedish, a gap after a topicalized or interrogative NP in a relative phrase is more acceptable if the relative phrase is not part of a referential NP (Allwood (1975–76)): (161) Swedish Vilken författare känner du ingen som har skrivit om? which author know you nobody who have.prs write.sup about ‘What author do you know nobody who has written about?’

Incorporating patterns 247

(162) Swedish ?Vilken författare känner du which author know you kvinnan som har skrivit om? woman.def who have.prs write.sup about ‘What author do you know the woman who has written about?’ Likewise, a reﬂexive pronoun in a relative clause can have an antecedent in the matrix clause if the NP that contains the relative clause is not referential: (163) Swedish Han har ingen som tar hand om sig. he have.prs nobody that take.prs hand about himself ‘He has nobody who takes care of him.’ (164) Swedish *Han talar med kvinnan som tar hand om sig. he talks with woman.def who take.prs hand about himself ‘He is talking to the woman who takes care of him.’ Consider, further, the possibility of “de re” readings of referential expressions in modal contexts, without the existence of which the following sentence would attribute a contradictory statement, not just a false one, to the defendant: (165) The defendant stubbornly claimed that she had not stolen the jewels that she had stolen. In the more likely “de re” reading, on which the defendant is claimed to be lying rather than contradicting herself, the noun phrase the jewels that she had stolen is interpreted outside of the scope of the verb claimed. The defendant referred in her statement to a speciﬁc set of jewels, but she did not use the description in (165) — that is added by the person who utters that sentence. In other words, the semantic interpretation of such sentences is not strictly compositional — rather, the NP that gets the “de re” reading is interpreted as if it were outside of the clause it appears in. This speaks in favour of a processing model in which the processing of referential NPs is at least partly independent of the rest of the sentence that they appear in. Notice, though, that there are some striking counterexamples to the generalization that high referentiality bars incorporation. Thus, among the examples of dvandva compounds from Vedic Sanskrit, a salient role was played by combinations such as ‘Mitra and Varuna’, which were said to refer to pairs of deities. Likewise, the NPs with incorporated possessors from North Swedish and Old Egyptian included examples such as ‘Daddy’s shirt’, ‘the brother’s boat’ and the like. One property that distinguishes these examples from the ones where referentiality constraints are operative is fairly obvious — in the case of both the co-compounds and the

248 The Growth and Maintenance of Linguistic Complexity

possessive constructions it is not only what is incorporated that is referential, the resulting phrase is also a referential, deﬁnite noun phrase. It is thus possible that being a referential expression is an obstacle to being incorporated into a structure only if that structure is not itself a referential expression.

10.8 Incorporating patterns in the making? An easy to ﬁnd place — postmodiﬁers of prenominal adjectives in English. It was said in 10.5.2 that English attributive adjectives in prenominal position cannot take postmodiﬁers. However, this constraint is not absolute, as noted in Huddleston & Pullum (2002: 551), who quote examples such as a better than average result and his hard as nails attitude to the workers, and note that comparative complements “are permitted provided that they are very short, usually than or as + a single word, which cannot be a referential NP”. In other words, although prenominal adjectives can be expanded rightwards, these expansions themselves seem to be subject to expandability and referentiality constraints. We could not for instance get *a better than ours result (Huddleston & Pullum’s example) or *a smarter than his parents kid. Huddleston & Pullum also note the existence of “hollow” premodifying inﬁnitival clauses but seem to want to say that they are restricted to expressions like a readyto-eat TV meal which have “something of the character of a ﬁxed phrase”. This formulation appears too restrictive if applied to actual usage. Thus, Huddleston & Pullum star the expression an easy to ﬁnd place, yet it has over 1200 hits on the Internet (according to Google).15 In fact, the inﬁnitival construction seems to be fairly productive: patterns with substantial numbers of Internet occurrences are for instance easy to recognize + Noun and diﬃcult to read + Noun. Prepositional phrases, which are not mentioned by Huddleston & Pullum, also occur: a full of energy person, a close to nature life, a limited by guarantee company (all actual examples from the Internet). It is thus not impossible that postmodiﬁers of attributive adjectives are becoming more common and less restricted in English. In Quirk et al. (1985: 1349), postmodiﬁers are qualiﬁed as “ad hoc ‘compounds’”, which are hyphenated “for typographical clarity”. The notion “ad hoc ‘compound’” is not deﬁned, but there is a reference to another section, treating examples such as their day-to-day complaints and her too-simple-to-be-true dress, which are said to

15.It is of course rather diﬃcult to tell how many of the Google occurrences were produced by native speakers of English. However, the claim that such expressions are fairly common in contemporary English is corroborated by the following observation: in one single oﬃcial document from the State of North Carolina (http://ssw.unc.edu/cares/rk/recordkeeping.pdf) I found four instances of easy-to-ﬁnd used in a prenominal, attributive position.

Incorporating patterns 249

have “a ﬂavour of originality, convention-ﬂouting, and provisional or nonce awkwardness” (Quirk et al. (1985: 1336)). It is possible that these are in fact properties of an innovative and not yet fully established pattern in the language. An interesting feature here is the tendency to view postmodiﬁed attributive adjective phrases as one-word constructions, manifested in Quirk et al.’s treatment of them as “compounds” (in quotes in the original) and the tendency to hyphenate them. On the Internet, uses with and without hyphens seem to be about equally frequent (search engines generally neglect hyphens, so it is hard to be more exact). It is as if the pre-nominal slot in noun phrases is too tight for multi-word constructions, at least if they are right-branching, and one has to compress the adjective phrase to make it ﬁt. It is tempting to speculate that we are witnessing an incipient incorporating pattern. On the other hand, if it is true that English is becoming more liberal in allowing postmodiﬁcation of pre-nominal adjectives, it could be argued that the slot is in fact becoming less tight. We shall see somewhat similar examples below. Incorporation into verbs from noun compounds. Many languages have extensive N+N compounding without permitting classical noun incorporation, i.e. N+V compounding. It is not clear if this pattern is general enough for us to establish a typological hierarchy. The question nevertheless arises as to whether it is possible for a classical noun incorporation pattern to develop out of N+N compounding. Between nouns and verbs there is a large grey area inhabited by various nominalizations, de-verbal adjectives and non-ﬁnite verb forms. Here is a possible bridgehead for incorporating constructions, since the borderline between what is possible and what is not is sometimes rather thin. Consider the pattern N+V-ing in English, exempliﬁed by coﬀee-drinking and pipe-smoking. Although these words are hardly acceptable in predicative position, e.g. ??She is coﬀee-drinking, they are perfectly possible as modiﬁers of nouns, e.g. today’s coﬀee-drinking public. An example such as a Gauloise-smoking Frenchman also seems possible, showing that the pattern is productive. The step to a true predicative use of similar patterns is quite small, though, if for instance a compound nominalization is combined with a “light verb”. An interesting example of this from English is the pattern Noun + bashing, involving the verb bash, in the sense “attack physically or verbally”, which is apparently relatively recent in English, to judge from the fact that older dictionaries do not mention it. The compounding pattern Noun + bashing is very productive, with examples such as gay-bashing, Chomsky-bashing, China-bashing, monogamy-bashing etc. In a further step, such compounds can be complements of “light verbs” such as do, commit, engage in. Cf. the following Internet examples: (166) We are positive, and don’t do male-bashing. (167) The occupiers produced leaﬂets and did lecture-bashing to urge other students to join the sit-in.

250 The Growth and Maintenance of Linguistic Complexity

(168) …friends of Pakistan who love Pakistan and would never engage in “Pakistan-bashing”… …and most impressively, the following excerpt from a discussion group, demonstrating the possibility of creating new instances of this construction “on the ﬂy”: (169) [First discussant:] This thread has become a Java vs. Objective-C forum. I’ve refrained from doing any Objective-C bashing simply because I don’t know Obj-C that well. … [Second discussant:] I have not been doing java bashing. I don’t know Java. I have been doing Apple bashing, Java-bug bashing, java-limitations bashing, and auto-GC-no-hints-available bashing. Another variant is found in the following Swedish example, where the noun trappa ‘staircase’ has been incorporated into the nominalization städning ‘cleaning’, to ﬁll its object slot: (170) Swedish Trappstädning pågår. staircase_cleaning go.on.prs ‘Staircases are being cleaned (lit. staircase-cleaning is going on).’ More systematic uses of similar patterns seem to be quite common cross-linguistically, particularly in progressive and habitual constructions. The North Frisian construction quoted at the beginning of this chapter is a case in point. It contains a nominalized verb (formally identical to the inﬁnitive, but preceded by the deﬁnite article), which allows compounding according to normal rules for nouns. There are counterparts to the North Frisian construction in many West Germanic varieties, and in most of these, the incorporated noun behaves the way one would expect of the ﬁrst component of a compound noun. For instance, a full noun phrase can normally only go outside the prepositional phrase, as in (b): (171) German (regional) a. Ich bin am Eis-essen. I be.prs.3sg at_def.dat.m.sg ice-cream_eating ‘I am eating ice-cream.’ b. Ich bin das Eis am Essen. I be.prs.3sg def.nom.n.sg ice-cream at+def.dat.m.sg eating ‘I am eating the ice-cream.’ However, there are signs that this orderly situation may be breaking up, at least in some Germanic varieties. Thus, Ebert (2000: 611) reports that some Züritüütsch speakers accept (172), with a deﬁnite noun phrase squeezed in between the am and the nominalized verb:

Incorporating patterns

(172) Züritüütsch Si isch am t’ herdöpfel schele. she is at the potatoes peel ‘She is peeling the potatoes.’ One may also note that complete prepositional phrases are readily accepted in many varieties. Ebert (ibid.) quotes examples from Standard German, Öömrang North Frisian and Frysk (West Frisian), such as (173) German Sie sind am they be.prs.3pl at_def.dat.m.sg Kohlen-in-den-Keller tragen. coals-in-the-cellar at_def.dat.m.sg ‘They are carrying coals into the basement.’ A similar process seems to be going on in the German construction zum + nominalized verb, which expresses the goal of some action. Abraham (1995) characterizes (174) as “oberdeutsch-umgangssprachlich”: (174) German Wir haben die Leute we have.prs.1pl def.pl.nom people zu-m frische Muscheln essen eingeladen. to-def.n.sg fresh.pl mussel.pl eat.inf invite.pst.ptcp ‘We have invited the people to eat fresh mussels.’ In this case, a plural-marked noun with an adjectival modiﬁer — again something that could not normally be the ﬁrst component of a compound — takes the place of the incorporated noun. I say “takes the place of” because it is actually not clear that we are dealing with incorporation any longer. T’herdöpfel schele and frische Muscheln essen look like phrases rather than words. The incorporating construction has expanded its domain by allowing phrases in the direct object slot, but at the same time it seems to be losing its character of incorporation. This process is noteworthy since it shows how a looser construction may arise out of a tighter one — a counterexample to claims about the unidirectionality of maturation processes. On the other hand, it may be objected that there is no strict descendancy relation between the constructions in question. Again, we see an illustration of the identity problems of constructions that were discussed in 7.5.

251

252 The Growth and Maintenance of Linguistic Complexity

10.9 Explaining incorporation We have seen in the preceding sections that there are a number of pervasive tendencies in incorporating and quasi-incorporating patterns, although it is not the case that all such patterns show all these tendencies — in particular, quasi-incorporating ones may display them to varying degrees. I shall now try to review the properties that tend to characterize these patterns, looking for the factors that may lie behind them and the diachronic processes that give rise to them. To start with, some properties pertain to patterns as wholes: – –

Constraints on productivity. Many patterns are wholly improductive or productive only to a limited extent. Tendency to lexicalization. Many patterns have a large proportion of lexicalized members (this is related to constraints on productivity but not necessarily identical to it).

The connection between lexicalization and tightness is hardly astonishing. It appears natural that highly entrenched and conventionalized combinations of lexical items should receive a tight and integrated expression. Rather, what needs to be explained is why the correlation is not perfect: how a pattern may be tight and still remain productive. –

“Unitary concept” constraints. It is often said that the members of incorporating/quasi-incorporating patterns must denote “unitary concepts” or e.g. “institutionalized”, “stereotypical” or “permanent” activities.

“Unitary concept” constraints are connected to lexicalization in the sense that it is of course such concepts that stand a chance of being lexicalized (5.4). On the other hand, the phenomenon seems to go beyond lexicalization and entrenchment. Consider e.g. what I called “proprial classiﬁers” above. As noted, formations such as Swedish Sirionófolket ‘the Sirionó people’ are fully productive, and I may well use such a compound to refer to a group that my audience have never heard of. Perhaps what is at stake here is something that you could call “nameworthiness”. The entity must have a status that in principle makes it possible to invent a name for it. Typically, there are also constraints on what can go into a slot in an incorporating or quasi-incorporating construction: –

–

Expandability constraints. These constraints show themselves most clearly in quasi-incorporating patterns, distinguishing these patterns clearly from ordinary syntactic constructions — it is often the case that you cannot add a modiﬁer to a quasi-incorporated element. Constraints on prominence management. It is often diﬃcult or impossible to apply focusing or other prominence-changing operations to an incorporated or quasi-incorporated element.

Incorporating patterns 253

It is natural to connect this with the word-like prosody that characterizes members of the patterns in question (see below). A property that is not connected to prosody in the same way is that incorporated slots normally cannot be “ﬁlled by empty categories”, as would be necessary for wh-questions and relative clause constructions to target them. Finally, there are tendencies in how the patterns are manifested: –

–

–

Tight prosody. This is a property that pertains to the pattern as a whole. An integrated intonational contour appears to be common to all incorporating and quasi-incorporating patterns, to the extent that it is possible to obtain information about them. The extent to which tighter, word-like patterns occur is more variable. Lacking or otherwise constrained morphological marking. We have seen that incorporated and quasi-incorporated elements may be wholly (more often in the case of the latter) or partially deprived of the morphological marking that would be expected in similar syntactic constructions. Reduction in the form of incorporated elements. An incorporated or quasiincorporated element may have a reduced phonetic weight relative to its nonincorporated manifestations.

We may thus note that incorporated elements may be characterized both by a typically “conservative” property — they have withstood the expansion of morphological marking — and an “innovative” one — they are more reduced than others. Obviously, there must be other factors that explain why some items are singled out in this way. The last two properties mentioned here parallel what was said in 8.2 about the behaviour of frequent items with respect to grammatical maturation. Frequency may be a factor which is also relevant for incorporability. We saw in 10.5.2 that among adjectives, the most frequent ones cross-linguistically are the ones that tend to show up most often in tight and incorporated patterns. To judge from the data from Dalecarlian, however, this may partly be a question of which adjectives survive in a receding pattern. One of the factors that inﬂuence which entities obtain morphological marking is referentiality. Notice that for direct object marking, there is a neat inverse relationship between marking by adpositions or case and incorporability: highly referential objects are marked most often typologically and ﬁrst in diachronic developments and almost never incorporated, non-referential objects are marked least and last and are often incorporated. As was noted at the beginning of this chapter, the question of whether incorporation should be seen as a lexical or a syntactic phenomenon has been and still is at the centre of the debate. It might be said that, in anyone’s theory, incorporating patterns put the deﬁnition of “lexicon” and “lexical item” to the test.

254 The Growth and Maintenance of Linguistic Complexity

In a simplistic and traditional view, a lexicon is a list of the words or lexemes in a language, and the grammar consists of two parts: morphology, which provides inﬂectional forms of the lexemes, and syntax, which provides the rules for combining words into larger expressions. The main problem with this view is the identiﬁcation of words with “listemes”, the entities that have to be listed rather than generated by rule. As we have already seen in this book, such entities may be both smaller and larger than words. The existence of productive derivational processes makes it impossible to list all the words in a language; the existence of phrasal idioms and the like makes it impossible to derive all multi-word entities by rule. To save the situation, we may re-deﬁne the lexicon either as the repository of all listemes in the language, of whatever size, or as the set of all words in a language, whether they are derivable by rule or not. It appears to me that linguists tend to be rather confused on this point. We thus ﬁnd proposals both to the eﬀect that lexical items may consist of more than one word, and to the eﬀect that lexical items may be derived by rule, but they are seldom pitted against each other as alternatives. This leads to diﬃculties, at least for modular theories, in that it is no longer clear what the essence of the lexicon is. In the worst case, we may end up with lexical items that are neither words nor listemes. For instance, the locational and directional particles discussed in 10.6 may be either bound or free. Since one and the same combination of a verb and a particle may vary between a “compound” and a two-word phrase in a given language, we may want to see the two-word combinations as lexical as well, in particular as many of them are clearly lexicalized or conventionalized, often with non-compositional meanings. On the other hand, we also ﬁnd productive verb– particle patterns, which would seem to belong to derivational morphology as long as their members are single words. But there is then no principled way to exclude productive formation of two-word combinations in the lexicon. “Lexicalization” is a term which is no more clearly deﬁned than “lexicon”, although the intuitive idea seems to be that something — primarily a complex expression of some sort — enters the lexicon, or becomes a lexical item. It may be noted that this presupposes that the entities in question may also exist without becoming lexicalized — that is, that e.g. morphological “nonce formations” are not lexical items. Lexicalized phrases, or “lexphrases” for short, is the topic of Anward & Linell (1976), a paper written in Swedish which predates the discussion of incorporation of the two last decades, and indeed does not at all mention incorporation (at least not under that name). Lexphrases are said to be opposed to, on the one hand, wordlevel lexical items, on the other, productive syntactic constructions. In their paper, Anward & Linell focus on those types of lexphrases as are characterized by “connected prosody”, or what is called sammanfattningsaccent or “unit accentuation” in the Scandinavian literature, which means that the phrase is pronounced as one prosodic

Incorporating patterns 255

unit with stress only on the last element. For instance, in the phrase spela kort16 ‘play cards’, the main stress is on kort ‘cards’ and the verb spela is de-accented. Anward & Linell identify a fairly large number of structural types of lexphrases with unit accentuation, including both noun phrases, adjective phrases and verb phrases. Among the noun phrases, most are “complex proper names” such as Vita Huset ‘the White House’ but also phrases such as kallt krig ‘cold war’. The APs and VPs include adjectives and verbs with various types of complements and adjuncts. Coordinated lexphrases occur among all the phrase types. The list of properties said to characterize lexphrases is quite long. The most important ones are as follows: –

–

– – – – – – –

Lexphrases must be syntactically well-integrated constituents. This basically seems to exclude lexphrases that are sentences or predicate phrases (rather than verb phrases) in the sense of Chomsky (1965). Lexphrases do not express “speciﬁc judgments” but consist of “amalgamated concepts”. They often contain “non-derivable meaning components”: this seems to be equivalent to “non-compositional interpretation”. Components of lexphrases cannot be independently modiﬁed. There are strong restrictions on the inﬂection of components. Lexphrases often contain morphological and syntactic idiosyncrasies and irregularities. There tend to be restrictions on variation in word order and choice of construction type (e.g. active — passive) in lexphrases. Components of lexphrases cannot be separately questioned, negated or assigned contrastive stress. Components of lexphrases cannot have speciﬁc reference. Lexphrases are “anaphoric islands”, that is, a component of a lexphrase cannot be the antecedent of an anaphoric expression.

What is striking here is the similarity between the properties of lexicalized phrases listed here and those that were found earlier in this chapter to be characteristic of incorporating and quasi-incorporating constructions. So how does lexicalization relate to those phenomena? In spite of their emphasis on the prosodic character of the lexphrases they discuss, Anward & Linell note that “unit accentuation” is neither a suﬃcient nor a necessary property of lexphrases. In particular, they point out that this prosodic pattern is also found in such non-lexicalized verb phrases as dricka vin ‘drink wine’ where the object is a bare mass noun and the verb has

16.For the sake of clarity: although kort ‘cards’ in spela kort looks like a bare noun such as häst ‘horse’ in (117b), it should rather be understood as a plural with zero ending. (Neuter nouns ending in a consonant, like kort, normally have no ending in the plural in Swedish.) Anward & Linell also give examples of overt plurals in lexphrases, e.g. jaga ﬂugor ‘chase ﬂies’.

256 The Growth and Maintenance of Linguistic Complexity

imperfective aspect, that is, precisely those cases where classical noun incorporation is found in other languages. In fact, Nedergaard-Thomsen (1992) uses “unit accentuation” in Danish as a major criterion for what he calls “analytic incorporation”. It is thus plausible that whatever conditions there are for the prosodic pattern are also a prerequisite for lexicalization. Lexicalization, however, happens only in a subset of all cases. Anward & Linell suggest that phrases may undergo a “temporary lexicalization” which may last only for the rest of a conversation. Cases in point would be when speakers decide on referring to a certain object by a speciﬁc deﬁnite description such as the guy from Paris. The idea is further developed by Wälchli (2003), who sees temporary lexicalization as a preliminary stage to permanent lexicalization, in a “drift towards the permanent lexicon”. Cognitive linguists would probably prefer to speak of a gradual process of entrenchment. It is perhaps not obvious that the notion of temporary lexicalization can also cover cases like ‘drink wine’. Part of the diﬃculties in determining the lexicon may lie in the assumption of modularity. Looking at the diﬀerent phenomena on the borderline between incorporation and syntax, one sees that although there seems to be a universal set of properties that characterize word-internal structures as opposed to syntactic ones, it is common for expressions to have a subset of these only. Thus, even seemingly simple properties such as expandability may be only partially manifest: English preposed adjectives accept modiﬁers to the left (a very big house) but not, in general, to the right (*the biggest in town house). Similarly, criteria that work in one language may not work in another: in Turkish, lack of case marking is a hallmark of quasi-incorporated nouns, but not in English, where there is no case marking on nouns in general, or in Hungarian, where quasi-incorporated nouns are marked for case (although not for number). This last circumstance brings us to a rather tricky problem. If lack of object case marking is one of the major signals of quasi-incorporation in a language, it would appear that the situation has come about through a “halted” grammaticalization. The quasi-incorporated nouns are those that have withstood the expansion of the object case marking. The question is now: were they already quasi-incorporated before the case marking was introduced for the other objects? And in that case, what were the criteria? Notice that a criterion such as “x is not modiﬁed” does not in any way violate the normal well-formedness constraints on object noun phrases — rather, it picks out a subset of them. And as long as we only have such criteria we do not really have any grounds for saying that incorporated objects are a separate pattern. The introduction of case marking creates such a criterion, but by changing the rest of the objects, rather than the quasi-incorporated ones, which strictly speaking remain what they have always been. That such a passive change would all of a sudden relegate them from the syntax into the lexicon appears a bit far-fetched.

Incorporating patterns 257

The alternative is that the lexical rules exist in all languages, although they are covert in some. That is, even a language such as English does have lexically derived objects, although you cannot see it. It seems to me preferable to assume that incorporating and quasi-incorporating patterns arise by a gradual accretion of properties in subsets of members of syntactic constructions, with the ensuing tightening of the members of those subsets and ﬁnal emergence of a separate pattern. But what is the general mechanism here, and is there a general principle for separating out the subsets that receive special treatment? I think that a possible solution could be found if we assumed that there are in fact two organizational principles at work. I shall sketch such a solution, although I am aware that it may be a bit too programmatic to be viable at this point, so I present it more as a point of departure for further deliberations. If you want to give a course of lectures, the lectures normally have a set length, say 45 minutes. Still, the topic of the course may be such that some sub-topics take up much more time than others and have to be allotted several lectures. This creates a mismatch between the hierarchical thematic structure of the course and the divisioning into 45-minute lectures. Analogical situations are found in many places, and are in fact diﬃcult to avoid in any situation where information has to be given in instalments. Files sent on the Internet are delivered in “packets” of a pre-speciﬁed length. Suppose in fact similar principles hold for spoken language, that is, that utterances are delivered in units that we can call “packets”, and that those represent the maximal domain of incorporation processes (what does not ﬁt into one packet cannot be incorporated). Here is how it might work. A packet would typically be a word or a short phrase, pronounced as at most one prosodic unit (with a unitary intonational contour). It does not seem to be possible to identify it with prosodically deﬁned notions such as “tone unit” or “intonation unit” (in the sense of Chafe (1987)), however. Thus, one intonation unit could encompass more than one packet as deﬁned below. (On the other hand, it probable holds that a packet is normally contained within one intonation unit.) Among the most salient properties of packets are: –

–

They are highly integrated: in production, they probably correspond to a single command at some level. As a consequence, there are constraints on what you can do with the elements of a packet. There are strong constraints on their internal complexity. A possible metaphor is to think of the packets as containers with limited capacity — you cannot cram in more than a certain amount of stuﬀ into them. However, it may be more a question of how complex the structure can be.

When a complex linguistic expression is built, a speaker puts as much structure as possible into each packet. If there is too much structure for one packet, or if any

258 The Growth and Maintenance of Linguistic Complexity

other constraint on what can be done within a packet is violated, the material has to be divided into two. The following are the constraints that seem to apply to packets in general (where “element” means “proper element”, i.e. something that is smaller than the packet itself): –

– –

Lexical referential expressions do not easily ﬁt into packets. A lexical referential expression is a noun phrase containing at least one lexical element and having speciﬁc reference. Prototypical examples of lexical referential expressions are deﬁnite descriptions such as the crocodile or proper names such as Mary. Noun phrases that clearly do not qualify are on the one hand pronouns, on the other, bare nouns such as crocodiles. An element of a packet may not be independently focused, highlighted, emphasized etc. An element of a packet should not have an internal syntactic structure of its own. (What counts as an “internal syntactic structure” is not entirely clear.)

Here are some practical examples: 1. She drinks tea consists of one packet, since there is no element that cannot be accommodated. 2. Mary drinks tea consists of two packets, Mary and drinks tea. Motivation: Mary is a LRE. 3. Mary drinks the tea consists of three packets, Mary, drinks, and the tea. Motivation: Mary and the tea are LREs. 4. Mary drinks TEA (with extra stress on TEA) consists of three packets, Mary, drinks, and TEA. Motivation: TEA is emphasized. 5. Mary drinks tea and coﬀee consists of three packets, Mary, drinks, and tea and coﬀee. Motivation: tea and coﬀee has a complex internal structure. Although “packet” does not coincide with the traditional notion of “word”, one can say that a packet represents the maximal possible word. That is, the elements of a packet can in principle be integrated into one word, but words do not usually go beyond one packet. Lexicalization usually takes place within packets. However, idioms are sometimes quite complex and apparently must be analyzed as consisting of more than one packet, e.g. lead someone down the garden path. Syntactic constructions are to some extent independent of packet structure, that is, the same syntactic construction (e.g. a transitive verb phrase) may be wholly contained in one packet or involve two separate packets, in which case we obtain a trans-packet syntactic relationship (e.g. between the verb and the direct object in a transitive construction). In the latter case, one of the elements is typically a packet in itself.

Incorporating patterns 259

However, syntactic constructions may become restricted to one- or two-packet structures, in one of the following ways: – –

A grammatical maturation process which results in the obligatory grammatical marking of the elements of a construction may be restricted to two-packet cases. A reduction process may apply only to one-packet cases, resulting e.g. in a distinction between two-word and one-word constructions.

On the whole, elements of one-packet constructions will tend to have less explicit grammatical marking and be more reduced than elements of two-packet constructions. However, a grammatical marking which is initially obligatory only in twopacket cases may later spread also to one-packet cases, resulting in the obliteration of the formal distinction between the constructions.

Chapter 11

Stability and change

11.1 Introduction In this chapter, I turn to questions concerning the factors underlying stability and change in linguistic systems, particularly from the perspective of maturation processes.

11.2 Measuring stability To understand the processes by which linguistic complexity arises and disappears, it would be of great value to be able to measure the empirical probabilities of diﬀerent types of events that take place in language change, and, from a diﬀerent perspective, to measure the stability of diﬀerent kinds of linguistic structures. It will not be possible, in general, to calculate anything close to precise probabilities for grammatical changes. One reason is the limitations of the empirical material — for the majority of the world’s languages we know hardly anything about their history. Another is the disturbing eﬀect of areal inﬂuence. Looking at a certain area during a certain period, one may get the impression that a given type of change is highly frequent and perhaps even necessary, since it seems to be taking place in practically every language encountered. The history of linguistics is full of hasty conclusions drawn from such biased material. However, preciseness is not essential: what we want to know is, on the one hand, the general order of magnitude of the probabilities, on the other, approximate relationships between diﬀerent types of changes — i.e. whether one is more probable than the other. Traditionally, a related but slightly diﬀerent question has been more in focus: Which elements of a language are frequently borrowed, and which elements tend to resist borrowing? The standard answer is that grammar and core vocabulary comprise the parts of language that are less easily borrowed from one language to another. Morris Swadesh, the founder of lexicostatistics, tried to make the idea of core vocabulary more precise and also widened the question to cover all kinds of lexical change. The basis of lexicostatistics was the claim that the words on Swadesh’s 100-word and 200-word lists were replaced in languages at a reasonably constant rate: in one millennium, about 15 per cent would be replaced. “Replacement

262 The Growth and Maintenance of Linguistic Complexity

rate per millennium” is thus a possible measure of stability, one which can easily be converted into “replacement rate per generation”, if we assume that generations in human populations are of standard duration (say, 25 years). Another useful measure is “half-life”, used extensively in natural science in the sense of the time required for a sample or quantity to be reduced to half its value — the most wellknown case being the time it takes for the radioactivity of a substance to fall by half. This is the same as the median (not the mean!) life-length of an item. A 15 per cent replacement rate per millennium means an approximate half-life of 4300 years,1 that is, after this time 50 per cent of the original vocabulary will remain. After two half-lifes (in this case 8530 years) 25 per cent remains, after three half-lifes 12.5 per cent etc. (Obviously, new elements may be added, but they do not count in these calculations.) It turns out that there is great variation in the stability of individual lexical items (or cross-linguistically, of sets of synonymous lexical items). This is the case even within Swadesh’s list, in spite of the fact that the items there were chosen for their stability (Dyen et al. (1992)). What I want to do here is to establish, in a rough way, the upper and lower boundaries of this variation among high-frequency vocabulary items. This will provide us with a benchmark when assessing stability of grammatical phenomena. As a concrete example, consider the words for ‘three’ and ‘girl’ in some Romance languages (main source: Malherbe & Rosenberg (1996)). Starting with the words for ‘three’, we can see that they are all cognates — after 2000 years, no Romance language has replaced the Latin tres by another item. In fact, the word is probably at least three times as old, as it is supposed to have been inherited from Proto-Indo-European, and it is preserved in virtually all modern Indo-European languages. To judge from these facts (which admittedly may not necessarily be extendable to other languages), words for ‘three’ are extremely stable — to the extent that it is quite diﬃcult to assign a value to their stability. It may be rather much higher than the normal 4300 year half-life for the words in the Swadesh lists. Let us however not try to give an exact ﬁgure here but rather somewhat arbitrarily postulate 6000 years as an upper limit for the half-life of stable lexical items.

1.The Thinkquest web site oﬀers a convenient calculator for computing these values (http:// library.thinkquest.org/11771/english/hi/math/calcs/halﬂife.html).

Stability and change 263

(175) Language Latin Asturian Catalan Corsican French Galician Gascon Italian Portuguese Provencal Romanian Romansh Sardinian Sicilian Spanish Venetian Walloon

‘three’

‘girl’

tres tres tres trè trois tres tres tre três tres trei trais tres tri tres tri treus

puella moza minyona giuvanotta (jeune) ﬁlle rapaza gojata ragazza; fanciulla menina; rapariga chato fata˘ giuvna pitzinna picciotta muchacha fìa; tóxa båcele

The words for ‘girl’ are very diﬀerent. We can see that after 2000 years, all the daughter languages have replaced the Latin puella. In addition, most of the words are clearly unrelated to each other, so there must have been many separate replacement events. This indicates a replacement rate that is much higher than the one Swadesh assumes for his lists. The situation found in the Romance languages is mirrored by a similar diversity in the Germanic languages. In German dialects, at least nine diﬀerent non-cognate words are found (König (1978: 167)), not counting diﬀerent diminutive formations of the same root as in Mädchen and Mädel. In a parallel corpus of 35 Swedish dialects, there were eight diﬀerent non-cognate translations of ‘girl’ in one and the same sentence. (Words for ‘boy’ are similarly diverse.) Even so, this does not demonstrate whether the situation is generalizable to other language families. But since we are only trying to establish the limits of variation here, let us consider the consequences of hypothetically assuming that (175) represents the normal situation. In that case, we have a phenomenon that is unstable enough for the average retention rate to be less than 1/15 after two thousand years. This means that the maximal half-life of such a phenomenon is 512 years. Even if we could certainly ﬁnd more extreme cases of instability — consider for instance slang words such as cool and hip, which are often replaced more than once in a generation — a half-life of 500 years may be a suitable lower limit for the stability in the lexicon. The normal range of variation would thus be between 500 and 6000 years, and this will be our yardstick.

264 The Growth and Maintenance of Linguistic Complexity

Kings and languages To appreciate the ﬁgures given in this section, it may be useful to compare the stability of languages to that of some other social phenomena. Consider for instance monarchy, a prime example of a non-genetically inherited institution, and the 14 independent states (some of them “double monarchies”) ruled by kings or emperors in Europe (this excludes the kingdoms that were part of the German Reich). If we go back 200 years, we ﬁnd that only six of these existed as monarchies in 1700, and if we instead go forward to the year 2000, we again ﬁnd only six of them left. (There are actually seven monarchies in Europe today, but this is because the union between Sweden and Norway split up in 1905.) Basing the calculation on the 20th century only, we obtain a half-life of 75 years. The ﬁgure may be higher if we look at a longer period but we will hardly reach the lower limit of what is assumed in the main text to be the normal range of variation for lexical items (500 to 6000 years). In other words, languages, or at least linguistic elements, are quite stable phenomena as social phenomena go. Some elements of languages, such as the most stable lexical items and phenomena like ablaut in Semitic, are probably among the oldest traceable and still current cultural phenomena. The only serious competitors are general technological innovations such as agriculture. More speciﬁc elements of human culture such as religions, calendars etc. seldom go much further back than 2,000–3,000 years.

The above discussion of the range of variation in the stability of vocabulary items was intended as a background to the question which is more directly pertinent to the main topic of this book, viz. the probability of grammatical change and the stability of grammatical phenomena. Here, the problems present themselves slightly diﬀerently. The birth and death of lexical items are (at least in simple cases) events of the same character: one lexeme is replaced by another. Grammaticalization, on the other hand, is a multi-stage process, and the appearance of a grammatical item is not equivalent to its disappearance. In his discussion of the dependence of the frequency of grammatical phenomena on language change, Greenberg (1978b) distinguishes two main factors: the “probability of origin from other states” and “stability”. In other words: for a certain type of grammatical item, say a future auxiliary construction, we have to ask at least the following two diﬀerent questions: (i) what is the probability that it will come into existence? (ii) what is the probability that it will disappear (in one way or another) once it has appeared? For the frequency of the item in question to be constant in the world’s languages, however, these two probabilities must more or less balance each other.2 Rather than looking for the half-

2.This is analogous to the case of genes that condition hereditary diseases: the increase in mortality before successful reproduction (which eliminates the genes from the gene pool) must be balanced by the rate of mutations that re-introduce the gene.

Stability and change 265

life of a phenomenon, we must look for the chance that a certain type of change will happen in a language during a given time span. How could this be done? Let the Romance languages again serve as an example of a language family. At the synchronic level, it presents itself to us simply as a set of languages. The ﬁrst thing we want to know is how many generations of language development lie between these languages and their assumed common origin. To this end, we try to reconstruct the Stammbaum of the family. At this point, we are not interested in the actual details of the tree and we can therefore make the simplifying assumption that the tree is a perfectly binary one. Here is for instance such a Stammbaum for a family of 16, which we shall (somewhat arbitrarily) assume to be the size of the Romance family: (176)

The ﬁrst step in the calculation of the number of generations is to ﬁgure out how many arcs — that is, connections between adjacent nodes — the tree contains. This turns out to be a relatively simple task: the general formula is (2 × n) − 2, where n is the number of terminal nodes, i.e. languages at the time of observation. A 16-language family thus has 30 arcs. For all practical purposes, the number of arcs can be set equal to twice the number of languages. We now need to know how many generations each arc represents. Obviously, this cannot be determined with any precision: what we need and can hope for is a reasonable estimate. Assuming that we are still speaking of the Romance family, its total age, that is, the time elapsed from the time when the proto-language was spoken, is two millennia. The distance from the root of the tree to the leaves being 4 arcs, each arc should correspond to 500 years. If one generation is estimated at 25 years, we will then have 20 generations per arc, yielding 600 generations for our somewhat idealized model of Romance. Suppose now we have some speciﬁc type of change, for instance the replacement of the word for ‘girl’ that we looked at above. In the list of 16 Romance languages above, we can see that at least 13 diﬀerent (in the sense of non-cognate)

266 The Growth and Maintenance of Linguistic Complexity

words are used, meaning that the change under discussion has taken place at least 13 times during the history of the Romance languages. The probability of the event taking place during any one generation would thus be 12/600 = 0.02 or 1 in 50 generations. Actually, this represents the absolute minimum and is almost certainly much too low, since it makes it rather unlikely that the Latin word would have been replaced in all the daughter languages, and with such varying results. But the minimal number of events needed for a certain result to be obtained gives us a rough indicator of the frequency of the type of event in question. Let us now return to grammatical change and look at a concrete case as an illustration. A suitable point of departure is the following statement by Leonard Bloomﬁeld (1933: 415): Merging of two words into one is excessively rare; the best-known instance is the origin of the future tense-forms in the Romance languages from phrases of inﬁnitive plus ‘have’: Latin amare habeo [aÁma˜re Áhabeo˜] ‘I have to, am to love’ > French aimerai [7mre] ‘(I) shall love’; Latin amare habet [aÁma˜re Áhabet] ‘he has to, is to love’ > French aimera [7mra] ‘(he) will love,’ and so on. This development must have taken place under very unusual conditions; above all, we must remember that Latin and Romance have a complicated set of verbinﬂections which served as a model for one-word tense-forms.3

Is Bloomﬁeld right — is univerbation an excessively rare event? The case he mentions is of course one of the standard examples of grammaticalization. It also happens to be a phenomenon that we have reasonably reliable data about. In (177), adapted from Dahl (2000a), I try to show the distribution of the major “gram families” that are used for future time reference in the languages of Europe. A gram family was deﬁned in Dahl (2000e: 7) as a set of language-speciﬁc grams that can be hypothesized to have arisen through one and the same historical process — either by being inherited from a common parent language or as a result of historical contact. By deﬁnition, then, each gram family (if correctly identiﬁed) represents one independent initial event. Restricting ourselves to the Indo-European phylum, the number of languages involved is 38, which would mean a tree with 76 nodes, corresponding to about 1200 generations. The question is now, how often has an event which is equivalent to the one Bloomﬁeld describes taken place? It turns out that there are four “gram

3.Bloomﬁeld seems to want to imply that univerbation is made possible or at least more probable by the previous existence of similarly structured inﬂections in the language. In fact, we cannot exclude that such an inﬂuence exists, but it can hardly be a necessary condition for univerbation to take place: that would mean that an isolating language would be doomed to be isolating for ever.

Stability and change 267

families” in Europe which may be regarded as clear cases of inﬂectional futures.4 They are marked in grey in (177). Two of them, those found in the Celtic and the Baltic languages, have prehistoric origins, and how they arose is not clear. The two others are the Romance inﬂectional future that the French example belongs to and an almost exactly analogous formation in Ukrainian (which, however, is restricted to imperfective verbs). Let us thus say that an inﬂectional future has arisen 2–4 times in the Indo-European languages of Europe, that is, with an estimated frequency of one event in 300–600 generations, which yields a probability of ª 0.005. (177) Future “gram families” in Europe Nor Ice

Far

Swe

Fin

Dan Est

Frs

Ltv

Eng

ScGl

Lith

Dut

Ir

Grm Brt

Zur

Fr Spn Bsq

Prt

Yid Srb

Rmns Ctl NIt

Slve

Pol Cz Hng

Bylr Slva Rmny

Scr

Rum Blg

Mac Alb

Srd Slt

Rus Ukr

Ghg Grk

A comparison with other cases of univerbation suggests that this is indeed the correct order of magnitude. If we look at the Scandinavian languages, we ﬁnd two reasonably well-documented cases of univerbation: the development of morphological middles/passives out of reﬂexives and the rise of suﬃxal deﬁnite articles. Looking around, we ﬁnd that each of these developments has exactly one clear counterpart in the rest of Europe: the East Slavic suﬃxed reﬂexives, which have developed various middle and passive uses,5 and the suﬃxed articles in the Balkan languages. Irrespective of whether these are independent events or not, we obtain an incidence which is at the lower end of the one assumed for inﬂectional futures. So, is Bloomﬁeld’s claim conﬁrmed? Well, it really depends on what is meant by “rare”. If, for instance, we consider the total of all grammatical constructions in a relatively large set of languages over a relatively long period of time — say, all European languages since the introduction of writing — it is highly probable that

4.The use of the present of perfective verbs in e.g. Russian is a borderline case: it is clear, though, that it has not arisen by univerbation in the way the French future has. 5.The cliticized reﬂexives of the other Slavic languages, like those of Romance, are excluded here as they do not represent univerbation (creation of complex words) in the strict sense.

268 The Growth and Maintenance of Linguistic Complexity

we will be able to observe a number of univerbation events. If we, on the other hand, consider one speciﬁc construction during one single generation, we can see that the probability that univerbation will take place is quite low — perhaps something like 1 in 500 (comparable to the probability of replacing a maximally stable lexeme such as ‘two’ or ‘heart’). As I have said earlier, what is important here is not the exact ﬁgure but the order of magnitude. An event with such a low probability is in a fairly concrete sense a “rare” one. This is relevant above all with respect to the choice of explanations as to why the event in question — in this case univerbation — takes place. The problem is that explanations of rare events are often “too good”, that is, the mechanisms that they propose are such that one would predict that the event would occur in the majority or at least in a large proportion of all possible cases. On the other hand, Bloomﬁeld’s use of the intensiﬁer “excessively” may still be questioned. Someone who claims that a certain type of event is “excessively rare” should not only have fairly good empirical evidence for it but should also make explicit what the criteria are. So far, we have been looking at inﬂectional patterns. What about non-inﬂectional ones? Returning to the ﬁgure above, we can see that the number of non-inﬂectional gram families is considerably larger than that of inﬂectional ones — eleven such gram families are noted in the ﬁgure. In addition, practically all of these have a recent history, which is reﬂected in the fact that their diachronic sources are still transparent in most cases: at most, they go back 1000–1500 years. In other words, the events that have conditioned the existence of the present-day non-inﬂectional future patterns in European Indo-European languages have all taken place during a relatively short period of time. A rough estimate gives a frequency of 1 in 100 generations, thus a considerably lower ﬁgure than the ones we have arrived at for inﬂections (although still higher than what was supposed for the replacement of the word for ‘girl’ in the Romance languages). This is consistent with the claim that the early steps in a grammaticalization process are more easily taken than the later ones. Obviously, the data presented here are much too limited to draw any relatively general conclusions, but the idea certainly also ﬁts the general impression we have as linguists. Let us now proceed to the other question formulated above: Given that it has appeared, what is the probability that a speciﬁc grammatical item will disappear (in one way or another)? One way of answering this is to look at the ages of existing grammatical items. I have already mentioned the fact that non-inﬂectionally expressed future grams can usually be shown to be relatively young, whereas the inﬂectional futures in Europe are partly of pre-historic origin. This suggests that there may be a greater volatility in the earlier stages of grammatical maturation. For many grammatical phenomena, it may well be that they either disappear (a majority) or make it to a more mature stage (a minority). Recall at this point the

Stability and change 269

“hoop net” model presented in 7.4. Also, as it turns out, mature patterns, such as those manifested in inﬂectional morphology, can be astonishingly stable. Consider, for instance, the Germanic languages. It is probably no great exaggeration to say that the reconstructed inﬂectional system of Proto-Germanic or the attested system of Gothic is after two millennia still preserved in its general traits in the more conservative present-day Germanic languages such as (High) German, Icelandic and Dalecarlian in Sweden. This is obscured by the fact that a relatively radical reduction of the system has taken place in the languages spoken along the coasts of the North Sea and the Baltic (English, Dutch, Low German, Standard Central Scandinavian).6 But what is striking is that even in those languages parts of the system are quite well preserved. Let us have a closer look at one particular example, the Germanic ablaut system. Ablaut (also called “apophony”) is the traditional German-derived term for vowel alternations such as those found in “strong” verbs such as drink : drank : drunk. Ablaut is exploited in the morphology of many diﬀerent languages, perhaps most often in the tense–aspect systems. The Germanic strong verb system is represented in all Germanic languages, with the possible exception of Afrikaans.7 Although the particular implementation of the ablaut system in the verb morphology is unique to Germanic, ablaut as a general phenomenon goes back much further than Proto-Germanic and is indeed reﬂected in one way or the other in many branches of Indo-European. Since the earliest written sources of Germanic go back almost two millennia, it is possible to make some fairly speciﬁc claims about the stability of the system. In English grammar, strong verbs are often referred to as “irregular”, which is probably due to the system being more reduced there than in most other Germanic languages. But even elsewhere they are often seen as a kind of anomaly in the language, which is on its way out — the picture one tends to get is one of a steady stream of strong verbs becoming weak, or regular. Indeed, it seems that the set of strong verbs is steadily shrinking: for instance, according to Augst (1975: 254), the number of strong verbs decreased from 3498 in Old High German (750–1050 C. E.)

6.I use this newly coined term to refer to Standard Danish, Standard Norwegian Bokmål and Standard Swedish and various regional varieties of these languages, as opposed to Icelandic, Faroese, Standard Norwegian Nynorsk and most spoken Scandinavian dialects in Norway, northern Sweden, and east of the Baltic. 7.This claim presupposes that creole languages with Germanic lexiﬁers do not count as Germanic languages. 8.I ﬁnd this ﬁgure (349 strong verbs in Old High German) slightly curious. As far as I can see, Augst provides neither a source for it, nor a list of verbs. Bittner (1996: 135) quotes Augst as an authority, but his statistics of diﬀerent types of strong verbs in Old High German (p. 133) rely on Hempen (1988) and are based on 291 verbs only.

270 The Growth and Maintenance of Linguistic Complexity

to 169 in Modern German, a loss of 51 per cent. These ﬁgures have to be seen in perspective, however, and to that end we shall have a closer look at the developments in question. The Germanic strong verbs are traditionally divided into ablaut classes, according to the particular patterns of stem vowel alternation. Grammars usually list seven classes. The seventh class involves reduplication in addition to ablaut, and has usually not survived as a separate pattern. The six pure ablaut classes, on the other hand, are more stable. Below, examples are given from Gothic and Swedish. For each verb, the inﬁnitive, preterite and past participle are given. (178) Ablaut classes in Gothic and Swedish Class Gothic

Swedish

Gloss

I II III

skina – sken – (skinit)9 ljuga – ljög – ljugen brinna – brann – brunnen bära – bar – buren giva – gav – given fara – for – faren

‘shine’ ‘lie’ ‘burn’

IV V VI

skeinan – skain – skinans liugan – laug – lugans brinnan – brann – brunnans baíran – bar – baúrans giban – gaf – gibans faran – fôr – farans

‘carry, bear’ ‘give’ ‘go, travel’

As can be seen from (178), the ablaut patterns are almost identical in Gothic and Swedish, disregarding some relatively minor phonological changes. The attested form of Gothic derives from the 4th century C. E., but the presumed common ancestor of Gothic and Swedish was spoken 500–1000 years earlier. To simplify things, let us say that we are dealing with a time distance of two millennia. We may thus claim that at the level of ablaut patterns, the system has been quite stable during this time. For individual verbs, though, the situation is a bit diﬀerent. As I mentioned earlier, the number of strong verbs seems to have dropped quite considerably, at least if we assume that Old High German reﬂects the original situation.10 But it is worth looking a bit more closely at the statistics. There are at least two important observations.

9.Skina has no past participle in Swedish, so I instead give the supine form, which is historically the neuter form of the past participle. 10.It is not so easy to establish how many strong verbs there were in Proto-Germanic. In Gothic, there are only a little more than a hundred attested strong verbs, but that may at least partly be due to the limited availability of texts. It is not impossible that there was actually an initial increase, in view of the considerable number of strong verbs that are not traceable back to Proto-Germanic.

Stability and change 271

One is that, as Augst emphasizes, the losses have not been evenly spread over time. In Middle High German, there were almost as many strong verbs as in Old High German (339 according to Augst) — at this period, the losses were largely compensated by gains (there were in fact 39 new strong verbs in Middle High German), so the real decrease in number took place between Middle High German and Modern German. The variation over time suggests that large-scale losses depend on some “crisis situation” for the language, whether this is caused by internal or external factors, or a combination of both. The second observation is that the majority of the losses are not due to regularization, that is, strong verbs becoming weak, but rather to the disappearance of the lexical items in question from the language. Thus, between OHG and MHG, 8 verbs became weak and 41 were lost altogether, and between MHG and Modern German 54 became weak and 119 dropped out of the language. After a millennium, then, about a third of the surviving strong verbs have been regularized. This may seem a high number, but it would still yield a half-life for the system of 1700 years, were it not for the general loss of lexical items, which has reduced the half-life to 1000 years. It might be argued that the lost verbs should also count, since they might disappear precisely because they are strong (and presumably problematic). However, Augst (1975: 254) also cites a general retention rate of verbs from Old High German to Modern German of 48 per cent, which should be compared to 57 per cent for the strong verbs. The strong verbs thus have been better preserved than average, which is to be expected, given their higher frequency. The ﬁgures from German show that there may be great variation over time as to the stability of the system. We can also see variation both between languages and between diﬀerent parts of the system. Thus, the ablaut classes in (178) are not equally stable. Let us look at a stable class in Swedish, Class I, with about 30 members. To get a maximally long time perspective, I have compared these verbs to the set of Class I verbs shared between Old High German and Gothic, that is, verbs that are likely to have been inherited from Proto-Germanic. There were 19 such verbs. It turns out that of these, 14 also have cognates in Swedish, of which 9 are still members of Class I (bita ‘bite’, driva ‘drift’, gripa ‘seize’, lida ‘suﬀer’, niga ‘curtsey’, skina ‘shine’, skita ‘shit’, smita ‘escape’, stiga ‘rise’) and 5 (bida ‘wait’, leja ‘hire’, snida ‘cut’, spy ‘spew, vomit’, te (sig) ‘appear’) have become weak verbs (probably only one of these latter 5, spy ‘spew, vomit,’ is particularly frequent in today’s spoken language). Five of the roots common to Gothic and Old High German (*risan ‘rise’, *liban ‘remain’, *skidan ‘separate’, *þihan ‘thrive’, *widan ‘bind’) do not have direct cognates in Swedish. In other words, of the original group, about 50 per cent are retained as strong verbs, and the rest are evenly divided between renegades (those having become weak) and drop-outs (those having disappeared). Probably the method by which I have chosen these verbs makes them generally more stable than the average strong verb. As it turns out, however, far from all

272 The Growth and Maintenance of Linguistic Complexity

Swedish Class I verbs are derived from Proto-Germanic. In addition to the 9 verbs already accounted for, there are at most 7 other verbs for which this is probable. Two verbs (svika ‘fail’, svida ‘sting’) are not attested outside Scandinavia but are old enough to plausibly derive from Common Scandinavian. The rest — 12 verbs — are likely later additions to Class I. In some cases, they were borrowed from Latin or Romance (skriva ‘write’, pipa ‘squeal’) or from Middle Low German (bli ‘become’, knipa ‘pinch’) but the rest are verbs that have either arisen later in Swedish or verbs that previously belonged to the weak conjugations, having become strong relatively recently (skrika ‘scream’, snika ‘sneak’, kvida ‘wail’, vina ‘whine’, glida ‘slide’, ﬁsa ‘fart’, kliva ‘stride’, strida ‘ﬁght’). A similar picture obtains for Class I verbs in German. Of the 19 verbs common to Gothic and Old High German, 13 still belong to Class I — in a few cases with added preﬁxes only — while 3 have become weak and 3 have disappeared from the language. Of the other verbs in Modern German Class I, 5 are probably from ProtoGermanic, 7 are not known outside West Germanic, 1 has been transferred from Class VII and 6 are new. English, however, is rather diﬀerent. Of the 19 verbs common to Gothic and Old High German, only 6 have Modern English cognates (drive, smite, shine, bide, rise, bite) that are still conjugated according to the original pattern (e.g. drive – drove – driven). 2 verbs (grip, spew) have been regularized, one (shit) vacillates between several diﬀerent inﬂectional patterns and 10 have no cognates in Modern English. In addition to the six verbs already mentioned, there are only three verbs in English that could claim Class I membership — ride, which is probably Proto-Germanic, thrive (a likely loan from Scandinavian) and the rather marginal shrive, which like schreiben in German and skriva in Swedish derives from Latin scribere. In English, then, the Germanic Class I ablaut pattern has become rather marginalized. But again, the main factor behind this is not regularizing, but rather a general renewal of the vocabulary. As noted above, Class I belongs to the more stable of the six ablaut classes. What happened to the others? Let us look at Class VI, which has some surviving members but really seems to have virtually collapsed in Swedish. Applying the same approach again, we ﬁnd 17 Class VI verbs that were common to Gothic and Old High German. 14 of these have cognates in Modern Swedish. Here, a fairly large group — 9 verbs — have migrated to the weak conjugations, although only two of them (baka ‘bake’, skada ‘damage’) have joined the productive (1st) conjugation. Seven verbs still preserve ablaut alternations, at least in the past tense, but various phonological changes have obscured the original pattern and in eﬀect made the inﬂection of several of these verbs less predictable (the forms quoted here are the

Stability and change 273

inﬁnitive, the past tense and the supine):11 (179) Swedish dra: drog: dragit ta: tog: tagit fara: for: farit svära: svor: svurit le: log: lett stå: stod: stått slå: slog: slagit

‘pull’ ‘taken’ ‘go, travel’ ‘swear’ ‘smile’ ‘stand’ ‘hit’

This class, then, has split up into one group that has become more regular, and another that has become more irregular. What we can see, then, is that the strong verb system in languages such as German and Swedish has been slimmed but is still a central part of the verbal morphology. Patterns of this kind lose members through at least three kinds of processes: regularization (i.e. transfer to more productive patterns), irregularization (primarily through the eﬀect of phonetic reduction processes) and lexical loss (items are eliminated from the language altogether). Although regularization tends to be considered the main factor here — and Naturalness Theory certainly implies that it should be — we see that the other types are often equally or more important. As it seems, the regularizing tendency may be as low as one per cent per generation, which is in the same range or even lower than the general lexical replacement rate among verbs. In my opinion, it may be seen as natural “leakage” rather than as due to an active force. But clearly a pattern or set of patterns may be subject to a “crisis”, leading to a breakdown where none or only residual members survive. Certainly many factors can contribute to such a crisis (see for instance discussions in Augst (1975), Bittner (1996) and Wurzel (1987), but, as is often the case, too little emphasis may have been given to external factors, that is, to the eﬀects of suboptimal transmission. The Germanic ablaut system is not unique in its stability. A rather extreme example is found in the Afro-Asiatic phylum, which has the longest documented history of all language families. The Semitic languages are famous for their verb morphology, involving three-consonant roots, consonant gradations and bisyllabic vowel alternation patterns which might be labelled “hyper-ablaut” — undoubtedly a highly mature system. However, parallels are found also in other branches of Afro-Asiatic. The similarities in form and function are suﬃcient for a common

11.It is possible that the past forms ending in -og (drog, tog, log) should be seen as representing what Bybee & Slobin (1982) have called a product-oriented schema, and that this schema is attracting new members from other patterns, such as colloquial stog of stå ‘stand’ (rather than the standard stod) and dog of dö ‘die’.

274 The Growth and Maintenance of Linguistic Complexity

origin in Proto-Afro-Asiatic to have been proposed (Greenberg (1952)). Thus, in four branches of Afro-Asiatic, we ﬁnd internal ablaut to -a- and medial consonant gemination characterizing imperfective verb forms, as in the following table (Hayward (2000: 91)): (180) Ablaut patterns in Afro-Asiatic imperfective perfective Tuareg Migaama Akkadian Beja Tigrinya

(Berber, Algeria) (Chadic, Chad) (Semitic, ancient Mesopotamia) (Cushitic, Sudan) (Semitic, Ethiopia)

iﬀaγ ‘àpàllá ikabbit isdabil y-äsebber

ifeγ ‘àpìlé ikbit iidbil säbärä

Now, Proto-Afro-Asiatic is of “extreme antiquity” (Hayward (2000: 74)). Estimates such as that of Diakonoﬀ (1988), who puts it before 8,000 B. C. E., would make it roughly contemporaneous with the introduction of agriculture. In order to establish the venerable age of the Afro-Asiatic ablaut system, we do not need any speculative hypotheses, however, since we know for certain that it was operative in Akkadian, with preserved texts from 2,500 B. C. E., and even the rather cautious hypothesis that the system goes back to Proto-Semitic gives us an order of magnitude of at least ﬁve millennia. A major diﬀerence between the Semitic and the Germanic ablaut systems is the role they play in the grammatical system of the language in general. For instance, in various forms of Arabic, the combined system of consonant and vowel stem alternations illustrated above is used generally for verbs (including newly-coined ones), extensively and productively in word formation and for many nouns, as well, in the formation of plurals (a pattern which has also been claimed to be of common Afro-Asiatic origin). Arguably, then, the Semitic system has become “entrenched” in the respective languages to a much higher degree than the strong verbs of Germanic. Its loss would imply a major upheaval in the language, an upheaval which has, characteristically, is only documented in Arabic-based pidgins and creoles. Ablaut patterns have some speciﬁc properties that may contribute to their astonishing stability. Although ablaut is without any doubt a non-linear phenomenon, it is realized segmentally rather than prosodically; moreover, it is realized in a very salient way, as an alternation of stem vowels, which tend to have full stress, and exploiting basic distinctions found in almost all vowel systems. It is plausible that this makes ablaut patterns less likely to be subject to reduction processes, and it may also make them less sensitive to suboptimal transmission eﬀects (at least in less radical cases — we may note that creoles based on Arabic or Germanic languages do not usually preserve the ablaut systems). Another observation that was made early

Stability and change 275

on in linguistics but whose signiﬁcance is somewhat more diﬃcult to determine is that vowel alternations analogous to those of ablaut patterns are found in various expressive formations such as onomatopoeia and ideophones, e.g. ding dang dong, zig-zag, which hardly presuppose a long prehistory and cannot therefore be said to be mature in any sense. In those cases where it is possible to have an opinion about the historical sources of ablaut patterns, such as Indo-European, they generally seem to derive from prosodic alternations. For instance, stressed and unstressed tokens of a certain stem morpheme may be subject to diﬀerent phonological developments and thus be diﬀerentiated as to vowel quality. If, for one reason or another, the stress distinction disappears the diﬀerences in vowel quality may remain. If we assume that prosodic morphological alternations generally arise from segmentally expressed patterns, we obtain a seemingly circular development, from segmental to prosodic expression and back again to segmental expression. However, this does not result in a return to the point of departure, since ablaut, as we have already noted, has a clear non-linear character. As a result of the process of maturation, the system has landed on a higher level, but in a stable state. Impressionistically, later stages of maturation processes seem to be more stable than earlier ones. I noted above that periphrastic patterns arise more often than inﬂectional ones. But they may also be more short-lived in general. Consider e.g. the perfect, a tense–aspect “gram type” that is typically expressed periphrastically, as in the English have + Past Participle pattern (Dahl (1985), Bybee & Dahl (1989), Bybee et al. (1994)). In his discussion of the perfect, Lindstedt (2000) refers to Greenberg (1978b: 75–76), who makes the point that high propensity to arise and high propensity to disappear are not necessarily connected. Thus, even if we might expect “that certain phenomena are wide-spread in language because the ways they can arise are frequent and their stability, once they occur, is high”, and that rare phenomena both rarely arise and are rarely likely to disappear, we could also ﬁnd frequently arising but unstable phenomena (Greenberg’s example is vowel nasalization) and rarely arising and stable phenomena (Greenberg mentions vowel harmony). The perfect, as Lindstedt points out, is of the former kind: “a gram type that is frequent, that is to say, likely to appear in diﬀerent languages, but unstable, as it often tends to be lost.” One reason why the perfect is unstable, he says, is that it tends to develop into something else. Perfects may thus become hodiernal pasts, general past tenses, perfectives, or evidentials. In contrast, past tenses and perfectives rarely develop into anything else: they seem to be, in a sense, the stable ﬁnal point of the development. Both mature and less mature patterns are constantly threatened by competitors, but less mature patterns may actually be at greater risk, since they are in a zone where newcomers are more likely. As can be seen from (177), the areas of the future “gram families” overlap quite considerably. In English, for instance, shall Verb, will

276 The Growth and Maintenance of Linguistic Complexity

Verb and be going to Verb are three major periphrastic patterns12 for expressing future time reference, with considerable instability at the borders. It is also likely that many incipient patterns never really get oﬀ the ground and disappear after a while without being noticed by any grammarians.

11.3 Do languages become more complex over time? The question I wish to address here is whether there is a sense in which there is a trend towards greater complexity in language change. This issue is obviously parallel to the corresponding question with respect to biological evolution. Although evolution at ﬁrst glance seems to create more complex creatures — after all, humans are more complex than amoebae — it has been argued that this may be illusory in the following sense. Suppose we have a simple computer program that, starting out with a set of strings that are all one character long, randomly either adds or deletes a character, over and over again. In the beginning, all strings will be very short, but after a while, longer and longer strings will show up. This is not because there is a trend towards longer strings but because it takes some time for them to develop, given that we are starting out with maximally short strings. The same might be true of language change: maybe languages do not tend towards greater complexity, it is just that after a while, some of them will have become complex, while others remain simple. The assumption is then that complicating and simplifying changes are equally probable. But is this compatible with the facts? If you actually run the program just described, and plot the length of the strings on a graph, results will tend to have a peak close to the minimal value and a fairly long slope to the right:

12.The actual competition in the future time reference domain is even greater. In addition to zero marking, i.e. the simple present tense, there are also (i) the is to Verb pattern; (ii) the present progressive; and (iii) will + progressive, which often has a non-compositional interpretation, as in We will be serving dinner at ﬁve o’clock (which means that dinner will begin at ﬁve, not that it will be going on at that point).

Stability and change 277

(181)

In other words, in such a development, most values would be close to the initial value. Running the program for a longer time will have the eﬀect of extending the slope farther to the right, but will not change the general form of the curve. In terms of linguistic complexity, this would mean that if all languages start out with zero complexity and simplifying and complexifying changes are equally probable, most languages will necessarily stay at a complexity level close to zero. As was noted above, McWhorter (2001a) argues that creole languages are maximally simple, and that all other languages have developed away from the creole prototype by adding various kinds of complexities. In addition, he argues that simpliﬁcation and complication are both active in “older” (i.e. non-creole) languages, and that because of this and the fact that these processes aﬀect only a subset of the grammar, “an older language retains at all times a degree of complexity alongside the simpliﬁcations it is undergoing” — i.e. no “older” language can ever return to the low level of complexity found in a creole. Although he does not give us a concrete idea of the distribution of complexity in quantitative terms, the claims just quoted seem to imply a picture which is rather diﬀerent from what is shown in (181) and perhaps rather like the following: (182)

Assuming that all complexity in language above the level found in prototypical creole language is historical in origin, that is, that all extant languages derive from an ancestor which was maximally simple, it does not appear possible to arrive at such a distribution without also assuming that complicating changes are more probable than

278 The Growth and Maintenance of Linguistic Complexity

simplifying ones, at least up to the average level of complexity in languages. There is in fact at least some concrete empirical support for the claim that linguistic complexity is distributed more like (182) than like (181). One of the structural features studied by Nichols (1992) in a sample of 175 languages is what she calls “morphological complexity”. In fact, the measure is based only on a few types of morphological markings, viz. the marking of subjects and objects (including head-marking such as agreement markings on the verb, dependent marking such as case markings etc. on the NPs themselves and “detached marking” such as clitics). The distribution of the complexity parameter in her sample looks like this:

number of languages Æ

(183)

complexity Æ Thus, Nichols’ data suggest that at least as far as morphological complexity goes, the distribution looks rather like the classical bell curve, with a clear gap between zero and the lowest attested values, which is also reasonably close to the picture suggested by McWhorter. (Of course, Nichols’ data are also compatible with the alternative hypothesis that languages started their development at a relatively high level of complexity and have stayed there ever since.) Further support for the idea that complicating changes are more probable than simplifying ones comes from a somewhat unexpected quarter, viz. Naturalness Theory (Natural Morphology). According to the tenets of this school (see also Section 6.3), “natural” or “unmarked” structures are “preferred” in languages, which among other things means that “languages tend to change from what is more marked to what is less marked” (e.g. Dressler (1987a: 14)). To the extent that markedness can be identiﬁed with complexity, this would appear to imply that languages become simpler over time. That languages are not in general maximally natural/unmarked is explained by adherents of Naturalness Theory by appealing to conﬂicts between diﬀerent types of naturalness and its application to diﬀerent components of language (e.g. Wurzel (1994: 2593)). This implies a kind of “balancing” theory, which, however, seems to contradict the principle of a movement from

Stability and change 279

marked to unmarked. A charitable interpretation is that no change has an increase in markedness as a direct result but that such increases always arise as side-eﬀects of some other markedness-reducing processes. However, it is harder to see the above-quoted preference principle as compatible with a forthright claim that complicating changes are the most common ones in languages. In what was, sadly, to become his last publication, one of the leading proponents of Naturalness Theory, Wolfgang Wurzel on the one hand opposes McWhorter (2001a) by saying that “morphological complexity in McWhorter’s sense does not continually increase in normal language history, but eﬀectively returns to the original level”13 (Wurzel (2001: 384)), but on the other suggests that “reducing inﬂectional complexity takes much more time than building it up”. He supports the latter claim by the preservation of morphological complexity over ﬁve millennia in the history of Indo-European, compared to the faster growth of fusional morphology in Saamic. If we assume (as appears reasonable) that increase and reduction in complexity take place in small steps, the rate at which a language increases or decreases its complexity will depend directly on the probability of each type of change. Wurzel’s claim that reduction of complexity takes more time than building it up is thus tantamount to saying that reduction events take place more seldom. Wurzel does not himself use the term “cycle” in this connection, although he refers to Gabelentz’ idea of a “typological spiral”. Other papers in the same volume, such as Hagège (2001) and DeGraﬀ (2001), do state more explicitly that languages undergo “cyclic” changes in complexity, and these claims are seemingly endorsed by McWhorter (2001a: 388). Statements of this kind tend to be somewhat lacking in stringency. It is not always clear whether scholars who say things like “linguistic evolution is cyclic, not linear” (Hagège (2001: 173)) intend them to apply to speciﬁc phenomena or to languages as wholes. As is argued in other places in this book, individual linguistic patterns may be said to have “life cycles” in that they undergo identiﬁable stages in their development, although cyclicity in the narrower sense of recurrence is not necessarily found. At the level of a language as a whole, on the other hand, speaking of cyclicity does not make much sense if it is not assumed that the language moves from one state to another and back again. It is reasonable to assume that the level of complexity of a language may ﬂuctuate to a greater or lesser extent. What is not at all clear is whether the authors who speak of cyclicity want to claim that there is a deﬁnite patterning in these ﬂuctuations, like the eleven-year “Solar Cycle” determining sunspot activity, with evenly spaced maxima and minima. The null hypothesis here is that ﬂuctuations are random. As noted in 4.5, some social

13.The formulation on the same page “If overall complexity does not perforce increase in “normal” language development, but rather on the contrary…” certainly suggests that Wurzel saw simpliﬁcation rather than complication as the normal case.

280 The Growth and Maintenance of Linguistic Complexity

phenomena (in particular, ones having to do with fashion), exhibit true cyclic behaviour, and it is of course not impossible that grammatical complexity behaves in a similar way. So far, however, there has been little evidence to demonstrate that this is the case, and it seems somewhat far-fetched to assume that languages become analytic because speakers are fed up with syntheticity and vice versa. Some formulations in McWhorter (2001a) and McWhorter (2001b) can be interpreted as making the reverse claim: that there is always a balance between simpliﬁcation and complexiﬁcation. If the null hypothesis of random ﬂuctuation is correct, we would expect that sooner or later, some language will “hit the bottom” and attain zero complexity. Admittedly, such an event may be quite unlikely. However, McWhorter claims that it is not only unlikely but “formally impossible” (McWhorter (2001a: 154)) outside of creolization: simpliﬁcation is always complemented by complexiﬁcation and therefore “an older language retains at all times a degree of complexity alongside the simpliﬁcations it is undergoing”. As already mentioned, McWhorter argues earlier in the same paper (p. 129) against the hypothesis that “overall, languages “balance out” in terms of complexity”. It is hard to see that the arguments he provides there would not equally apply to the hypothesis that simpliﬁcations and complexiﬁcations must balance out in language change.

11.4 The dependence of language change on external factors There now seems to be a general consensus among linguists that not only the lexicon but also the grammatical structure of a language may be inﬂuenced by contacts with other languages, and even that this inﬂuence is systematic enough for there to be a signiﬁcant correlation between degree of contact and parameters such as morphological complexity. One of the things that astonished me in the comments on McWhorter (2001a) in Linguistic Typology 5–2/3 was how many of the contributors seemed to agree on the latter point. Even one of McWhorter’s most vehement critics, Michel DeGraﬀ, agrees that “… the products of (large-scale) language contact do give the impression that they are, to a certain degree and in certain domains, simpler than their corresponding source languages. For example, overt morphological paradigms (e.g., phonetically-realized inﬂectional aﬃxes on nouns and verbs) tend to decrease in size, morphological irregularities tend to be ﬁltered out, various sorts of semantic transparency tend to increase, etc.” (DeGraﬀ (2001: 256)). Trudgill (1983: 102) suggests that there are two types of linguistic change, one that “may be relatively ‘natural’, in the sense that they are liable to occur in all linguistic systems, at all times, without external stimulus, because of the inherent nature of linguistic systems themselves”, and another, that “may be relatively ‘nonnatural’, in the sense that they take place mainly as the result of language contact”.

Stability and change 281

Among the changes of the ﬁrst type, Trudgill mentions various phonological changes but also grammatical changes such as “the development of case-endings or of personal inﬂections on verbs” — in other words, straightforward cases of grammatical maturation. “Non-natural”14 changes, on the other hand, would include the reduction of grammatical systems such as conjugations, declensions and inﬂected forms and an increase in the use of adpositional and periphrastic constructions, and the development of ﬁxed word order. These, then, are the changes that would be characteristic in particular of pidginization and of other “high-contact” situations. (Notice that on the whole, Trudgill’s “natural changes” lead to a decrease in “naturalness” in the Naturalness Theory understanding of the term.) The term “high-contact” is in need of some qualiﬁcation, since the notion of contact may cover many diﬀerent types of situations and many possible channels of inﬂuence. In their inﬂuential monograph, Thomason & Kaufman (1988) distinguish two main types of contact-induced change: – –

borrowing, i.e. “the incorporation of foreign features into a group’s native language by speakers of that language”; interference that results from imperfect group learning during a process of language shift, a major sub-type being what is traditionally called “substratum inﬂuence”.

These deﬁnitions cover only a subset of all possible results of language contact, however. To start with, Thomason & Kaufman seem to equate “contact-induced change” and “interference”. But “imperfect learning” need not involve any direct inﬂuence of the other language. Commonly, what happens is simply that the learner fails to pick up some feature of the target language — this is the “ﬁltering eﬀect” of second-language learning. Moreover, imperfect learning of course appears not only during language shift but also in second-language learning in general, including the early acquisition of a non-dominant language in bilingual situations (for instance, when a child learns a language spoken by an isolated parent or grand-parent). The presence of large groups of second-language speakers may thus inﬂuence the structure of a language without any language shift taking place. Instead of interference in language shift we should therefore rather speak of suboptimal transmission (see p. 127) in general for the major type of contact-induced change not subsumable under borrowing. Returning now to Trudgill’s “natural” : “non-natural” distinction, it would at ﬁrst sight appear that “non-natural” changes are primarily those that occur in suboptimal transmission. In fact, however, his examples include changes of two rather diﬀerent kinds: on the one hand, we have the reduction of grammatical

14.I follow Trudgill in consistently putting the terms “natural” and “non-natural” between quotes.

282 The Growth and Maintenance of Linguistic Complexity

systems, on the other, the increased use of periphrastic constructions. Of these, the former is indeed naturally linked up with suboptimal transmission; it is much less clear, though, that the latter can be so explained. In fact, as I argued on p. 127, the spread of periphrastic constructions tends to look more like the other kind of contact-induced change, borrowing. Since the spread of patterns is one of the processes that together make up grammatical maturation, it would yield a somewhat confused picture if it were lumped together with the kind of changes characteristic of suboptimal transmission. It is true that the breakdown of inﬂectional systems is often linked up with the growth of periphrastic constructions, and they are often thought to be causally linked or to be part of a general trend from a synthetic to an analytic language type. Trudgill does mention “a move towards a more analytic structure” as a characteristic of high-contact situations, and he apparently sees the rise of periphrastic constructions as part of that. He also links high-contact situations to a “reduction in redundancy”, whereas, he says, “natural” changes tend to increase redundancy. As an example of the latter, he quotes the double marking of deﬁniteness in Scandinavian, e.g. Norwegian/Swedish den store mann-en ‘the great man’, as an example of a “natural” development that leads to greater redundancy. As an illustration of what happens in a low-contact situation the example is rather unfortunate since, as I show in Dahl (2003), double deﬁniteness marking is a prime example of a contact phenomenon in that it shows up in the intersection of two spread areas, one of the preposed and one of the suﬃxed article. Also, the construction being an analytic one, it illustrates the fact that analytic constructions also introduce redundancy. Indeed, as we have seen elsewhere, redundancy, or verbosity, may be greatest at the initial stages of the life cycle of a construction, decreasing later through phonetic reduction processes. The long-term eﬀects of suboptimal acquisition can be expected to involve the “ﬁltering out” of structures and properties that are sensitive to this kind of acquisition, that is, “diﬃcult” features of languages. We know of course that adult secondlanguage learners often fail to acquire large parts of the language system, and indeed there seems to be a correlation between the diﬃculties that those learners have and the suboptimal acquisition eﬀects that can be observed in language contact (Trudgill (1983:106)). As was emphasized above (3.5), “diﬃcult” and “complex” should not be automatically equated here — what an individual ﬁnds diﬃcult obviously depends not only on the complexity of the object of learning but also on the individual’s previous knowledge and the learning mechanisms that are available to him/her. As pointed out in 6.2, there are in principle two diﬀerent ways of explaining the absence of mature structures in creoles: either the lack of time they have had to develop these structures, or the reduction processes that they have undergone as a result of suboptimal acquisition. Although these two kinds of explanation need not contradict each other — one may want to explain both why something has been lost

Stability and change 283

and why it has not yet been replaced — it should be borne in mind that it is in principle possible that a given feature takes a long time to develop in a language but that it is quite unproblematic for an adult second-language learner and could thus easily be transferred in a contact situation. Furthermore, since maturation involves processes of many diﬀerent kinds, not all of them need work the same way, not even the components of the same chain of development. A “non-natural” type of change would according to Trudgill’s deﬁnition be one that “mainly” takes place as the result of language contact. I assume this should be understood as saying that it is not impossible but much less probable in a zerocontact situation. The paradigm example would be the total breakdown of inﬂectional systems observable in pidginization. Now, the history of languages is replete with changes that lead to greater or smaller reductions (or if you like, simpliﬁcations) of inﬂectional morphology. It is unlikely that those changes can all depend on language contact. On the other hand, if we assume that the probability of this kind of change depends on (at least) two factors: the extent to which non-optimal transmission is involved, and the overall size of the change, the conclusion is that large-scale simpliﬁcations of inﬂectional systems have a relatively low probability in low-contact situations. Everything depends of course on how we delineate “largescale”, “relatively low” and “low-contact”. To ﬁnd cases of large-scale simpliﬁcations we may consider the languages sometimes called “creoloids” — languages which are not usually regarded as creoles but which have undergone reductions of their grammars which are large enough for some linguists to have put forward such claims. A case in point is Afrikaans, which arose from Dutch under conditions of highly developed multi-ethnicity and a large proportion of non-native speakers of varying linguistic background. If creoles are excluded by deﬁnition, we may note, for instance, that Afrikaans is the only Germanic language to have got rid of the strong verb system, and we may further note that it belongs to the small minority (including also English and some Scandinavian vernaculars in Jutland and Finland) that have lost grammatical gender (third person pronouns excepted). Consider now as an example of an extreme low-contact language, a speech community of a few hundred persons living for centuries in an isolated settlement only visited by traders once or twice a year. Given the assumptions speciﬁed above, a development analogous to that which turned 17th century Dutch into Afrikaans would be rather unlikely in such an environment. While Afrikaans is extreme among the Germanic languages, there have also been large-scale changes among the other languages in that family (as well as in the Romance languages), such as the breakdown of the old case systems — changes that may be deemed too large to be explicable if we do not assume that they were induced by sub-optimal acquisition in some sense. This issue has in fact been the subject of much debate, which I will not go into here. One aspect that is not always realized is

284 The Growth and Maintenance of Linguistic Complexity

This map shows the present-day extent of the Germanic languages in Europe (disregarding diaspora languages) with the dative-loss area in light grey and the dative-preserving area in dark grey.

the extent to which these developments are restricted to high-contact areas. As the map above shows, the loss of the dative is basically restricted to the coastal areas of the Baltic and the North Sea, which makes an explanation in terms of internal factors rather implausible (see also Johansson (1997)). Neo-grammarian sound change is not seldom adduced as the reason for morphological simpliﬁcations, even in the case of wholesale breakdowns of inﬂectional subsystems. For instance, Blake (2001: 177) says that “phonological change, which tends to be reductive especially in unstressed syllables, may obliterate case distinctions”, using the downfall of the case system in Latin as an example. Thus, “word-ﬁnal -m, which was the characteristic marker of the accusative singular, must have been reduced, probably to nasalization”. However, as we saw in 8.4, such a change does not necessarily lead to a loss of the distinction but only to the morphologization of the feature of nasality. Such a development of nasal suﬃxes is found e.g. in Elfdalian, with pairs of forms such as ruäva ‘turnip: nom.sg.indf’: ruävø ‘turnip:nom.sg.def’ (corresponding to Swedish rova-n). It does not appear unlikely that the total disappearance of the nasal in Latin was at least “helped” by suboptimal transmission, as were many other changes in the process that led to the Vulgar Latin of the ﬁrst centuries C. E.. It can thus be questioned whether a major reduction of an inﬂectional system, including the total obliteration of distinctions, can be conditioned exclusively by Neo-grammarian sound change. Clearly, very extensive sound-changes may take place without aﬀecting the system of morphological distinctions: cf. the morphological systems of French and Italian, which have almost identical sets of features, in spite of the massive phonological reductions that French has gone through. Notice also that the

Stability and change 285

type of chain of phonological changes that typically leads to the loss of inﬂections, viz. the reduction of unstressed syllables due to a major change in stress patterns, is in itself likely to be triggered by suboptimal transmission. It may thus be argued that the large-scale changes that have taken place in West European languages over the last two millennia, the result of which has been a signiﬁcant reduction of morphological complexity, have contributed to a biased picture of what “typical” language change is like.

11.5 Who is responsible for maturational changes — adults or children? Two much-debated issues in the theory of language change are (i) whether language change takes place in children or in adults, and (what is not quite the same thing) (ii) whether it takes place in language acquisition (i.e. through imperfect transmission) or in language use (which basically means that language changes after it has been acquired). Typically, generativists take the view that the locus of language change is ﬁrst language acquisition, whereas functionalists, in particular adherents of “usage-based models”, often think of change as a result of language use, whether in children or adults (see e.g. Newmeyer (1998: 67–77) and Croft (2000: Chapter 3), for diﬀerent points of view on these issues). What is meant when one says that a language changes is normally that the language of some speech community has changed, which in a larger time perspective means that the speech of one generation is diﬀerent from that of an earlier generation. But this may take place without either imperfect transmission or postacquisitional change. The input that a child gets is normally a sampling of the speech of the members of the surrounding speech community, and the variation contained in this “corpus” will usually be larger than what is found in the speech of an individual and must be reduced by children as they form their linguistic competence. Acquiring a language is thus a process of selection. Language change on the community level may take place if more and more individuals acquire a feature that did exist earlier but was a minority option and was perhaps originally introduced by someone entering the community from the outside. It is no doubt the case that the ways people speak change continuously all their lives, as functionalist linguists tend to argue. It has been shown (Harrington et al. (2000)) that Queen Elizabeth II has modiﬁed her vowel system over recent decades, towards an accent “that is characteristic of speakers who are younger and/or lower in the social hierarchy”, in accordance with general trends in British English. And if the Queen lets herself be inﬂuenced by societal trends, who doesn’t? On the other hand, linguists who would rather place change in the acquisition of language by children have objected that only when an innovation is transmitted to the next generation is its place in the language ensured. This line of argument can be

286 The Growth and Maintenance of Linguistic Complexity

developed further. If we observe that adults change their language, we must also ask ourselves who adopts whose norms. In the case of the British Queen, it seems fairly plausible that she has let herself be inﬂuenced by the speech of younger people, rather than the other way round. And one may question whether English-speaking children care very much about the way people of the Queen’s age speak. One observation here is that in fact, the age diﬀerence between a child’s peers and its parents may not always be very great. Especially in pre-industrial societies, childbearing starts in the upper teens. So even supposing that parents have signiﬁcant inﬂuence on the language of their children, what happens later in the parents’ adult life is not necessarily so relevant, and the generation cycle may actually be very short. Furthermore, as previously noted — and perhaps of more obvious relevance to the theme of this book — we must ask whether it is really the case that adults can change their languages without limits. In the light of the well-known limitations of adults’ ability to learn new languages, it is far from self-evident that they could master any imaginable innovation in their own language. Notice that there is a signiﬁcant overlap between the mature features listed in 6.2 and those linguistic features that are most recalcitrant in second language learning. For instance, lexical gender and prosodic patterns, especially phonemic ones, are areas where second language learners tend to fail most saliently. There is also the observation that very much the same features tend to get “ﬁltered out” in those transmission situations where there is heavy inﬂuence from non-native speakers and/or strong interference from another language that is dominant in the environment. This suggests that mature features (or some of them, at least) are unlikely to develop in their fullblown form in adult language, since adults would have diﬃculty in acquiring them. The conclusion need not be, though, that these features originate in the child’s brain when acquiring language, as generativists prefer to have it; rather, what the argument says is that the active agents in this type of language change will have to be children. To take an example from syntax, consider a phenomenon such as verbsecond order, as it appears in most Germanic languages. It is normal for adult learners of a language as Swedish to miss out more or less completely on this rule, that is, they consistently fail to place the subject after the verb if the ﬁrst slot in the sentence is ﬁlled by something else. It thus appears rather implausible that the verbsecond rule could spread as an innovation in an adult population. It must be admitted that this statement should ideally be accompanied by an account of how the change would take place without adults, and I do not have a good story for that. In any case, there may well be no general answer to the question as to where change takes place, in acquisition or post-acquisitionally; we should perhaps rather try to ﬁnd out what kinds of changes, and in particular, what parts of maturational processes, can be attributed to diﬀerent stages of acquisition and use. One obvious place to look for empirical evidence concerning the division of labour between adults and children in linguistic maturation processes is “nativization” of

Stability and change 287

pidgins into creoles, that is, what happens when a pidgin starts being acquired as a native language. Bickerton (1981) proposed that this is accompanied by quite radical changes in the language. Objections against this proposal have built on data from the emergence of Tok Pisin, where it has been argued that many of the features Bickerton ascribed to nativization were already present before the language had any native speakers. On the other hand, the data in question do suggest an active role for children in maturation processes. Sankoﬀ & Laberge (1973) investigated in detail the use of the emerging future marker bai in two generations of Tok Pisin speakers, adult ﬂuent second-language speakers and their children, who were the ﬁrst generation of native speakers. They found that in the speech of both groups bai (which in itself is a reduced form, deriving from baimbai) seemed to be “redundant and obligatory”. However, children used signiﬁcantly more reduced variants. The following table summarizes the diﬀerences: (184)

Children

Adults

Secondary stress: full syllabic weight, analogous 29.1% to stressed syllables in nouns or pronouns Tertiary stress: full syllabic weight, analogous to 60.4% unstressed syllable in nouns or pronouns Reduction or disappearance of vowel nucleus 10.4%

51.7%

Total no. of cases

203

192

47.3% 1.0%

Most strikingly, the variant with a reduced vowel nucleus appeared almost only in children’s speech. It would thus appear that children play a signiﬁcant role at least in reduction processes. On the other hand, it does not appear from this data that children have a monopoly on any type of change; they all show up in adults, as well. In particular, if we are to believe Sankoﬀ and Laberge, the future marker was already obligatory in the speech of the adult ﬂuent second-language speakers. Thus, in spite of the claim to the contrary made by Slobin (1977: 205), approvingly quoted by Newmeyer (1998: 73), and repeated in Slobin (forthcoming), this data cannot be taken as evidence for a division of labour in which adults invent new forms and children make them obligatory and regular — in other words, pattern regulation in the form of obligatoriﬁcation would be possible without the intervention of native speakers. However, it may be rash to generalize from the Tok Pisin example. Precisely the fact that the adults involved are not changing their native language but one that they have acquired, or are acquiring, as a second language may be signiﬁcant. Notice that with respect to non-natively spoken languages it is much harder to distinguish between language use and language acquisition (or learning) than for native languages, since there is often no clearly deﬁned learning phase. From this it

288 The Growth and Maintenance of Linguistic Complexity

follows that it is not always possible to distinguish imperfect learning from language change in such languages. Furthermore, it is likely that transfer of obligatory features will take place from second-language learners’ native languages, and this may even be a pre-condition for the appearance of such features in a non-natively spoken language. Thus, we do not know if a future marker like Tok Pisin bai could become obligatory in a non-natively spoken language if none of its substratum languages has such a marker; nor do we know if such a marker can become obligatory in the speech of adult native speakers, with no help from children who are acquiring the language. Nicaraguan Sign Language has attracted attention as a language which was born under the eyes of linguists when schools for the deaf were created in the 1980’s. Quoting data from Senghas (1995) on the emergence of grammatical forms in this language, Slobin (forthcoming) argues that those forms were ﬁrst introduced by the ﬁrst “cohort” of children (“second language learners”) but were used more frequently and ﬂuently by the younger members of the second “cohort” (“ﬁrst language learners”). Again, it is not clear that one can conclude from this that it is children who are responsible for pattern regulation. On the whole, however, it seems that as we proceed from the introduction of new patterns to their spread and regulation and further to adaptive changes in the form of reductive processes, the role of young children acquiring a native language increases and that of adults or older children modifying their ﬁrst language or acquiring a second one decreases. I think this is consonant with the view of grammatical maturation as the combination of redundancy-increasing and orderdecreasing processes on the one hand and redundancy-decreasing and orderincreasing processes on the other.

Chapter 12

Final discussion

As was noted in Chapter 6, the idea that linguistic structures evolve by passing through diﬀerent stages is not at all a new one, but one that goes back at least to the 18th century. Usually, it was assumed that this evolution concerned the language as a whole, and in the ideological climate of those days it was natural to connect linguistic evolution with the evolution of culture in general (which, in its turn, was usually not kept strictly apart from biological evolution). Towards the end of the 19th century, the emphasis shifted towards the evolution of individual linguistic patterns and the idea of a cyclical development came to the fore. When the study of what I have called maturation processes was again taken up (after the period of “amnesia” in the ﬁrst half of the 20th century) most scholars focused on the development from lexical to grammatical morphemes, under the heading of “grammaticalization”. There is a pervasive tendency in modern linguistics to think of language change primarily in terms of simpliﬁcation, even if nobody would seriously claim that all changes make languages simpler. In other words, when encountering a case of change, the linguist instinctively looks for a way of explaining it as a way of simplifying the language. I shall refer to this way of thinking as “simpliﬁcationism”. Simpliﬁcation means reduction of complexity, and as we have seen in previous chapters, there are several diﬀerent kinds of linguistic complexity — for instance, what I have labelled system complexity, structural complexity and output complexity. In claiming that a change is a simplifying one, it is often not made clear what kind of complexity it is supposed to relate to. For instance, since output simpliﬁcation in the form of phonological reduction may obliterate grammatical distinctions, it is perhaps natural to think of system simpliﬁcation as a collateral result of it, although in fact, output simpliﬁcation quite often leads to a more complex system. Similarly, when a grammatical marker spreads throughout the lexicon of a language, a simpliﬁcationist description in terms of “regularization” may be close at hand, but it is then forgotten that grammatical changes are often only partially implemented, leaving residual areas untouched, a process which results in the grammatical system becoming more complex. Simpliﬁcationism is not necessarily simplistic. A sophisticated variant is found in Naturalness Theory, whose concept of “naturalness”, as I argued in 6.3, can largely be equated with “lack of complexity” and “linearity”. According to Mayerthaler

290 The Growth and Maintenance of Linguistic Complexity

(1987: 38), Naturalness Theory can be seen as a “preference theory” (the term is taken from Vennemann (1983)), which makes claims about “the preferred locations of languages in the space … of possible languages”. Thus, certain possible languages, or perhaps better language states, are more “natural” and thus preferred. Ceteris paribus, language change will tend to increase naturalness, that is, tend to move towards the preferred states. By contrast, the notion of linguistic maturation is not reducible to a preference relation between possible language states. Likewise, “mature” is not equivalent to “unnatural” or “dispreferred”. As we have seen, mature states may be diﬃcult to attain, but once reached, they may be quite stable. A case in point is the ablaut systems discussed in 11.2, where it was argued, among other things, that the tendency towards “regularization” of the Germanic strong verbs is considerably weaker than would be expected if ablaut alternations were really “dispreferred” in languages. I do think, however, that we may identify language states that are truly “dispreferred”, although I would prefer to call them “out of balance”. More precisely, dispreferred cases are those where the phonetic weight of a linguistic pattern (its “cost”) is disproportionate relative to its informational value (its “beneﬁt”) — in other words, the pattern is too verbose. Although, as we have seen in earlier chapters, there is considerable variation in verbosity between equivalent constructions in diﬀerent languages, and we must reckon with a certain tolerance in this regard, there are situations, in particular in creolization, where the limit for tolerable verbosity seems to be transgressed, leading to a rapid reduction. For instance, Tok Pisin, in the early stages of its development, still partly represented in the written language, displayed some highly verbose constructions, which already seem to have been heavily reduced phonetically. I have already mentioned the development of the future marker baimbai into [b6]. Similarly, the possessive marker bilong is now often [bl˜f]. It does not appear implausible that a language in which you need seven syllables as in papa bilong mipela to say ‘our father’ rather than three (as in English) is somewhat out of balance. I do not know if there is empirical evidence for stating categorically that the rapid process of change that we can observe in Tok Pisin is a necessary one, but the speed at which it is taking place certainly suggests that it is a highly probable one in the current situation. As noted earlier, in both the cyclical theory and the concerted scales theory, developments move toward a zero point, where the grammaticalizing expression is ﬁnally annihilated (or, at least, drops out of the language). “Zero” is also the endpoint in Givón’s schema of the development of grammatical phenomena (see fn. 37). It seems to me that scholars, myself included, have failed to distinguish between diﬀerent reductive forces that may act upon a grammatical marker. The phonetic reduction involved in pattern adaptation is of a diﬀerent nature than both Neogrammarian sound change and the failed replication of grammatical categories typical of non-optimal language transmission. Maturation processes, in my view, do

Final discussion 291

not contain an element of “programmed death”. Once a stable state has been reached, a mature pattern can in principle exist for ever — witness the Afro-Asiatic system of verbal morphology. To travesty the VW slogan from the ﬁfties: nobody knows how old an ablaut pattern can be. The phenomena called maturation processes in this book thus consist of several components — in my account pattern spread, pattern regulation, and pattern adaptation, and perhaps featurization as a separate additional component. Earlier models have cut the cake diﬀerently, but more important perhaps is the way in which the relationship between the component processes has been regarded. The cyclical theory sees grammaticalization as the outcome of a struggle between opposing forces, one constructive and one destructive, leading to an eternal cycle. By contrast, the concerted scales theory sees grammaticalization as a harmonic simultaneous movement along a set of parallel scales. It may not be wholly fair to describe the models in this way — they are probably less incompatible with each other than my choice of metaphors would imply. What they deﬁnitely have in common is, as I have already said, that the initial and ﬁnal points of the process are identical — essentially, zero. In the model that I propose, maturation is not a cyclical but rather a “dialectic” process. Thus, pattern spread tends to increase the total redundancy, or verbosity, of the system; this is counteracted through pattern adaptation in the form of redundancydecreasing reduction, but the result is not simply a return to the initial state, but rather one in which redundancy is exploited gainfully in the system — “smart redundancy”. I do not know if there is any deeper signiﬁcance hidden in the fact that the phenomenon traditionally called “synthesis” in linguistics — basically, complex word structure — is indeed also a “synthesis” in the sense of dialectics, that is, an emergent result of the interaction between a “thesis” — pattern spread — and an “antithesis” — pattern adaptation. It is thus essential to grammatical maturation processes that they involve both redundancy-increasing changes, i.e. pattern spread, and redundancy-decreasing ones, i.e. pattern adaptation. Together, these have the eﬀect of increasing complexity in the form of non-linearity. In addition, we have at least two other kinds of changes which are basically redundancy-decreasing: Neogrammarian reductive change and the disruptive eﬀects of non-optimal transmission. Of these, the former may add to non-linearity, but the latter deﬁnitely decreases it, breaking down the structures built up through maturation processes. By deﬁnition these processes are associated with high-contact situations. But language contact may in fact also enhance quite a diﬀerent type of change, viz. pattern spread — we have seen that especially periphrastic grammatical patterns are easily borrowed. The ideal situation for these changes to take place is, however, rather diﬀerent from that which conditions disruptive change — we could expect this to happen in the same situation as borrowings in general, that is, from some (usually high-status) neighbouring

292 The Growth and Maintenance of Linguistic Complexity

language into the native language of the borrowers. Pattern adaptation in the form of reduction and tightening, on the other hand, may well be connected primarily with language-internal developments that are hindered rather than helped by highcontact situations (Trudgill’s “natural” change). Interestingly, then, it may be the case that the diﬀerent components of maturation are favoured by somewhat diﬀerent ecological conditions. One may thus speculate that grammatical maturation will be most probable in a phase when external contacts have been at a high level but are decreasing. In several places in this book, we have noted that one and the same set of expressions may be subject to conservative as well as innovative tendencies, and that in both cases the result is an increased diﬀerentiation. Thus, a set of expressions may tend to both withstand pattern spread (conservativity) and be phonetically or otherwise reduced (innovativity). This was said to be valid for constructions expressing inalienable possession (7.6.1), for “one-packet” expressions (10.9), and for frequent items in general (8.1). Likewise, we saw that distinctions between main and subordinate clauses may arise through the latter preserving an older form of expression (7.3) but also through the deletion of auxiliaries in subordinate clauses (8.2). In the latter case, it is clear that this is not a question of frequency — incontrovertably, main clauses are more frequent than subordinate ones. Rather, the explanation must involve other, not entirely well-understood mechanisms that underlie maturation processes. What we can readily observe is that there are scales with a “high” and a “low” end, such that the “high” end is associated with prominence in discourse and a high informational and/or rhetorical value. Grammatical patterns are more or less universally born at the high end and expand during their life towards the low end. As they approach the low end, the pressure towards phonetic reduction and tightening in diﬀerent respects grows. Two things may now give rise to a diﬀerentiation: the expansion of the pattern may halt at an intermediate point — this leaves the low end of the scale as a residual or “conservative” section, or a reduction or tightening process is applied to the lower portions of the scale — this comes out as an “innovative” event. But notice that the reductive or condensing processes need some material to work on: this presupposes an earlier spread of verbosity from the high end. This is again an illustration of a grammatical process of change in several steps. Upon closer observation, the conservative option may also be seen to involve a chain of events. Thus, a receding pattern may in itself be the result of an earlier expansion. In Hindi, the non-progressive present kahta¯ hai ‘he/she speaks’ is now a donut category which has been driven out from part of its former territory by the progressive construction kah raha¯ hai ‘he/she is speaking’; however, kahta¯ hai was originally also a progressive construction and obtained its present domain by pushing the form kahe out of main clauses, which has resulted in kahe becoming a subjunctive (Lienhard (1961: 46–48)).

Final discussion 293

We saw in 7.1 that McWhorter (2001a) claims that there is a gap in grammatical complexity between creole and “old” languages. I argued that if this is true (and the data in Nichols (1992) give some support to his claim) it suggests that, at least up to a point, complicating changes are more frequent than simplifying changes, under “normal ecolinguistic conditions” — that is, when the vertical chain of transmission is not under threat. I also argued that highly mature grammatical subsystems, such as grammatical gender and ablaut-based verbal inﬂection, are often astonishingly stable, once they have arisen. It is possible, and in my view quite plausible, that in the state space inhabited by the set of possible human languages, there are certain “attractors”, i.e. states which languages tend to move towards, and where languages tend to stay once they have got there — that is, truly “preferred locations of languages in the space … of possible languages”, in the words of Mayerthaler. The important point is then, again, that these states are not the “simplest” ones or the most “natural” ones from the point of view of Naturalness Theory. If we think of the space of possible languages as a landscape with hills, representing “complex”, “unnatural” or “marked” states, and valleys, representing “simple”, “natural” or “unmarked” states, Naturalness Theory, like simpliﬁcationism generally, predicts that languages would gravitate towards the deepest valleys and stay there. My claim is that languages not only tend to move uphill but that they also may come to rest at points at signiﬁcantly higher altitudes. One may in fact wonder if these points are not what the typological study of language is all about. Typologists commonly ﬁnd that certain types of grammatical systems are more frequent than other, logically possible ones. These could then be said to be the “attractors” or stable points in the state space of languages. But these preferred states typically have rather long prehistories, that is, they are mature. Crucially, they are also often the result of the convergence of diﬀerent possible paths of development. In my survey of tense–aspect systems (Dahl (1985)), I found that there was less variation in the more mature parts of those systems, i.e. those with inﬂectional expression, than in the less mature parts, i.e. those expressed periphastically. At least in cases such as the Semitic verb system (11.2), it might be speculated that we are dealing with a combination of a high degree of maturity and a relatively low degree of sensitivity to suboptimal transmission eﬀects (p. 274). Trudgill (1999: 149, 2001: 372) ascribes the tenacity of linguistic complexity, as found e.g. in gender systems, to “the amazing language-learning abilities of the human child”, and its disappearance in situations of intensive language contact as “a result of the lousy language-learning abilities of the human adult”. These formulations are certainly compatible with what I am saying here, but are really only the ﬁrst step towards an explanation. Notice that it is hardly the case that we are generally best at learning things at the age when language is acquired, so it would be necessary to specify precisely what the diﬀerences consist in. This raises

294 The Growth and Maintenance of Linguistic Complexity

the question of whether mature structures of language are in some way genetically predetermined. By deﬁnition, mature features are non-universal, and it may therefore seem strange to assume that they depend on genetics. But notice that what is at stake here is only whether humans have some kind of genetically determined preparedness for certain linguistic features, which would make it easier for them to arise or be preserved, once they have arisen. Human children indeed seem to have an advantage compared to members of other species and notably to adult members of their own species when it comes to learning those sub-systems of spoken languages that depend on maturation, such as grammatical gender and suprasegmental phonology. It is tempting to ascribe this advantage to a genetic predisposition, which does not have to imply, as McWhorter alleges,1 that children are born with language-speciﬁc systems ingrained, but (continuing the ﬁdelity argument in 4.1) rather that there is a mechanism that makes it possible to pick up the relevant features from the environment in an extremely eﬃcient way. If it can be shown that certain features of language are indeed more “L2-diﬃcult” than others in the sense of having a higher rate of failure in second-language acquisition (and perhaps also in cases of speciﬁc language impairment), this certainly makes it more probable that there is a dedicated acquisition mechanism for these features. In addition, it may be argued, at least for suprasegmental features, that they put additional demands on the processing system, which makes a dedicated mechanism more likely. If we are genetically predisposed to acquire mature features, the next question is: if such a predisposition has arisen in evolution, does it have a selectional advantage? The obvious problem here is that if we argue that it is somehow better for a language to have mature features than not to have them, we are forced to admit that some languages are better oﬀ than others. Clearly, for any mature feature that we choose, there are numerous languages that lack it, and, as we have seen, it has been claimed that some languages (notably creoles) lack practically anything that could be called mature. We thus cannot argue that mature features are in any way indispensable. It could still be the case that they carry some advantage, though. In my opinion, there are at least two plausible ways in which they might do so: tight structures allow for a higher rate of information transfer, and various kinds of

1.McWhorter (2001b: 390) says, responding to Dahl (2001b): “… by my reading, Dahl’s scenario would require that all human languages evidence the same types of ornament, which would leave us with the question: What mechanism would determine that Fula is festooned with consonant mutations while Vietnamese has six lexical tones?” and that given my scenario, languages such as Hawaiian Creole English and Nicaraguan Sign Language ought to be as complex as Icelandic or Welsh. In addition to the argument given in the main text, one may respond that even if mature structure does indeed exhibit great variation among languages, McWhorter also must explain why, for instance, gender systems with quite speciﬁc common properties show up in so many unrelated languages (see 9.4 for discussion).

Final discussion 295

grammatical and prosodic embellishments provide “smart” redundancy, making information transfer safer. If this is the case, why are mature patterns so unevenly distributed across languages? If we have a genetic capacity for mature patterns, why is it unused in so many languages? The answer, which has already been touched upon in earlier chapters, appears to lie in “ecological” diﬀerences between languages: how much they have been exposed to factors that are disruptive of mature patterns — most notably to what extent they have been subject to language contact that involves suboptimal transmission. Note that the incidence of disruptive language contact and suboptimal transmission may well have increased historically (cf. Trudgill (1983)). Certainly the average language of pre-agricultural humankind was lowcontact, compared to the languages that most people speak today, or even in traditional argricultural societies. Languages spoken by small, nomadic groups in areas with a population density of about 1 person per square kilometre are unlikely to be exposed to large-scale suboptimal transmission. Thus, the incidence of mature patterns may well have been diﬀerent in paleolithic times. If there is a genetic “language acquisition device”, it is likely to have been adapted to that kind of situation, rather than to the linguistic scene of post-industrial society. I think there is a general feeling among linguists that really complex grammatical systems tend to be found among languages spoken in pre-industrial societies, although it is diﬃcult to verify or falsify such a claim, and there certainly is a fairly large variation among the languages spoken by groups who still live as hunter-gatherers or have recently done so. It may also be noted that the languages which have been claimed to lack mature patterns altogether, creoles, have arisen under circumstances which can have few counterparts in the history of humankind — that is, situations when new speech communities were created by forcibly bringing together adult members of dozens of ethnic groups thousands of kilometers from their homes. The consequence of the somewhat speculative assumptions presented here is that mature language states presuppose the presence of both a speciﬁc genetically determined acquisition mechanism and a speciﬁc cultural chain of development. For this to come about, gene-culture co-evolution — that is, a mutual attunement of genetic and cultural information — of a rather complex kind must have taken place. Whether such a thing is at all possible is for the time being an open question. It is now commonplace to speak of cultural evolution as a process that works in parallel with biological evolution. Cultural evolution presupposes cultural learning — picking up information from other members of your group rather than from the environment. Cultural learning is also found in some non-human species, but on the whole, it does not appear to be cumulative in those species — that is, the amount of cultural information in a group does not grow systematically over time. Such an accumulation of cultural information, which is a further prerequisite for cultural evolution in the proper sense, would thus be unique to humans. At least in

296 The Growth and Maintenance of Linguistic Complexity

the realm of technology, it appears fairly uncontroversial to say that the body of cultural knowledge has grown immensely over the last ten thousand years and is growing at an accelerating rate — with consequences that are not always positive. In the 19th century, when genetic and cultural evolution were not yet kept apart by scholars, it was commonly assumed that language was also part of the general evolutionary processes that had led humankind to the pinnacles of civilization. This view has become thoroughly discredited through the insight that “primitive” languages are nowhere to be found. A reasonable conclusion has seemed to be that spoken language is outside cultural evolution as we can observe it in other domains: languages change but do not evolve. But the existence of mature phenomena in language — phenomena that presuppose a prehistory — means that if not languages then at least linguistic patterns do evolve in the sense of going through sequences of stages characterized by growing complexity. Importantly, however, this evolution, referred to in this book as maturation, is by and large independent of the processes that are usually meant when speaking of “cultural evolution”, and may even be negatively correlated with the rise of large-scale societies with highly mobile populations. Thus, there is evolution in language, but not in the way that 19th century scholars thought.2

2.Anyone who sees a eurocentric element in the idea that languages may vary in grammatical complexity will do well to contemplate the fact that of South Africa’s eleven major languages, the only ones that have ever been suggested to be creoles — the languages with the world’s simplest grammars, according to John McWhorter — are Afrikaans and English.

Appendix A Regular and irregular verbs

Psycholinguistic research on language production has largely been based on English, and a particularly central role has been played by studies of the acquisition and processing of English verb morphology. Pinker (1999) follows this tradition: when he chooses one phenomenon to be examined in the book “from every angle imaginable,” it is “regular and irregular verbs, the bane of every language student” (1999: xi), and the only languages discussed at any length are English and German. This concentration on one or two Germanic languages is rather unfortunate, in several respects, which I shall now discuss in some detail. Both traditional and modern descriptions of English grammar tend to distinguish only two types of verb paradigms, “regular” and “irregular”. (But cf. Halle & Mohanan (1985).) The irregular verbs include the so-called “strong” verbs found in almost all Germanic languages, which employ ablaut alternations to form past tenses and other forms. In grammatical traditions other than that of English, strong verbs are treated not as irregulars but as regular patterns although usually improductive or only marginally productive ones. Other such minor patterns are also seen as regular and are sometimes treated as “conjugations” in their own right. The choice between seeing a pattern as irregular or as regular but improductive sometimes appears somewhat arbitrary. Cf. the Swedish regular but improductive “second weak conjugation” exempliﬁed by the verb sända ‘to send’ with forms such as imperative sänd!, past tense sände and past participle sänt, with the pattern exempliﬁed by the English “irregular” verb send, with imperative send!, past tense and past participle sent. The only diﬀerence here seems to be in the number of verbs in the two patterns: about 300 in Swedish and (counting somewhat generously) 60 in English. The important point, however, is that there may be a diﬀerence between verbs that belong to minor patterns and those that are totally anomalous, that is form patterns of just one member, such as suppletive verbs like go : went or a verb with a unique alternation such as Swedish dö : dog ‘die : died’. In Table 1, I show the type and token frequencies of past tense verb forms in a corpus of spoken Swedish.1

1.The corpus — identical to the one presented in Dahl 2001 — consists of 65,000 words of spoken Swedish chosen from the larger half-million word corpus Samtal i Göteborg

298 The Growth and Maintenance of Linguistic Complexity

I have there divided the verbs into four classes, according to their treatment in standard grammars. It can be seen that the improductive “regulars” diﬀer in their frequency distribution both from the productive type (the traditional ﬁrst conjugation, where the simple past ends in in -ade) and from the “irregulars”. But English also diﬀers from many other languages in important ways that cannot be reduced to a matter of description. Every English verb has an endingless base form with a wide use — inﬁnitive, imperative and present tense except 3rd person singular. In addition, English has extensive zero-derivation (alternatively, unmarked conversion) of verbs from other word classes, that is, practically any word can be used as a verb if a suitable interpretation is found. As a consequence, it makes sense to ask an English speaker to produce the past tense of a nonsense verb such as wug — it is no surprise that the answer given is almost invariably the only productive formation, that is wugged. Contrast this with a language such as Russian. In Russian, there are several diﬀerent conjugation types among verbs, and normally there is no “neutral” form of a verb, that is, a form that does not give information about the conjugation class it belongs to. Furthermore, zero derivation is in general not possible. If a new verb is formed, either from a Russian noun or adjective or from a borrowed stem, it has to be equipped with a derivational suﬃx before inﬂectional endings can be added to it. The most productive such suﬃx is -ov- (which takes the form -u(j)- in the present and the imperative). For instance, the English word start is borrowed into Russian as start-ov-at’ (present tense startu-et). Now, all verbs in -ov- are conjugated in exactly the same way. Given the inﬁnitive startovat’ a Russian speaker knows that the 3rd person singular of the present tense has to be start-u-et and the masculine singular of the past tense has to be start-o-val. At ﬁrst glance, then, this looks like a productive inﬂectional pattern. However, one may equally well argue that this pattern is dependent on the derivational suﬃx -ov-/u(j) and that it is the latter that is productive, not the inﬂection. In fact, it may be claimed that in eﬀect Russian has no productive verb inﬂection, since in all cases, one must ﬁrst assign the verb to a derivational pattern, which then determines the choice of inﬂectional endings.

(‘Conversations in Gothenburg’), originally collected for a sociolinguistic project (Löfström (1988)), in which a number of persons (chosen at random from the civic register) were asked to make a 30 minute recording of a conversation between themselves and another person of their choice with whom they were acquainted.

Appendix A 299

Table 1.Past tense forms in a corpus of spoken Swedish TYPE FREQUENCIES Token frequency classes Conjugation type

> 20

> 10

>5

1–4

Sum

Average token frequency

Productive Weak improductive Strong improductive “Irregular”

2 4 1 10

4 7 3 4

10 6 2 3

115 49 21 7

131 66 27 24

2.6 10.4 5.9 66.5

Sum

17

18

21

192

248

11.2

TYPE FREQUENCIES — PERCENTAGES Token frequency classes Conjugation type

> 20

> 10

>5

1–4

Sum

Productive Weak improductive Strong improductive ”Irregular”

0.8% 1.6% 0.4% 4.0%

1.6% 2.8% 1.2% 1.6%

4.0% 2.4% 0.8% 1.2%

46.4% 19.8% 8.5% 2.8%

52.8% 26.6% 10.9% 9.7%

Sum

6.9%

7.3%

8.5%

77.4%

100.0%

TOTAL TOKEN FREQUENCIES Token frequency classes Conjugation type

> 20

> 10

>5

1–4

Sum

Productive Weak improductive Strong improductive ”Irregular”

46 463 67 1497

60 105 40 64

60 38 14 20

172 82 37 15

338 688 158 1596

Sum

2073

269

132

306

2780

300 The Growth and Maintenance of Linguistic Complexity

TOTAL TOKEN FREQUENCIES — PERCENTAGES Token frequency classes Conjugation type

> 20

> 10

>5

1–4

Sum

Productive Weak improductive Strong improductive “Irregular”

1.7% 16.7% 2.4% 53.8%

2.2% 3.8% 1.4% 2.3%

2.2% 1.4% 0.5% 0.7%

6.2% 2.9% 1.3% 0.5%

12.2% 24.7% 5.7% 57.4%

Sum

74.6%

9.7%

4.7%

11.0%

100.0%

Table 2.Past verb forms in the Christine corpus Verb type

Types

Tokens

Percentage of total

Percentage of full verbs

119 Regular 78 Irregular be, have, and modals 9

321 1183 2155

9.1% 29.3% 61.5%

23.8% 76.2%

Sum

3511

204

Table 3.Most frequent past verb forms in the Christine corpus Rank

Verb form

Frequency

Rank

Verb form

Frequency

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

was did said were had +’d would could got went

651 432 343 221 202 177 173 151 132 94

11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

thought should might told took wanted saw came knew put

90 82 66 42 31 28 28 25 20 17

The connection between high (token) frequency and irregularity is well known. On the other hand, the higher type frequency of a regular pattern, e.g. the fact that there are many more regular than irregular verbs in English (that is, in the lexicon of English) may give the impression that the majority of verb forms in discourse are

Appendix A 301

also regular. Regularity would be the normal2 case, irregularity an exception. In actual fact, however, not only in English but also in other Germanic languages, the regular and productive way of forming past tenses is most deﬁnitely in the minority. Among the spoken Swedish past tense forms tabulated in Table 1, the productive “ﬁrst conjugation” makes up as little as 12.2 per cent. The availability on the web of the tagged Christine corpus (Sampson (2002)) made it possible to obtain a comparable material for spoken English. As shown in Table 2, the percentage of regular past tense verb forms in this corpus is even lower — 9.1 per cent. Quite a large proportion of the pasts here are auxiliaries or copulas, which are known to be irregular. It may thus be objected that it would be more proper to consider the proportion of regulars among the “full” verbs. It turns out, though, that even if this yields a larger percentage of regulars, the ﬁgure still indicates a minority — 23.8 per cent. Since everyone agrees that irregular past forms have to be retrieved from the lexicon, these ﬁgures point to a rather low maximal load for an on-line generating mechanism. Moreover, the controversy over high-frequency regular and productive forms turns out to be something of a red herring, at least as far as Germanic past tenses are concerned. However we deﬁne “high-frequency verb form”, regular and productive high-frequency past tense forms actually are so low in type frequency that even their total token frequencies are less than impressive. If we assume that the discussion as regards Swedish concerns the most frequent half of the traditional 1st conjugation, the question would be whether the proportion of forms generated on-line is 12 per cent (all 1st conjugation verbs) or 6 per cent (only the less frequent ones). The ﬁgures for English will be very much the same. The question that arises here is in what sense English -ed, or Swedish -ade is a “default” way of creating past tense forms. Traditionally, it is said that (i) most English verbs take -ed; (ii) all new verbs take -ed; (iii) over time, irregular pasts tend to regularize. We now see that (i) is true only if we are speaking about type frequencies — token-wise, irregular pasts are the rule. It has been noted that productivity is compatible with being a minority, even with respect to type frequency. For instance, only about 5 per cent of all German nouns (most of them recent loans) take the plural ending -s; still, this ending is used “as the default, acting whenever memory retrieval comes up empty-handed” (Pinker (1999: 248)). For instance, it is chosen by German speakers if you ask them to supply the plural form of a nonsense word. The label Notpluralendung ‘emergency plural ending’ appears to describe the situation very well. But how diﬀerent is English past tense -ed really from German plural -s? Pinker does not go so far as to label -ed an “emergency past ending”, but

2.Normality is a tricky concept, however. When e.g. Bittner (1996:69) claims that the German weak conjugation type has a higher “degree of normality” than the strong type, he apparently regards that as fully compatible with a lower text frequency, understanding “normality” rather in terms of “system-related naturalness” (“systembezogene Natürlichkeit”).

302 The Growth and Maintenance of Linguistic Complexity

he notes that there are in fact very strong similarities between the two markers, and argues that it is not the high type-frequency of regular verbs that make speakers apply the pattern to new verbs, but the other way around — there are so many regular verbs because the pattern has been productive for such a long time. Giving these statements a somewhat greater precision, one might state matters as follows. There are really two ways of understanding the notion of “default” with regard to morphology. The statistics for past tense formation in Germanic suggests that we get past forms by retrieving them from the lexicon most of the time. In this sense, then, lexicon retrieval is the “default”. However, we also need a strategy to apply when “memory retrieval comes up empty-handed”. This strategy, then, is a “default” in the other sense. There isn’t really any good reason why the two senses should coincide. It may be useful to remember that the original meaning of the word “default” is “failure”, in particular “failure to perform a legal duty”. A default judgment is made e.g. when a party fails to show up in court. In other words, it is a “last resort” choice that is taken when the normal basis for making a decision is lacking. In connection with the development of computers, the word “default” appears to have shifted in meaning, so that it now means the “normal” choice. This ambiguity parallels the equivocation in the use of the word in linguistic theory.

References

Abraham, Werner. 1995. Deutsche Syntax im Sprachenvergleich. Grundlegung einer typologischen Syntax des Deutschen. Tübingen: Narr. Allen, Barbara J., Gardiner, Donna B. and Frantz, Donald G. 1984. Noun incorporation in Southern Tiwa. International Journal of American Linguistics, 50.292–311. Allen, Shanley E. M. 1996. Aspects of argument structure acquisition in Inuktitut. Language acquisition & language disorders, 13. Philadelphia: Benjamins. Allwood, Jens. 1975–76. Några oväntade satsﬂätor. Nysvenska studier, 55–56.177–199. ———. 1976. Linguistic communication as action and cooperation: a study in pragmatics. Gothenburg monographs in linguistics, 2. Göteborg: Dept. of Linguistics, University of Göteborg. Anderson, Stephen R. 1992. A-morphous morphology. Cambridge: Cambridge University Press. Andersson, Anders-Börje. 1993. Second language learners’ acquisition of grammatical gender in Swedish. Gothenburg monographs in linguistics, 10. Göteborg: Dept. of Linguistics, University of Göteborg. Ansaldo, Umberto. 1999. Comparative constructions in Chinese: areal typology and patterns of grammaticalization. Ph.D. thesis. Department of Linguistics, Stockholm University. Anward, Jan and Linell, Per. 1976. Om lexikaliserade fraser i svenskan. Nysvenska studier, 55–56.77–119. Arthur, W. Brian. 1990. Positive feedbacks in the economy. Scientiﬁc American, 262.92–99. Asher, R. E. 1982. Tamil. Lingua descriptive studies, 7. Amsterdam: North-Holland. ——— (ed.) 1994. The Encyclopedia of language and linguistics. Oxford: Pergamon. Augst, Gerhard. 1975. Untersuchungen zum Morpheminventar der deutschen Gegenwartssprache. Forschungsberichte des Instituts für deutsche Sprache, 25. Tübingen: Narr. Ball, Philip. 2001. The self-made tapestry: pattern formation in nature. Oxford: Oxford University Press. Barlow, Michael and Kemmer, Suzanne. 2000. Usage-based models of language. Stanford, Calif.: CSLI Publications. Barnes, Janet. 1996. Autosegments with three-way lexical contrasts in Tuyuca. International Journal of American Linguistics, 62.31–58. Bhat, D. N. Shankara. 1994. The adjectival category: criteria for diﬀerentiation and identiﬁcation. Studies in Language Companion Series, 24. Amsterdam: Benjamins. Bichakjian, B. H. 1999. Language evolution and the complexity Criterion. Psycoloquy. http:// psycprints.ecs.soton.ac.uk/archive/00000668/#html. Bickerton, Derek. 1981. Roots of language. Ann Arbor: Karoma. Bierwisch, Manfred. 1967. Some semantic universals of German adjectivals. Foundations of Language, 3.1–36. Bittner, Andreas. 1996. Starke “schwache” Verben, schwache “starke” Verben: deutsche Verbﬂexion und Natürlichkeit. Studien zur deutschen Grammatik, 51. Tübingen: Stauﬀenburg. Bloomﬁeld, Leonard. 1933. Language. New York: Holt.

304 The Growth and Maintenance of Linguistic Complexity

Boas, Franz and Deloria, Ella. 1941. Dakota grammar. National Academy of Sciences. Memoirs. Vol. 23:2. Washington. Booij, Geert. 1993. Against split morphology. Yearbook of Morphology, 1993.27–49. Borer, Hagit, Lowenstamm, Jean, and Shlonsky, Ur. 1994. The construct in review. In Studies in Afroasiatic grammar: Papers from the Second Conference on Afroasiatic Languages, Sophia Antopolis, ed. by Catherine Lecarme, 30–61. The Hague: Holland Academic Graphics. Broderick, George. 1993. Manx. In The Celtic languages, ed. by Martin John Ball and James Fife, 228–285. London: Routledge. Brown, Penelope and Levinson, Stephen C. 1987. Politeness: some universals in language usage. Studies in interactional sociolinguistics, 4. Cambridge: Cambridge University Press. Bybee, Joan. 1994. The grammaticization of zero: asymmetries in tense and aspect systems. In Perspectives on grammaticalization, ed. by William Pagliuca, 235–254. Amsterdam: Benjamins. ———. 2001. Phonology and language use. Cambridge Studies in Linguistics, 94. Cambridge: Cambridge University Press. Bybee, Joan and Hopper, Paul J. 2001. Introduction to Frequency and the emergence of linguistic structure. In Frequency and the emergence of linguistic structure, ed. by Joan Bybee and Paul J. Hopper, 1–24. Amsterdam: Benjamins. Bybee, Joan L. and Slobin, Dan I. 1982. Rules and schemas in the development and use of the English past tense. Language, 58.265–289. Bybee, Joan L. and Dahl, Östen. 1989. The creation of tense and aspect systems in the languages of the world. Studies in Language, 13.51–103. Bybee, Joan L., Perkins, Revere and Pagliuca, William. 1994. The evolution of grammar. Tense, aspect, and modality in the languages of the world. Chicago/London: University of Chicago Press. Bybee, Joan L. Ms. Mechanisms of change in grammaticization: the role of frequency. Börjars, Kersti, Vincent, Nigel and Chapman, Carol. 1996. Paradigms, periphrases and pronominal inﬂection: a feature-based account. In Yearbook of Morphology 1996, ed. by Geert Booij and Jaap van Marle, 155–177. Camazine, Scott, Deneubourg, Jean-Louis, Franks, Nigel R., Sneyd, James, Theraulaz, Guy and Bonabeau, Eric. 2001. Self-organization in biological systems. Princeton studies in complexity. Princeton, N. J.: Princeton University Press. Caubet, Dominique. 1983. Quantiﬁcation, négation, interrogation: les emplois de la particule “š i” en arabe marocain. Arabica, 30.227–245. Chafe, Wallace. 1982. Integration and involvement in speaking, writing, and oral literature. In Spoken and written language, ed. by Deborah Tannen, 35–53. Norwood: Ablex. ———. 1987. Cognitive constraints on information ﬂow. In Coherence and grounding in discourse, ed. by Ross Tomlin, 21–51. Amsterdam: Benjamins. Chomsky, Noam. 1957. Syntactic structures. Janua linguarum. Series minor, 4. The Hague: Mouton. ———. 1965. Aspects of the theory of syntax. Cambridge, Mass.: MIT Press. Clark, Herbert H. 1996. Using language. Cambridge: Cambridge University Press. Comrie, Bernard. 1976. Aspect: an introduction to the study of verbal aspect and related problems. Cambridge textbooks in linguistics. Cambridge: Cambridge University Press. ———. 1989. Language universals and linguistic typology: syntax and morphology. Oxford: Blackwell. ———. 1992. Before complexity. In The evolution of human languages: Proceedings of the Workshop on the Evolution of Human Languages, held August 1989 in Santa Fe, New Mexico, ed. by Murray Gell-Mann and John Hawkins, 193–211. Redwood, Calif: Addison-Wesley. Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press.

References 305

Corbett, Greville. Forthcoming. Sex-based and non-sex-based gender systems. In World Atlas of Language Structures, ed. by Martin Haspelmath, Matthew Dryer, David Gil, and Bernard Comrie. Oxford: Oxford University Press. Craig, Colette and Hale, Ken. 1988. Relational preverbs in some languages of the Americas: typological and historical perspectives. Language, 64.312–344. Croft, William. 1996. Linguistic selection: An utterance-based evolutionary theory of language change. Nordic Journal of Linguistics, 19.99–139. ———. 2000. Explaining language change: an evolutionary approach. Longman linguistics library. Harlow: Longman. Croft, William and Deligianni, Efrosini. Ms. Asymmetries in NP word order. Curry, Haskell B. 1961. Some logical aspects of grammatical structure. In Structure of language and its mathematical aspects, ed. by Roman Jakobson, 56–68. Providence, Rhode Island: American Mathematical Society. Dahl, Östen. 1985. Tense and aspect systems. Oxford: Blackwell. ———. 2000a. The grammar of future time reference in European languages. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 309–328. Berlin: Mouton de Gruyter. ———. 2000b. Egophoricity in discourse and syntax. Functions of Language, 7.33–77. ———. 2000c. Elementary gender distinctions. In Gender in grammar and cognition, II: manifestations of gender, ed. by Barbara Unterbeck, Matti Rissanen, Terttu Nevalainen and Mirja Saari, 577–593. Berlin: Mouton de Gruyter. ———. 2000d. Animacy and the notion of semantic gender. In Gender in grammar and cognition, I: approaches to gender, ed. by Barbara Unterbeck, Matti Rissanen, Terttu Nevalainen and Mirja Saari, 99–115. Berlin: Mouton de Gruyter. ———. 2000e. The tense-aspect systems of European languages in a typological perspective. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 3–25. Berlin: Mouton de Gruyter. ———. 2001a. Grammaticalization and the life cycles of constructions. RASK. Internationalt tidsskrift for sprog og kommunikation, 14.91–133. ———. 2001b. Complexiﬁcation, erosion, and baroqueness. Linguistic Typology, 5.375–377. ———. 2003. Deﬁnite articles in Scandinavian: competing grammaticalization processes in standard and non-standard varieties. In Dialectology Meets Typology, ed. by Bernd Kortmann. Berlin: Mouton de Gruyter. Dahl, Östen and Fraurud, Kari. 1996. Animacy in grammar and discourse. In Reference and referent accessibility, ed. by Thorstein Fretheim and Jeanette K. Gundel, 47–64. Amsterdam: Benjamins. Dahl, Östen and Koptjevskaja-Tamm, Maria. 1998. Alienability splits and the grammaticalization of possessive constructions. In Papers from the 16th Scandinavian Conference of Linguistics, ed. by Timo Haukioja, 38–49. Turku: Department of Finnish and General Linguistics, University of Turku. ———. 2001. Kinship in grammar. In Dimensions of possession, ed. by Irène Baron, Michael Herslund and Finn Sørensen. Amsterdam: Benjamins. Dawkins, Richard. 1976. The selﬁsh gene. Oxford: Oxford University Press. de Reuse, W. J. 1994. Noun incorporation. In The Encyclopedia of language and linguistics, ed. by R. E. Asher, 2842–2847. Oxford: Pergamon. DeGraﬀ, Michel. 2001. On the origin of creoles: A Cartesian critique of “neo”-Darwinian linguistics. Linguistic Typology, 5.213–311. Dennett, Daniel C. 1987. The intentional stance. Cambridge, Mass.: MIT Press. ———. 1991. Consciousness explained. Boston: Little Brown. ———. 1996. Darwin’s dangerous idea: evolution and the meanings of life. London: Penguin. Diakonoﬀ, Igor M. 1988. Afrasian languages. Languages of Asia and Africa. Moscow: Nauka. Dixon, Robert M. W. 1977. Where have all the adjectives gone? Studies in Language, 1.19–80.

306 The Growth and Maintenance of Linguistic Complexity

Dressler, Wolfgang U. 1987a. Leitmotifs in natural morphology. Studies in Language Companion Series, 10. Amsterdam: Benjamins. ———. 1987b. Word formation as part of natural morphology. In Leitmotifs in natural morphology, ed. by Wolfgang U. Dressler, 99–126. Amsterdam: Benjamins. Dyen, Isidore, Kruskal, Joseph B. and Black, Paul. 1992. An Indoeuropean classiﬁcation: A lexicostatistical experiment. Transactions of the American Philosophical Society, 82. Ebert, Karen. 2000. Progressive markers in Germanic languages. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 605–653. Berlin: Mouton de Gruyter. Edelman, Gerald M. 1987. Neural Darwinism: the theory of neuronal group selection. New York: Basic Books. Ellegård, Alvar. 1953. The auxiliary do: the establishment and regulation of its use in English. Stockholm: Almqvist & Wiksell. Evans, Nick. 1997. Role or cast? Noun incorporation and complex predicates in Mayali. In Complex predicates, ed. by Joan Bresnan, Peter Sells and Alex Alsina i Keith, 397–430. Stanford, Calif.: CSLI Publications. Fält, Gunnar. 2000. Spansk grammatik för universitet och högskolor. Lund: Studentlitteratur. Field, Fredric W. 2002. Linguistic borrowing in bilingual contexts. Studies in Language Companion Series, v. 62. Amsterdam: Benjamins. Fillmore, Charles J., Kay, Paul and O’Connor, Mary Catherine. 1988. Regularity and idiomaticity in grammatical constructions: the case of let alone. Language, 64.501–538. Fischer Jørgensen, Eli. 1989. Phonetic analysis of the stød in Standard Danish. Phonetica, 46.1–59. Flake, Gary William. 1998. The computational beauty of nature: computer explorations of fractals, chaos, complex systems and adaptation. Cambridge, Mass.: MIT Press. Fredkin, Edward. 1992a. Finite nature. Paper presented at The XXVIIth Rencontre de Moriond. ———. 1992b. A new cosmogony. http:// cvm.msu.edu/~dobrzele/dp/Publications/Fredkin/ New-Cosmogony/. Gabelentz, Georg von der. 1891. Die Sprachwissenschaft: ihre Aufgaben, Methoden und bisherigen Ergebnisse. Leipzig. Gell-Mann, Murray. 1994. The quark and the jaguar: adventures in the simple and the complex. London: Little Brown. George, Ken. 1993. Cornish. In The Celtic languages, ed. by Martin John Ball and James Fife, 410–468. London: Routledge. Geurts, Bart. 2000. Explaining grammaticalization (the standard way). Linguistics, 38.781–798. Gildea, Spike. 1997. Evolution of grammatical relations in Cariban: How functional motivation precedes syntactic change. In Grammatical relations: a functional perspective, ed. by Talmy Givón, 155–198. Amsterdam: Benjamins. Gillies, William. 1993. Scottish Gaelic. In The Celtic languages, ed. by Martin John Ball and James Fife, 201–227. London: Routledge. Givón, Talmy. 1971. Historical syntax and synchronic morphology: an archaeologist’s ﬁeld trip. In Papers from the Seventh Regional Meeting of the Chicago Linguistics Society, 394–415. Chicago: Chicago Linguistic Society. ———. 1976. Topic, pronoun and grammatical agreement. In Subject and topic, ed. by Charles N. Li, 149–188. New York: Academic Press. ———. 1979. From discourse to syntax: grammar as a processing strategy. In Discourse and syntax, ed. by Talmy Givón, 81–109. New York: Academic Press. ———. 1991. Isomorphism in the grammatical code: cognitive and biological considerations. Studies in Language, 15.85–114. Goddard, Cliﬀ. 2001. Lexico-semantic universals: a critical overview. Linguistic Typology, 5.1–66.

References 307

Goertzel, Ben. 1994. Chaotic logic: language, thought, and reality from the perspective of complex systems science. IFSR international series on systems science and engineering, 9. New York: Plenum Press. Goldberg, Adele E. 1995. Constructions: a construction grammar approach to argument structure. Chicago: University of Chicago Press. Goldsmith, John. 1976. Autosegmental phonology. Ph.D. thesis, MIT. Gopnik, M and Crago, M. B. 1990. Familial aggregation of a developmental language disorder. Cognition, 39.1–50. Gould, Stephen Jay. 1991. Bully for Brontosaurus: further reﬂections in natural history. Harmondsworth: Penguin. Greenberg, Joseph H. 1952. The Afro-Asiatic (Hamito-Semitic) present. Journal of the American Oriental Society, 72.1–9. ———. 1978a. How does a language acquire gender markers? In Universals of human language, ed. by Joseph H. Greenberg, 48–81. Stanford, Calif.: Stanford University Press. ———. 1978b. Diachrony, synchrony, and language universals. In Universals of human language, I: Method and theory, ed. by Joseph H. Greenberg. Stanford: Stanford University Press. Grice, H. P. 1957. Meaning. Philosophical Review, 66.377–388. Gropen, Jess, Pinker, Steven, Hollander, Michelle, Goldberg, Richard and Wilson, Ronald. 1989. The learnability and acquisition of the dative alternation in English. Language, 65.203–257. Guy, Gregory. 1980. Variation in the group and in the individual: the case of ﬁnal stop deletion. In Locating language in time and space, ed. by William Labov, 1–36. New York: Academic Press. Hagège, Claude. 2001. Creoles and the notion of simplicity in human languages. Linguistic Typology, 5.167–174. Haiman, John. 1994. Ritualization and the development of language. In Perspectives on grammaticalization, ed. by William Pagliuca, 3–28. Amsterdam: Benjamins. Halle, M. and Mohanan, K. P. 1985. Segmental phonology of modern English. Linguistic Inquiry, 16.57–116. Harder, Peter. 1996. Linguistic structure in a functional grammar. In Content, expression, and structure: studies in Danish functional grammar, ed. by Elisabeth Engberg-Pedersen, Michael Fortescue, Peter Harder, Lars Heltoft and Lisbeth Falster Jacobsen, 423–452. Amsterdam: Benjamins. Harrington, Jonathan, Palethorpe, Sallyanne and Watson, Catherine I. 2000. Does the Queen speak the Queen’s English? Nature, 408.927–928. Harris, Alice C. and Campbell, Lyle. 1995. Historical syntax in cross-linguistic perspective. Cambridge studies in linguistics, 74. Cambridge: Cambridge University Press. Harris, Zellig S. 1951. Methods in structural linguistics. Chicago: University of Chicago Press. Harrison, Sheldon P. 1976. Mokilese reference grammar. Pali language texts. Micronesia. Honolulu: University Press of Hawaii. Haspelmath, Martin. 1993. A grammar of Lezgian. Mouton grammar library, 9. Berlin: Mouton de Gruyter. ———. 1997. Indeﬁnite pronouns. Oxford studies in typology and linguistic theory. Oxford: Clarendon. ———. 1998. Does grammaticalization need reanalysis? Studies in Language, 22.315–351. ———. 1999. Why is grammaticalization irreversible? Linguistics, 37.1043–1068. ———. 2000. The relevance of extravagance: a reply to Bart Geurts. Linguistics, 38.789–798. ———. forthcoming. Semantic maps. Hauser, Marc and Marler, Peter. 1999. Animal communication. In The MIT encyclopedia of the cognitive sciences, ed. by Robert A. Wilson and Frank C. Keil, 22–24. Cambridge, Mass.: MIT Press.

308 The Growth and Maintenance of Linguistic Complexity

Hayward, Richard J. 2000. Afroasiatic. In African languages: an introduction, ed. by Bernd Heine and Derek Nurse, 74–99. New York: Cambridge University Press. Heath, Jeﬀrey. 1983. Referential tracking in Nunggubuyu (Australia). In Switch-reference and universal grammar. Proceedings of a Symposium on Switch Reference and Universal Grammar, Winnipeg, May 1981, ed. by John Haiman and P Munro. Amsterdam: Benjamins. Heine, Bernd, Claudi, Ulrike and Hünnemeyer, Friederike. 1991. Grammaticalization: A conceptual framework. Chicago: University of Chicago Press. Hemon, Roparz. 1970. Grammaire bretonne. Brest: Al Liamm. Hempen, Ute. 1988. Die starken Verben in Deutschen und Niederländischen. Diachrone Morphologie. Tübingen: Niemeyer. Hermerén, Ingrid, Schlyter, Suzanne and Thelin, Ingrid. 1994. The marking of future time reference in French. Future Time Reference in European Languages III: EUROTYP Working Papers VI.6. Herslund, Michael. 1980. Problèmes de syntaxe de l’ancien français. Compléments datifs et génitifs. Études Romanes de l’Université de Copenhague. Revue Romane numéro spécial 21. Copenhagen. Heylighen, Francis. 2000. Web dictionary of cybernetics and systems. http://pespmc1.vub.ac.be/ ASC/IndexASC.html. Hockett, Charles F. 1958. Two models of grammatical description. Word, 10.210–231. Holm, Gösta. 1942. Lövångersmålet. In Lövånger: en sockenbeskrivning under medverkan av ﬂere fackmän, ed. by Carl Holm. Umeå: Aktiebolaget Nyheternas Tryckeri. Hopper, Paul J. 1987. Emergent grammar. Berkeley Linguistics Society, 13.139–157. ———. 1996. Some recent trends in grammaticalization. Annual Review of Anthropology, 25.217–236. Hopper, Paul J. and Traugott, Elizabeth. 1993. Grammaticalization. Cambridge: Cambridge University Press. Huddleston, Rodney D. and Pullum, Geoﬀrey K. 2002. The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Hudson, Richard. 2000. Grammar without functional categories. In The nature and function of syntactic categories, ed. by Robert Borsley, 7–35. New York: Academic Press. Hull, David L. 1988. Science as a process: an evolutionary account of the social and conceptual development of science. Science and its conceptual foundations. Chicago: University of Chicago Press. Håkansson, Gisela. 2001. Tense morphology and verb-second in Swedish L1 Children, L2 Children and Children with SLI. Bilingualism: Language and Cognition, 4.85–99. Jackson, Jean E. 1983. The ﬁsh people: linguistic exogamy and Tukanoan identity in northwest Amazonia. Cambridge studies in social anthropology, 39. Cambridge: Cambridge University Press. Jakobson, Roman. 1959a. On linguistic aspects of translation. In On translation, ed. by Reuben A. Brower. Cambridge, Mass.: Harvard University Press. ———. 1959b. Boas’ view of grammatical meaning. In The anthropology of Franz Boas, ed. by W. Goldschmidt, 139–145. Jakobson, Roman, Fant, C. Gunnar M. and Halle, Morris. 1963. Preliminaries to speech analysis: the distinctive features and their correlates. Cambridge, Mass.: MIT Press. Jobin, Bettina. 2004. Genus im Wandel. Studien zu Genus und Animatizität im Deutschen und Schwedischen anhand von Personenbezeichnungen im heutigen Deutsch mit Kontrastierungen zum Schwedischen. Acta Universitatis Stockholmiensis, Stockholmer Germanistische Forschungen 64.: Almqvist & Wiksell International. Johanson, Lars. 2000. Viewpoint operators in European languages. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 27–187. Berlin: Mouton de Gruyter.

References 309

Johansson, Christer. 1997. A view from language: growth of language in individuals and populations. Travaux de l’Institut de linguistique de Lund, 34. Lund: Lund University Press. Jurafsky, Daniel, Bell, Alan, Gregory, Michelle and Raymond, William D. 2001. Probabilistic relations between words: evidence from reduction in lexical production. In Frequency and the emergence of linguistic structure, ed. by Joan Bybee and Paul J. Hopper, 229–254. Amsterdam: Benjamins. Juvonen, Päivi. 2000. Grammaticalizing the deﬁnite article: a study of deﬁnite adnominal determiners in a genre of spoken Finnish. Stockholm: Dept. of Linguistics, Stockholm University. Kammerzell, Frank. 2000. Egyptian possessive constructions. Sprachtypologie und Universalienforschung, 53.97–108. Katz, Jerrold J. 1981. Language and other abstract objects. Oxford: Blackwell. Keenan, Edward L. 1976. Towards a universal deﬁnition of “subject”. In Subject and topic, ed. by Charles N. Li, 303–334. New York: Academic Press. Keller, Rudi. 1994. On language change: the invisible hand in language. London: Routledge. Kemmer, Suzanne. 1993. The middle voice. Typological studies in language, 23. Amsterdam: Benjamins. Kiefer, Ferenc. 1990–91. Noun incorporation in Hungarian. Acta Linguistica Hungarica 40.149–177. Kiparsky, Paul. 1992. Analogy. In International encyclopedia of linguistics, ed. by William Bright, 56–61. Oxford: Oxford University Press. Koptjevskaja-Tamm, Maria. 1996. Possessive NPs in Maltese: alienability, iconicity and grammaticalization. Rivista di Linguistica, 8.245–274. Kroeber, Alfred L. 1910. Noun incorporation in American languages. In Verhandlungen der XVI. Internationalen Amerikanisten-Kongress, ed. by F Heger, 569–576. Wien and Leipzig: A. Hartleben. ———. 1911. Incorporation as a linguistic process. American Anthropologist, 13.577–584. Kuryłowicz, Jerzy. 1965. The evolution of grammatical categories. Diogenes, 51.51–71. Kusters, Wouter and Muysken, Pieter. 2001. The complexities of arguing about complexity. Linguistic Typology, 5.182–185. König, Werner. 1978. dtv-Atlas zur deutschen Sprache: Tafeln und Texte. München: dtv. Labov, William. 1994. Principles of linguistic change. Vol. 20: Language in society. Oxford: Blackwell. Lambrecht, Knud. 1984. Formulaicity, frame semantics and pragmatics in German binomial expressions. Language, 60.753–796. Langacker, Ronald W. 1977. Syntactic reanalysis. In Mechanisms of syntactic change, ed. by Charles N. Li, 57–139. Austin, TX: University of Texas Press. ———. 1987. Foundations of Cognitive Grammar. Vol. 1. Stanford: Stanford University Press. ———. 1991. Concept, image, and symbol: the cognitive basis of grammar. Cognitive linguistics research, 1. Berlin; New York: Mouton de Gruyter. ———. 1999. Grammar and conceptualization. Cognitive linguistics research, 14. Berlin: Mouton de Gruyter. ———. 2000. A dynamic usage-based model. In Usage-based models of language, ed. by Michael Barlow and Suzanne Kemmer, 1–65. Stanford, Calif.: CSLI Publications. Lass, Roger. 1980. On explaining language change. Cambridge studies in linguistics, 27. Cambridge: Cambridge University Press. ———. 1990. How to do things with junk: exaptation in linguistic change. Journal of Linguistics, 26.79–102. Launey, Michel. 1999. Compound nouns vs. noun incorporation in Classical Nahuatl. Sprachtypologie und Universalienforschung, 52.347–364. Laury, Ritva. 1997. Demonstratives in interaction: the emergence of a deﬁnite article in Finnish. Studies in discourse and grammar, 7. Amsterdam: Benjamins.

310 The Growth and Maintenance of Linguistic Complexity

Lehmann, Christian. 1982. Thoughts on grammaticalization: a programmatic sketch. Vol. 1: Arbeiten des Kölner Universalien-Projekts 48. ———. 1985. Grammaticalization: synchronic variation and diachronic change. Lingua e Stile, 20.203–218. Lewes, George Henry. 1874. Problems of life and mind. London: Trübner & Co. Lewis, David. 1973. Counterfactuals. Oxford: Blackwell. Lienhard, Siegfried. 1961. Tempusgebrauch und Aktionsartenbildung in der modernen Hindi. Stockholm Oriental studies, 1. Stockholm: Almqvist & Wiksell. Lindblom, Björn, MacNeilage, Peter and Studdert-Kennedy, Michael. 1984. Self-organizing processes and the explanation of phonological universals. In Explanations for language universals, ed. by Brian Butterworth, Bernard Comrie and Östen Dahl, 181–204. Berlin: Mouton. Lindsay, Peter H. and Norman, Donald A. 1977. Human information processing: an introduction to psychology: international edition. New York: Academic Press. Lindstedt, Jouko. 2000. The perfect — aspectual, temporal and evidential. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 365–384. Berlin: Mouton de Gruyter. Longobardi, Giuseppe. 1994. On the typological unity of Indoeuropean and Semitic genitive case. In Studies in Afroasiatic grammar: Papers from the Second Conference on Afroasiatic Languages, Sophia Antopolis, ed. by Catherine Lecarme, 179–214. The Hague: Holland Academic Graphics. Lopes, Aurise Brandao and Parker, Steve. 1999. Aspects of Yuhup phonology. International Journal of American Linguistics, 65.324–342. Lüdtke, Helmut. 1980. Auf dem Wege zu einer Theorie des Sprachwandels. In Kommunikationstheoretische Grundlagen des Sprachwandels, ed. by Helmut Lüdtke, 182–252. Berlin: de Gruyter. Löfström, Jonas. 1988. Repliker utan gränser: till studiet av syntaktisk struktur i samtal. Göteborg: University of Göteborg. Mac Eoin, Gearóid. 1993. Irish. In The Celtic languages, ed. by Martin John Ball and James Fife, 101–144. London: Routledge. Macdonell, Arthur Anthony. 1916. A Vedic grammar for students: including a chapter on syntax and three appendixes: list of verbs, metre, accent. London: Oxford University Press. MacWhinney, Brian. 2001. Emergentist approaches to language. In Frequency and the emergence of linguistic structure, ed. by Joan Bybee and Paul J. Hopper, 449–470. Amsterdam: Benjamins. ———. 2002. Language emergence. In An integrated view of language development — Papers in honor of Henning Wode, ed. by Petra Burmeister, Thorsten Piske and Andreas Rohde, 17–42. Trier: Wissenschaftlicher Verlag. Malherbe, Michel and Rosenberg, Serge. 1996. Les langages de l’humanité: une encyclopédie des 3000 langues parlées dans le monde. Bouquins. Paris: Laﬀont. Marcus, Gary F. and Fisher, Simon E. 2003. FOXP2 in focus: what can genes tell us about speech and language? Trends in Cognitive Sciences, 7.257–262. Matisoﬀ, James A. 1991. Areal and universal dimensions of grammatization in Lahu. In Approaches to grammaticalization, ed. by Elizabeth Traugott and Bernd Heine, 383–453. Amsterdam: Benjamins. Matthews, Peter H. 1991. Morphology. Cambridge: Cambridge University Press. Mayerthaler, Willi. 1980. Morphologische Natürlichkeit. Linguistische Forschungen, 28. Wiesbaden: Athenaion. ———. 1987. System-independent morphological naturalness. In Leitmotifs in natural morphology, ed. by Wolfgang U. Dressler. Amsterdam: Benjamins. McCarthy, John J. 1981. A prosodic theory of nonconcatenative morphology. Linguistic Inquiry, 12.373–418.

References

———. 1994. Nonconcatenative morphology. In The Encyclopedia of language and linguistics, ed. by R. E. Asher, 2598–2600. Oxford: Pergamon. McCawley, James D. 1968. Review of Th. Sebeok [ed.], Current trends in linguistics, Vol. 3: Theoretical foundations. Language, 44.556–593. McLaughlin, Brian P. 1999. Emergentism. In The MIT encyclopedia of the cognitive sciences, ed. by Robert A. Wilson, Frank C. Keil and Massachusetts Institute of Technology, 267–269. Cambridge, Mass.: MIT Press. McLendon, Sally. 1975. A grammar of Eastern Pomo. University of California Publications in Linguistics, 74. Berkeley. McWhorter, John H. 1998. Identifying the creole prototype: Vindicating a typological class. Language, 74.788–817. ———. 2001a. The world’s simplest grammars are creole grammars. Linguistic Typology, 5.125–166. ———. 2001b. What people ask David Gil and why: Rejoinder to the replies. Linguistic Typology, 5.388–412. Meillet, Antoine. 1912. L’évolution des formes grammaticales. Scientia (Rivista di Scienza), 12.6. ———. 1921. Linguistique historique et linguistique générale. Collection linguistique (Paris). Paris. Merlan, Francesca. 1976. Noun incorporation and discourse reference in Modern Nahuatl. International Journal of American Linguistics, 42.177–191. Miner, K. L. 1986. Noun stripping and loose incorporation in Zuni. International Journal of American Linguistics, 52.242–254. Mithun, Marianne. 1984. The evolution of noun incorporation. Language, 60.847–894. ———. 1997. Lexical aﬃxes and morphological typology. In Essays on language function and language type dedicated to T. Givón, ed. by Joan Bybee, John Haiman and Sandra A. Thompson, 357–371. Amsterdam: Benjamins. ———. 1998. The sequencing of grammaticization eﬀects: a twist from North America. In Historical Linguistics 1997, ed. by Monika S. Schmid, Jennifer R. Austin and Dieter Stein, 291–314. Muravyova, Irina A. 1998. Chukchee (Paleo-Siberian). In Handbook of morphology, ed. by Andrew Spencer and Arnold Zwicky, 521–538. Oxford: Blackwell. Nedergaard-Thomsen, Ole. 1992. Unit accentuation as an expression device for predicate formation: the case of syntactic noun incorporation in Danish. In Layered structure and reference in a functional perspective: Papers from the Functional Grammar Conference in Copenhagen 1990, ed. by Michael Fortescue, Peter Harder and Lars Kristoﬀersen, 173–229. Amsterdam: Benjamins. Newmeyer, Frederick J. 1998. Language form and language function. Language, speech, and communication. Cambridge, Mass.: MIT Press. Nichols, Johanna. 1988. On alienable and inalienable possession. In In Honor of Mary Haas: from the Haas Festival Conference on Native American Linguistics, ed. by W. Shipley, 557–609. Berlin: Mouton de Gruyter. ———. 1992. Linguistic diversity in space and time. Chicago: University of Chicago Press. Nilsson, Birgit. 1985. Case marking semantics in Turkish. Stockholm: Dept. of Linguistics, Stockholm University. Nordström, August. 1925. Luleåkultur. Luleå. Oates, Lynette F. 1964. A tentative description of the Gunwinggu language (of western Arnhem land). Oceania linguistic monographs, 10. Sydney: University of Sydney. Okell, John. 1969. A reference grammar of colloquial Burmese. London: Oxford University Press. Ourn, Noeurng and Haiman, John. 2000. Symmetrical compounds in Khmer. Studies in Language, 24.483–514.

311

312

The Growth and Maintenance of Linguistic Complexity

Parker, Steve and Weber, David. 1996. Glottalized and aspirated stops in Cuzco Quechua. International Journal of American Linguistics, 62.70–85. Parkinson, C. Northcote. 1957. Parkinson’s law, and other studies in administration. Boston: Houghton Miﬄin. Payne, Doris L. n.d. The Maasai (Maa) language. http://darkwing.uoregon.edu/~dlpayne/maasai/ maling.htm. Pinker, Steven. 1999. Words and rules: the ingredients of language. London: Weidenfeld & Nicolson. Plotkin, Henry C. 1994. The nature of knowledge: concerning adaptations, instinct and the evolution of intelligence. London: Allen Lane. Prior, Arthur N. 1957. Time and modality. John Locke lectures, 1955–56. London: Oxford University Press. ———. 1967. Past, present and future. Oxford: Clarendon. ———. 1968. Papers on time and tense. Oxford: Clarendon. Putnam, Hilary. 1975. The meaning of “meaning”. In Language, mind, and knowledge, ed. by Keith Gunderson, 131–193. Minneapolis: University of Minnesota Press. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoﬀrey and Svartvik, Jan. 1985. A comprehensive grammar of the English language. London and New York: Longman. Ramat, Paolo. 1992. Thoughts on degrammaticalization. Linguistics, 30.549–560. Rankin, Robert, Boyle, John, Graczyk, Randolph and Koontz, John. 2002. Synchronic and diachronic perspective on ‘word’ in Siouan. In Word: a cross-linguistic typology, ed. by Robert M. W. Dixon and Alexandra Y. Aikhenvald, 180–204. Cambridge: Cambridge University Press. Reichenbach, Hans. 1947. Elements of symbolic logic. New York: Macmillan. Reynolds, Craig. 1995. Boids. http://www.red3d.com/cwr/boids/. Roberts, Ian G. and Roussou, Anna. 1999. A formal approach to “grammaticalization”. Linguistics, 37.1011–1041. Rohlfs, Gerhard. 1954. Historische Grammatik der italienischen Sprache und ihrer Mundarten. Bibliotheca Romanica. Series I, 7. Bern: Francke. Ronneberger-Sibold, Elke. 1980. Sprachverwendung, Sprachsystem: Ökonomie und Wandel. Linguistische Arbeiten, 87. Tübingen: Niemeyer. ———. 1987. A performance model for a natural theory of linguistic change. In Papers from the 7th International Conference on Historical Linguistics, ed. by Anna Giacalone Ramat, Onofrio Carruba and Giuliano Bernini, 517–544. Amsterdam: Benjamins. Rozental’, D. E. 1968. Prakticˇeskaja stilistika russkogo jazyka. Moskva: Vysshaja shkola. Sadock, Jerrold M. 1991. Autolexical syntax: a theory of parallel grammatical representations. Studies in contemporary linguistics. Chicago: University of Chicago Press. Sapir, Edward. 1911. The problem of noun incorporation in American languages. American Anthropologist, 13.250–282. Schlyter, Suzanne and Sandberg, Vesta. 1994. The marking of future time reference in Spanish. Future time reference in European languages III: EUROTYP Working Papers VI.6. Senghas, Ann. 1995. Children’s contribution to the birth of Nicaraguan Sign Language. Ph.D. thesis MIT. Seuren, Pieter A. M. 1996. Semantic syntax. Oxford: Blackwell. ———. 2001. Simple and transparent. Linguistic Typology, 5.176–180. Shalizi, Cosma Rohilla. 2001. Causal architecture, complexity and self-organization in time series and cellular automata. Ph.D. thesis The University of Wisconsin. Shannon, Claude E. 1949. The mathematical theory of communication [1948]. The Bell System Technical Journal, 27.379–423, 623–656.

References

Shi, Yuzhi. 2002. The establishment of modern Chinese grammar: the formation of the resultative construction and its eﬀects. Vol. 89: Studies in Language Companion Series. Amsterdam: Benjamins. Slater, Peter James Bramwell. 1999. Essentials of animal behaviour. Cambridge: Cambridge University Press. Slobin, Dan. 1977. Language change in childhood and in history. In Language learning and thought, ed. by John Macnamara, 185–214. New York: Academic Press. ———. forthcoming. From ontogenesis to phylogenesis: what can child language tell us about language evolution? In Biology and knowledge revisited: from neurogenesis to psychogenesis, ed. by J. Langer, S. T. Parker and C. Milbrath. Mahwah, NJ.: Lawrence Erlbaum. Southern Ute Tribe. 1980. Ute reference grammar. Ignacio, Colo.: Ute Press Southern Ute Tribe. Spencer, Andrew. Ms. Does English have productive compounding? Squartini, Mario and Bertinetto, Pier Marco. 2000. The simple and compound past in Romance languages. In Tense and aspect in the languages of Europe, ed. by Östen Dahl, 403–440. Berlin: Mouton de Gruyter. Stowell, Timothy. 1981. Origins of phrase structure. Ph.D. thesis MIT. Teleman, Ulf, Hellberg, Staﬀan, Andersson, Erik, Christensen, Lisa and Svenska akademien. 1999. Svenska akademiens grammatik. Stockholm: Norstedts ordbok. Thomason, Sarah Grey and Kaufman, Terrence. 1988. Language contact, creolization and genetic linguistics. Berkeley: University of California Press. Thompson, Sandra A. 1988. A discourse approach to the cross-linguistic category “adjective”. In Explaining language universals, ed. by John Hawkins, 167–185. Oxford: Blackwell. Thurneysen, Rudolf. 1909. Handbuch des Alt-Irischen: Grammatik, Texte und Wörterbuch. Heidelberg: Winter. Timberlake, Alan. 1977. Reanalysis and actualization in syntactic change. In Mechanisms of syntactic change, ed. by Charles N. Li, 141–177. Austin, TX: University of Texas Press. Tinbergen, Niko. 1953. Social behaviour in animals. London: Butler and Tanner. Tomasello, Michael. 1998. The return of constructions. Journal of Child Language, 25.431–491. ———. 2000a. The item-based nature of children’s early syntactic development. Trends in Cognitive Sciences, 4.156–163. ———. 2000b. Do young children have adult syntactic competence? Cognition, 74.209–253. ———. 2000c. First steps toward a usage-based theory of language acquisition. Cognitive Linguistics, 11.61–82. Traugott, Elizabeth. 1994. Grammaticalization and lexicalization. In The encyclopedia of language and linguistics, ed. by R E. Asher, 1481–1486. Oxford: Pergamon. Trudgill, Peter. 1983. On dialect: social and geographical perspectives. Oxford: Blackwell. ———. 1999. Language contact and the function of linguistic gender. Poznan studies in contemporary linguistics, 33.133–152. ———. 2001. Contact and simpliﬁcation: Historical baggage and directionality in linguistic change. Linguistic Typology, 5.372–375. Uriagereka, Juan. 1998. Rhyme and reason: an introduction to minimalist syntax. Cambridge, Mass.: MIT Press. Velazquez Castillo, Maura. 1996. The grammar of possession: inalienability, incorporation, and possessor ascension in Guaraní. Studies in Language Companion Series, 33. Amsterdam: Benjamins. Vennemann, Theo. 1983. Causality in language change: Theories of linguistic preferences as a basis for linguistic explanations. Folia Linguistica Historica, 6.5–26. Veselinova, Ljuba. 2003. Suppletion in verb paradigms: bits and pieces of a puzzle. Ph.D. thesis. Dept. of Linguistics, Stockholm University.

313

314 The Growth and Maintenance of Linguistic Complexity

Volodin, Aleksandr Pavlovic. 1976. Itel’menskij jazyk. Leningrad: Nauka. Wälchli, Bernhard. 2003. Co-compounds and natural coordination. Ph.D. thesis. Dept. of Linguistics, Stockholm University. Walsh, Thomas and Parker, Frank. 1983. The duration of morphemic and non-morphemic /s/ in English. Journal of Phonetics, 11.201–206. Watkins, T. Arwyn. 1993. Welsh. In The Celtic languages, ed. by Martin John Ball and James Fife, 289–348. London: Routledge. Wessén, Elias. 1968. Svensk språkhistoria. I: Ljudlära och ordböjningslära. Stockholm: Almqvist & Wiksell. Wierzbicka, Anna. 1996. Semantics: primes and universals. Oxford: Oxford University Press. Willerman, Raquel. 1994. The phonetics of pronouns: articulatory bases of markedness. Ph.D. thesis. The University of Texas at Austin. Williams, Edwin. 1997. Lexical and syntactic complex predicates. In Complex predicates, ed. by Joan Bresnan, Peter Sells and Alex Alsina i Keith, 13–28. Stanford, Calif.: CSLI Publications. Wurzel, Wolfgang Ullrich. 1984. Flexionsmorphologie und Natürlichkeit: ein Beitrag zur morphologischen Theoriebildung. Studia grammatica, 21. Berlin: Akademie-Vlg. ———. 1987. System-dependent morphological naturalness in inﬂection. In Leitmotifs in natural morphology, ed. by Wolfgang U. Dressler, 59–98. Amsterdam: Benjamins. ———. 1989. Inﬂectional morphology and naturalness. Studies in natural language and linguistic theory, 9. Dordrecht: Kluwer. ———. 1994. Natural Morphology. In The Encyclopedia of language and linguistics, ed. by R. E. Asher, 2590–2598. Oxford: Pergamon. ———. 2001. Creoles, complexity, and linguistic change. Linguistic Typology, 5.378–387. Zipf, George Kingsley. 1935. The psycho-biology of language: an introduction to dynamic philology. Boston: Houghton Miﬄin. Zubin, David M. and Köpcke, Klaus-Michael. 1986. Gender and folk taxonomy: the indexical relation between grammatical and lexical categorization. In Noun classes and categorization: Proceedings of a Symposium on Categorization and Noun Classiﬁcation, ed. by Colette Craig, 139–180. Amsterdam: Benjamins.

List of abbrevations used in glosses

1 2 3 abs acc adess all an cl:+Roman numeral n coord cs dat def det du erg f fut gen ind indf inf ins intr

ﬁrst person second person third person absolutive accusative adessive allative animate noun class n

loc m n n-

coordinator construct state dative deﬁnite determiner dual ergative feminine future genitive indicative indeﬁnite inﬁnitive instrumental intransitive

part pl prs pst ptcp q rat rel rp sbj sg sup tns wk

neg nom par

locative masculine neuter non- (e.g. nsg nonsingular, npst nonpast) negation, negative nominative “participial modality” (Inuktitut) partitive plural present past participle question particle/marker rational (gender) relative relational preverb (Rama) subject singular supine tense marker weak (adjective ending in Germanic)

The above list of abbreviations and interlinear morpheme-by-morpheme glossing in the examples in the book follow the The Leipzig Glossing Rules (http:// www.eva.mpg.de/lingua/ﬁles/morpheme.html).

Language index

Information given for each language: name used in this book; Ethnologue/EMELD code; genetic aﬃliation; main location and if relevant, time period during which the language was used.

A Afrikaans [AFK] (IE West Germanic; South Africa) 275, 292, 305 Afro-Asiatic, phylum 201, 238, 239, 280, 281, 283, 300 Akkadian [XAKK] (AF Semitic; Mesopotamia (3rd–1st mill. B. C. E.)) 282, 283 Arabic, Modern Standard [ABV] (AF Semitic; Middle East) 209, 239 Arabic, Moroccan [ARY] (AF Semitic; Morocco) 83 Arjeplog [~SWD] (IE North Germanic; Sweden (Norrbotten County, province of Lappland)) 236, 237 Armenian, branch of IE 83, 135, 201 Armenian, Classical [XARC] (IE Armenian; Armenia (400 C. E.–19th cent.)) 115 Armenian, Modern [ARM] (IE Armenian; Armenia) 115 Asturian [AUB] (IE Italic; Spain (Asturias)) 265 B Beja (Bedawi) [BEI] (AF North Cushitic; Sudan) 282 Bella Coola [BEL] (Salishan; Canada (British Columbia)) 222, 223 Breton [BRT] (IE Celtic; France) 231

Bulgarian [BLG] (IE Slavic; Bulgaria) 166 Burmese [BMS] (ST Lolo-Burmese; Myanmar) 224, 225, 229 C Catalan [CLN] (IE Italic; Spain) 155, 239, 265 Celtic, branch of IE 230, 231, 237, 239, 273 Chinese, Mandarin [CHN] (ST Sinitic; China) 124, 125, 137, 139, 141, 145, 245 Chukchi [CKT] (ChukotkoKamchatkan; Russia) 229, 230 Chukotko-Kamchatkan, phylum 230, 238 Cornish [CRN] (IE Celtic; Great Britain (Cornwall)) 231 Cushitic 282 D Dalecarlian [DLC] (IE North Germanic; Sweden (Dalarna)) 114, 234, 255, 275 Danish [DNS] (IE North Germanic; Denmark) 14, 124, 133, 165, 166, 207, 228, 258, 275 Dutch [DUT] (IE West Germanic; Netherlands) 275, 292

318

The Growth and Maintenance of Linguistic Complexity

E Egyptian, Middle [@XEGY] (AF Semitic; Egypt (2000–1700 B. C. E.)) 240 Egyptian, Old [@XEGY] (AF Semitic; Egypt (3rd millennium B. C. E.)) 240, 249 Elfdalian [~DLC] (IE North Germanic; Dalecarlian as spoken in Älvdalen, Sweden) 83, 220, 221, 233, 234, 237, 245, 293 English [ENG] (IE West Germanic; Great Britain) 5, 1, 2, 11, 14, 15, 38, 46, 51–55, 80, 81, 83, 86, 87, 90, 94, 95, 97–102, 105–107, 109, 114, 120–125, 127, 129, 130, 133, 134, 137, 139, 140, 144, 145, 150, 152, 153, 156, 159, 162, 164, 167–169, 173, 175, 177, 178, 182, 184, 187–189, 191, 193, 195, 196–199, 201, 204, 211, 216, 220, 223, 224, 227, 228, 237–239, 242, 244–246, 248, 250, 251, 258, 259, 275, 279, 284, 292, 295, 299, 303, 305–307, 309, 310 English, American [~ENG] (IE West Germanic; English as spoken in the US) 169, 189 English, British [~ENG] (IE West Germanic; English as spoken in Britain) 154, 294 Estonian [EST] (UR Baltic Finnic; Estonia) 123, 208 F Faroese [FAE] (IE North Germanic; Faroe Islands) 275 Finnish [FIN] (UR Baltic Finnic; Finland) 104, 123, 163, 173, 207, 208, 209, 244, 245 French [FRN] (IE Italic; France) 53, 83, 101, 102, 120, 130, 131, 133, 140, 159, 160, 170, 173, 177, 178, 183–185, 187, 188, 190, 201, 203,

204, 221, 224, 232, 240, 245, 265, 272, 273, 293 French, Old [@FRN] (IE Italic; France (840–1400 C. E.)) 167 Frisian, North [FRR] (IE West Germanic; Germany (SchleswigHolstein)) 211, 252, 253 Frysk (West Frisian) [FRI] (IE West Germanic; Netherlands (Friesland)) 253 G Gaelic, Scots [GLS] (IE Celtic; Great Britain (Scotland)) 83 Galician [GLN] (IE Italic; Spain (Galicia)) 265 Gascon [GSC] (IE Italic; France (Gascogne)) 266 German [GER] (IE West Germanic; Germany) 38, 39, 133, 140, 156, 157, 166, 175, 197, 201, 224, 226, 228, 243, 246, 247, 252, 253, 269, 275–280, 306, 310 German, Low [GEP] (IE West Germanic; Germany (northern)) 275, 279 German, Standard [~GER] (IE West Germanic; Germany) 114, 253 Germanic, branch of IE 2, 38, 39, 93, 107, 114, 124, 130, 144, 159, 201, 208, 224, 227, 228, 234, 239, 243, 247, 252, 269, 275, 277, 279, 280, 283, 292, 295, 299, 306, 310, 311 Gothic [GOF] (IE East Germanic; Eastern Europe (1st mill. C. E.)) 275, 277–279 Greek (Modern) [GRK] (IE Greek; Greece) 71, 114, 176, 223 Greek, Classical [GKO] (IE Greek; Greece (1st mill. B. C. E.)) 192 Greenlandic, West [~ESG] (EskimoAleut; Greenland) 217 Guaraní [GUG] (TU Tupí-Guaraní; Paraguay) 150, 214

Language index 319

Gun-djeihmi [~GUP] (AU Gunwingguan; variety of Mayali (Gunwinggu); Australia (Arnhem Land))) 216 H Hawaiian [HWI] (AO Polynesian; USA (Hawaii)) 135, 303 Hebrew, Modern [HBR] (AF Semitic; Israel) 239 Hungarian [HNG] (UR Ugric; Hungary) 218–220, 258 I Icelandic [ICE] (IE North Germanic; Iceland) 114, 275, 303 Indo-European, phylum 106, 113, 133, 144, 184, 192, 193, 201, 268, 272–275, 284, 288 Inuktitut, Eastern Canadian [ESB] (Eskimo-Aleut; Canada (eastern)) 222 Irish, Modern (Irish Gaelic) [GLI] (IE Celtic; Ireland) 231 Italian [ITN] (IE Italic; Italy) 130, 155, 232, 266, 293 J Japanese [JPN] (isolate?; Japan) 128, 245 K Kammu [KJG] (AA Northern MonKhmer; Laos) 1 Khmer [KMR] (AA Eastern MonKhmer; Cambodia) 109, 243 Komi [KPV] (UR Permic; Russia (Komi Republic)) 238 Koryak [KPY] (Chukotko-Kamchatkan; Russia (Kamchatka)) 230 Kunwinjku [~GUP] (AU Gunwingguan; variety of Mayali (Gunwinggu); Australia (Arnhem Land)) 216

L Lakota [LKT] (Siouan, USA (Nebraska and other states)) 228, 229 Latin [LTN] (IE Italic; Italy (600 B. C. E.–)) 114, 127, 156, 159, 184, 185, 187–190, 193, 194, 196–198, 203, 206, 223, 265, 268, 269, 272, 279, 293 Latin, Medieval [@LTN] (IE Italic; Latin as used in the Middle Ages (400–1400)) 112 Lezgian [LEZ] (North Caucasian; Russia (Dagestan)) 83, 202 M Maasai [MET] (NS Eastern Nilotic; Kenya) 192, 207 Maltese [MLS] (AF Semitic; Malta) 149, 150, 155, 240 Manx [MJD] (IE Celtic; extinct; Great Britain (Manx)) 231 Maori [MBF] (AO Polynesian; New Zealand) 135 Mayali (Gunwinggu) [GUP] (AU Gunwingguan 216 Micronesian, branch of AO 174 Migaama [MMY] (AF East Chadic; Chad) 282 Mokilese [MNO] (AO Micronesian; Micronesia) 214 N Nadëb [MBJ] (Maku; Brazil) 248 Nahuatl, Classical [NCI] (UA Aztecan; Mexico (–16th cent. C. E.)) 213 Nahuatl, Huautla [NAI] (UA Aztecan; Mexico (Hidalgo)) 215 Navajo [NAV] (ND Athapaskan; USA (south-western)) 104, 150, 153 Nicaraguan Sign Language [NCS] (sign language, Nicaragua) 288, 294, 310 Norwegian [NRR & NRN] (IE North Germanic; Norway) 133, 165, 166, 207, 275, 291

320 The Growth and Maintenance of Linguistic Complexity

Nunggubuyu [NUY] (AU Gunwingguan; Australia (Arnhem Land)) 203 O Öömrang [~FRR] (IE West Germanic; North Frisian as spoken on Amrum, Germany) 211, 253 Ossmol [~DLC] (IE North Germanic; Dalecarlian as spoken in Orsa, Sweden) 237 P Palula [PHL] (IE Indo-Aryan; Pakistan (Chitral Valley)) 207 Persian (Farsi) [PES] (IE Iranian; Iran) 83, 238 Polish [PQL] (IE West Slavic; Poland) 165 Pomo, Eastern [PEB] (Hokan; extinct; USA (California)) 149 Portuguese [POR] (IE Italic; Portugal) 165, 196, 203, 266 Proto-Afro-Asiatic (AA, hypothetical proto-language) 274 Proto-Germanic [——] (IE Germanic; hypothetical proto-language) 38, 39, 275, 277, 278, 279 Proto-Indo-European [——] (IE; hypothetical proto-language) 268 Provencal [PRV] (IE Italic; France (Provence)) 266 Punjabi [PNJ] (IE Indo-Aryan; India (Punjab)) 83 Q Quechua, Cuzco [QUZ] (Quechuan; Peru) 207 R Rama [RMA] (Chibchan; Nicaragua) 248 Romance (branch of IE (=Italic excluding Latin)) 92, 129, 155,

199, 201, 222, 229, 235, 237, 262, 263, 265–268, 272, 283, 310 Romanian [RUM] (IE Italic; Romania) 266 Romansh [RHE] (IE Italic; Switzerland) 267 Russian [RUS] (IE East Slavic; Russia) 54, 66, 81–86, 91, 104, 113, 128, 131, 140, 141, 143, 144, 151, 152, 156, 165, 166, 187, 189, 192, 193, 196, 198, 215, 224, 245, 247, 273, 307 Russian, Old [@RUS] (IE East Slavic; Russia (–17th cent.)) 143 S Sanskrit [SKT] (IE Indo-Aryan; India (500 B. C. E.–)) 228, 242, 244 Sanskrit, Vedic [@SKT] (IE IndoAryan; early forms of Sanskrit (2nd mill. B. C. E.)) 241, 242, 244, 249 Sardinian [SDC & SDN & SRD & SRO] (IE Italic; Italy (Sardinia)) 267 Scandinavian (North Germanic), branch of IE Germanic 124, 130, 133, 134, 144, 147, 158, 232, 237, 239, 246, 268, 273, 283, 284 Scandinavian, Common [——] (IE North Germanic, hypothetical proto-language 273 Scandinavian, Standard Central [DNS & NRR & SWD] (IE North Germanic; cover term for standard varieties of Danish, Norwegian Bokmål, and Swedish) 202, 275 Semitic, branch of AA 209, 238, 241, 280, 282, 283, 302 Sicilian [SCN] (IE Italic; Italy (Sicily)) 267 Sirionó [SRQ] (TU Tupí-Guaraní; Bolivia (Beni)) 127, 150, 246, 254 Slavic 114, 166, 192, 202, 238, 247, 273

Language index

Spanish [SPN] (IE Italic; Spain) 13, 102, 118, 127, 128, 130, 133, 145, 156, 159, 165, 168, 169, 202, 203, 205, 206, 231, 232, 245, 267 Swedish [SWD] (IE North Germanic; Sweden) 5, 14, 15, 38, 39, 83, 84, 88, 104, 124, 126, 129, 133, 134, 141, 143–145, 147, 150, 152–156, 165, 166, 168, 178, 187, 189, 195, 198, 202, 207, 208, 209, 219, 220, 223, 224, 226–228, 233–238, 244, 246–249, 252, 254, 256, 257, 269, 275, 277–280, 291, 293, 295, 306, 308, 310 Swedish, Standard Finland [~SWD] (IE North Germanic; Standard Swedish as spoken in Finland) 226 T Tamil [TCV] (DR Southern; India (Tamil Nadu)) 242, 243 Tigrinya [TGN] (AF Semitic; Ethiopia) 282 Tiwa, Southern [TIX] (Kiowa-Tanoan; USA (New Mexico)) 213, 217, 238 Tok Pisin [PDG] (English-based creole; Papua New Guinea) 106, 123, 296, 297, 299 Tsimshian, phylum 222 Tupí-Guaraní, phylum 150 Turkic, branch of AL 83 Turkish [TRK] (AL Southern Turkic; Turkey) 110, 135, 145, 196, 207, 219, 220, 258

Tuyuca [TUE] (TC Eastern; Colombia) 207 U Udi [UDI] (North Caucasian; Azerbaijan) 202 Ugric, branch of UR 247 Ukrainian [UKR] (IE East Slavic; Ukraine) 273 Ute, Southern [UTE] (UA Northern; USA (Utah, Colorado)) 232, 237 V Venetian [VEC] (IE Italic; Italy) 268 W Walloon [FRN] (IE Italic; Belgium) 268 Welsh [WLS] (IE Celtic; Great Britain (Wales)) 83, 231, 303 Y Yoruba [YOR] (NC Defoid; Nigeria) 83, 209 Yuhup [YAB] (Maku; Brazil (Amazonas)) 207 Z Zuñi [ZUN] (isolate; USA (New Mexico)) 218 Züritüütsch [~GSW] (IE West Germanic; Alemannic as spoken in Zurich, Switzerland) 252, 253

321

322 The Growth and Maintenance of Linguistic Complexity

Phyla abbreviations

Symbols used with codes

AA AF AL AO AU DR IE NC ND NS PE ST TU UA UR

~ @

Austro-Asiatic Afro-Asiatic Altaic Austronesian Australian Dravidian Indo-European Niger-Congo Na-Dene Nilo-Saharan Penutian Sino-Tibetan Tupí Uto-Aztecan Uralic

&

is a variety of/is closely related to is a historical stage of/is a historical predecessor of union of

Author index

A Abraham, Werner 251 Adams, Douglas 8 Aikhenvald, Alexandra Y. 213 Allen, Barbara J. 215 Allen, Shanley E. M. 220 Allwood, Jens 80, 246 Anderson, Stephen R. 184, 185 Andersson, Anders-Börje 144, 153, 200 Ansaldo, Umberto 123, 124 Anward, Jan 254–256 Aristotle 116 Arthur, W. Brian 95, 192 Asher, R. E. 111, 240, 241 Augst, Gerhard 269, 271, 273 B Ball, Philip 32 Barlow, Michael 36, 96 Barnes, Janet 205 Bertinetto, Pier Marco 129 Bhat, D. N. Shankara 236 Bichakjian, B. H. 197 Bickerton, Derek 109, 287 Bierwisch, Manfred 124 Bittner, Andreas 269, 273, 301 Blake, Barry J. 284 Bloomﬁeld, Leonard 15, 266–268 Boas, Franz 81, 226, 227 Booij, Geert 195 Borer, Hagit 237 Börjars, Kersti 195 Broderick, George 229 Brown, Penelope 88 Bybee, Joan L. 36, 37, 88, 114, 159, 161, 164, 188, 230, 273, 275

C Camazine, Scott 31, 32 Campbell, Lyle 171, 172 Caubet, Dominique 83 Chafe, Wallace. 233, 257 Chomsky, Noam 49, 53, 92, 249, 255 Clark, Herbert H. 87 Cliﬀord, James 36 Comrie, Bernard 83, 93, 107, 109, 110, 114 Condillac, Étienne 106 Corbett, Greville 132, 197, 198, 199 Crago, M. B. 34 Craig, Colette 30, 245 Croft, William 62, 68, 78, 87, 228, 236, 285 Curry, Haskell B. 49, 65 D Dahl, Östen ix, 1, 83, 84, 123, 128, 134, 152, 153, 164, 178, 193, 198, 201, 211, 238, 266, 275, 282, 293, 294, 297 Dawkins, Richard 57, 68 de Reuse, W. J. 210, 211 DeGraﬀ, Michel 39, 42, 279, 280 Deligianni, Efrosini 228, 236 Deloria, Ella 226, 227 Dennett, Daniel C. 78, 169 Diakonoﬀ, Igor M. 274 Dixon, Robert M. W. 213, 235 Dressler, Wolfgang U. 115, 116, 278 Dyen, Isidore 262 E Ebert, Karen 209, 250, 251

324 The Growth and Maintenance of Linguistic Complexity

Edelman, Gerald M. 94 Ellegård, Alvar ix, 134 Enger, Hans-Olav ix Eriksson, Gunnar ix Evans, Nick 214 F Fält, Gunnar 229 Field, Fredric W. 127 Fillmore, Charles J. 50 Firth, John Rupert 205 Fischer Jørgensen, Eli 205 Fisher, Simon E. 34 Flake, Gary William 25, 77, 78 Fraurud, Kari 201 Fredkin, Edward 5 G Gabelentz, Georg von der 134, 135, 161, 279 Gell-Mann, Murray 24 George, Ken 28, 50, 229 Geurts, Bart 135, 136, 141 Giddens, Anthony 36 Gildea, Spike 171 Gillies, William 229 Givón, Talmy 106, 109, 123, 159, 178, 290 Goddard, Cliﬀ 139 Goertzel, Ben 22 Goldberg, Adele E. 50 Goldberg, Richard 50 Goldsmith, John 205 Gopnik, M. 34 Gould, Stephen Jay 169 Greenberg, Joseph. H. 85, 117, 132, 264, 274, 275 Grice, H. P. 8, 75 Gropen, Jess 97 Guy, Gregory 167 H Hagège, Claude 279 Haiman, John 88, 241 Håkansson, Gisela 113

Hale, Ken 245 Halle, Morris 297 Harder, Peter 77 Harrington, Jonathan 285 Harris, Alice C. 171, 172 Harris, Zellig S. 92 Harrison, Sheldon P. 212 Haspelmath, Martin 83, 99, 100, 136, 140, 141, 143, 146, 170–174 Hauser, Marc 75 Hayward, Richard J. 274 Heath, Jeﬀrey 202 Heine, Bernd 116, 125, 176, 177 Hemon, Roparz 229 Hempen, Ute 269 Hermerén, Ingrid 128 Herslund, Michael 165 Heylighen, Francis 65 Hockett, Charles F 183 Holm, Gösta 239 Hopper, Paul J. 36, 37, 46, 119, 125, 135–138, 140, 141, 145, 161–163, 170–172, 175, 240 Huddleston, Rodney D. 248 Hudson, Richard 176 Hull, David L. 68 Humboldt, Wilhelm von 106 J Jackson, Jean E. 60 Jakobson, Roman 80, 81, 193 Jobin, Bettina 195 Johanson, Lars 83 Johansson, Christer 284 Jurafsky, Daniel 160 Juvonen, Päivi 161, 162 K Kammerzell, Frank. 238 Katz, Jerrold J. 67 Kaufman, Terrence 61, 110, 281 Kay, Paul 50 Keenan, Edward L. 93, 94 Keller, Rudi 9, 11, 32, 77, 135, 143, 251 Kemmer, Suzanne 36, 96, 134

Author index 325

Kiefer, Ferenc 216, 217 Kiparsky, Paul 167, 175 König, Werner 263 Köpcke, Klaus-Michael 201 Koptjevskaja-Tamm, Maria ix, 85, 149, 152, 153, 238 Kroeber, Alfred L. 210 Krull, Diana 206 Kusters, Wouter 39 L Laberge, Suzanne 287 Labov, William 167, 168 Lambrecht, Knud 241 Langacker, Ronald W. 28, 40, 43, 44, 65, 66, 89, 91, 96, 170, 171, 203, 204 Lass, Roger 67, 169 Launey, Michel 211 Laury, Ritva 161 Lehmann, Christian 106, 119, 142, 164, 165, 166 Levinson, Stephen C. 88 Lewes, George Henry 28 Lewis, David 76 Lienhard, Siegfried 83, 292 Liljegren, Henrik 205 Lindblom, Björn 31 Lindsay, Peter H. 93 Lindstedt, Jouko 275 Linell, Per 254–256 Löfström, Jonas 297 Longobardi, Giuseppe 237 Lopes, Aurise Brandao 205 Lüdtke, Helmut 9, 135, 143, 159 M Mac Eoin, Gearóid 229 Macdonell, Arthur Anthony 242 MacWhinney, Brian 33–36 Malherbe, Michel 262 Marcus, Gary F. 34 Markowski, Anne ix, 143 Marler, Peter 75 Matisoﬀ, James A. 170

Matthews, Peter H. 183–185, 194 Mayerthaler, Willi 115–117, 289, 293 McCarthy, John J. 207 McCawley, James D. 85 McLaughlin, Brian P. 28 McLendon, Sally 149 McWhorter, John H. 39, 42, 54, 80, 107–110, 113, 114, 117, 179, 197, 277–280, 293, 294, 296 Meillet, Antoine 119, 135, 138, 192 Merlan, Francesca 213, 214 Miner, K.L 216, 218 Minugh, David ix, 139 Mithun, Marianne 211–214, 216, 220, 221, 241 Mohanan, K. P. 297 Muravyova, Irina A. 227 Muysken, Pieter 39 N Nedergaard-Thomsen, Ole. 218, 256 Newmeyer, Frederick J. 85, 159, 160, 170, 175–177, 233, 285, 287 Nichols, Johanna 149, 154, 198, 239, 278, 293 Nilsson, Birgit 217, 218 Nordström, August 239 Norman, Donald A. 93 O Oates, Lynette F. 214 Okell, John 223 Ourn, Noeurng 241 P Parker, Frank 168 Parker, Steve 205 Parkvall, Mikael ix Payne, Doris L. 190, 205 Pinker, Steven 96, 97, 297, 301 Plotkin, Henry C. 26, 27 Prior, Arthur N. 192, 193 Pullum, Geoﬀrey K. 248 Putnam, Hilary 41, 58

326 The Growth and Maintenance of Linguistic Complexity

Q Quirk, Randolph 248, 249 R Ramat, Paolo 146 Rankin, Robert 227 Reichenbach, Hans 123 Reynolds, Craig 30–32 Roberts, Ian G. 177 Rohlfs, Gerhard 154 Ronneberger-Sibold, Elke 186 Rosenberg, Serge 262 Roussou, Anna 177 S Sadock, Jerrold M. 215 Sandberg, Vesta 128 Sankoﬀ, Gillian 287 Sapir, Edward 210 Saussure, Ferdinand de 66, 191 Schleicher, August 57 Schlyter, Suzanne 128 Selfridge, Oliver 93 Senghas, Ann 288 Seuren, Pieter A. M. 80, 81 Shakespeare, William 135 Shalizi, Cosma Rohilla 22, 29 Shannon, Claude E. 5, 7 Shi, Yuzhi 123, 243 Slater, P. J. B. 89 Slobin, Dan 109, 273, 287, 288 Smith, Adam 32, 240, 242, 243 Southern Ute Tribe. 230 Spencer, Andrew 225 Squartini, Mario 129 Stowell, Timothy 236 Swadesh, Morris 261, 262, 263 T Teleman, Ulf 176

Thomason, Sarah Grey 61, 110, 281 Thompson, Sandra A. 138, 233, 240 Thurneysen, Rudolf 228 Timberlake, Alan 172 Tinbergen, Niko 75 Tolstoy, Lev 20 Tomasello, Michael 98 Traugott, Elizabeth 46, 119, 125, 135–137, 140, 141, 145, 162, 163, 170–172, 175, 254 Trudgill, Peter 110, 116, 197, 280–283, 292, 293, 295 U Uriagereka, Juan 37, 118 V Velazquez Castillo, Maura 150, 212 Vennemann, Theo 290 Veselinova, Ljuba 186, 187 Volodin, Aleksandr Pavlovicˇ 228 Vrba, Elizabeth 169 W Wälchli, Bernhard 240, 241, 256 Walsh, Thomas 168 Watkins, T. Arwyn 229 Weber, David 205 Wessén, Elias 133 Wierzbicka, Anna 139 Willerman, Raquel 202 Williams, Edwin 218 Wurzel, Wolfgang Ullrich 115, 116, 273, 278, 279 Z Zipf, George Kingsley 160 Zubin, David M. 201

Subject index

A ablaut 38, 243, 271–277, 292, 293, 295, 299 abstract feature 193, 198, 205 actualization 172, 174 adaptation 25–27, 29, 59, 123, 159, 171, 293, 294 adaptive sound change 159, 160, 170 advanced tongue root (ATR) 207 aﬃxation 110, 166 agreement 53, 54, 95, 107, 114, 117, 125, 134, 167, 186, 189, 193, 197–206, 209, 213, 217, 229–232, 237, 280 algorithmic information content 21 alienability 111, 150, 151, 154, 155, 178, 240, 241, 294 analogy 57, 68, 171, 177 analytic incorporation 220, 258 anaphoric island 257 animacy 45, 46, 72, 119, 134, 135, 163, 187, 189, 199–203, 205, 213, 217, 229 apocope 233, 234 apophony, see ablaut Artiﬁcial Life 29 aspect 87, 100, 116, 125, 129, 131, 188, 192, 193, 197, 258, 271, 277, 295 attrition 69, 71, 159, 160, 167, 174, 179 attunement 26–27, 88, 297 automation 91 autosegmental 116, 209 auxiliary pattern 103

B binomial expression 243 bit 7 blocking 99, 128, 188 bondedness 166 borrowing 41, 62, 129, 171, 263, 283, 284 butterﬂy eﬀect 61 C case marking 45, 94, 114, 117, 125, 129, 176, 258, 280 chaos 20, 23 checksum digit 9, 10, 11, 204 Cheshirization 172 chess 66, 97, 103–105 child language 100, 109, 111, 115 choice point 6, 46, 47, 48, 49, 50 — bound 49, 50, 51 — free 49, 51 choice structure 48, 49, 186 clarity 137, 250, 257 classiﬁcatory noun incorporation 216 classiﬁer 128, 201, 246 clitic movement 111, 115 coalescence 167, 186, 187 co-compound 242, 244, 249 comparative 51, 127, 146, 250 competence 41, 65, 68, 98, 287 complexity metric 119 complexity 1, 2, 3, 5, 10, 19–26, 29, 34, 39, 40, 42–46, 51, 52, 66, 93, 107, 109, 111, 112, 118, 119, 133, 134, 157, 171, 176, 182, 184, 187, 189, 199, 200, 201, 202, 205, 259, 263, 278–284, 287, 291, 293, 295, 298

328 The Growth and Maintenance of Linguistic Complexity

— conceptual 45–46 — evolutionary 104–106 — output 184, 291 — signal 43, 44 — structural, 44, 93, 106, 170, 171, 181, 183, 184, 209, 223, 291 — system 43, 44, 45, 123, 157, 160, 183, 184, 191, 291 concerted scales model 166 conditions of use 51, 80–84 conservatism 99, 100 construct state 236–239 Construction Grammar 50 construction 50 contact-induced change 112, 148, 283, 284 content requirement 205 content word 178 contextual 96, 197 coreferentiality 216 cost 9, 11, 39, 43, 44, 59, 79, 81, 88, 89, 162, 168, 292 cost-beneﬁt analysis 9, 11, 39, 59, 79, 89, 168 creole prototype 110, 116, 279 creole 109, 110, 111, 112, 113, 115, 116, 119, 271, 276, 279, 284, 285, 289, 295, 296, 297, 298 cross-linguistic dispensability 54, 55, 82 cultural evolution 57, 297, 298 cultural inertia 38, 59, 60, 79, 97 cyclical theory of grammaticalization 136–42, 292, 293 cyclicity 69, 281 D dative 102, 103, 119, 143, 144, 176, 187, 189, 190, 194, 286 default 13, 14, 26, 61, 119, 124, 231, 238, 303, 304 deﬁnite article 70, 84, 118, 128, 131, 143, 147, 151, 152, 153, 162, 177, 202, 203, 220, 232, 233, 235, 238, 240, 251, 268

deﬁniteness 85, 187, 216, 239, 284 degrammaticalization 147 demonstrative pronoun 107, 119, 134, 135, 144, 240 dependent choice 47 derivational morphology 110, 113, 116, 196, 256 diﬃculty 39, 40, 113 direct object 46, 94, 115, 119, 125, 134, 146, 187, 211, 217, 218, 219, 252, 254, 259 directionality 142, 148, 149, 176 disruptive change 293 distributed realization 186, 187 division of linguistic labour 41 donut category 136, 294 double causative 196 double diminutive 196 dvandva 241, 242, 244, 249 E economics 15, 35 economy 11, 28, 32, 33, 43, 44, 118, 137 ecosystem 62, 63, 69, 78, 79, 140, 142 eﬀectiveness 137, 138 eﬃciency 22, 29, 63, 137, 203 egocentric 153, 154 E-language 65, 66, 93 elementary gender system 200 emancipation 88, 89 emergence 19, 28, 29, 31–37, 93, 259, 289, 290 emergentism 28, 33–35 entrenchment 98, 133, 254, 258 entropy 6, 19, 20 epiphenomenon 36, 37, 121, 172 equipollent 192 ergativity 110 erosion 69, 71, 72, 159, 160, 174, 179 exaptation 171, 172 expandability 225, 250, 258 extravagance 138

Subject index 329

F fashion 72, 73, 146, 147, 182, 280 featurization 123, 183, 207, 293 FOXP2 gene 35 frequency eﬀect 162 function word 178 function 83–93 functional explanation 77 functionalism 169, 179, 287 fusion 90, 166, 167, 168, 173, 186, 187, 198 G Game of Life 30, 31, 37 gender 42, 82, 85, 86, 114, 116, 134, 135, 196–202, 205, 206, 209, 230, 288, 295, 296 generativism 26, 119, 149, 172, 177, 179, 287, 288 generic 31, 85, 128, 136, 216, 221 genotype 65, 67, 68, 94 glottalization 207 gram family 268 grammatical opposition 117, 134, 191, 192, 194 grammaticalization 2, 3, 36, 84, 85, 87, 100, 108, 113, 116, 118, 119, 121, 124, 127, 129, 134–148, 155, 160, 163, 164, 166–168, 171–181, 189, 191, 194, 197, 223, 245–247, 258, 268, 270, 291, 293 gram type 277 guessing diﬃculty 6, 7 H habitual 85, 136, 213, 252 habituation 90, 91 half-life 202, 264, 265, 273 Hierarchy of Borrowability 129 hoop net 141, 271 horizontal 58, 156 I iconicity 117 idiosyncratic case marking 117

I-language 65, 66, 93 imperfective 85, 136, 192, 193, 198, 247, 258, 269, 276 incorporation 3, 113, 116, 142, 152, 164, 174, 176, 182, 211–224, 228–239, 241, 245–259, 283 indirect 80, 81, 82, 86, 90 inﬂationary phenomenon 125 inﬂection 114 inﬂectional class 114, 117 inﬂectional model 194–98 inﬂectional morphology 34, 111, 113–117, 128, 196–200, 224, 271, 285 information theory 5–9, 13, 29, 161, 162 information 2–19, 24–26, 29–33, 35, 39, 40, 57, 58, 59, 63, 65–67, 71, 77, 81–88, 92, 93, 95–97, 106, 111, 119, 120, 125, 126, 149, 156, 161, 162, 165, 170, 173, 182, 187, 191, 194, 197, 204, 207, 209, 215, 217, 222, 235, 255, 259, 296, 297, 300 informational autonomy 13, 14 informational value 6, 7, 8, 17, 91, 122, 153, 154, 160, 167, 169, 215, 291 information-theoretic object 3, 66, 67, 77, 97 innovation 62, 135, 287, 288 integrity 53, 164–166, 248 intensiﬁer 125, 126, 127, 138–142, 270 intention 61, 77, 79–83, 86, 88, 122, 131 interactor 68 inverse marking 111 invisible hand 32, 33, 35 irregular verb 5, 99, 299, 302 irreversibility 142 island phenomena 248 Item-and-Arrangement 185, 186 J jokes 68, 69

330 The Growth and Maintenance of Linguistic Complexity

K kinship term 125, 150, 153–155, 240, 241 L Language Acquisition Device (LAD) 297 language contact 129, 282–285, 293, 295, 297 length of derivational history 44, 106–108 lexical aﬃx 222, 223 lexical idiosyncrasy 103, 114, 117, 133, 160 lexical integration 154 lexicalization 110, 254, 257, 258 lexicon 41–43, 58, 93, 99, 106, 113, 114, 119, 120, 134, 181, 182, 198, 199, 206, 233, 255, 256, 258, 265, 282, 291, 302–304 lexicostatistics 263 lexphrase 257 life cycle 57, 69–74, 105, 108, 122, 136, 154, 159, 221, 223, 281, 284 linearity 19, 20, 28, 53–55, 115, 116, 183, 281, 291, 293 lingueme 68 linguistic exogamy 60 linguistic object 40, 50 linguistic pattern 50–51 locational and directional adverb 5, 246 locational construction 156 M markedness 87, 117, 280, 281 maturation 2, 3, 5, 10, 40, 62, 72, 74, 110–116, 119, 121–123, 129, 154, 159–165, 168, 171, 180, 183, 186, 191, 193, 196, 198, 199, 212, 247, 253, 263, 277, 285, 288, 289–298 maturity, deﬁnition of 107 memetics 57, 58 mood 196

morpheme 1, 44, 85, 86, 87, 107, 116, 117, 128, 129, 140, 145, 165, 166, 167, 169, 173, 177, 180, 184–187, 189, 191, 193, 194, 198, 199, 204, 207, 209, 223, 247, 277 morphologization 164, 165, 171, 174, 286 morphology 34, 87, 99, 100, 105, 108, 109, 111, 113, 114, 116–118, 120, 128, 183, 183–185, 192, 194, 196–199, 206, 207, 209, 256, 271, 275, 281, 293, 299, 304 multi-use pattern 100 N nasalization 109, 116, 207, 208, 277, 286 Natural Morphology 117, 280 naturalness 117, 118, 280, 283, 291, 292, 303 negation 101, 102, 121, 145, 149, 196 Neogrammarian sound change 158, 160, 292 Neural Darwinism 96 niche 74, 78, 79, 121, 122, 124, 129, 132, 133, 140, 141, 142, 144, 157, 211, 236 noise 9, 10, 71, 118 non-genetically inherited system 57, 58 non-linearity 53, 54, 55, 108, 116, 118, 123, 181, 183, 186, 189, 207, 208, 209, 276, 277, 293 non-recursivity 196 noun class 111 noun incorporation, “classical” 211–17 noun-stripping 218 NP-internal incorporation 223–46 O obligatoriness 150, 153, 154, 215 obviative 111 on-line processing 100 operator 51, 195, 196 optimal word length 118 order 19–21

Subject index

organism 69 output structure 48, 49, 186 overspeciﬁcation 81 P packet 258, 259, 260, 294 Pandemonium model 95, 96 paradigm gap ﬁlling 128 paradigmatic variability 167 paradigmaticity 167 parental prototype 153 pattern 29 pattern adaptation 122, 123, 129, 292, 293 pattern regulation 122, 123, 130, 289, 290, 293 pattern spread 121–129, 171, 176, 292, 293 Pattern spread 122–129, 171, 176, 292, 293 perfect 112, 124, 127, 130, 165, 186, 192, 195, 277 perfective 192, 193, 198, 209, 247, 269, 276, 277 periphrastic 51, 105–107, 116, 128, 129, 131, 137, 144, 151, 166, 173, 197, 198, 224, 240, 241, 277, 278, 283, 284, 293 phenogrammatics 50, 52, 53, 56, 66 phenotype 66, 68, 69, 94 phonemic tone 109, 116 phonetic weight 44, 54, 56, 117, 123, 145, 159, 160, 162, 167, 168, 170, 171, 184, 255, 292 phonology 109, 114–116, 169, 183, 185, 205–207, 295 pidgin 108, 109–111, 114, 275, 288 piecemeal learning 98–103 P-language 66, 77, 97 politeness 90, 128, 129, 161, 245 portmanteau expression 187 possessive construction 150, 154, 155, 167, 227, 238, 239, 241, 250 possessor ascension 215

postmodiﬁers of prenominal adjectives 250 posture verb 155, 156 poverty of the stimulus 120 pragmatic anchoring 154 predictability 7, 8, 24, 122, 162, 204 preference theory 292 preservation of face 90 priming eﬀect 163 privative 192 Probabilistic Reduction Hypothesis 162 problem solving 127, 128 problems of identity 142, 178 productivity 213, 222, 237, 243, 254, 303 programmed death 69, 293 prominence management 11–15, 12, 13, 14, 124, 145, 160, 164, 165, 254 propagation 62, 67 proper name 91, 153, 217, 241, 244, 245, 257, 260 proprial classiﬁer 5, 244–246, 254 prosody 164, 183, 207, 213, 226, 255, 256 Q Quantity Principle 161 quasi-incorporation 218–21, 223, 224, 254, 255, 258 R reanalysis 92, 123, 133, 172–179 rebracketing 175 recategorization 175 recursivity 196 reductionism 19, 27, 28 reductive change 159–66, 172, 174, 293 redundancy management 9–12, 17, 162, 204 — — system-level 11 — — user-level 11 redundancy 5, 9, 10–13, 17, 53, 65, 66, 93, 114, 137, 145, 160, 162, 172, 187, 190, 204, 284, 290, 293, 296

331

332 The Growth and Maintenance of Linguistic Complexity

referentiality constraint 248–50 regrouping 175, 176 regular verb 99, 304 regulation 40–43, 45, 89, 122, 290 relabelling 175 relative 25 renewal 137, 139, 140, 274 replication 65, 67, 68, 292 replicator 68 resource 9, 11, 12, 13, 39, 40, 41, 42, 93, 145 reversibility 176, 225 rhetorical devaluation 123, 125, 127 rhetorical value 14, 161, 165, 294 ritualization 86–90 S scalar predicate 125, 126, 138 schema 47, 93, 94, 100, 108, 176, 205, 206, 211, 275, 292 Second Law of Thermodynamics 20, 72, 160 secondary society 63 second-language acquisition 40, 100, 115, 129, 181, 202, 283–285, 289, 290, 296 self-organization 27, 31, 32, 33 semantic map 102 semantic questionnaire 82, 83 semantic redundancy 149, 157 sign language 6, 91, 109, 111, 113, 116 signal simplicity 43, 44, 45, 174 simpliﬁcationism 291, 295 slang 265 Speciﬁc Language Impairment 115 stability 5, 263 state space 6, 19, 105, 106, 295 stød 207 stranding 217 strong verb 2, 38, 39, 108, 271–276, 285, 292, 299 structural change 170–173 structural complexity 44, 92, 106, 170, 171, 181, 183, 184, 209, 223, 290 structure 24

subject property 95, 96, 173 subjunctive 111, 115, 136, 184, 294 suboptimal transmission 112, 202, 209, 275, 276, 283, 284, 286, 287, 295, 297 subordinate clause 111, 113–115, 129, 136, 166, 294 suppletion 54, 117, 134, 181, 186–189, 193, 299 suprasegmental 55, 205–207, 295 switch-reference 111 symbolic act 89, 90 syncretism 189, 190 syntactically anchored 154 syntagmatic variability 167 syntax 108, 109, 113, 114, 120, 127, 180, 181, 183–185, 196, 198, 199, 205, 212, 248, 256, 258, 288 system congruity 118 system-level 11 T tectogrammatics 49, 52, 65 temporary lexicalization 258 tense 1, 2, 49, 54, 55, 83–85, 87, 88, 99, 100, 107, 116, 125, 129, 130, 131, 166, 169, 185–188, 191, 192, 194, 195, 197, 268, 271, 274, 275, 277, 278, 295, 299–301, 303, 304 tense-aspect system 125, 192, 271, 295 tense operator 194 tightness 43, 54, 176, 180, 181, 217, 224, 225, 227, 231, 232, 237, 238, 242, 243, 251, 254, 255, 296 tip cycle 126 title 15–17, 36, 42, 244–246 token sandwich 89, 90, 128, 168 tonal word accent 207, 233 tradition 79 transparency 45, 117, 118, 282 trapped 121, 130, 160, 165 trimming 73, 160 typological spiral 281

Subject index 333

U unidirectionality 135, 142, 143, 145–149, 175, 253 unit accentuation 256–258 unitary concept constraint 182, 183, 213, 228, 254 univerbation 123, 266–268 Universal Grammar 26, 59, 117, 119, 120 usage-based model 97, 173, 287 user-level 11

V verbosity 53, 54, 55, 123, 284, 290–292 verb-second order 111, 115, 288 vertical 23, 58, 60, 156, 295 vowel harmony 208, 209, 277 W word level feature 116, 117 word order 49, 51, 53, 95, 111, 115, 117, 125, 238, 239, 257, 283 Word-and-Paradigm 185, 186, 194, 194–196, 207 X X-bar theory 237 Z zero marking 187, 190, 194, 242, 278

In the series Studies in Language Companion Series the following titles have been published thus far or are scheduled for publication: 1

2 3 4 5 6 7 8

9 10 11 12 13 14 15 16

17 18 19 20 21 22 23 24 25 26

ABRAHAM, Werner (ed.): Valence, Semantic Case, and Grammatical Relations. Workshop studies prepared for the 12th International Congress of Linguists, Vienna, August 29th to September 3rd, 1977. 1978. xiv, 729 pp. Out of print ANWAR, Mohamed Sami: BE and Equational Sentences in Egyptian Colloquial Arabic. 1979. vi, 128 pp. MALKIEL, Yakov: From Particular to General Linguistics. Selected Essays 1965–1978. With an introduction by the author, an index rerum and an index nominum. 1983. xxii, 659 pp. LLOYD, Albert L.: Anatomy of the Verb. The Gothic Verb as a Model for a Uniﬁed Theory of Aspect, Actional Types, and Verbal Velocity. (Part I: Theory; Part II: Application). 1979. x, 351 pp. HAIMAN, John: Hua: A Papuan Language of the Eastern Highlands of New Guinea. 1980. iv, 550 pp. VAGO, Robert M. (ed.): Issues in Vowel Harmony. Proceedings of the CUNY Linguistics Conference on Vowel Harmony, May 14, 1977. 1980. xx, 340 pp. PARRET, Herman, Marina SBISÀ and Jef VERSCHUEREN (eds.): Possibilities and Limitations of Pragmatics. Proceedings of the Conference on Pragmatics, Urbino, July 8–14, 1979. 1981. x, 854 pp. BARTH, E.M. and J.L. MARTENS (eds.): Argumentation: Approaches to Theory Formation. Containing the Contributions to the Groningen Conference on the Theory of Argumentation, October 1978. 1982. xviii, 333 pp. LANG, Ewald and John PHEBY: The Semantics of Coordination. (English transl. by John Pheby from the German orig. ed. 'Semantik der koordinativen Verknüpfung', Berlin, 1977). 1984. 300 pp. DRESSLER, Wolfgang U., Willi MAYERTHALER, Oswald PANAGL and Wolfgang Ullrich WURZEL: Leitmotifs in Natural Morphology. 1988. ix, 168 pp. PANHUIS, Dirk G.J.: The Communicative Perspective in the Sentence. A study of Latin word order. 1982. viii, 172 pp. PINKSTER, Harm (ed.): Latin Linguistics and Linguistic Theory. Proceedings of the 1st International Colloquium on Latin Linguistics, Amsterdam, April 1981. 1983. xviii, 307 pp. REESINK, Ger P.: Structures and their Functions in Usan. 1987. xviii, 369 pp. BENSON, Morton, Evelyn BENSON and Robert F. ILSON: Lexicographic Description of English. 1986. xiii, 275 pp. JUSTICE, David: The Semantics of Form in Arabic. In the mirror of European languages. 1987. iv, 417 pp. CONTE, Maria-Elisabeth, János Sánder PETÖFI and Emel SÖZER (eds.): Text and Discourse Connectedness. Proceedings of the Conference on Connexity and Coherence, Urbino, July 16–21, 1984. 1989. xxiv, 584 pp. CALBOLI, Gualtiero (ed.): Subordination and Other Topics in Latin. Proceedings of the Third Colloquium on Latin Linguistics, Bologna, 1–5 April 1985. 1989. xxix, 691 pp. WIERZBICKA, Anna: The Semantics of Grammar. 1988. vii, 581 pp. BLUST, Robert A.: Austronesian Root Theory. An essay on the limits of morphology. 1988. xi, 190 pp. VERHAAR, John W.M. (ed.): Melanesian Pidgin and Tok Pisin. Proceedings of the First International Conference on Pidgins and Creoles in Melanesia. 1990. xiv, 409 pp. COLEMAN, Robert (ed.): New Studies in Latin Linguistics. Proceedings of the 4th International Colloquium on Latin Linguistics, Cambridge, April 1987. 1990. x, 480 pp. McGREGOR, William B.: A Functional Grammar of Gooniyandi. 1990. xx, 618 pp. COMRIE, Bernard and Maria POLINSKY (eds.): Causatives and Transitivity. 1993. x, 399 pp. BHAT, D.N.S.: The Adjectival Category. Criteria for diﬀerentiation and identiﬁcation. 1994. xii, 295 pp. GODDARD, Cliﬀ and Anna WIERZBICKA (eds.): Semantic and Lexical Universals. Theory and empirical ﬁndings. 1994. viii, 510 pp. LIMA, Susan D., Roberta L. CORRIGAN and Gregory K. IVERSON: The Reality of Linguistic Rules. 1994. xxiii, 480 pp.

27 ABRAHAM, Werner, T. GIVÓN and Sandra A. THOMPSON (eds.): Discourse, Grammar and Typology. Papers in honor of John W.M. Verhaar. 1995. xx, 352 pp. 28 HERMAN, József (ed.): Linguistic Studies on Latin. Selected papers from the 6th International Colloquium on Latin Linguistics (Budapest, 23–27 March 1991). 1994. ix, 421 pp. 29 ENGBERG-PEDERSEN, Elisabeth, Michael FORTESCUE, Peter HARDER, Lars HELTOFT and Lisbeth Falster JAKOBSEN (eds.): Content, Expression and Structure. Studies in Danish functional grammar. 1996. xvi, 510 pp. 30 HUFFMAN, Alan: The Categories of Grammar. French lui and le. 1997. xiv, 379 pp. 31 WANNER, Leo (ed.): Lexical Functions in Lexicography and Natural Language Processing. 1996. xx, 355 pp. 32 FRAJZYNGIER, Zygmunt: Grammaticalization of the Complex Sentence. A case study in Chadic. 1996. xviii, 501 pp. 33 VELÁZQUEZ-CASTILLO, Maura: The Grammar of Possession. Inalienability, incorporation and possessor ascension in Guaraní. 1996. xvi, 274 pp. 34 HATAV, Galia: The Semantics of Aspect and Modality. Evidence from English and Biblical Hebrew. 1997. x, 224 pp. 35 MATSUMOTO, Yoshiko: Noun-Modifying Constructions in Japanese. A frame semantic approach. 1997. viii, 204 pp. 36 KAMIO, Akio (ed.): Directions in Functional Linguistics. 1997. xiii, 259 pp. 37 HARVEY, Mark and Nicholas REID (eds.): Nominal Classiﬁcation in Aboriginal Australia. 1997. x, 296 pp. 38 HACKING, Jane F.: Coding the Hypothetical. A comparative typology of Russian and Macedonian conditionals. 1998. vi, 156 pp. 39 WANNER, Leo (ed.): Recent Trends in Meaning–Text Theory. 1997. xx, 202 pp. 40 BIRNER, Betty and Gregory WARD: Information Status and Noncanonical Word Order in English. 1998. xiv, 314 pp. 41 DARNELL, Michael, Edith MORAVSCIK, Michael NOONAN, Frederick NEWMEYER and Kathleen M. WHEATLEY (eds.): Functionalism and Formalism in Linguistics. Volume I: General papers. 1999. vi, 486 pp. 42 DARNELL, Michael, Edith MORAVSCIK, Michael NOONAN, Frederick NEWMEYER and Kathleen M. WHEATLEY (eds.): Functionalism and Formalism in Linguistics. Volume II: Case studies. 1999. vi, 407 pp. 43 OLBERTZ, Hella, Kees HENGEVELD and Jesús SÁNCHEZ GARCÍA (eds.): The Structure of the Lexicon in Functional Grammar. 1998. xii, 312 pp. 44 HANNAY, Mike and A. Machtelt BOLKESTEIN (eds.): Functional Grammar and Verbal Interaction. 1998. xii, 304 pp. 45 COLLINS, Peter and David LEE (eds.): The Clause in English. In honour of Rodney Huddleston. 1999. xv, 342 pp. 46 YAMAMOTO, Mutsumi: Animacy and Reference. A cognitive approach to corpus linguistics. 1999. xviii, 278 pp. 47 BRINTON, Laurel J. and Minoji AKIMOTO (eds.): Collocational and Idiomatic Aspects of Composite Predicates in the History of English. 1999. xiv, 283 pp. 48 MANNEY, Linda Joyce: Middle Voice in Modern Greek. Meaning and function of an inﬂectional category. 2000. xiii, 262 pp. 49 BHAT, D.N.S.: The Prominence of Tense, Aspect and Mood. 1999. xii, 198 pp. 50 ABRAHAM, Werner and Leonid KULIKOV (eds.): Tense-Aspect, Transitivity and Causativity. Essays in honour of Vladimir Nedjalkov. 1999. xxxiv, 359 pp. 51 ZIEGELER, Debra: Hypothetical Modality. Grammaticalisation in an L2 dialect. 2000. xx, 290 pp. 52 TORRES CACOULLOS, Rena: Grammaticization, Synchronic Variation, and Language Contact. A study of Spanish progressive -ndo constructions. 2000. xvi, 255 pp. 53 FISCHER, Olga, Anette ROSENBACH and Dieter STEIN (eds.): Pathways of Change. Grammaticalization in English. 2000. x, 391 pp. 54 DAHL, Östen and Maria KOPTJEVSKAJA-TAMM (eds.): Circum-Baltic Languages. Volume 1: Past

and Present. 2001. xx, 382 pp. 55 DAHL, Östen and Maria KOPTJEVSKAJA-TAMM (eds.): Circum-Baltic Languages. Volume 2: Grammar and Typology. 2001. xx, 423 pp. 56 FAARLUND, Jan Terje (ed.): Grammatical Relations in Change. 2001. viii, 326 pp. 57 MEL’ČUK, Igor A.: Communicative Organization in Natural Language. The semantic-communicative structure of sentences. 2001. xii, 393 pp. 58 MAYLOR, B. Roger: Lexical Template Morphology. Change of state and the verbal preﬁxes in German. 2002. x, 273 pp. 59 SHI, Yuzhi: The Establishment of Modern Chinese Grammar. The formation of the resultative construction and its eﬀects. 2002. xiv, 262 pp. 60 GODDARD, Cliﬀ and Anna WIERZBICKA (eds.): Meaning and Universal Grammar. Theory and empirical ﬁndings. Volume 1. 2002. xvi, 337 pp. 61 GODDARD, Cliﬀ and Anna WIERZBICKA (eds.): Meaning and Universal Grammar. Theory and empirical ﬁndings. Volume 2. 2002. xvi, 337 pp. 62 FIELD, Fredric W.: Linguistic Borrowing in Bilingual Contexts. With a foreword by Bernard Comrie. 2002. xviii, 255 pp. 63 BUTLER, Christopher S.: Structure and Function – A Guide to Three Major Structural-Functional Theories. Part 1: Approaches to the simplex clause. 2003. xx, 573 pp. 64 BUTLER, Christopher S.: Structure and Function – A Guide to Three Major Structural-Functional Theories. Part 2: From clause to discourse and beyond. 2003. xiv, 579 pp. 65 MATSUMOTO, Kazuko: Intonation Units in Japanese Conversation. Syntactic, informational and functional structures. 2003. xviii, 215 pp. 66 NARIYAMA, Shigeko: Ellipsis and Reference Tracking in Japanese. 2003. xvi, 400 pp. 67 LURAGHI, Silvia: On the Meaning of Prepositions and Cases. The expression of semantic roles in Ancient Greek. 2003. xii, 366 pp. 68 MODER, Carol Lynn and Aida MARTINOVIC-ZIC (eds.): Discourse Across Languages and Cultures. 2004. vi, 349 pp. + index. 69 TANAKA, Lidia: Gender, Language and Culture. A study of Japanese television interview discourse. 2004. xvii, 233 pp. 70 LEFEBVRE, Claire: Issues in the Study of Pidgin and Creole Languages. 2004. xvi, 358 pp. 71 DAHL, Östen: The Growth and Maintenance of Linguistic Complexity. 2004. x, 333 pp. 72 FRAJZYNGIER, Zygmunt, Adam HODGES and David S. ROOD (eds.): Linguistic Diversity and Language Theories. Expected Winter 2004. 73 XIAO, Richard and Anthony Mark MCENERY: Aspect in Mandarin Chinese. A corpus-based study. Expected Winter 2004/05.

The Growth and Maintenance of Linguistic Complexity (Studies in Language Companion)

Linguistic Diversity And Language Theories (Studies in Language Companion Series)

Linguistic Diversity And Language Theories (Studies in Language Companion Series)

Language Complexity: Typology, contact, change (Studies in Language Companion Series)

Cross-Linguistic Semantics (Studies in Language Companion Series)

The Complexity Theory Companion

The Complexity Theory Companion

The Complexity Theory Companion

The Complexity Theory Companion

Studies on German-Language Islands (Studies in Language Companion Series)

The complexity theory companion

Studies in African Linguistic Typology (Typological Studies in Language)

Language Death and Language Maintenance: Theoretical, Practical and Descriptive Approaches (Current Issues in Linguistic Theory)

Time and Modality (Studies in Natural Language and Linguistic Theory)

Language and Linguistic Nonviolence

'Subordination' versus 'Coordination' in Sentence and Text: A Cross-linguistic Perspective (Studies in Language Companion Series)

Interdependence of Diachronic and Synchronic Analyses (Studies in Language Companion)

Linguistic Ecology: Language Change and Linguistic Imperialism in the Pacific Rim (The Politics of Language Series)

Agency And Impersonality: Their Linguistic And Cultural Manifestations (Studies in Language Companion Series, Volume 78)

Ergativity: Emerging Issues (Studies in Natural Language and Linguistic Theory)

Pathways of Change: Grammaticalization in English (Studies in Language Companion)

Studies In Linguistic Motivation (Cognitive Linguistic Research)

Latin Linguistics and Linguistic Theory (Studies in Language Companion Series, 12)

Studies In Linguistic Motivation (Cognitive Linguistic Research)

Growth and structure of the English language

Linguistic Borrowing in Bilingual Contexts (Studies in Language Companion Series, Volume 62)

Formal Complexity of Natural Language

Sign Bilingualism: Language development, interaction, and maintenance in sign language contact situations (Studies in Bilingualism)

Transformational Growth and the Business Cycle (Studies in Transformational Growth)

A Functional Grammar of Gooniyandi (Studies in Language Companion Series)

Universals of Language Today (Studies in Natural Language and Linguistic Theory)

The Growth and Maintenance of Linguistic Complexity (Studies in Language Companion)

Linguistic Diversity And Language Theories (Studies in Language Companion Series)

Linguistic Diversity And Language Theories (Studies in Language Companion Series)

Language Complexity: Typology, contact, change (Studies in Language Companion Series)

Cross-Linguistic Semantics (Studies in Language Companion Series)

The Complexity Theory Companion

The Complexity Theory Companion

The Complexity Theory Companion

The Complexity Theory Companion

Studies on German-Language Islands (Studies in Language Companion Series)

The complexity theory companion

Studies in African Linguistic Typology (Typological Studies in Language)

Language Death and Language Maintenance: Theoretical, Practical and Descriptive Approaches (Current Issues in Linguistic Theory)

Time and Modality (Studies in Natural Language and Linguistic Theory)

Language and Linguistic Nonviolence

'Subordination' versus 'Coordination' in Sentence and Text: A Cross-linguistic Perspective (Studies in Language Companion Series)

Interdependence of Diachronic and Synchronic Analyses (Studies in Language Companion)

Linguistic Ecology: Language Change and Linguistic Imperialism in the Pacific Rim (The Politics of Language Series)

Agency And Impersonality: Their Linguistic And Cultural Manifestations (Studies in Language Companion Series, Volume 78)

Ergativity: Emerging Issues (Studies in Natural Language and Linguistic Theory)

Pathways of Change: Grammaticalization in English (Studies in Language Companion)

Studies In Linguistic Motivation (Cognitive Linguistic Research)

Latin Linguistics and Linguistic Theory (Studies in Language Companion Series, 12)

Studies In Linguistic Motivation (Cognitive Linguistic Research)

Growth and structure of the English language

Linguistic Borrowing in Bilingual Contexts (Studies in Language Companion Series, Volume 62)

Formal Complexity of Natural Language

Sign Bilingualism: Language development, interaction, and maintenance in sign language contact situations (Studies in Bilingualism)

Transformational Growth and the Business Cycle (Studies in Transformational Growth)

A Functional Grammar of Gooniyandi (Studies in Language Companion Series)

Universals of Language Today (Studies in Natural Language and Linguistic Theory)

Recommend Documents