RSTB_365_1559_Cover.qxd
10/20/10
2:52 PM
Page 1
volume 365
. number 1559 . pages 3779–3933
Cultural and linguistic diversity: evolutionary approaches Papers of a Theme issue compiled and edited by James Steele, Peter Jordan and Ethan Cochrane Introduction Evolutionary approaches to cultural and linguistic diversity J. Steele, P. Jordan & E. Cochrane
3781
Articles Transmission coupling mechanisms: cultural group selection R. Boyd & P. J. Richerson
3787
Cultural traits as units of analysis M. J. O’Brien, R. L. Lyman, A. Mesoudi & T. L. VanPool
3797
Simulating trait evolution for cross-cultural comparison C. L. Nunn, C. Arnold, L. Matthews & M. B. Mulder
3807
Measuring the diffusion of linguistic change J. Nerbonne
3821
Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories P. Heggarty, W. Maguire & A. McMahon
Language shift, bilingualism and the future of Britain’s Celtic languages A. Kandler, R. Unger & J. Steele
3855
The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles J. J. Tehrani, M. Collard & S. J. Shennan
3865
Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast P. Jordan & S. O’Neill
3875
Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of Remote Oceania E. E. Cochrane & C. P. Lipo
3889
Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits T. E. Currie, S. J. Greenhill & R. Mace
3903
Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies L. Fortunato & F. Jordan
3913
Registered Charity No 207043
3923
Cultural and linguistic diversity: evolutionary approaches
3845
Founded in 1660, the Royal Society is the independent scientific academy of the UK, dedicated to promoting excellence in science
volume 365
number 1559
pages 3779–3933
In this Issue
Cultural and linguistic diversity: evolutionary approaches Papers of a Theme issue compiled and edited by James Steele, Peter Jordan and Ethan Cochrane
3829
Historical linguistics in Australia: trees, networks and their implications C. Bowern
On the shape and fabric of human history R. D. Gray, D. Bryant & S. J. Greenhill
Phil. Trans. R. Soc. B | vol. 365 no. 1559 pp. 3779–3933 | 12 Dec 2010
12 December 2010
ISSN 0962-8436
The world’s first science journal
rstb.royalsocietypublishing.org 12 December 2010
Published in Great Britain by the Royal Society, 6–9 Carlton House Terrace, London SW1Y 5AG See further with the Royal Society in 2010 – celebrate 350 years
RSTB_365_1559_Cover.qxd
10/20/10
2:52 PM
Page 2
GUIDANCE FOR AUTHORS
Editor Professor Georgina Mace Publishing Editor Joanna Bolesworth Editorial Board Neuroscience and Cognition Dr Brian Billups Dr Andrew Glennerster Professor Bill Harris Professor Trevor Lamb Professor Tetsuro Matsuzawa Professor Andrew Whiten Cell and developmental biology Professor Makoto Asashima Dr Buzz Baum Professor Martin Buck Dr Louise Cramer Dr Anne Donaldson Professor Laurence Hurst Professor Fotis Kafatos Professor Elliot Meyerowitz Professor Dale Sanders Dr Stephen Tucker
Publishing Editor: Joanna Bolesworth (tel: +44 (0)20 7451 2602; fax: +44 (0)20 7976 1837;
[email protected]) Production Editor: Jessica Mnatzaganian 6–9 Carlton House Terrace, London SW1Y 5AG, UK rstb.royalsocietypublishing.org
Organismal, environmental and evolutionary biology Professor Spencer Barrett Professor Nick Barton Dr Will Cresswell Professor Georgina Mace Professor Yadvinder Malhi Professor Manfred Milinski Professor Peter Mumby Professor Karl Sigmund Health and Disease Professor Zhu Chen Professor Mark Enright Professor Michael Malim Professor Angela McLean Professor Nicholas Wald Professor Joanne Webster
Publishing format Phil. Trans. R. Soc. B articles are published regularly online and in print issues twice a month. Along with all Royal Society journals, we are committed to archiving and providing perpetual access. The journal also offers the facility for including Electronic Supplementary Material (ESM) to papers. Contents of the ESM might include details of methods, derivations of equations, large tables of data, DNA sequences and computer programs. However, the printed version must include enough detail
to satisfy most non-specialist readers. Supplementary data up to 10Mb is placed on the Society's website free of charge. Larger datasets must be deposited in recognised public domain databases by the author.
Conditions of publication Articles must not have been published previously, nor be under consideration for publication elsewhere. The main findings of the article should not have been reported in the mass media. Like many journals, Phil. Trans. R. Soc. B employs a strict embargo policy where the reporting of a scientific article by the media is embargoed until a specific time. The Executive Editor has final authority in all matters relating to publication.
Electronic Submission details For full submission guidelines and access to all journal content please visit the Phil. Trans. R. Soc. B website at rstb.royalsocietypublishing.org.
AIMS AND SCOPE Each issue of Phil. Trans. R. Soc. B is devoted to a specific area of the biological sciences. This area will define a research frontier that is advancing rapidly, often bridging traditional disciplines. Phil. Trans. R. Soc. B is essential reading for scientists working across the biological sciences. In particular, the journal is focused on the following four cluster areas: neuroscience and cognition; organismal and evolutionary biology; cell and developmental biology; and health and disease. As well as theme issues, the journal publishes papers from the Royal Society’s biological discussion meetings. For information on submitting a proposal for a theme issue, consult the journal‘s website at rstb.royalsocietypublishing.org.
ISBN: 978-0-85403-854-1
Copyright © 2010 The Royal Society Except as otherwise permitted under the Copyright, Designs and Patents Act, 1988, this publication may only be reproduced, stored or transmitted, in any form or by any other means, with the prior permission in writing of the publisher, or in the case of reprographic reproduction, in accordance with the terms of a licence issued by the Copyright Licensing Agency. In particular, the Society permits the making of a single photocopy of an article from this issue (under Sections 29 and 38 of this Act) for an individual for the purposes of research or private study. SUBSCRIPTIONS In 2011 Phil. Trans. R. Soc. B (ISSN 0962-8436) will be published twice a month. Full details of subscriptions and single issue sales may be obtained either by contacting our journal fulfilment agent, Portland Customer Services, Commerce Way, Colchester CO2 8HP; tel: +44 (0)1206 796351; fax: +44 (0)1206 799331; email:
[email protected] or by visiting our website at http://royalsocietypublishing.org/info/subscriptions. The Royal Society is a Registered Charity No. 207043.
Selection criteria The criteria for selection are scientific excellence, originality and interest across disciplines within biology. The Editors are responsible for all editorial decisions and they make these decisions based on the reports received from the referees and/or Editorial Board members. Many more good proposals and articles are submitted to us than we have space to print, we give preference to those that are of broad interest and of high scientific quality.
The Royal Society, the national academy of science of the UK and the Commonwealth, is at the cutting edge of scientific progress. We support many top young scientists, engineers and technologists, influence science policy, debate scientific issues with the public and much more. We are an independent, charitable body and derive our authoritative status from over 1400 Fellows and Foreign Members. During 2010, we are celebrating the Royal Society’s 350th anniversary. As part of this, there will be an exciting programme of activities – exhibitions, lectures, conferences, a new book, a vast science festival on the South Bank in London, television and radio broadcasting and much more besides. Our mission: to expand knowledge and further the role of science and engineering in making the world a better place.
Subscription prices 2011 calendar year
Europe
USA & Canada
All other countries
Electronic access only
£2145/€2788
$4058
£2317/US$4153
• invest in future scientific leaders and in
Printed version plus electronic access
£2574/€3345
$4869
£2780/US$4983
• influence policymaking with the best scientific
For further information on the Society’s activities, please contact the following departments on the extensions listed by dialling +44 (0)20 7839 5561, or visit the Society’s Web site (www.royalsociety.org). Research Support (UK grants and fellowships) Research Appointments (Fellowships): 2542 Research Grants: 2223 International travel Grants: 2555 Newton International Fellowships: 2559 Science Advice Science Policy Centre: 2550 Science Communication General enquiries: 2573 Library and Information Services Library/archive enquiries: 2606
The Royal Society’s strategic priorities are to: innovation, advice,
• invigorate science and mathematics education, • increase access to the best science internationally, and
Typeset in India by Techset Composition Limited, Salisbury, UK. Printed by Latimer Trend, Plymouth. This paper meets the requirements of ISO 9706:1994(E) and ANSI/NISO Z39.48-1992 (Permanence of Paper) effective with volume 335, issue 1273, 1992. Philosophical Transactions of the Royal Society B (ISSN: 0962-8436) is published twice a month for $4058 per year by the Royal Society, and is distributed in the USA by Agent named Air Business, C/O Worldnet Shipping USA Inc., 149-35 177th Street, Jamaica, New York, NY11434, USA. US Postmaster: Send address changes to Philosophical Transactions of the Royal Society B, C/O Air Business Ltd, C/O Worldnet Shipping USA Inc, 149-35 177th Street Jamaica, New York, NY11414.
• inspire an interest in the joy, wonder and excitement of scientific discovery.
Cover image: A split graph showing the results of NeighborNet analyses of the Indo-European lexical data. The network has three main regions: Fijian dialects plus Rotuman, western Polynesian and Eastern Polynesian. There is substantial conflicting signal within each region consistent with the break-up of a dialect chain. Scale bar, 0.1. (See article by Russell D. Gray, David Bryant and Simon J. Greenhill, pp. 3923–3933.)
Cultural and linguistic diversity: evolutionary approaches Papers of a Theme issue compiled and edited by James Steele, Peter Jordan and Ethan Cochrane
Contents
Introduction Evolutionary approaches to cultural and linguistic diversity J. Steele, P. Jordan and E. Cochrane
3781
Articles Transmission coupling mechanisms: cultural group selection R. Boyd and P. J. Richerson
3787
Cultural traits as units of analysis M. J. O’Brien, R. L. Lyman, A. Mesoudi and T. L. VanPool
3797
Simulating trait evolution for cross-cultural comparison C. L. Nunn, C. Arnold, L. Matthews and M. Borgerhoff Mulder
3807
Measuring the diffusion of linguistic change J. Nerbonne
3821
Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories P. Heggarty, W. Maguire and A. McMahon
3829
Historical linguistics in Australia: trees, networks and their implications C. Bowern
3845
Language shift, bilingualism and the future of Britain’s Celtic languages A. Kandler, R. Unger and J. Steele
3855
The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles J. J. Tehrani, M. Collard and S. J. Shennan
3865
Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast P. Jordan and S. O’Neill
3875
Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of Remote Oceania E. E. Cochrane and C. P. Lipo
3889
Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits T. E. Currie, S. J. Greenhill and R. Mace
3903
3779
3780
Contents
Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies L. Fortunato and F. Jordan
3913
On the shape and fabric of human history R. D. Gray, D. Bryant and S. J. Greenhill
3923
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3781–3785 doi:10.1098/rstb.2010.0202
Introduction
Evolutionary approaches to cultural and linguistic diversity James Steele1,*, Peter Jordan1,2 and Ethan Cochrane1,3 1
AHRC Centre for the Evolution of Cultural Diversity, Institute of Archaeology, University College London, 31-34 Gordon Square, London WC1H 0PY, UK 2 Department of Archaeology, University of Aberdeen, St Mary’s Building, Elphinstone Road, Aberdeen AB24 3UF, UK 3 International Archaeological Research Institute, Inc., 2081 Young Street, Honolulu, HI 96826-2231, USA
Evolutionary approaches to cultural change are increasingly influential, and many scientists believe that a ‘grand synthesis’ is now in sight. The papers in this Theme Issue, which derives from a symposium held by the AHRC Centre for the Evolution of Cultural Diversity (University College London) in December 2008, focus on how the phylogenetic tree-building and network-based techniques used to estimate descent relationships in biology can be adapted to reconstruct cultural histories, where some degree of inter-societal diffusion will almost inevitably be superimposed on any deeper signal of a historical branching process. The disciplines represented include the three most purely ‘cultural’ fields from the four-field model of anthropology (cultural anthropology, archaeology and linguistic anthropology). In this short introduction, some context is provided from the history of anthropology, and key issues raised by the papers are highlighted. Keywords: evolution; cultural change; phylogenetics
1. INTRODUCTION: CULTURAL TRANSMISSION AND EVOLUTION Evolutionary approaches to cultural change are increasingly influential, and many scientists believe that a ‘grand synthesis’ is now in sight (e.g. Mesoudi, Whiten & Laland 2006). At the ‘microevolutionary’ scale, modern theories of cultural evolution recognize that cultural traditions and innovations are socially transmitted person-to-person between and within generations (respectively, by vertical or oblique and by horizontal transmission routes; Cavalli-Sforza & Feldman 1981), with learners applying generalized rules of thumb in choosing when to engage in independent trial-and-error learning, and in selecting whose example to copy when this is the preferred strategy (transmission biases; Boyd & Richerson 1985). Preservation of a historical signal within the cultural traditions carried by populations depends on traits being consistently selected and replicated, often with some degree of modification, ensuring that they survive from one generation to the next. Cultural ‘macroevolution’ refers to the historical processes that explain cultural similarities and differences between human populations arising from such
* Author for correspondence (
[email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
repeated copying with modification (Mulder et al. 2006). Mesoudi et al. (2006), who propose a multidisciplinary framework for the Darwinian analysis of cultural dynamics, draw an explicit parallel between evolutionary archaeology, cultural anthropology and comparative anthropology (among the cultural sciences), and the macroevolutionary disciplines in biology (respectively, palaeobiology, biogeography and systematics). Historical linguistics should certainly be added to the list of cultural disciplines with a macroevolutionary focus. This special issue, which derives from a symposium held by the AHRC Centre for the Evolution of Cultural Diversity (University College London) in December 2008, focuses on the latest developments in this rapidly expanding field. The main focus is on how the phylogenetic tree-building and network-based techniques used to estimate descent relationships in biology can be adapted to reconstruct cultural histories, where some degree of inter-societal diffusion will almost inevitably be superimposed on any deeper signal of a historical branching process. The disciplines represented include the three most purely ‘cultural’ fields from the four-field model of anthropology (cultural anthropology, archaeology and linguistic anthropology). Integration with the fourth field, physical or biological anthropology, is being actively pursued elsewhere (e.g. Bellwood & Renfrew 2002), but would have required a separate issue in its own right.
3781
This journal is # 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3782
J. Steele et al. Introduction. Cultural and linguistic diversity
It is well known that Darwin saw similarities between the evolution of species and the evolution of languages (van Wyhe 2005), and that the use of genealogical approaches in nineteenth century historical linguistics (e.g. Schleicher 1863) paralleled their use in zoology. As Sereno (1991) points out, languages share with biological organisms the properties of heritability (transmission to offspring); mutation; deme-based structuring of transmission pathways and allopatric (e.g. geographical) and sympatric (e.g. sociolinguistic) divergence mechanisms. Recent phylogenetic and statistical approaches have explored this analogy further, focusing both on applying novel phylogenetic techniques and on explaining empirical heterogeneity in evolutionary rates for different linguistic traits (e.g. Pagel 2009). Descent-withmodification has also been a precept informing studies of manuscript traditions in the genealogical or stemmatological approach since the nineteenth century (Robins 2007), with recent studies applying formal phylogenetic methods (e.g. Barbrook et al. 1998) and modelling the survivorship of variants in terms of an underlying birth– death process (Weitzman 1987; Cisne 2005). Similar approaches to the evolution of stylistic attributes of material culture can also be traced to the late nineteenth century, such as Evans’ attempt to reconstruct the descent histories of variants of design of British Iron Age coins (descendants of copying chains originating ultimately with Macedonian exemplars). Reviewing the development of his own thinking on this matter, Evans (1890) noted that it was a prerequisite for cultural descent with variation ‘1st, that the successive issues or generations of coins should resemble each other sufficiently to pass as current together; but 2nd, that, art being imperfect, there must have been more or less important variations and modifications in the successive dies that were engraved’ (p. 422). He also argued that (other things being equal) there should be a tendency for designs to evolve under unconscious selection for symmetry, and for ease of execution. Much recent experimental work on cultural transmission chains builds on these kinds of early insights and conjectures (e.g. Smith et al. 2008). The transmission histories of functional aspects of traditional technologies, and of social structure (kinship systems and political organization), have been less often analysed from a phylogenetic perspective, because it is usually assumed that such cultural attributes come under stronger selective pressure (and are therefore more prone to horizontal diffusion and to adaptive convergence). However, such assumptions need to be tested, the locus classicus being Galton’s comment in 1889 on a paper by Tylor purporting to show adaptive convergence (and an evolutionary societal trajectory) based on empirical correlations, in a sample of 350 cultures, between type of kinship system (descent and marriage rules) and other measures of cultural complexity (Galton 1889; Tylor 1889). Galton commented that these cultures could not be assumed to be statistically independent of one another, and that the case for convergent evolution could not be made until commonalities had been Phil. Trans. R. Soc. B (2010)
controlled for that are simply owing to common historical descent or to cultural borrowing. Galton’s problem is a problem for testing hypotheses of the adaptive cultural evolution of social systems under selection (e.g. Mace & Pagel 1994), but the flipside is that there may be an underlying conservatism in the transmission of social institutional attributes that could enable historical inferences to be made about cultural ancestry (e.g. Jones 2008). The implication is that societies with shared cultural histories may also inherit common social structural features, and this has been supported empirically by Guglielmino et al. (1995). Such aspects of anthropology’s disciplinary history have shaped the content and structure of this special issue.
2. UNITS AND MODELS OF CULTURAL TRANSMISSION A number of contributors to this issue address general principles of cultural transmission and macroevolutionary dynamics. Most contributors assume the validity of their units of analysis, whether linguistic vocabulary items or material cultural design traits. However, O’Brien et al. (2010) focus explicitly on this matter, examining cultural units of transmission that have some material correlate in the archaeological record in order to reconstruct the evolution of traditions in prehistory. They focus on the hierarchical organization of the underlying ideational units into ‘design recipes’, and their pattern of transmission. Although they have in mind the transmission of recipes for production of artefact designs sampled from a larger design space, their analysis also has implications for the transmission of language-encoded traits (e.g. semantic networks, sociolinguistic conventions). They discuss the circumstances in which cultural traits should be expected to be transferred in a piecemeal fashion, and in which they might be more likely to be transmitted as a coherent package. Historical reconstruction of cultural descent histories presents all the familiar limitations of inverse problems (cf. Boyd & Richerson 2008). In assessing the fit between a model and a set of data, forward approaches to modelling use the known dynamics of the empirical system to predict outcomes for a given parameter constellation. In inverse problems, the outcomes are known to some degree, but the dynamics of the empirical system and the parameter constellation are unknown and must be estimated by ‘reverse engineering’. Typically in such situations, difficulties arise where parameter values cannot be reliably estimated from observable data, and where it can be shown that alternative models and alternative parameter constellations would yield the same observed outcomes (e.g. Steele et al. 2010). Simulation provides one solution, by enabling forward modelling of a simplified version of the historical system. Focusing on cultural applications of cladistic methods to estimate historical descent relationships, Nunn et al. (2010) use simulations to explore the effects of rates of inter-societal diffusion and of innovation on the coherence and reliability of statistical indices of a phylogenetic branching signal (cf. Collard et al. 2006).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Introduction. Cultural and linguistic diversity They suggest that a high value for one such index (the Retention Index; Farris 1989; Naylor & Kraus 1995) may be a reliable indicator that such rates were low and had limited influence on observed inter-societal cultural diversity. However, a low value for the Retention Index can have several causes and is therefore not a sufficient indicator of high inter-societal diffusion rates. Currie et al. (2010) use Nunn et al.’s simulation methodology to explore the robustness of inferences about adaptive convergence in sociocultural evolution, since horizontal transmission between societies only exacerbates ‘Galton’s problem’ for historical interpretation. They find that such inferences are less robust in the presence of either high rates of piecemeal stochastic diffusion of individual traits between societies, or coupled stochastic transfers of the two traits whose correlation is being examined by comparative analysis. Estimating the tempo and mode of inter-societal diffusion therefore becomes crucial for any comparative cross-cultural analysis, if its statistical methodology requires that adaptive convergence be assumed to have been superimposed on a strictly tree-like population history. Focusing on the more fundamental question of what factors affect rates of inter-societal cultural transfer, Boyd & Richerson (2010) suggest that cultural group selection provides one boundary-enforcing mechanism that may give rise to a coherent cultural phylogenetic signal. They outline the conditions for intergroup selection on cultural traditions using the Price equation (Price 1970), suggesting that social transmission biases can ‘fix’ one of a number of alternative solutions within a group when there are multiple cultural equilibria (e.g. local optima in artefact design space or in a space of possible social structures and social rules), with sorting mechanisms such as competitive group extinction, imitative intergroup copying and selective migration then favouring the group that has converged on a globally optimal solution. Kandler et al. (2010) illustrate such sorting processes using modified Lotka – Volterra competition equations to model language shift, suggesting that this can be seen as a form of selective cultural sorting based on contrasts in the underlying social and economic opportunities afforded by membership of competing linguistic communities. They address the conditions required for preservation of two parallel sets of traditions within a group (in this case, bilingualism and the preservation of the heritage encoded in the usages of an endangered language) by stabilizing multiple, sociolinguistically discrete domains of use. In the absence of such selective forces, cultural traditions may diverge though drift-like processes affecting individual traits in a piecemeal way. Nerbonne (2010) examines the effects of geographical proximity on dialect similarity in the absence of strong large-scale social boundaries, showing through simulation that a process analogous to isolationby-distance in genetics can lead to regularities in the sublinear relationships observed between dialect distance and geographical distance. His work emphasizes the importance of spatially localized social interaction biases for the evolution of cultural diversity in such traits. Phil. Trans. R. Soc. B (2010)
J. Steele et al.
3783
3. CULTURAL MACROEVOLUTION AS AN INVERSE PROBLEM Phylogenies (trees) typically describe ancestor– descendant relations between species—can they also be used to describe variation between cultures within the same human species? Do models for describing variation across species work well when looking at human cultural diversification (cf. Mace et al. 2005; Lipo et al. 2006)? Some of the clearest parallels lie between patterns of genetic and linguistic evolution. As with genes, languages are passed between generations and are modified; with enough time linguistic communities may eventually diverge, generating branching trees of historical relatedness. In historical linguistics, cognate occurrence is used to model common ancestry; results are affected by the choice of data, with the core lexicon or basic vocabulary evolving more slowly and consequently giving a stronger phylogenetic signal. There is an inevitable conformist bias reinforcing fidelity of transmission, in that language usage must be co-ordinated and errors corrected if intelligibility is to be maintained. However, even in historical linguistics it is increasingly clear that cases vary in the importance of the signals of branching processes and of areal diffusion. Heggarty et al. (2010) explore the value of phylogenetic network methods to characterize linguistic relationships at larger time and space scales, where the underlying processes discussed by Nerbonne may have been active. They show that such methods are preferable to methods that force a tree topology onto lexical data, and can bring to light distinct aspects of language history including both large-scale branching episodes, and small-scale local diffusion. They also caution that segregation of local interactions on linguistic grounds (for example, associated with rules defining group membership) can complicate the interpretation of branching signals in such datasets. Bowern (2010) explores similar issues in the context of Australian linguistic prehistory and the expansion of the PamaNyungan languages. Using NeighborNet (Huson & Bryant 2006) and a fractionation of lexical items into more and less borrowable semantic classes (based on the prior assumption that, for example, body part terms diffuse less readily, while words for local plants, animals and locally adapted artefact kinds will be adopted more readily by an incoming group) she is able to distinguish a geographically meaningful branching signal, as well as evidence for continuous areal diffusion. In an analogous study not of linguistic vocabulary, but of stylistic variation in a particular material cultural tradition, Cochrane & Lipo (2010) explore the early population history of remote Oceania. They analyse design variation in the characteristic ‘Lapita’ pottery traditions of these initial colonists, highlighting some limitations of cladistic approaches if the history of such traditions was characterized by considerable inter-societal diffusion, and exploring network methodologies to identify such vectors of lateral transfer. The topology of the underlying population history on which specific processes of cultural evolution are superimposed comes up in several other contributions. Revisiting ‘Galton’s Problem’,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3784
J. Steele et al. Introduction. Cultural and linguistic diversity
Fortunato & Jordan (2010) use phylogenetic techniques to fit a tree topology to Indo-European and Austronesian language histories (based on core vocabulary), and introduce methods to estimate ancestral states and stable equilibria for kinship systems within each of these language families. They are explicit that their methodology requires testable assumptions about the appropriateness of phylogenetic reconstructive techniques, and point out that their results will enable better-controlled analyses of the effects of social structure (for instance, sex-biased marital dispersal) on patterns of genetic diversity. Others have meanwhile started to explore statistical techniques that might exploit parallels between the co-transmission of disparate cultural traits, and the biological processes of host– parasite co-speciation. At the heart of this approach is the assumption of some degree of parallel cladogenesis in two or more cultural lineages. Tehrani et al. (2010) introduce cophylogenetic methods from biology to estimate the degree of parallel evolution of distinct traditions (for example, language and material culture), pointing out that in such situations a simple branching signal of perfect coevolution owing to common vertical descent can be confounded not just by inter-societal diffusion of the more borrowable tradition, but also by the loss of localized variants (‘sorting’ events) and by heterogeneous innovation rates. Using the ‘jungles’ algorithm (Charleston 1998) as implemented in TREEMAP2.0 (Charleston & Page 2002), they illustrate the value of co-phylogenetic methods for teasing out such processes in empirical cases. In a thematically related analysis using different case studies and alternative statistical techniques, Jordan & O’Neill (2010) analyse linguistic and material cultural datasets from the Pacific northwest coast, comparing the results of cladistic reconstructions with those obtained using NeighborNet to estimate the degree of intersocietal diffusion of house-building traditions (and the extent to which such diffusion may have followed ethnolinguistic lines). They also raise the question of how closely inter-societal cultural transfers of housebuilding techniques may have reflected the marital transfer and residence rules of the gender whose members were most responsible for those techniques’ transmission. Finally, Gray et al. (2010) use NeighborNet (Huson & Bryant 2006) to analyse linguistic and material cultural datasets, and propose some new statistical indices of the level of reticulation in a phylogenetic network (in contrast with such cladistic measures as the Retention Index). Their approach recognizes that in historical analyses, anthropologists will typically be interested in both branching and diffusive processes and will wish to estimate the importance of each in any given case. Anthropologists will also typically want to estimate the degree of coupled transmission of disparate traits, to assess (for example) the extent to which material cultural attributes fractionate along ethnolinguistic lines. Gray et al.’s worked comparisons of Indo-European and Polynesian language history, and of Polynesian linguistic and material cultural diversity, illustrate the power of these new techniques. Phil. Trans. R. Soc. B (2010)
4. FINAL COMMENT The papers in this special issue illustrate the very significant contributions that evolutionary methods can bring to the cultural sciences, and also some of the key areas in which the greatest innovations are being made in method and theory. These papers also highlight the importance of the continued development of standardized and well-screened comparative datasets of linguistic, material cultural and social structural variation. Online archiving and public availability of both new software and new datasets will be critical for the further development of the field.
We thank the AHRC Centre for the Evolution of Cultural Diversity for sponsoring this symposium, and Manu Davies for coordinating the submission, refereeing and revision timetables. We also thank Claire Rawlinson for editorial guidance and assistance in the final stages of submission.
REFERENCES Barbrook, A. C., Howe, C. J., Blake, N. & Robinson, P. 1998 The phylogeny of The Canterbury Tales. Nature 394, 839. (doi:10.1038/29667) Bellwood, P. & Renfrew, C. (eds) 2002 Examining the farming/language dispersal hypothesis. Cambridge, UK: McDonald Institute for Archaeological Research. Bowern, C. 2010 Historical linguistics in Australia: trees, networks and their implications. Phil. Trans. R. Soc. B 365, 3845– 3854. (doi:10.1098/rstb.2010.0013) Boyd, R. & Richerson, P. J. 1985 Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Boyd, R. & Richerson, P. J. 2008 Response to our critics. Biol. Philos. 23, 301– 315. (doi:10.1007/s10539-0079084-8) Boyd, R. & Richerson, P. J. 2010 Transmission coupling mechanisms: cultural group selection. Phil. Trans. R. Soc. B 365, 3787– 3795. (doi:10.1098/rstb.2010.0046) Cavalli-Sforza, L. L. & Feldman, M. W. 1981 Cultural transmission and evolution. Princeton, NJ: Princeton University Press. Charleston, M. A. 1998 Jungles: a new solution to the host/ parasite phylogeny reconciliation problem. Math. Biosci. 149, 191–223. (doi:10.1016/S0025-5564(97)10012-8) Charleston, M. & Page, R. 2002 TREEMAP 2.0. See http: //taxonomy.zoology.gla.ac.uk/%7emac/treemap/index.html. Cisne, J. L. 2005 How science survived: medieval manuscripts’ ‘demography’ and classic texts’ extinction. Science 307, 1305–1307. (doi:10.1126/science.1104718) Cochrane, E. E. & Lipo, C. P. 2010 Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of Remote Oceania. Phil. Trans. R. Soc. B 365, 3889– 3902. (doi:10.1098/rstb.2010.0091) Collard, M., Shennan, S. J. & Tehrani, J. 2006 Branching, blending and the evolution of cultural similarities and differences among human populations. Evol. Hum. Behav. 27, 169 –184. (doi:10.1016/j.evolhumbehav. 2005.07.003) Currie, T. E., Greenhill, S. J. & Mace, R. 2010 Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Phil. Trans. R. Soc. B 365, 3903– 3912. (doi:10.1098/rstb.2010.0014) Evans, J. 1890 The coins of the ancient Britons: supplement. London, UK: B. Quaritch.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Introduction. Cultural and linguistic diversity Farris, J. 1989 The retention index and the rescaled consistency index. Cladistics 5, 417 –419. (doi:10.1111/j.10960031.1989.tb00573.x) Fortunato, L. & Jordan, F. 2010 Your place or mine? A phylogenetic comparative analysis of marital residence in IndoEuropean and Austronesian societies. Phil. Trans. R. Soc. B 365, 3913–3922. (doi:10.1098/rstb.2010.0017) Galton, F. 1889 Discussion of Tylor (1889). J. Anthropol. Inst. GB Ireland 18, 270. Gray, R. D., Bryant, D. & Greenhill, S. J. 2010 On the shape and fabric of human history. Phil. Trans. R. Soc. B 365, 3923– 3933. (doi:10.1098/rstb.2010.0162) Guglielmino, C. R., Viganotti, C., Hewlett, B. & Cavalli-Sforza, L. L. 1995 Cultural variation in Africa: role of mechanisms of transmission and adaptation. Proc. Natl Acad. Sci. USA 92, 7585–7589. (doi:10.1073/pnas.92.16.7585) Heggarty, P., Maguire, W. & McMahon, A. 2010 Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories. Phil. Trans. R. Soc. B 365, 3829–3843. (doi:10.1098/rstb.2010.0099) Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254 –267. (doi:10.1093/molbev/msj030) Jones, D. 2008 Kinship and deep history: exploring connections between culture areas, genes, and languages. Am. Anthropol. 105, 501–514. (doi:10.1525/aa.2003.105.3.501) Jordan, P. & O’Neill, S. 2010 Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast. Phil. Trans. R. Soc. B 365, 3875– 3888. (doi:10.1098/rstb.2010.0092) Kandler, A., Unger, R. & Steele, J. 2010 Language shift, bilingualism and the future of Britain’s Celtic languages. Phil. Trans. R. Soc. B 365, 3855–3864. (doi:10.1098/ rstb.2010.0051) Lipo, C. P., O’Brien, M. J., Collard, M. & Shennan, S. (eds) 2006 Mapping our ancestors: phylogenetic approaches in anthropology and prehistory. New York, NY: Aldine. Mace, R. & Pagel, M. 1994 The comparative method in anthropology. Curr. Anthropol. 35, 549– 564. (doi:10. 1086/204317) Mace, R., Holden, C. J. & Shennan, S. (eds) 2005 The evolution of cultural diversity: a phylogenetic approach. Walnut Creek, US: Left Coast Press. Mesoudi, A., Whiten, A. & Laland, K. N. 2006 Towards a unified science of cultural evolution. Behav. Brain Sci. 29, 329– 383. (doi:10.1017/S0140525X06009083) Mulder, M. B., Nunn, C. L. & Towner, M. C. 2006 Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088)
Phil. Trans. R. Soc. B (2010)
J. Steele et al.
3785
Naylor, G. & Kraus, F. 1995 The relationship between s and m and the retention index. Syst. Biol. 44, 559–562. Nerbonne, J. 2010 Measuring the diffusion of linguistic change. Phil. Trans. R. Soc. B 365, 3821–3828. (doi:10. 1098/rstb.2010.0048) Nunn, C. L., Arnold, C., Matthews, L. & Mulder, M. B. 2010 Simulating trait evolution for cross-cultural comparison. Phil. Trans. R. Soc. B 365, 3807–3819. (doi:10. 1098/rstb.2010.0009) O’Brien, M. J., Lyman, R. L., Mesoudi, A. & VanPool, T. L. 2010 Cultural traits as units of analysis. Phil. Trans. R. Soc. B 365, 3797–3806. (doi:10.1098/rstb. 2010.0012) Pagel, M. 2009 Human language as a culturally transmitted replicator. Nat. Rev. Genet. 10, 405 –415. (doi:10.1038/ nrg2560) Price, G. R. 1970 Selection and covariance. Nature 227, 520–521. (doi:10.1038/227520a0) Robins, W. 2007 Editing and evolution. Lit. Compass 4, 89–120. (doi:10.1111/j.1741-4113.2006.00391.x) Schleicher, A. 1863 Die Darwinsche Theorie und die Sprachwissenschaft. Weimar, Germany: H. Boehlau. Sereno, M. I. 1991 Four analogies between biological and cultural/linguistic evolution. J. Theor. Biol. 151, 467–507. (doi:10.1016/S0022-5193(05)80366-2) Smith, K., Kalish, M. L., Griffiths, T. L. & Lewandowsky, S. 2008 Cultural transmission and the evolution of human behaviour. Phil. Trans. R. Soc. B. 363, 3469–3603. (doi:10.1098/rstb.2008.0147) Steele, J., Glatz, C. & Kandler, A. 2010 Ceramic diversity, random copying, and tests for selectivity in ceramic production. J. Archaeol. Sci. 37, 1348 –1358. (doi:10.1016/ j.jas.2009.12.039) Tehrani, J. J., Collard, M. & Shennan, S. J. 2010 The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles. Phil. Trans. R. Soc. B 365, 3865–3874. (doi:10.1098/rstb.2010.0020) Tylor, E. B. 1889 On a method of investigating the development of institutions; applied to laws of marriage and descent. J. Anthropol. Inst. GB Ireland 18, 245 –272. (doi:10.2307/2842423) van Wyhe, J. 2005 The descent of words: evolutionary thinking 1780– 1880. Endeavour 29, 94–100. (doi:10.1016/ j.endeavour.2005.07.002) Weitzman, M. P. 1987 The evolution of manuscript traditions. J. R. Stat. Soc. Ser. A(General ) 150, 287– 308. (doi:10.2307/2982040)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3787–3795 doi:10.1098/rstb.2010.0046
Transmission coupling mechanisms: cultural group selection Robert Boyd1,* and Peter J. Richerson2 1
Department of Anthropology, University of California, Los Angeles, CA 90095, USA School of Environmental Science and Policy, University of California, Davis, CA 95616, USA
2
The application of phylogenetic methods to cultural variation raises questions about how cultural adaption works and how it is coupled to cultural transmission. Cultural group selection is of particular interest in this context because it depends on the same kinds of mechanisms that lead to tree-like patterns of cultural variation. Here, we review ideas about cultural group selection relevant to cultural phylogenetics. We discuss why group selection among multiple equilibria is not subject to the usual criticisms directed at group selection, why multiple equilibria are a common phenomena, and why selection among multiple equilibria is not likely to be an important force in genetic evolution. We also discuss three forms of group competition and the processes that cause populations to shift from one equilibrium to another and create a mutation-like process at the group level. Keywords: cultural transmission; multi-level selection; cultural adaptation
1. INTRODUCTION The application of phylogenetic methods to cultural variation has burgeoned over the past decade. As evidenced by the papers in this issue, this project has led to important new inferences about cultural history, human demography and human migrations. Much of this work has been done without reference to an explicit theory of cultural adaptation, simply applying phylogenetic statistical methods developed in biology to cultural data, and there are also important unanswered questions that might be illuminated by a better understanding of how cultural evolution works. For example, cultural phylogenies allow the application of new statistical methods developed in evolutionary biology (e.g. Huelsenbeck et al. 2001) to solve Galton’s problem. This approach has proven very useful (e.g. Holden & Mace 2005), but it also raises important questions about the relationship between cultural phylogenies and cultural evolution. There is little gene flow between most biological species, and thus to a first approximation all of the genes within a species share a common history. This is definitely not true of cultural lineages—traits and trait complexes flow from one lineage to another, and thus different trait complexes may have different histories from each other and from the genes in the populations that carry them. As a consequence, sometimes it is not easy to know which phylogeny should be used to constrain phylogenetic inference. Linguistic phylogenies are often used, but, contrary to what is sometimes assumed, they are not histories of biological populations. Language phylogenies are appropriate for traits that are transmitted along with language, but are
* Author for correspondence (
[email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
not appropriate for traits that have different patterns of transmission. Thus, it is important to ask how different processes of cultural adaptation work, and in particular, focus on adaptive processes that cause different trait complexes to have similar cultural histories. For many traits, cultural evolutionary processes can lead to many distinct steady-state outcomes, or ‘multiple stable equilibria’. Which outcome is reached will then be determined by the accidents of initial conditions, and knowing the adaptive consequences of different traits does not allow us to predict the outcome. However, if such a population is subdivided into partially isolated subpopulations, adaptive processes can maintain different subpopulations near different equilibria. Then if subpopulations near one equilibrium have lower extinction rates or produce more migrants, the variants that characterize that equilibrium can spread to the population as a whole. This process is not subject to the usual criticisms directed at group selection for altruistic variants because adaptation within groups does not compete with selection among groups. It can work even if populations are very large, and migration rates are substantial. The main requirement is that rates of adaptation within groups are high when compared with rates of migration between them, and as a result this process is more likely to be important for cultural evolution than for genetic evolution. When these conditions are satisfied, group selection will lead to the spread of the most group-beneficial equilibrium. In this paper, we review and discuss cultural group selection, distinguishing it from other group selection processes, discussing the processes that lead to multiple equilibria, the processes that select among equilibria and the random processes that give rise to new group-level variants. We believe and think that this process is especially relevant to cultural
3787
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3788
R. Boyd & P. J. Richerson
Cultural group selection
phylogenetics because some of the same processes that lead to multiple equilibria tend to cause traits to be transmitted vertically within populations rather than horizontally among them. 2. GROUP SELECTION HAS MANY FACES The modern group selection controversy began in the early 1960s when Wynne-Edwards (1962) proposed that a number of interesting bird behaviours evolved because they promoted group survival. Populations in which the behaviour was common survived and prospered, while those in which it was rare perished. While casual group functionalism was common in those days, Wynne-Edwards was much clearer than his contemporaries that it was selection among groups that gave rise to such group-level adaptations. The book generated a storm of controversy, and luminaries like Williams (1966) and Maynard Smith (1964) penned critiques explaining why this mechanism, then called group selection, was unlikely to be an important evolutionary process. Moreover, they also showed how such traits could evolve owing to individual and kin selection. The result was the beginning of an ongoing, and highly successful revolution in our understanding of the evolution of animal behaviour, a revolution that is rooted in carefully thinking about the individual and nepotistic function of behaviours. In the early 1970s, Price (1970, 1972) developed a powerful new mathematical formalism that describes all natural selection as going on in a series of nested levels: among genes within an individual, among individuals within groups and among groups. While this ‘multi-level’ approach and the older gene-centred approaches are mathematically equivalent, both have proved useful in understanding many evolutionary problems. However, rise of the multi-level approach also led to confusion about what kinds of evolutionary processes should be called ‘group selection’. Some authors use group selection to mean the process that Wynne-Edwards envisioned—selection between sizable groups made up of mostly genealogically distantly related individuals, while others use group selection to refer to selection involving any kind of group in a multi-level selection analysis including even pairs of individuals interacting in, say, the hawk– dove game. The real scientific question is always: does the population structure in question lead to selection that favours genetic variants of interest? In the case of the mechanism proposed by Wynne-Edwards, we want to know, can selection among large groups of distantly related individuals, sometimes labelled ‘interdemic group selection’, lead to the evolution of group-beneficial traits when it is opposed by individual selection? The answer to this question is fairly clear: only when groups are small or there is very little gene flow between them. To see why, it will be useful to introduce Price’s formalism. In a population structured into groups, the change in frequency of a gene undergoing selection, Dp, is given by Dp /
VG bG |fflffl{zfflffl}
between groups
þ VW b W |fflfflffl{zfflfflffl}
within groups
Phil. Trans. R. Soc. B (2010)
The first term gives the change owing to selection between groups, and the second term gives the change in frequency owing to changes within groups. The bs give the effect of the behaviour on the fitness of groups (bG) and individuals (bW). A behaviour is beneficial to the group when it increases group fitness, or bG . 0. If it is costly to the individual bW , 0. The Vs are the variance in gene frequency between groups (VG) and within groups (VW). Population genetics theory tells us that when groups are large, selection is weak, and there is even a modest amount of migration among them, the variance between individuals (VW) will be much larger than the variance between groups (VG, Rogers 1990). Thus, unless selection within groups is much weaker than selection among groups (bG bW), group selection cannot overcome opposing individual selection.
3. INTERDEMIC GROUP SELECTION CAN BE IMPORTANT WHEN THERE ARE MULTIPLE STABLE EQUILIBRIA This does not mean, however, that interdemic group selection is never important—it can play a crucial role in determining evolutionary outcomes when there are multiple stable equilibria. Interestingly, this idea dates to the early 1930s when the great population geneticist Sewall Wright (1931) first outlined his ‘shifting balance’ theory of evolution. Wright knew from his empirical work that interaction between genes often leads to evolutionary systems with multiple stable equilibria. The simplest case is underdominance at a single locus. Suppose there are two alleles, A and B, and that the fitnesses of the three genotypes are WAA ¼ 1, WAB ¼ 1 2 s, and WBB ¼ 1 þ t (where t, s . 0). It is easy to see that populations in which either allele is common can resist invasion by the alternative allele. For example, if A is common, most of the A alleles will occur in AA homozygotes and thus have average fitness of one, while most B alleles are in heterzygotes and have fitness 1 2 s. When B is common, it has higher fitness for the same reason. This means that if A is initially common, individual selection will never lead to the spread of the B allele, even though it leads to higher fitness. However, Wright argued that group selection can lead to the spread of the B allele. Suppose that a large population is subdivided into a number of genetically well-mixed demes linked by low rates of gene flow. Selection is strong enough that in any given deme either A is common or B is common. Now, apply the Price equation to this population: Since one or the other allele is common in each deme, VW is small in all demes, and since selection within different demes pulls in opposite directions, the average value of bW will also be small. Thus, the within-group component of the Price equation is close to zero—because each deme is near a stable equilibrium, selection within groups has little effect on the frequency of the two alleles. Now, consider the between-group term: because selection is much stronger than migration, there will be lots of variation among demes. Thus, if the fact that the B allele has higher average fitness translates into between-group selection, this process
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural group selection will lead to the spread of that allele. This can happen in at least three different ways. Higher average fitness could lead to more out-migration, and this in turn can lead to the spread of the B allele through differential proliferation, the basis of the third phase of Wright’s shifting balance model (Gavrilets 1995). Second, B will spread if groups with higher average fitness have lower extinction rates and new groups are formed by the fissioning of existing ones (Boyd & Richerson 1990). Finally, B will spread if higher average fitness attracts immigrants, and as a consequence the larger group spreads or splits (Boyd & Richerson 2009). 4. MANY PROCESSES LEAD TO MULTIPLE STABLE EQUILIBRIA We believe that evolutionary systems with many equilibria are very common. Engineering experience suggests that even ordinary adaptive problems like the design of tools or shelters typically have many locally optimal solutions. The frequency dependence introduced by social interaction vastly multiplies the potential for multiple equilibria. For example, coordination systems resulting from communication, group movement and bargaining generate many equilibria. The possibilities for multiple equilibria are further increased by repeated interactions and contingent behaviour. Especially important are systems of moral norms enforced by reputation, retribution or reciprocity, which can stabilize a vast range of behaviours. Finally, a conformist bias in social learning can stabilize virtually any behaviour. (a) Ordinary adaptive problems often have many solutions Textbook examples of evolution as an optimization process sometimes portray the adaptive problem as climbing a smooth hill with a single local maximum. However, there are good reasons to believe that real adaptive problems often have vast numbers of locally optimal solutions. Real world design problems have many dimensions that can interact in a complicated, nonlinear fashion. Even seemingly simple problems have much hidden complexity. Consider, for example, the design of bows. The overall length of the bow affects how strongly it must be bent to generate a given amount of force. Thus, the optimal construction depends on length. Shorter bows must sustain greater strains, and this affects the best cross-section, the kind of wood that is used, how the wood is cut from the tree, whether the bow is sinew backed, whether the handle is live or static and a host of other attributes. The best choice for any given attribute affects what is best for others. Once most people in a society have converged on a particular solution, trial-and-error often will not generate progress because small changes will make the design worse. However, different groups may come to different solutions, which then can compete either directly, say in warfare, or indirectly to attract imitators. (b) Coordination games In many kinds of social interactions, individuals can increase their payoffs if they can coordinate their Phil. Trans. R. Soc. B (2010)
R. Boyd & P. J. Richerson
3789
Table 1. Payoff matrix for a simple coordination game. younger son
older son
partition primogenitor
partition
primogenitor
2,2 0,0
0,0 5,1
choices. Game theorists refer to such interactions as coordination games. Bargaining interactions provide a good illustration of why coordination games lead to multiple stable equilibria. Suppose in a particular population there are two cultural variants governing beliefs about inheritance: equal partition among brothers, and primogenitor (only the oldest brother inherits). To keep things simple, let us suppose that all families have exactly two sons, and that the payoffs associated with each combination of beliefs within a family are given in table 1. When brothers agree, they have a higher payoff than when they disagree because disputes are costly. This means that once either variant becomes common, people with the common variant achieve a higher payoff on average, and if the cultural evolution is driven by payoffs (for example, because people imitate the successful), then both inheritance institutions will be evolutionarily stable. Also notice that coordination games may involve conflicts of interest. Younger sons prefer partition while older sons prefer primogenitor. A wide range of social interactions give rise to coordination games. Classic examples are social conventions, drive on the right versus drive on the left, matrilocal versus patrilocal post-marital residence. Signalling systems may also have many equilibria and the behaviour signaller and receiver have to be coordinated. People could signal their health or wealth in many different ways. Some may be better than others, but once the whole population converges on a given channel, individuals who deviate will lose out. (c) Conformist social learning There are good reasons to believe that social learning is often subject to a conformist bias, meaning that individuals are disproportionately likely to imitate the most common variant they observe in their social environment. Conformist makes sense if we think of the psychology of social learning as having been designed to acquire adaptive information. Adaptive processes will tend to increase the frequency of the locally adaptive behaviour. Transmission and learning errors, and changing environments will reduce it, but under many circumstances the most adaptive behaviour will, on average, be the most commonly observed behaviour. Thus, preferentially imitating the most common behaviour, individuals will increase the chance of acquiring the most adaptive behaviour. This intuitive argument is supported by modelling work which indicates that selection favours a conformist psychology in variable environments (Henrich & Boyd 1998; McElreath et al. 2008) and when cultural
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3790
R. Boyd & P. J. Richerson
Cultural group selection
transmission is error-prone (Henrich & Boyd 2002). Laboratory experiments support these predictions (Efferson et al. 2008; McElreath et al. 2008). Conformist social learning creates an evolutionary force that causes common variants to become more common and rare variants to become more rare. If this effect is strong compared with migration, then variation among groups can be maintained. To see why, think of a number of groups linked by migration. Assume that the two cultural variants affect religious beliefs: ‘believers’ are convinced that moral people are rewarded after death and the wicked suffer horrible punishment for eternity, while ‘heretics’ do not believe in any afterlife. Because they fear the consequences, believers behave better than heretics—more honestly, charitably and selflessly. As a result, groups in which believers are common are more successful than groups in which heretics are common. Moreover, it is plausible that people’s decision to adopt one cultural variant or the other might not be strongly affected by content bias. True, people seek comfort, pleasure and leisure and this can cause them to behave wickedly. However, a desire for comfort also causes people to worry about spending an eternity buried in a flaming tomb. Since people are uncertain about the existence of an afterlife, they might not be strongly biased in favour of one cultural variant or the others. As a result, they are strongly influenced by the cultural variant that is common in their society. People who grow up surrounded by believers, choose to believe, while those who grow up among worldly atheists do not.
(d) Moralistic punishment Moralistic punishment can also stabilize a very wide range of behaviours. To see why, consider the following simple example. Imagine a population subdivided into a number of groups. Cultural practices spread between groups because either people migrate, or they sometimes adopt ideas from neighbouring groups. Two alternative culturally transmitted moral norms exist in the population, norms that are to be enforced by moralistic punishment. Let us call them norm x and norm y. These could be ‘must wear a business suit at work’ and ‘must wear a dashiki to work’, or ‘a person owes primary loyalty to their kin’ and ‘a person owes primary loyalty to their group’. In groups where one of the two norms is common, people who violate the norm are punished. Suppose that people’s innate psychology causes them to be biased in favour of norm y, and therefore y will tend to spread, all other things being equal. Nonetheless, when norm x is sufficiently common, the effects of punishment overcome this bias and people tend to adopt norm x. In such groups, new immigrants whose beliefs differ from the majority (or people who have adopted ‘foreign’ ideas) rapidly learn that their beliefs get them into trouble and adopt the prevailing norm. When more believers in norm y arrive, they find themselves to be in the minority, rapidly learn the local norms and maintain norm x despite the fact that it is not the norm that fits best with their evolved psychology. Phil. Trans. R. Soc. B (2010)
5. INTERDEMIC GROUP SELECTION IS PROBABLY MORE IMPORTANT IN CULTURAL THAN GENETIC EVOLUTION This mechanism only works when the adaptation within groups is a much stronger evolutionary force than migration among groups, and thus is not likely to be an important force in genetic evolution. Evolutionary biologists normally think of selection as being weak, and, although there are many exceptions to this rule, it is a useful generalization. So, for example, if one genotype had a 5 per cent selection advantage over the alternative genotype, this would be thought to be an extremely strong selection. So, suppose that a novel, group-beneficial genotype has arisen, and that it has become common in one local group where it has a 5 per cent advantage over the genotype that predominates in the population as a whole. For group selection to be important, the novel type must remain common long enough to spread by group selection, and this will only be possible if the migration rate per generation is substantially less than 5 per cent. Otherwise, the effects of migration will swamp the effects of natural selection. But this is not very much migration. In most group-living primates, the members of one sex leave at sexual maturity, and there are about two generations present at any moment, and thus the migration rate between neighbouring such primate groups is of the order of 25 per cent per generation. While migration rates are notoriously difficult to measure, most likely migration rates are typically high among small local groups that suffer frequent extinction. Migration rates between larger subdivisions of a population are probably much lower, but so too will be the extinction rates. In contrast, we know that social learning processes are very rapid, and that they can maintain behavioural differences among neighbouring human groups despite substantial flows of people and ideas between them. As a result, human groups are more like different species than populations of the same species, and this may be why phylogenetic methods work so well for cultural variation. If two human groups have different adaptations to the same ecological niche, the dynamics of their evolution often look more like competitive exclusion than conventional multi-level evolution in a metapopulation of the same species. The speed of cultural competitive exclusion is often enhanced because people moving from the losing group can be assimilated into the winning one, or because ideas diffuse from winning to losing groups. Barriers between human ‘species’ are often selective, limiting the effects of migration into the successful competitor but accelerating the flow of successful ideas into the less successful group. Recently, Lehmann, Feldman and colleagues (Lehmann & Feldman 2008; Lehmann et al. 2008) have published several theoretical studies which they claim show that culture does not facilitate the evolution of cooperation by the mechanisms outlined above. In each paper, they present models that they claim (in one case ‘exactly’: Lehmann et al. 2008, p. 22) capture the processes discussed here, and then derive results showing that culture makes it harder for selection to favour cooperative behaviour. These
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural group selection claims are mistaken. In all of these papers, Lehmann and colleagues assume that selection (or analogous adaptive cultural processes) is weak enough that it can be ignored when calculating the variation among groups (or alternatively, the relatedness within groups). This ‘quasi-equilibrium’ assumption means that neither the multiple adaptive equilibria, nor conformist social learning maintains variation among groups. Instead, their models assume that groups are small enough that there is a substantial probability that two individuals chosen randomly from within a group acquired their culture from the same model, and, as a consequence, common descent and limited migration can give rise to substantial variation among large groups. Because the way that variation is maintained is radically different in these models, they have no relevance to the processes discussed in this paper. Moreover, their explanation of between-group variation is not empirically plausible. In the modern world, there is substantial variation in beliefs and norms among ethnic groups and nation states that number millions of individuals (Bell et al. 2010). It is not plausible that four million Kamba (East African ethnic group) share language and many beliefs (Bell et al. 2010) because a substantial fraction of the Kamba acquired their beliefs by imitating the same person. Nor is this account believable for the smallscale societies that dominated most of human history because even in such societies, the scale of cultural variation is larger than the scale of everyday interaction. For example, Australian groups that shared a common language and culture typically numbered between 500 and 5000 (Keen 2004). If we assume that bands numbered between 10 and 100 people, and that everybody in a band imitates a single individual, then the formulae used by Lehman and colleagues predict that only a small fraction of cultural variation will be between ethnolinguistic units.
6. THREE TYPES OF INTERGROUP COMPETITION HAVE BEEN STUDIED In the Origin of Species, Darwin (1859) famously argued that three conditions are necessary for adaptation by natural selection: first, there must be a ‘struggle for existence’ so that not all individuals survive and reproduce. Second, there must be variation so that some types are more likely to survive and reproduce than others, and finally, variation must be heritable so that the offspring of survivors resemble their parents. While Darwin usually focused on individuals, the same three postulates apply to any reproducing entity—molecules, genes and cultural groups. We have seen that rapid cultural adaptation in human societies combined with multiple equilibria give rise to stable, between-group differences that are heritable at the group level. Symbolic boundary markers act to limit the flow of ideas from one group to the other. Thus, there will be adaptation at the group level as long as groups compete in such a way that the cultural variants that characterize successful groups spread. We have been able to think of three different mechanisms of intergroup competition. Phil. Trans. R. Soc. B (2010)
R. Boyd & P. J. Richerson
3791
(a) Variation in extinction rates The simplest mechanism is intergroup competition. The spread of the Nuer at the expense of the Dinka in the nineteenth century Sudan provides a good example. During the nineteenth century, each language group was divided into a number of independent polities. Cultural differences in norms between the two groups meant that the Nuer were able to organize larger war parties than the Dinka. The Nuer, who were driven by the desire for more grazing land, attacked and defeated their Dinka neighbours, occupied their territories and assimilated tens of thousands of Dinka into their communities. This example illustrates the requirements for cultural group selection by intergroup competition. Contrary to some critics (Palmer et al. 1997), there is no need for groups to be strongly bounded, individual-like entities. The only requirement is that there be persistent cultural differences between groups, and these differences must affect the group’s competitive ability (Boyd & Richerson 1990). Losing groups must be replaced by the winning groups. Interestingly, the losers do not have to be killed. The members of losing groups just have to disperse or to be assimilated into the victorious group. Losers will be socialized by conformity or punishment, so even very high rates of physical migration need not result in the erosion of cultural differences. This kind of group selection can be a potent force even if groups are usually very large. Group competition is common in small-scale societies. The best data come from New Guinea, which provides the only large sample of simple societies studied by professional anthropologists before they experienced major changes owing to contact with Europeans. Joseph Soltis (Soltis et al. 1995) assembled data from the reports of early ethnographers in New Guinea. Many studies report appreciable intergroup conflict and about half mention cases of social extinction of local groups. Five studies contained enough information to estimate the rates of extinction of neighbouring groups (table 2). The typical pattern is for groups to be weakened over a period of time by conflict with neighbours and finally to suffer a sharp defeat. When enough members become convinced of the group’s vulnerability to further attack, members take shelter with friends and relatives in other groups, and the group becomes socially extinct. At these rates of group extinction, it would take between 20 and 40 generations, or 500 – 1000 years, for an innovation to spread from one group to most of the other local groups by cultural group selection. These data suggest that cultural group selection is a fairly slow process. But then, so are the actual rates of increase in political and social sophistication we observe in the historical and archaeological records. Change in the cultural traditions that eventually led to large-scale social systems like those that we live in proceeded at a modest rate. The relatively slow rate of evolution of cultural group selection may explain the 5000-year lag between the beginnings of agriculture and the first primitive city– states, and the five millennia that passed between the origins of simple states and modern complex societies.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3792
R. Boyd & P. J. Richerson
Cultural group selection
Table 2. Extinction rates for cultural groups from five regions in New Guinea. Adapted from Soltis et al. (1995). region
number of groups
number of social extinctions
number of years
% groups extinct every 25 years
Mae Enga Maring Mendi Fore/Usurufa Tor
14 13 9 8 –24 26
5 1 3 1 4
50 25 50 10 40
17.9 7.7 16.6 31.2 –10.4 9.6
(b) Imitation of successful neighbours A propensity to imitate the successful can also lead to the spread of group-beneficial variants. People often know about the norms that regulate behaviour in neighbouring groups. They know that we can marry our cousins here, but over there they cannot; or anyone is free to pick fruit here, while individuals own fruit trees there. Suppose different norms are common in neighbouring groups, and that one set of norms causes people to be more successful. Both theory and empirical evidence suggest that people have a strong tendency to imitate the successful (Henrich & Gil-White 2001; Richerson & Boyd 2005; McElreath et al. 2008). Consequently, behaviours can spread from groups at high payoff equilibria to neighbouring groups at lower payoff equilibria because people imitate their more successful neighbours. A mathematical model suggests that this process will spread group-beneficial beliefs from one group to another, resulting in a wave-like advance, and that this occurs over a wide range of conditions (Boyd & Richerson 2002). The model also suggests that such spread can be rapid. Roughly speaking, it takes about twice as long for a group-beneficial trait to spread from one group to another as it does for an individually beneficial trait to spread within a group. This kind of group selection is also likely to be faster than that owing to differential extinction because it readily leads to the recombination of group-beneficial strategies that initially arise in different groups (Boyd & Richerson 2002). The exact combination of strategies necessary to support complex, adaptive social institutions would seem unlikely to arise through a single chance event. It is much more plausible that complex institutions are assembled in numerous small steps. Differential extinction models are analogous to the evolution of an asexual population in which they lack any mechanism that allows the recombination of beneficial strategies that arise in different populations, and thus require innovations to occur sequentially in the same lineage. In contrast, the spread of ideas from successful groups allows recombination of different strategies and thus more rapid cumulative change. The rapid spread of Christianity in the Roman Empire may provide an example of this process. Between the death of Christ and the rule of Constantine, a period of about 260 years, the number of Christians increased from a only a handful to somewhere between 6 and 30 million people (depending on whose estimate you accept). This sounds like a huge increase, but it turns out that it is equivalent to a 3 – 4% annual rate of increase, about the same as Phil. Trans. R. Soc. B (2010)
the growth rate of the Mormon Church over the past century. According to the sociologist Rodney Stark, many Romans converted to Christianity because they were attracted to what they saw as a better quality of life in the early Christian community. Pagan society had weak traditions of mutual aid, and the poor and sick often went without any help at all. In contrast, in the Christian community norms of charity and mutual aid created ‘a miniature welfare state in an empire which for the most part lacked social services’ (Johnson 1976, p. 75, quoted in Stark 1997). Such mutual aid was particularly important during the several severe epidemics that struck the Roman Empire during the late Imperial period. Unafflicted pagan Romans refused to help the sick or bury the dead. As a result, some cities devolved into anarchy. In Christian communities, strong norms of mutual aid produced solicitous care of the sick, and reduced mortality. Both Christian and Pagan commentators attribute many conversions to the appeal of such aid. For example, the emperor Julian (who detested Christians) wrote in a letter to one of his priests that Pagans needed to emulate the virtuous example of the Christians if they wanted to compete for their souls, citing ‘their moral character even if pretended’ and ‘their benevolence toward strangers’ (Stark 1997; pp. 83– 84). Middle class women were particularly likely to convert to Christianity, probably because they had higher status and greater marital security within the Christian community. Roman norms allowed polygyny, and married men had great freedom to have extramarital affairs. In contrast, Christian norms required faithful monogamy. Pagan widows were required to remarry, and when they did, they lost control of all of their property. Christian widows could retain property, or, if poor, would be sustained by the church community. Demographic factors were also important in the growth of Christianity. Mutual aid led to substantially lower mortality rates during epidemics, and a norm against infanticide led to substantially higher fertility among Christians. This form of group selection may also explain the spread of moral norms that stigmatize ‘victimless’ crimes, for example, drunken-ness or prostitution. There is by now a large literature that indicates that people often have time-inconsistent preferences and as a result, they often make choices in the short run that they know are not in their long-run interest. It is plausible that social norms help people solve these problems by creating short-run incentives to do the right thing. I may not be able to resist a drink when the costs are all in the distant future, but make a different decision if I suffer immediate social disapproval. It is
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural group selection also easy to see why such norms persist once they are established. If everyone agrees that self-control is proper behaviour and punish people who disagree, then the norm will persist. The problem is that the same mechanism can stabilize any norm. People could just as easily agree that excessive drinking is proper behaviour and punish teetotalers. If, however, groups in which drinkers are stigmatized achieve better outcomes, and if those outcomes are observable, the norm can spread from one group to another by differential imitation.
(c) Selective migration Selective migration, the tendency of people to move from less desirable to more desirable societies, can also lead to the spread of some kinds of equilibria. We are very familiar with this process in the modern world where streams of migrants flow from societies that migrants perceive as offering them fewer opportunities toward ones that appear to offer them more (Martin 2005). The extensive literature on this topic (e.g. Alba & Nee 2003; Borjas 1994) supports two generalizations: (i) that migrants flow from societies where immigrants find their prospects poor to those where they perceive them to be better, and (ii) most immigrant populations assimilate to the host culture within a few generations. Ethnographic evidence suggests that selective immigration is not limited to industrialized nation states, and thus may be an ancient phenomenon (Knauft 1985; Cronk 2002). The spread of cultural institutions associated with ancient complex societies, such as China, Rome and India, supports the idea that this process is not new. Ancient imperial systems often expanded militarily but the durable ones, such as Rome, succeeded by assimilating conquered peoples and by inducing a flow of migrants across their boundaries. Although the Roman empire eventually faded, its most attractive institutions were adapted by successor polities and persist in modified form to this day. Rome, India, China and Islamic civilization stand in stark contrast to pure conquest empires like that of the Mongols, which expanded but did not assimilate. The simple mathematical model of this process (Boyd & Richerson 2009) indicates that it has two qualitatively evolutionary outcomes. The model assumes that there are two possible evolutionary equilibria in an isolated population, and one equilibrium leads to higher average welfare than the other. The population is subdivided into two subpopulations linked by migration. There is more migration from low-payoff to high-payoff subpopulations than the reverse. When local adaptation is strong enough when compared with migration to maintain cultural variation among subpopulations, the population as a whole evolves towards a polymorphic equilibrium at which the variants that produce higher average welfare are more common, but the lower payoff variant also persists. Initial subpopulation size and the sizes of the basins of attraction play relatively minor roles. When migration is stronger, however, initial population sizes and sizes of the basins of attraction predominate. The variant that is common in the Phil. Trans. R. Soc. B (2010)
R. Boyd & P. J. Richerson
3793
larger of the two populations tends to spread and the other variant tends to disappear even it yields a higher payoff.
7. THREE PROCESSES CAN SHIFT GROUPS TO NOVEL EQUILIBRIA Selection always requires a source of variation, and so it is with all of the group-selection mechanisms described above. If all of the groups are characterized by the same equilibrium behaviour, group selection can have no effect. There must be some process that causes groups to shift from one equilibrium to another, the analogue of a new mutation at the group level. Three different processes may have this effect. First, sampling variation affecting who happens to get copied and who happens to interact with whom will generate random changes in the frequencies of cultural variants analogous to genetic drift, and these will occasionally cause populations to shift from the neighbourhood of one stable equilibrium to a second equilibrium. In large populations, the waiting time until such shifts occur can be very long (e.g. Lande 1985). Environments that vary in time so that adaptive forces shift magnitude and direction can also create drift-like forces that lead to shifts from one peak to another, but these forces do not depend on population size (Gillespie 2000). Note that the environmental variation need not directly affect the trait in question if the transmission of different traits is linked, for example, because individuals tend to acquire a suite of traits from the same individual. Finally, the frequency of cultural traits is affected by learning, and chance variation in cues from the environment will lead to drift-like shifts in trait frequencies. Moreover, if the cues available to different individuals in the population are correlated, this could lead groups to shift rapidly from the basin of attraction of a second equilibrium. For example, according to Dower (1999), the experience of losing World War II led many Japanese people to adopt strongly pacifist beliefs. If things had gone differently at Midway, Japan might not have lost, and the Japanese population might have instead maintained their previously held, strongly militaristic beliefs.
8. GENE –CULTURE COEVOLUTION Over longer time scales, social environments shaped by cultural group selection may have affected the genetic evolution of the human species. The archaeological record suggests that cumulative cultural evolution arose in the human lineage sometime between 250 and 500 thousand years ago. As a consequence, social environments shaped by cultural group selection may have generated novel selection pressures on genes influencing human social behaviour. For example, the existence of group-beneficial norms enforced by moralistic punishment might select for moral emotions like shame, and cognitive mechanisms like cheater detection because such genetically transmitted adaptations reduced the chance that their bearers would be punished (Richerson & Boyd 2005). It has also been suggested that cultural group
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3794
R. Boyd & P. J. Richerson
Cultural group selection
selection may explain the low levels of genetic variation within the human species (Premo & Hublin 2009). These authors argue that cultural variation between groups reduces the amount of gene flow among groups, and this in turn increases the fraction of human genetic variation among groups. Then, competition between culturally different groups led to group extinction, and thus reduced the genetic variation in the human species as a whole. 9. CONCLUSION: WHAT COMES NEXT? The theory of cultural group selection is fairly well worked out, and there are a number of convincing examples of the process at work. We believe that three kinds of additional research will be especially valuable. First, there has been little systematic quantitative empirical work that allows an assessment of the relative importance of cultural group selection compared with other processes that shape cultural variation. The main exception is the work of Soltis et al. (1995) estimating group extinction rates, described above. Similar estimates for a wider range of societies would be useful, as would analogous work on group selection by differential imitation and differential migrations. Second, group selection predicts that societies should exhibit design at the group level, that we should be able to understand the structure and variation of norms in terms of how they enhance group welfare (Wilson 2002). Of course, there is a long tradition of functionalist explanation in the social sciences, but for the most part this work takes the form of group-level just so stories. What is needed are sharp, testable hypotheses about how group-functional behaviours, especially groupfunctional norms, should vary with ecology, group size and other measurable variables. The field of ‘law and economics’ is a rich source of such hypotheses (e.g. Posner 1980). Finally we believe that group selection should leave detectable patterns in the ethnographic and archaeological records. There is a rich body of techniques for detecting individual selection using correlations among traits and biogeographic patterns, and the analogous methods may be useful in detecting group selection. We would like to thank Joe Henrich, Ruth Mace, Richard McElreath, Luke Premo, Stephen Shennan and James Steele for useful discussions of the ideas reported in this paper.
REFERENCES Alba, R. & Nee, V. 2003 Remaking the American mainstream: assimilation and the new immigration. Cambridge, MA: Harvard University Press. Bell, A. V., Richerson, P. J. & McElreath, R. 2010 Culture rather than genes provides greater scope for the evolution of large-scale human prosociality. Proc. Natl Acad. Sci. USA 106, 17 671 –17 674. (doi:10.1073/pnas. 0903232106) Borjas, G. J. 1994 The economics of immigration. J. Econ. Lit. 32, 1667–1717. Boyd, R. & Richerson, P. J. 1990 Group selection among alternative evolutionarily stable strategies. J. Theor. Biol. 145, 331 –342. (doi:10.1016/S0022-5193(05)80113-4) Boyd, R. & Richerson, P. J. 1992 Punishment allows the evolution of cooperation (or anything else) in sizable Phil. Trans. R. Soc. B (2010)
groups. Ethol. Sociobiol. 13, 171– 195. (doi:10.1016/ 0162-3095(92)90032-Y) Boyd, R. & Richerson, P. J. 2002 Group beneficial norms spread rapidly in a structured population. J. Theor. Biol. 215, 287 –296. (doi:10.1006/jtbi.2001.2515) Boyd, R. & Richerson, P. J. 2009 Voting with your feet: payoff biased migration and the evolution of group beneficial behavior. J. Theor. Biol. 257, 331 –339. (doi:10.1016/j.jtbi.2008.12.007) Cronk, L. 2002 From true Dorobo to Mukogodo Maasai: contested ethnicity in Kenya. Ethnology 41, 27–49. (doi:10.2307/4153019) Darwin, C. 1859 On the Origin of Species by Means of Natural Selection, 1st edn. London, UK: John Murray. Dower, J. W. 1999 Embracing defeat: Japan in the wake of World War II. New York, NY: Blackstone, W. W. Norton Co. Efferson, C., Lalive, R., Richerson, P. J., McElreath, R. & Lubell, M. 2008 Conformists and mavericks: the empirics of frequency-dependent cultural transmission. Evol. Human Behav. 29, 56– 64. (doi:10.1016/j.evolhumbehav. 2007.08.003) Gavrilets, S. 1995 On phase three of the shifting balance theory. Evolution 50, 1034–1041. (doi:10.2307/ 2410644) Gillespie, J. H. 2000 Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155, 909 –919. Henrich, J. & Boyd, R. 1998 The evolution of conformist transmission and the emergence of between-group differences. Evol. Hum. Behav. 19, 215 –241. (doi:10.1016/ S1090-5138(98)00018-X) Henrich, J. & Boyd, R. 2002 On modeling cognition and culture: why cultural evolution does not require replication of representations. Cult. Cogn. 2, 87–112. (doi:10. 1163/156853702320281836) Henrich, J. & Gil-White, F. J. 2001 The evolution of prestige—freely conferred deference as a mechanism for enhancing the benefits of cultural transmission. Evol. Human Behav. 22, 165– 196. (doi:10.1016/S10905138(00)00071-4) Holden, C. & Mace, R. 2005 The cow is the enemy of matriliny: using phylogenetic methods to study cultural evolution in Africa. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. Holden & S. Shennan). London, UK: UCL Press. Huelsenbeck, J. P., Ronquist, F., Nielsen, R. & Bollback, J. P. 2001 Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310 –2314. (doi:10. 1126/science.1065889) Johnson, P. 1976 A history of Christianity. London, UK: Weidenfeld & Nicolson. Keen, I. 2004 Aboriginal economy and society: Australia at the threshold of colonisation. Oxford, UK: Oxford University Press. Knauft, B. M. 1985 Good company and violence: sorcery and social action in a lowland New Guinea society. In Studies in Melanesian anthropology. Berkeley, CL: University of California Press. Lande, R. 1985 Expected time for random genetic drift of a population between stable phenotypic states. Proc. Natl Acad. Sci. USA 82, 7641–7645. (doi:10.1073/pnas.82. 22.7641) Lehman, L. & Feldman, M. W. 2008 Cultural transmission can inhibit the evolution of altruistic helping. Theor. Popul. Biol. 73, 506–515. (doi:10.1016/j.tpb.2008.02.004) Lehmann, L., Feldman, M. W. & Foster, K. 2008 Cultural transmission can inhibit the evolution of altruistic helping. Am. Nat. 173, 12–24. (doi:10.1086/587851) Martin, P. 2005 Migrants in the global labor market. pp. 1–57. Global Commission on International Migration (see http:// www.gcim.org/en/).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural group selection Maynard Smith, J. 1964 Group selection and kin selection. Nature 201, 1145–1147. (doi:10.1038/2011145a0) McElreath, R., Bell, A., Efferson, C., Lubell, M., Richerson, P. J. & Waring, T. 2008 Beyond existence and aiming outside the laboratory: estimating frequency-dependent and payoff-biased social learning strategies. Phil. Trans. R. Soc. B 363, 3515–3528. (doi:10.1098/rstb.2008.0131) Palmer, C. T., Fredrickson, B. E. & Tilley, C. F. 1997 Categories and gatherings: group selection and the mythology of cultural anthropology. Evol. Human Behav. 18, 291– 308. (doi:10.1016/S1090-5138(97)00045-7) Posner, R. 1980 A theory of primitive society, with special reference to law. J. Law Econ. 23, 1 –53. (doi:10.1086/ 466951) Premo, L. L. & Hublin, J. J. 2009 Culture, population structure, and low genetic diversity among Pleistocene hominins. Proc. Natl Acad. Sci. USA 106, 33–37. (doi:10.1073/pnas.0809194105) Price, G. R. 1970 Selection and covariance. Nature 227, 520 –521. (doi:10.1038/227520a0) Price, G. R. 1972 Extensions of covariance selection mathematics. Ann. Human Genet. 35, 485–490. (doi:10. 1111/j.1469-1809.1957.tb01874.x)
Phil. Trans. R. Soc. B (2010)
R. Boyd & P. J. Richerson
3795
Richerson, P. J. & Boyd, R. 2005 Not by genes alone: how culture transformed human evolution. Chicago, IL: University of Chicago Press. Rogers, A. R. 1990 Group selection by selective emigration: the effects of migration and kin structure. Am. Nat. 135, 398–413. (doi:10.1086/285053) Soltis, J., Boyd, R. & Richerson, P. J. 1995 Can groupfunctional behaviors evolve by cultural group selection? An empirical test. Curr. Anthropol. 36, 437–494. (doi:10.1086/204381) Stark, R. 1997 The rise of Christianity: how the obscure, marginal Jesus movement became the dominant religious force in the Western world in a few centuries. San Francisco, CA: Harper/Collins. Williams, G. C. 1966 Adaptation and natural selection: a critique of some current evolutionary thought. Princeton, NJ: Princeton University Press. Wilson, D. S. 2002 Darwin’s cathedral: evolution, religion, and the nature of society. Chicago, IL: University of Chicago Press. Wright, S. 1931 Evolution in Mendelian populations. Genetics 16, 97–159. Wynne-Edwards, V. C. 1962 Animal dispersion in relation to social behavior. Edinburgh, UK: Oliver and Boyd.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3797–3806 doi:10.1098/rstb.2010.0012
Cultural traits as units of analysis Michael J. O’Brien1, *, R. Lee Lyman1, Alex Mesoudi2 and Todd L. VanPool1 1
Department of Anthropology, University of Missouri, 107 Swallow Hall, Columbia, MO 65211, USA 2 School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
Cultural traits have long been used in anthropology as units of transmission that ostensibly reflect behavioural characteristics of the individuals or groups exhibiting the traits. After they are transmitted, cultural traits serve as units of replication in that they can be modified as part of an individual’s cultural repertoire through processes such as recombination, loss or partial alteration within an individual’s mind. Cultural traits are analogous to genes in that organisms replicate them, but they are also replicators in their own right. No one has ever seen a unit of transmission, either behavioural or genetic, although we can observe the effects of transmission. Fortunately, such units are manifest in artefacts, features and other components of the archaeological record, and they serve as proxies for studying the transmission (and modification) of cultural traits, provided there is analytical clarity over how to define and measure the units that underlie this inheritance process. Keywords: cultural traits; cultural transmission; ideational units; classes; design space
1. INTRODUCTION Cultural traits are units of transmission that permit diffusion and create traditions—patterned ways of doing things that exist in identifiable form over extended periods of time. As with genes, cultural traits are subject to recombination, copying error, and the like and thus can be the foundation for the production of new traits. In other words, cultural traits can be both inventions—new creations—and innovations— inventions that successfully spread (Schumpeter 1934). Because they can exist at various scales of inclusiveness and can exhibit considerable flexibility, cultural traits have many of the characteristics of Hull’s (1981) ‘replicators’—entities that pass on their structure directly through replication (Williams 2002). Archaeologists and other social scientists often distinguish between biologically based (innate) behavioural traits and cultural traits, the former being a reflection of one’s genotype and the latter the result of learning (e.g. Williams 1992; Boone & Smith 1998). This is a false dichotomy (Shennan 2002; Mesoudi & O’Brien 2009). ‘Biological’ means living; thus, all human behaviour is biological. Further, ‘innate’ behaviours typically include cultural components, both innate and learned. Learning a language, a quintessential cultural trait, requires cultural transmission, but it also requires the appropriate mental facilities, which result from the interaction between an individual’s genes and the environment (Nettle 2006). Thus, language is a cultural trait because it requires the transmission of cultural information in addition to other environmental and
* Author for correspondence (
[email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
genetic elements. Cultural transmission occurs when both the necessary genes and environmental factors (including cultural traits) are present. Cultural traits that are transmitted through behaviour are a fundamental component of human phenotypes and are one, but clearly not the only, component necessary for cultural transmission. Once transmitted, cultural traits serve as units of replication in that they can be modified as part of an individual’s cultural repertoire through processes such as recombination (new associations with other cultural traits), loss (forgetting) or partial alteration (incomplete learning, personal experience or forgetting select components) within an individual’s mind (Eerkens & Lipo 2005). In this regard, cultural traits are analogous to genes in that organisms replicate them, but they are also replicators in their own right. However, the transmission of these units is behavioural, and it uses mutually understandable spoken or written language, physical imitation or some combination. No one has ever seen a unit of transmission, either behavioural or genetic, although we can observe the effects of transmission. Genes and behavioural traits become units of transmission only in specific environmental contexts, meaning that although one can talk abstractly about them, their definition as an analytically useful unit depends on environmentally specific elements. Fortunately, such units are manifest in artefacts, features and other components of the archaeological record, and they serve as proxies for studying the transmission (and modification) of cultural traits (Leonard & Jones 1987; VanPool 2003). The applicability of an evolutionary framework to these traits has been previously defended (e.g. Lyman & O’Brien 1998; VanPool & VanPool 2003; Mesoudi & O’Brien 2009); here we point out only that these behavioural traits are transmitted between
3797
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3798
M. J. O’Brien et al. Cultural traits as units of analysis
people in an evolutionary process of descent with modification. Our concern is with how to define and measure the units that underlie this inheritance process. What sort of unit will be useful for measuring cultural transmission?
dimension Z A n sio
Y
B
III
en II
m
di
Phil. Trans. R. Soc. B (2010)
I
1 dimension X
2. KINDS OF UNITS Evolutionary archaeologists have examined various units that have been proposed to track cultural transmission (e.g. Dunnell 1971, 1986; Lipo et al. 1997; O’Brien & Lyman 2000, 2002; Lipo & Madsen 2001), in the process emphasizing the critical distinction between two kinds of units—ideational and empirical. A projectile point is an empirical unit—we can see and feel it—but its properties are measured using ideational units, which include the characters and the various states in which they reside. The character ‘notch angle’ is an ideational unit, as are its various states (308, 40– 508, and so on). Ideational units can be descriptional, used merely to characterize a thing (e.g. recording projectile-point colour for descriptive purposes), or they can be theoretical, created for specific analytical purposes (e.g. projectilepoint notch-angle units such as 1 – 308, 31 – 608 and 61 – 908, each of which corresponds to a functional distinction) (Dunnell 1986). A theoretical unit is a special kind of ideational unit—one that has explanatory significance because of, and only because of, its theoretical relevance to the problem at hand. Colour could be a theoretical unit if we were interested in why prehistoric potters painted their bowls certain colours but not others, but it is unlikely that it would play a role in the functional analysis of projectile points. Ideational units are important in two ways. First, they are essential to defining cultural traits, given that archaeologists study cultural replication indirectly through artefacts and other components of the archaeological record. Second, the transmission of cultural traits is contingent on ideational units, making them an essential component of cultural replicators. Humans use ideational units when learning and communicating behavioural information. For example, a manufacturer of, say, projectile points, thinks of his intended creation using ideational units: ‘I need a 6-inch-long point that is 2 inches wide and has 60-degree notches instead of the usual 40-degree notches.’ Those units—inches and degrees—cannot be anything else but ideational because we cannot ‘see’ or ‘feel’ them. The manufacturer then uses ideational units to create the object and can also describe the object using ideational units. The actual specimen that he creates—a 6-inch-long projectile point—is an empirical unit in that it can be seen and felt. Ideational units reduce the need for repetitive, and costly, experimentation with, for example, each newly produced atlatl and dart. Our ability to forego repetitive experimentation sets humans apart from other culture-bearing animals and is based on cultural transmission, which itself is based on the ability to think in, as well as transmit and receive, ideational units (Mesoudi & O’Brien 2008c). In fact, behavioural transmission is typically focused on the transmission of specific ideational units in that they allow fidelity
2
Figure 1. A simple three-dimensional classification system showing the intersection of the character states of each character. Twelve classes are represented (2 3 2), which collectively define the design space.
in cultural transmission by allowing an individual to copy the ‘intent’ as opposed to simply the ‘object’ of another. Notice that there are two things going on here: ideational units are used by both the person making a stone tool—systemic context—and the person later studying the tool—archaeological context (Schiffer 1972). The only difference between the two contexts is in terms of the ultimate role played by the units—replication in the systematic context, analysis in the archaeological context. Some archaeological studies of transmission employ a particular kind of ideational unit, the class, which is a measurement unit that specifies the necessary and sufficient conditions that a specimen must possess to be classified as a member of that unit (class). The advantage of using this kind of classification system is that various combinations of ideational units that define cultural traits can be specified. We can consider the packages of cultural traits that are transmitted (classes with members), those that are not (classes without members), and the differential persistence of behavioural traits through time or space (changes in the occurrence or frequencies of particular classes). What does a class reflect? If a class has sustained replicative success (Leonard & Jones 1987), the short answer is behaviour that at least, in part, reflects cultural transmission. The class does not reconstruct a cultural trait any more than the distal breadth of a fossil hominid humerus reconstructs the underlying genes; rather, it serves as a proxy for one or more cultural traits. Of considerable analytical interest is the concept of design space, an n-dimensional hyperspace (meaning that it is non-Euclidian) defined by the intersection of all possible character states of mutually exclusive characters. Figure 1 illustrates a three-dimensional space (X, Y and Z). There are 12 positions (2 3 2) that define our three-dimensional hyperspace—1-II-A,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural traits as units of analysis M. J. O’Brien et al. 2-III-B, and so on. Those positions are classes— mutually exclusive units defined by the intersections of character states of the three characters (X, Y and Z ). We label these paradigmatic classes (Dunnell 1971; O’Brien & Lyman 2000). Although figure 1 shows a three-dimensional design space, the number of characters and character states included in a particular classification is unrestricted. Importantly, some classes may have no empirical members, meaning that those parts of our design space are empty. Empty design space is just as important analytically as filled design space. We point out that for the sake of simplicity, our treatment here is on a monothetic view of design space when in fact people often or mostly think about things as polythetic groups. This will have implications for our upcoming discussion of recipes and hierarchies. Design-space analysis has been a focus of much of our recent work (e.g. O’Brien et al. 2001, 2002; VanPool 2003; Darwent & O’Brien 2006; Lyman et al. 2008, 2009) aimed at tracking the appearance and disappearance of characters and character states over time and space. An ongoing project studying the evolution of projectile points in the southeastern United States that date ca 11 050 – 10 500 radiocarbon years before the present illustrates our approach. Instead of using traditional artefact types, we used classes defined by eight characters with a variable number of character states (table 1 and figure 2). The selected characters are those that are expected to change the most as a result of cultural transmission (e.g. Beck 1995; Hughes 1998). Our classification gives us the ability to monitor changes in characters through time at the scale of single characters or packages of linked characters. As we discuss below, projectile points (and all other artefacts/features) are higher level traits that comprise any number of lower level traits. Classification used to define design space allows us to shift back and forth between different levels.
3. HIERARCHIES OF UNITS Just as the human brain is equipped to recognize the difference between ideational and empirical phenomena, it is also equipped to arrange phenomena hierarchically (Atran 1998). This is manifest in the manufacture and use of such things as ceramic vessels, which are cultural traits that comprise hierarchically lower traits such as various kinds of temper or manufacturing techniques and themselves are parts of higher level traits such as diet and cuisine choices, food storage, and so on. Further, traits at the same level may be independent, in that their variation is not directly linked (e.g. temper type and form of painted design), yet may also be dependent on the same higher level traits (e.g. pottery use). Pocklington & Best (1997) argue that appropriate units of selection for tracing cultural adaptation will be the largest units that reliably and repeatedly withstand transmission. These presumably will reflect multiple cultural traits, just as most somatic adaptations typically reflect multiple genetic sites. Why the largest unit? Pocklington and Best see two reasons. First, the evolution of smaller units is probably Phil. Trans. R. Soc. B (2010)
3799
Table 1. System used to classify projectile points from the midwestern and southeastern United States. character
character state
I. location of maximum blade width 1. proximal quarter 2. second-most proximal quarter 3. second-most distal quarter 4. distal quarter II. base shape 1. arc-shaped 2. normal curve 3. triangular 4. Folsomoid III. basal-indentation ratioa 1. no basal indentation 2. 0.90 –0.99 (shallow) 3. 0.80 –0.89 (deep) IV. constriction ratiob 1. 1.00 2. 0.90 –0.99 3. 0.80 –0.89 4. 0.70 –0.79 5. 0.60 –0.69 6. 0.50 –0.59 V. outer tang angle 1. 938 –1158 2. 888 –928 3. 818 –878 4. 668 –808 5. 518 –658 6. 508 VI. tang-tip shape 1. pointed 2. round 3. blunt VII. fluting 1. absent 2. present VIII. length/width ratio 1. 1.00 –1.99 2. 2.00 –2.99 3. 3.00 –3.99 4. 4.00 –4.99 5. 5.00 –5.99 6. 6.00 a
The ratio between the medial length of a specimen and its total length; the smaller the ratio, the deeper the indentation. b The ratio between the minimum blade width (proximal to the point of maximum blade width) as a measure of ‘waistedness’; the smaller the ratio, the higher the amount of constriction.
controlled by the transmission of cultural traits defined at a higher level. Second, the parallel transmission of multiple smaller scale units over long periods of time indicates that there is no significant conflict of interest among the sub-components (Bull 1994). From an evolutionary perspective, parallel transmission is the force that initiates the process by which multiple isolated elements begin to cooperate with one another and create larger scale structural integrity, which is the scale at which adaptations form. Our classification produces units that are amenable to hierarchical arrangement, meaning that the units are nested. The evolutionary arrangement of four hypothetical units (taxa) created from eight characters is shown in figure 3a. The ancestral unit, x, undergoes
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3800
M. J. O’Brien et al. Cultural traits as units of analysis D
E
D
E
A
A¢
B
A
A¢
B
B¢
C
B¢ F E¢
landmark characters
C¢
G
C¢
G
F
D¢
D¢ E¢
C
H basic shapes
A–A¢ = maximum blade width B–B¢ = minimum blade width C–C¢ = height of maximum blade width
arc-shaped normal curve
D–D¢ = medial length E–E¢ = maximum length
triangular
F = outer tang angle G = tang tip
Folsomoid
H = flute Figure 2. Locations of characters used in the analyses of projectile points from the midwestern and southeastern United States (from O’Brien et al. 2001). See table 1 for character states.
one character-state change, in character IV (1 ! 2), to produce unit y (represented by 11222324). Unit y undergoes two state changes, in characters VII (2 ! 1) and VIII (4 ! 3), to produce unit z (11222313). Unit z undergoes one state change in character I (1 ! 2) to produce unit 21222313. This arrangement is hierarchical in the sense of a nesting of less-inclusive, lower level units within more-inclusive, higher level units. To simplify, considering only characters that change states—I, IV, VII and VIII—and ranking the characters in the order listed in figure 3a, the hierarchy of possible combinations of character states gives the 16 possible classes as shown in figure 3b. Only four of the classes are actually represented by empirical specimens in figure 3a, but we reiterate that empty design Phil. Trans. R. Soc. B (2010)
space—classes without members—can be analytically significant (Gould 1991), especially with respect to adaptation. For example, Henrich & Boyd (1998) ask why the aboriginal peoples of New Guinea do not fletch their arrows, given the likelihood that people in coastal New Guinea have had considerable contact with and have observed others using fletching for centuries. The emptiness of design space raises the question ‘Why not?’ in a manner that lends itself to empirical examination.
4. RECIPES AS UNITS The successful construction and use of tools—higher order cultural traits—typically involve the execution
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural traits as units of analysis M. J. O’Brien et al. 11212324
(a)
11222324
11222313
3801
21222313
I2 z VIII3 VII1 y IV2 x character state
(b)
character
IV
1 11
VII VIII I
2
113
12 114
22
123
124
213
22 214
223
224
1113 2113 1114 2114 1123 2123 1124 2124 1213 2213 1214 2214 1223 2223 1224 2224
represented taxa
1
1
2
1
1
1
1
1
2
2
2
2
1
2
2
2
2
2
2
2
3
3
3
3
2
1
1
2
4
3
3
4
Figure 3. Phylogenetic (historical) arrangement of four fictional units (taxa) created from eight characters: (a) tree showing historical progression of character-state changes (arabic numerals identify the characters, and roman numerals identify the states of those characters); (b) nested hierarchical arrangement of character states showing empirically filled design space (the classes labelled ‘represented taxa’) and empty design space; character states common to all classes are not circled (from O’Brien et al. 2002).
of a lengthy sequence of actions (Bleed 2001), from the acquisition and preparation of materials to a tool’s eventual discard, with each action functionally dependent on previous actions. Cognitive psychologists (see Mesoudi & Whiten 2004) have proposed that people represent tools as interlinked, hierarchical knowledge structures, incorporating behavioural scripts governing their construction and use, much like recipes—a concept that has been used on occasion in archaeology (e.g. Krause 1985; Schiffer & Skibo 1987; Neff 1992; Lyman & O’Brien 2003; Mesoudi & O’Brien 2008c). The concept of recipe attends several of the difficulties inherent in the cultural-trait concept. If, as is evident, culture is highly plastic, then ‘the location of the ‘joints’ in a cultural genome appear to be capable of varying from case to case, and perhaps from context to context’ (Wimsatt 1999, p. 282). The seeming arbitrariness of cultural traits as cultural fragments gives them their viability as replicators and provides ‘our ability to re-package and re-articulate cultural products into seemingly arbitrary larger or smaller constructions to be replicated and transmitted as units’ (Wimsatt 1999, p. 283). This means ‘most cultural products are Phil. Trans. R. Soc. B (2010)
also compound products’ (Wimsatt 1999, p. 285)—a characteristic not lost on early ethnologists (e.g. Driver & Kroeber 1932). The same can be said for modern perspectives in biology, where it is becoming increasingly clear that ‘the classical molecular concept of a gene as a contiguous stretch of DNA encoding a functional product is inconsistent with the complexity and diversity of genomic organization’ (Prohaska & Stadler 2008, p. 215). In our minds, it is unquestionable that genes are units of function, and we have no issue with defining a gene as a unit that shows ‘stronger cohesion to itself than to other components’ (Prohaska & Stadler 2008, p. 219). We need to keep in mind, however, that for any given experimental protocol, ‘we may be able to distinguish the function of higher level units from those of their components, thus functional units can be nested within each other’ (Prohaska & Stadler 2008, p. 219). This perspective both recognizes that biological products can also be compound products and underscores our earlier discussion of ideational units, which can exist in the minds of the makers of tools just as they can in the minds of the archaeologists studying the tools. In both cases,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3802
M. J. O’Brien et al. Cultural traits as units of analysis
units can be manipulated at various scales, from that of small functional units—equivalent to ‘genes’ in the biological world—to that of larger, nested units. Recipes are nothing but large, nested ideational units. We find the term ‘recipe’ to be useful for three reasons. First, the commonsensical meaning of the term captures the essence of most standard anthropological definitions of ‘cultural trait’: behavioural information that can be transmitted between people about how (and when, where, and why, to lesser extents) to produce something (that may or may not leave a material trace). Second, the recipe concept behaviourally links two general structures—ingredients and rules—that can be reconfigured to form different recipes and thus different products (Eerkens & Lipo 2007). As an important aside, the same is true in biology with respect to protein interaction networks, in which the same ‘ingredients’ can be activated in different orders by different rules to form different cellular products. Two concepts employed in evolutionary biology are of interest here (Lyman et al. 2008). The first is constraint, in which some attribute of the phenomena being transmitted places mechanical or structural limitations on future potential variants. This results in channelling (Gould 2002), in which the transmission of a particular trait can be constrained by the transmission of a trait with which it is mechanically linked. Such a trait is said to hitchhike—the second concept—with a trait that is actually being sorted by processes such as selection, drift or infidelity of transmission (Hurt et al. 2001; Ackland et al. 2007). The third useful characteristic of the term recipe is that recipes are ideational, with any given product being a more or less imperfect empirical manifestation of a recipe as a result of variation in raw materials, manufacturing skills, and so on. Given their ideational structure, recipes can be defined (and cultural transmission studied) at different scales. Thus they are ordered, encompassing several behavioural subroutines (e.g. preparation of material, production and use), each of which in turn can be subdivided into a sequence of constituent lower level actions required to complete each subroutine. This feature of recipes is helpful from an analytical standpoint in that the scale of units of cultural replication can vary according to analytical needs. That is, one can move back and forth between examining the basic building blocks of a recipe and examining the higher order groupings of those blocks into larger, more-complex blocks. To examine these issues, Mesoudi & O’Brien (2008c) constructed a simple agent-based model designed to explore the conditions under which recipe-like knowledge structures are likely to emerge during cultural evolution. The model considered three types of vertical cultural transmission: hierarchical, holistic and diffusionist (‘piecemeal’ might be a better term to differentiate the process from the catch-all process long used in anthropology to refer to any kind of transmission). Hierarchically organized transmission, where agents could subdivide toolmaking knowledge acquired from their parents into a recipe-like series of constituent subunits or subroutines (figure 4), was favoured over holistic Phil. Trans. R. Soc. B (2010)
(a) A B C D E F G H I
J K L M N O
(b) A B C D E F G H I
J K L M N O
A B C D E F G H I
J K L M N O
(c)
Figure 4. Three different models of behavioural organization (from Mesoudi & O’Brien 2008c). Although Mesoudi and O’Brien use the term ‘diffusionist’ organization, perhaps a better term would be ‘piecemeal’ organization. (a) Hierarchical organization; (b) holistic organization; (c) diffusionist organization.
transmission, where agents learned tool-making from their parents in an all-or-nothing fashion with no stable subunits (figure 4), only when there was some degree of error in the transmission process. This is because, for hierarchical transmission, errors affect only a single subunit; already-completed subunits are unaffected. Where there are no intermediate subunits, as in holistic transmission, errors disrupt the entire learning process (Simon 1962; Dawkins 1976b). The advantage of hierarchical transmission is maintained at equilibrium when transmission is also associated with some cost, which minimizes the amount of time spent learning from parents. Otherwise, holistic learners will eventually acquire the entire behavioural sequence despite the disruptive effect of transmission error. That cultural transmission exhibits both cost and error seems a realistic assumption, given that mastering the skills required to make and use tools typically requires repeated practice over several years (Eerkens 2000). Hierarchical transmission is also favoured over diffusionist transmission, in which actions are acquired from the parent separately in piecemeal fashion (figure 4), only when subunits repeat in one or more recipe. This is because the overall cost of transmission is reduced: once a subunit is learned, it can be repeated in the same or a different artefact at no cost and with no error (Lyman & O’Brien 2003). Hierarchical transmission is therefore more likely to emerge when there are many repeating subunits (e.g. when there are multiple recipes with multiple subunits and few actions per subunit). Finally, the model also explored the advantage of hierarchical cultural transmission of behavioural knowledge from the previous generation relative to individual trial-and-error learning: the former is more likely to replace the latter when the former is less costly and features less error. This assumption is consistent with both theoretical predictions (the maximization of inclusive fitness) and ethnographic evidence (Mesoudi & O’Brien 2008c). Some degree of individual learning is retained when the selective environment changes, which vertical transmission alone cannot track (see Boyd & Richerson 1985).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural traits as units of analysis M. J. O’Brien et al. A plausible scenario suggested by this model, therefore, is one in which there is an extended period of relatively low cost and relatively accurate vertical cultural transmission, where hierarchically structured behavioural knowledge is learned from the parental generation, along with less-frequent individual learning that is predominantly diffusionist and functions to track novel environmental change (Mesoudi & O’Brien 2008c). Recent advances in evolutionary developmental biology, or ‘evodevo’ (Carroll 2005), have shown there to be several parallels between the hierarchically structured, recipe-like organization of cultural behavioural knowledge and the manner in which biological organisms develop (Callebaut & Rasskin-Gutman 2005). Phenotypic characters are often modular (Hansen 2003), such that different characters develop as partially self-contained modules, similar to the subunits of a behavioural recipe. These modules are ordered, with a small number of higher level regulatory genes triggering the growth of entire lower level modules, such as the Hox genes that control the growth of limbs or body segments (Carroll 1995). Consequently, bodies can be built by repeating modular body parts, such as limbs, teeth or body segments (Weiss 1990), in the same way that cultural subunits can be repeated in one or more recipes. These parallels suggest that the advantages of hierarchical organization—localization of error and repetition of subunits—are likely to generalize to many or all knowledge-gaining evolutionary systems (Dawkins 1976b).
5. DISCUSSION Although there is considerable room for debate about cultural units, we find several points indisputable (Lyman et al. 2008) and hope that they will serve as cornerstones of all future evolutionary studies of cultural transmission. First, cultural traits are ideational replicative units composed of behavioural information transmitted through human interaction. Second, cultural traits are part of human phenotypes, but the traits themselves are populational. They can be tracked at an individual level across time and space, but trait evolution is observed at the level of the changing membership of a population and does not predict the life history of any individual trait. Third, traits aggregate into larger linked associations that can be manifest in the archaeological record. However, individual cultural traits cannot be directly reconstructed from the archaeological record because they are replicated behaviour, which is not wholly reflected in even the best-preserved archaeological contexts. Further, the material objects archaeologists recover typically reflect cultural-trait clusters (recipes of action) that can be indirectly traced through the replicative success of like artefacts (Leonard & Jones 1987). Recognizing that theoretical classes reflect, but do not reconstruct, cultural traits frees us from worrying about such things as identifying ‘true’ cultural traits, just as palaeobiologists do not seek to reconstruct specific genetic sequences from the morphological characteristics they study. Phil. Trans. R. Soc. B (2010)
3803
Four axioms follow from these premises (Lyman et al. 2008). First, cultural transmission creates lineages of artefacts as cultural traits replicate and change. At a larger scale, groups of phylogenetically related lineages form traditions, or clades (e.g. Jordan & Shennan 2003; Buchanan & Collard 2007; O’Brien et al. 2008; Lycett 2009). Second, the persistence of artefact classes over time monitors cultural transmission but at a scale higher than a single cultural trait (Lyman & O’Brien 1998; Lyman & Harpole 2002). If constructed using attributes that are culturally transmitted, cultural traits reflected in artefacts in the same class are related phylogenetically. Third, copying error, intentional or not, and experimentation create variation (Schiffer 1996; Eerkens & Lipo 2005). Fourth, selection reduces or stabilizes that variation. Understanding the operation of selection and other evolutionary processes is simplified by properly understanding cultural traits as replicative units. To begin with, there probably will not be a one-to-one correspondence between only one cultural trait and its behavioural manifestation. Each cultural trait is linked more or less strongly (depending on selective context) within the transmission environment. Recombination might allow previously linked cultural traits to become independently transmitted, but this is unnecessary and may actually be mechanically impossible. Linked cultural traits further illustrate that cultural traits are replicative units, not just units of inheritance. For example, variants that are superior in one context can be selected against because they lack performance characteristics associated with different recipes of action. A key factor in explaining spatial-temporal patterns visible in culturally transmitted information will consequently be evaluating hierarchical relationships between culture traits and related recipes of action. Although variation is continuously generated, we also expect that the rate of change in items such as projectile points will be episodic rather than constant. Studies of modern material culture have found patterned inventive activities, ‘discernible as a clustering in time and space of similar inventions’—literally, a ‘burst of variation,’ termed stimulated variation (Schiffer 1996, p. 656). The analogous process in biological evolution is adaptive radiation, during which organisms enter new niches. We believe a similar temporal dynamic attends stimulated variation (Lyman & O’Brien 2000). Deficiencies in the performance characteristics of an artefact category result in a proliferation of variation (Petroski 1992), perhaps in a cascade effect as culture traits realign into new recipes of action (Schiffer 2005). Subsequently, variation will decrease as less-efficient variants cease to be replicated. We recently began investigating such changes in the tempo of cultural-trait evolution as they are reflected in the replacement of the atlatl and dart by the bow and arrow (Lyman et al. 2008, 2009). Because of mechanical differences, attributes of dart points, especially those related to point size (arrow points are smaller than dart points) and the manner in which the points were fastened to shafts (hafting), had to be experimented with to find an effective combination
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3804
M. J. O’Brien et al. Cultural traits as units of analysis
of attributes (classes) of points that could serve as arrow points (e.g. Beck 1995; Hughes 1998; Bettinger & Eerkens 1999). These efforts are archaeologically visible as both taxonomic diversity and morphological diversity within classes.
6. CONCLUSION If Mayr (1973) is correct that behaviour is perhaps the strongest selection pressure operating in the animal kingdom, then we need to take it all that more seriously when the animals are humans. Cultural transmission is a primary determinant of behaviour, and there is little doubt that cultural transmission is one of the most effective means of evolutionary inheritance that nature could ever sculpt. Some (e.g. Gould 1996) argue that culture, through its highly creative transmission processes, has exempted humans from natural selection, and thus from evolution, but a growing number of social scientists are rejecting this myopic view. Instead, they are finding themselves in agreement with Bettinger & Eerkens’ (1999, p. 239) claim that, ‘it seems clear to us that cultural transmission must affect Darwinian fitness—how could it be otherwise? And Darwinian fitness must also bear on cultural transmission. Again, how could that not be true? . . . To deny that would imply that the culturally mediated evolutionary success of anatomically modern humans is merely serendipitous happenstance’. Considerable study has elucidated culturaltransmission processes—individual learning versus social learning, for example—and the strategies/ biases that shape the results of transmission— conformist, prestige-based, indirect, content-based, and so on (e.g. Boyd & Richerson 1985; Henrich & Boyd 1998). If our intellectual forebears were able to look into the future, no doubt they would have been amazed at the progress that has been made in understanding transmission processes. But they probably would also be amused to see that the same issues with which they were wrestling in the early twentieth century relative to the units of transmission have a similar cast to them (Lyman 2008). As Shennan (2008, p. 3176) put it, the key question is, ‘to what extent is it possible to identify the action of the various cultural evolutionary processes. . . .on the basis of distributions of variation in the (more or less) present. . . . or at various points in the past?’ This requires us to understand both the ways in which humans gain cultural information and the structure of that information. We have been able to model the relationships between process and structure for some time (e.g. Cavalli-Sforza & Feldman 1981; Boyd & Richerson 1985), but recent empirical investigations reflect our growing ability to empirically test such models (e.g. Bettinger & Eerkens 1999; Shennan & Wilkinson 2001; Henrich 2004; Kohler et al. 2004; Mesoudi & O’Brien 2008a,b,c). Archaeologists in particular are beginning to take what Dawkins (1976a) referred to as the ‘meme’s eye-view’, or the perspective of the cultural attributes themselves (Shennan 2008). And when we reach down to the level of the artefact, and then down to the level of characters and character states, we begin to notice the incredible variation that exists. Phil. Trans. R. Soc. B (2010)
That variation tells us that evolutionary change has taken place. It is our job to construct units that measure the change—its direction, tempo and scale. Here is a closing example: Let us say that our analytical interest is on understanding how hunters and gatherers negotiate complex fitness topographies containing a variety of peaks, valleys, chasms and plateaus. Slight variations in initial conditions—the starting point on the fitness landscape—can drive two similar populations towards increasingly divergent adaptive ‘peaks’, or solutions (Henrich & Boyd 1998). We might propose that jumping from one optimum to another is difficult, requiring simultaneous alteration of a number of traits in just the right manner so as to land on a superior peak and avoid dropping into fitness valleys (Mesoudi 2008). How could we possibly structure research to address this proposition without detailed knowledge of the small-scale changes that occurred, either singly or as integrated packages (linked characters), in the phenotypic expressions of the actors involved? The answer is, we cannot. We thank James Steele, Peter Jordan, and Ethan Cochrane for organizing the seminar at which this paper was presented. The paper benefited greatly from the comments of seminar participants, especially Carl Lipo, Mark Collard, Jamie Tehrani, and Stephen Shennan. We also thank two not-so-anonymous reviewers for their extremely helpful advice.
REFERENCES Ackland, G. J., Signitzer, M., Stratford, K. & Cohen, M. H. 2007 Cultural hitchhiking on the wave of advance of beneficial technologies. Proc. Natl Acad. Sci. USA 104, 8714– 8719. (doi:10.1073/pnas.0702469104) Atran, S. 1998 Folk biology and the anthropology of science: cognitive universals and cultural particulars. Behav. Brain Sci. 21, 547 –609. (doi:10.1017/S0140525X98001277) Beck, C. 1995 Functional analysis and the differential persistence of Great Basin dart forms. J. Calif. Great Basin Anthropol. 17, 222– 243. Bettinger, R. L. & Eerkens, J. 1999 Point typologies, cultural transmission, and the spread of bow-and-arrow technology in the prehistoric Great Basin. Amer. Antiq. 64, 231–242. (doi:10.2307/2694276) Bleed, P. 2001 Trees or chains, links or branches: conceptual alternatives for consideration of stone tool production and other sequential activities. J. Archaeol. Method Theory 8, 101 –127. (doi:10.1023/A:1009526016167) Boone, J. L. & Smith, E. A. 1998 Is it evolution yet? A critique of evolutionary archaeology. Curr. Anthropol. 39, S141 –S173. (doi:10.1086/204693) Boyd, R. & Richerson, P. J. 1985 Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Buchanan, B. & Collard, M. 2007 Investigating the peopling of North America through cladistic analyses of Early Paleoindian projectile points. J. Archaeol. Sci. 26, 366–393. Bull, J. J. 1994 Virulence. Evolution 48, 1423–1437. (doi:10. 2307/2410237) Callebaut, W. & Rasskin-Gutman, D. (eds) 2005 Modularity: understanding the development and evolution of natural complex systems. Cambridge, MA: MIT Press. Carroll, S. B. 1995 Homeotic genes and the evolution of arthropods and chordates. Nature 376, 479–485. (doi:10.1038/376479a0)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cultural traits as units of analysis M. J. O’Brien et al. Carroll, R. L. 2005 Endless forms most beautiful: the new science of evo devo and the making of the animal kingdom. New York, NY: Norton. Cavalli-Sforza, L. L. & Feldman, M. 1981 Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Darwent, J. & O’Brien, M. J. 2006 Using cladistics to construct lineages of projectile points from northeastern Missouri. In Mapping our ancestors: phylogenetic approaches in anthropology and prehistory (eds C. P. Lipo, M. J. O’Brien, M. Collard & S. J. Shennan), pp. 185– 208. New York, NY: Aldine. Dawkins, R. 1976a The selfish gene. New York, NY: Oxford University Press. Dawkins, R. 1976b Hierarchical organisation: a candidate principle for ethology. In Growing points in ethology (eds P. P. Bates & R. A. Hinde), pp. 7–54. Cambridge, UK: Cambridge University Press. Driver, H. E. & Kroeber, A. L. 1932 Quantitative expression of cultural relationships. U. Calif. Pub. Am. Archaeol. Ethnol. 31, 211 –256. Dunnell, R. C. 1971 Systematics in prehistory. New York, NY: Free Press. Dunnell, R. C. 1986 Methodological issues in Americanist artifact classification. Adv. Archaeol. Method Theory 9, 149 –207. Eerkens, J. W. 2000 Practice makes within 5% of perfect: the role of visual perception, motor skills, and human memory in artifact variation and standardization. Curr. Anthropol. 41, 663–668. (doi:10.1086/317394) Eerkens, J. W. & Lipo, C. P. 2005 Cultural transmission, copying errors, and the generation of variation in material culture and the archaeological record. J. Anthropol. Archaeol. 24, 316 –334. (doi:10.1016/j.jaa.2005.08.001) Eerkens, J. W. & Lipo, C. P. 2007 Cultural transmission theory and the archaeological record: providing context to understanding variation and temporal changes in material culture. J. Archaeol. Res. 15, 239– 274. (doi:10. 1007/s10814-007-9013-z) Gould, S. J. 1991 The disparity of the Burgess Shale arthropod fauna and the limits of cladistic analysis: why we must strive to quantify morphospace. Paleobiology 17, 411 –423. Gould, S. J. 1996 Full house: the spread of excellence from Plato to Darwin. New York, NY: Harmony. Gould, S. J. 2002 The structure of evolutionary theory. Cambridge, MA: Belknap. Hansen, T. F. 2003 Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvability. BioSystems 69, 83–94. (doi:10.1016/S03032647(02)00132-6) Henrich, J. 2004 Demography and cultural evolution: why adaptive cultural processes produced maladaptive losses in Tasmania. Amer. Antiq. 69, 197 –214. (doi:10.2307/ 4128416) Henrich, J. & Boyd, R. 1998 The evolution of conformist transmission and the emergence of between-group differences. Evol. Hum. Behav. 19, 215 –241. (doi:10.1016/ S1090-5138(98)00018-X) Hughes, S. S. 1998 Getting to the point: evolutionary change in prehistoric weaponry. J. Archaeol. Method Theory 5, 345 –408. (doi:10.1007/BF02428421) Hull, D. L. 1981 Units of evolution: a metaphysical essay. In The philosophy of evolution (eds U. J. Jenson & R. Harre´), pp. 23– 44. New York, NY: St. Martin’s Press. Hurt, T. D., VanPool, T. L., Leonard, R. D. & Rakita, G. F. M. 2001 Explaining the co-occurrence of attributes in the archaeological record: a further consideration of replicative success. In Style and function: conceptual issues
Phil. Trans. R. Soc. B (2010)
3805
in evolutionary archaeology (eds T. D. Hurt & G. F. M. Rakita), pp. 51–68. Westport, CT: Bergin and Garvey. Jordan, P. & Shennan, S. J. 2003 Cultural transmission, language, and basketry traditions amongst the California Indians. J. Anthropol. Archaeol. 22, 42–74. (doi:10.1016/ S0278-4165(03)00004-7) Kohler, T. A., VanBuskirk, S. & Ruscavage-Barz, S. 2004 Vessels and villages: evidence for conformist transmission in early village aggregations on the Pajarito Plateau, New Mexico. J. Anthropol. Archaeol. 23, 100–118. (doi:10. 1016/j.jaa.2003.12.003) Krause, R. A. 1985 The clay sleeps. Tuscaloosa, AL: University of Alabama Press. Leonard, R. D. & Jones, G. T. 1987 Elements of an inclusive evolutionary model for archaeology. J. Anthropol. Archaeol. 6, 199–219. (doi:10.1016/0278-4165(87)90001-8) Lipo, C. P. & Madsen, M. E. 2001 Neutrality, ‘style,’ and drift: building models for studying cultural transmission in the archaeological record. In Style and function: conceptual issues in evolutionary archaeology (eds T. D. Hurt & G. F. M. Rakita), pp. 91–118. Westport, CT: Bergin and Garvey. Lipo, C., Madsen, M., Dunnell, R. C. & Hunt, T. 1997 Population structure, cultural transmission, and frequency seriation. J. Anthropol. Archaeol. 16, 301– 333. (doi:10.1006/jaar.1997.0314) Lycett, S. J. 2009 Are Victoria West cores ‘proto-Levallois’? A phylogenetic assessment. J. Hum. Evol. 56, 175 –191. (doi:10.1016/j.jhevol.2008.10.001) Lyman, R. L. 2008 Cultural transmission in North American anthropology and archaeology, ca. 1895– 1965. In Cultural transmission and archaeology: issues and case studies (ed. M. J. O’Brien), pp. 10–20. Washington, DC: Society for American Archaeology Press. Lyman, R. L. & Harpole, J. L. 2002 A. L. Kroeber and the measurement of time’s arrow and time’s cycle. J. Anthropol. Res. 58, 313 –338. Lyman, R. L. & O’Brien, M. J. 1998 The goals of evolutionary archaeology: history and explanation. Curr. Anthropol. 39, 615 –652. (doi:10.1086/204786) Lyman, R. L. & O’Brien, M. J. 2000 Measuring and explaining change in artifact variation with clade-diversity diagrams. J. Anthropol. Archaeol. 19, 39– 74. (doi:10. 1006/jaar.1999.0339) Lyman, R. L. & O’Brien, M. J. 2003 Cultural traits: units of analysis in early twentieth-century anthropology. J. Anthropol. Res. 59, 225 –250. Lyman, R. L., VanPool, T. L. & O’Brien, M. J. 2008 Variation in North American dart points and arrow points when one or both are present. J. Archaeol. Sci. 35, 2805–2812. (doi:10.1016/j.jas.2008.05.008) Lyman, R. L., VanPool, T. L. & O’Brien, M. J. 2009 The diversity of North American projectile-point types, before and after the bow and arrow. J. Anthropol. Archaeol. 28, 1– 13. (doi:10.1016/j.jaa.2008.12.002) Mayr, E. 1973 Populations, species, and evolution. Cambridge, MA: Harvard University Press. Mesoudi, A. 2008 An experimental simulation of the ‘copysuccessful-individuals’ cultural learning strategy: adaptive landscapes, producer –scrounger dynamics, and informational access costs. Evol. Hum. Behav. 29, 350– 363. (doi:10.1016/j.evolhumbehav.2008.04.005) Mesoudi, A. & O’Brien, M. J. 2008a The cultural transmission of Great Basin projectile-point technology I: an experimental simulation. Amer. Antiq. 73, 3– 28. Mesoudi, A. & O’Brien, M. J. 2008b The cultural transmission of Great Basin projectile-point technology II: an agent-based computer simulation. Amer. Antiq. 73, 627–644.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3806
M. J. O’Brien et al. Cultural traits as units of analysis
Mesoudi, A. & O’Brien, M. J. 2008c The learning and transmission of hierarchical cultural recipes. Biol. Theory 3, 63–72. (doi:10.1162/biot.2008.3.1.63) Mesoudi, A. & O’Brien, M. J. 2009 Placing archaeology within a unified science of cultural evolution. In Pattern and process in cultural evolution (ed. S. J. Shennan), pp. 21–32. Berkeley, CA: University of California Press. Mesoudi, A. & Whiten, A. 2004 The hierarchical transformation of event knowledge in human cultural transmission. J. Cogn. Cult. 4, 1– 24. (doi:10.1163/ 156853704323074732) Neff, H. 1992 Ceramics and evolution. In Archaeological method and theory (ed. M. B. Schiffer), vol. 4, pp. 141 – 194. Tucson, AZ: University of Arizona Press. Nettle, D. 2006 Language: costs and benefits of a specialized system for social information transmission. In Social information transmission and human biology (eds J. K. Wells, S. Strickland & K. Laland), pp. 137–152. Boca Raton, FL: CRC Press. O’Brien, M. J. & Lyman, R. L. 2000 Applying evolutionary archaeology: a systematic approach. New York, NY: Kluwer Academic/Plenum. O’Brien, M. J. & Lyman, R. L. 2002 The epistemological nature of archaeological units. Anthropol. Theory 2, 37–57. (doi:10.1177/1463499602002001287) O’Brien, M. J., Darwent, J. & Lyman, R. L. 2001 Cladistics is useful for reconstructing archaeological phylogenies: Palaeoindian points from the southeastern United States. J. Archaeol. Sci. 28, 1115–1136. (doi:10.1006/ jasc.2001.0681) O’Brien, M. J., Lyman, R. L., Saab, Y., Saab, E., Darwent, J. & Glover, D. S. 2002 Two issues in archaeological phylogenetics: taxon construction and outgroup selection. J. Theor. Biol. 215, 133 –150. (doi:10.1006/jtbi. 2002.2548) O’Brien, M. J., Lyman, R. L., Collard, M., Holden, C. J., Gray, R. D. & Shennan, S. J. 2008 Transmission, phylogenetics, and the evolution of cultural diversity. In Cultural transmission and archaeology: issues and case studies (ed. M. J. O’Brien), pp. 77–90. Washington, DC: Society for American Archaeology Press. Petroski, H. 1992 The evolution of useful things. New York, NY: Vintage Books. Pocklington, R. & Best, M. L. 1997 Cultural evolution and units of selection in replicating text. J. Theor. Biol. 188, 79–87. (doi:10.1006/jtbi.1997.0460)
Phil. Trans. R. Soc. B (2010)
Prohaska, J. S. & Stadler, P. F. 2008 Genes. Theory Biosci. 127, 215 –221. (doi:10.1007/s12064-008-0025-0) Schiffer, M. B. 1972 Archaeological context and systemic context. Amer. Antiq. 37, 156 –165. (doi:10.2307/ 278203) Schiffer, M. B. 1996 Some relationships between behavioral and evolutionary archaeologies. Amer. Antiq. 61, 643 –662. (doi:10.2307/282009) Schiffer, M. B. 2005 The devil is in the details: the cascade model of invention processes. Amer. Antiq. 70, 485–502. (doi:10.2307/40035310) Schiffer, M. B. & Skibo, J. M. 1987 Theory and experiment in the study of technological change. Curr. Anthropol. 28, 595 –622. (doi:10.1086/203601) Schumpeter, J. A. 1934 The theory of economic development: an inquiry into profits, capital, credit, interest, and the business cycle. Cambridge, MA: Harvard University Press. Shennan, S. J. 2002 Learning. In Darwin and archaeology: a handbook of key concepts (eds J. P. Hart & J. E. Terrell), pp. 183 –200. Westport, CT: Greenwood Press. Shennan, S. J. 2008 Canoes and cultural evolution. Proc. Natl Acad. Sci. USA 105, 3175–3176. (doi:10.1073/ pnas.0800666105) Shennan, S. J. & Wilkinson, J. R. 2001 Ceramic style change and neutral evolution: a case study from Neolithic Europe. Amer. Antiq. 66, 577–593. (doi:10.2307/2694174) Simon, H. A. 1962 The architecture of complexity. Proc. Amer. Phil. Soc. 106, 467–482. VanPool, T. L. 2003 Explaining changes in projectile point morphology: a case study from Ventana Cave, Arizona. PhD thesis, Department of Anthropology, University of New Mexico, Albuquerque, NM. VanPool, T. L. & VanPool, C. S. 2003 Agency and evolution: the role of intended and unintended consequences of action. In Essential tensions in archaeological method and theory (eds T. L. VanPool & C. S. VanPool), pp. 89–114. Salt Lake City, UT: University of Utah Press. Weiss, K. M. 1990 Duplication with variation: metameric logic in evolution from genes to morphology. Amer. J. Phys. Anthropol. 33, 1–23. (doi:10.1002/ajpa.1330330503) Williams, G. C. 1992 Natural selection: domains, levels, and challenges. New York, NY: Oxford University Press. Williams, P. A. 2002 Of replicators and selectors. Q. Rev. Biol. 77, 302– 306. (doi:10.1086/341995) Wimsatt, W. C. 1999 Genes, memes and cultural heredity. Biol. Phil. 14, 279–310. (doi:10.1023/A:1006646703557)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3807–3819 doi:10.1098/rstb.2010.0009
Simulating trait evolution for cross-cultural comparison Charles L. Nunn1,*, Christian Arnold1,2, Luke Matthews1 and Monique Borgerhoff Mulder3 1
Department of Human Evolutionary Biology, Peabody Museum, Harvard University, 11 Divinity Avenue, Cambridge, MA 02138, USA 2 Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Ha¨rtelstraße 16-18, 04107 Leipzig, Germany 3 Department of Anthropology, University of California, Davis, CA 95616-8522, USA Cross-cultural anthropologists have increasingly used phylogenetic methods to study cultural variation. Because cultural behaviours can be transmitted horizontally among socially defined groups, however, it is important to assess whether phylogeny-based methods—which were developed to study vertically transmitted traits among biological taxa—are appropriate for studying group-level cultural variation. Here, we describe a spatially explicit simulation model that can be used to generate data with known degrees of horizontal donation. We review previous results from this model showing that horizontal transmission increases the type I error rate of phylogenetically independent contrasts in studies of correlated evolution. These conclusions apply to cases in which two traits are transmitted as a pair, but horizontal transmission may be less problematic when traits are unlinked. We also use the simulation model to investigate whether measures of homology (the consistency index and the retention index) can detect horizontal transmission of cultural traits. Higher rates of evolutionary change have a stronger depressive impact on measures of homology than higher rates of horizontal transmission; thus, low consistency or retention indices are not necessarily indicative of ‘ethnogenesis’. Collectively, these studies demonstrate the importance of using simulations to assess the validity of methods in cross-cultural research. Keywords: cultural traits; cross-cultural comparison; phylogeny; consistency index; correlated evolution; simulation study
1. INTRODUCTION Human cultural traits exhibit a rich kaleidoscope of forms. Across societies, we exhibit variation in marriage systems, the types of shelters that protect us from the elements, the foods that we cook and how we cook them and decorations to our skin, bodies and clothing. Globally, linguists have identified over 6000 languages (Gordon 2005)—which is more than the 4500 or so extant mammalian species (Bininda-Emonds et al. 2007)—and humans are thought to practise more than 4300 religions (faith groups). Many human cultural traits are likely to be adaptive, such as those related to resource allocation and health practices, and are thus subject to natural selection (Mesoudi et al. 2004). Other cultural traits, such as decorations on pottery, are probably driven less by natural selection, but they may provide social or sexual benefits that indirectly translate to higher reproduction. As with many biological species, cultural diversity is disappearing at a rapid clip (Sutherland 2003), and many of the factors that influence
* Author for correspondence (
[email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
biological diversity also influence cultural diversity (Pagel et al. 1991; Mace & Pagel 1995; Moore et al. 2002). Anthropologists are interested in documenting differences in the configurations of human cultural traits, and in understanding how and why particular sets of traits arise. Comparison has long played a central role in this endeavour, with the first formalized approach to cross-cultural comparison developed by Tylor (1889). Tylor was interested in developing systematic approaches to investigate cross-cultural variation, including correlations—or what Tylor called ‘adhesions’—among different traits. He was criticized for this approach on statistical grounds by Francis Galton (Naroll 1961), yet his work spawned a number of followers who developed an empirically and theoretically rich approach that flourishes to this day (Murdock & White 1969; Burton & White 1984; Borgerhoff Mulder et al. 2001; Pagel & Mace 2004; Mace & Holden 2005; Mace et al. 2005; Lipo et al. 2006). Systematic comparisons are used, for example, to test hypotheses for why some cultures cook with more spices than others (Billing & Sherman 1998), or why some pass wealth to sons and others to daughters (Hartung 1982). As in biology (Harvey & Pagel 1991; Nunn & Barton 2001), however, a critical
3807
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3808
C. L. Nunn et al.
Simulating trait evolution
issue in all of these studies involves the non-independence of data points when conducting comparative tests (Mace & Pagel 1994; Mace & Holden 2005). Indeed, this is precisely the criticism levelled by Galton, and thus known as ‘Galton’s problem’ among anthropologists. The essence of the problem can be summarized as follows. Because populations tend to share traits through descent from a common ancestor, the data points that make up a cross-cultural analysis will lack statistical independence. Phylogeny-based methods provide a means to deal with non-independence that arises through ‘vertical’ transmission of traits from ancestral to descendent populations (Mace & Pagel 1994; Mace & Holden 2005; Nunn et al. 2006). Phylogenetic methods assume that cultural traits behave like genetic traits, particularly with regard to the prominence of vertical trait transmission from parent generations to later generations. In contrast to the genetic transmission of most elements in biology, however, cultural traits can spread ‘horizontally’ among unrelated individuals. While ‘vertical’ and ‘horizontal’ most aptly refer to transmission of traits between individuals (whether individuals learn from their parents or other community members), a similar distinction can be made at the population level, namely whether cultures develop by a tree-like splitting process (phylogenesis) or by admixture (ethnogenesis). Thus, under phylogenesis, cultural change results from the transmission of ideas and practices from parental to daughter cultural taxa (White et al. 1981; Moore & Romney 1994; Holden 2002; Tehrani & Collard 2002). Under ethnogenesis, cultural evolution occurs through the borrowing and blending of ideas and practices and the trade of objects among contemporary societies (e.g. Terrell et al. 1997), resulting in a weak phylogenetic signal (e.g. Hurles et al. 2003; Jordan & Shennan 2003; Moylan et al. 2006). Ethnogenesis creates its own form of non-independence, and a form that requires different approaches from those used for investigating phylogenesis (Borgerhoff Mulder et al. 2006; Nunn et al. 2006). This is not because borrowed traits fail to shed light on adaptation; indeed if pastoralists in arid environments ‘borrow’ camel-keeping from a neighbouring ethnic group this may be valid evidence for adaptive cultural coevolution (Mace & Pagel 1994). Nor is this because cultural traits within a particular domain are not transmitted vertically; indeed, some domains may exhibit huge conservatism, as with the consistency of kinship traits across language families ( Jones 2003). Rather, because of ethnogenesis, the true trees of different domains may be different. Correcting for shared language (as is typically done in the application of phylogenetic methods to human cultures) may not be useful to control for shared origins in a domain that is not structured by language. At this time, however, we lack a general framework for grappling with the most fundamental aspects of this problem (Borgerhoff Mulder et al. 2006). In particular, we need to address the following three questions: (i) Does horizontal transmission between societies produce a signal that is distinct from vertical transmission in comparative data? (ii) Which methods offer the most powerful means to detect horizontal Phil. Trans. R. Soc. B (2010)
transmission (or ethnogenesis)? (iii) What methods are most appropriate for investigating correlations between traits when both horizontal and vertical transmission occur? This latter question is particularly important, as it forms the basis for applying the comparative method to test adaptive hypotheses. Some efforts have been made to address these questions, particularly with regard to methods developed from within anthropology (Dow 1984, 1993, 2007) and from biology (Borgerhoff Mulder et al. 2001; Mace & Holden 2005; Fortunato et al. 2006). But we also need to evaluate the different methods, which raises a fourth question: (iv) How can we evaluate the strengths and weaknesses of different methods when we usually lack information about the actual patterns of vertical and horizontal transmission in realworld data? An answer to this last question again comes from comparative biology. Biologists typically investigate the appropriateness of a new comparative method by taking a phylogeny, simulating the evolution of traits down the tree and calculating statistical tests on these data (e.g. Martins & Garland 1991; Purvis et al. 1994; Nunn 1995; Diaz-Uriarte & Garland 1996; Harvey & Rambaut 2000). The statistical test might measure the correlation between traits, the degree to which more closely related species share similar trait values, measures of tree ‘balance’ or any other statistical or phylogenetic measure of interest. By generating many such artificial datasets with known characteristics—such as the model of evolution or the degree to which the traits are correlated—it becomes possible to evaluate the statistical properties of a particular method. Hence, these methods have played a pivotal role in assessing phylogeny-based comparative methods and making decisions about when and how they should be applied to biological data. This paper has three major sections. First, we introduce a simulation model that was developed to assess methods for cross-cultural research (Nunn et al. 2006). Second, we review one set of results from the model involving the appropriateness of independent contrasts under conditions of horizontal transmission. Lastly, we apply the simulation model to a new question in which we evaluate whether two commonly used phylogenetic metrics of ‘treeness’—namely, the consistency index (CI) and the retention index (RI)—detect phylogenesis in cross-cultural datasets. These indices have been used in cross-cultural research (e.g. Tehrani & Collard 2002; Collard et al. 2006). Our simulations show that factors other than horizontal trait transmission are significant determinants of the CI and RI. The simulations also reveal that using external information on the historical relationships among societies (e.g. language; Mace & Pagel 1994) can improve the performance of tree-consistency measures in cross-cultural research.
2. THE SIMULATION MODEL (a) The basic framework The purpose of a simulation model in phylogenetics research is to generate artificial datasets under known models of trait evolution, and then to assess how well a particular method can recover the parameters of the evolutionary model. To evaluate comparative
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Simulating trait evolution (a) step 1: extinction ( pextinction) 1
2
3
4
1 1
2
2 X
step 2: colonization ( pcolonize) 1 2 3 4
2
3
: indicates horizontal transmission (d) step 4: trait evolution (variance, correlation) 1 2 3 4
1
1
ΔX1 ΔX2
ΔX1 ΔX2
ΔX1 ΔX2
2
2
ΔX1 ΔX2
ΔX1 ΔX2
ΔX1 ΔX2
3
3 : indicates colonization of new cell
4
3
= survived, X = extinct; no extinction first round. (b)
3809
(c) step 3: horizontal donation ( pdonation)
1
3
C. L. Nunn et al.
ΔX1 ΔX2
ΔX1 ΔX2 Brownian motion or mutation
Figure 1. Simulation procedure. A simplified version of the simulation procedure using a 3-row by 4-column spatial matrix partway through a simulation run. The following stochastic processes occur in sequence for each generation in the simulation: (a) step 1: extinction of filled cells; (b) step 2: colonization of empty cells; (c) step 3: horizontal transmission among neighbouring societies; and (d ) step 4: trait evolution. Empty cells indicate unfilled niches in the spatial model.
methods in the context of cross-cultural comparisons, Nunn et al. (2006) developed a spatially explicit simulation approach to investigate trait evolution in relation to phylogeny and geography. Phylogeny in this case refers to the historical relationships among societies, such as a branching pattern indicated with a linguistic tree (e.g. Gray & Jordan 2000; Gray & Atkinson 2003), while geography is represented as a matrix of geographical distances among societies. Nunn et al.’s (2006) simulation approach is derived from previous simulation protocols that have been used to test phylogenetic comparative methods in biology (e.g. Martins & Garland 1991; Purvis et al. 1994; Nunn 1995; Diaz-Uriarte & Garland 1996; Harvey & Rambaut 2000). Nunn et al. (2006) augmented this basic procedure with a spatial context that also allows for horizontal transmission of the simulated traits among neighbouring societies. Moreover, including spatial context required a procedure to generate links between geography and phylogeny, which is important because these two factors will tend to covary in realworld datasets—i.e. closely related societies are often in close spatial proximity. To achieve this, they developed an adaptive radiation model (e.g. Price 1997; Harvey & Rambaut 2000) in which mother populations produce daughter populations that tend to settle in nearby open niches, as described below. Nunn et al. (2006) constructed their spatially explicit model of cultural trait evolution using the computer package MATLAB (v. 6.5, Mathworks, Inc.). The model can be viewed as a metapopulation represented as a two-dimensional lattice (or matrix) with a ‘hard’ edge, such that societies on the edge of the lattice are not connected to societies on the opposite edge. Phil. Trans. R. Soc. B (2010)
Thus, the model assumes a geographically delimited area such as an island, a continent or an area bounded by impassable mountain ranges or bodies of water. The model has non-overlapping generations (discrete time) and each cell of the lattice is treated as a distinct society (discrete space). The model examines the evolution of one or more traits, represented as X1, X2, . . . , Xn. These traits can be continuously varying—such as body mass—or they can be discrete—such as categorizations of the mating system. The user must also make assumptions about how traits transfer when they are horizontally donated. Traits can transfer as a group, which might be expected if the traits are functionally linked (and thus show correlated evolution). At the other extreme, traits can transfer independently. An intermediate position is also possible, with sets of traits moving stochastically as a function of the correlation between them (i.e. ‘stochastic yoking’; Nunn et al. 2006). In all cases, when trait X1 moves from society A to society B, the value of X1 in B is replaced by the value of X1 in A. As in most applications of the comparative method to real anthropological data, societies are assumed to exhibit no intra-societal variation; this is an assumption that could be relaxed in future research using the simulation model. Vertical transmission occurs by default when descendent societies inherit trait values of their ancestors across generations, including through speciation events.
(b) Initializing and running the model along discrete time steps Figure 1 provides an overview of the simulation processes part way through a simulation run on a small
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3810
C. L. Nunn et al.
Simulating trait evolution
Table 1. Parameters that can be varied in the simulation model. parameters description R, C pextinction pdonation pcolonize r N pevolution
number of rows and number of columns in spatial matrix probability that a society goes extinct per time step probability that a society donates a trait to an adjacent society per time step probability that a society colonizes an adjacent open cell per time step correlation between continuously varying traits number of traits simulated rate of evolutionary change (variance of trait change for continuous traits, mutation rate for discrete traits)
3 4 matrix. The parameters used in the simulation are summarized in table 1. The simulation begins with an empty matrix and a single society on the leftmost column in the middle row of the matrix (e.g. row ¼ 2, column ¼ 1 in figure 1). Extinction, colonization of empty cells, horizontal trait donation and trait evolution occur sequentially and stochastically in discrete generations (i.e. time steps). Thus, in each generation, the following events can occur for each society, in the order described below. Step 1: Extinction. Societies go extinct based on a user-defined probability of extinction (pextinction). Higher extinction rates increase the number of empty cells, and thus provide new niches for neighbouring societies to fill; this is important because it affects the degree to which diversification events occur close to the tips of the tree (see Nee et al. 1994; Nunn et al. 2006). To avoid the possibility of an entirely empty matrix, the probability of extinction is set to zero when only one cell of a matrix was occupied, including the first generation of a simulation run. Step 2: Colonization. Neighbouring cells available for colonization are identified as empty cells on the ‘flat’ sides of a given society’s cell (rather than cells attached by their corners). Thus, a society may possess a maximum of four neighbours, with societies on the edges of the matrix having fewer neighbours. Societies that colonize adjacent cells are treated as distinct societies in the next generation. Societies colonize neighbouring cells with probability pcolonize, and as this occurs, the evolutionary relationships are recorded as a bifurcating tree. The program updates branch lengths by one unit in each generation of the simulation. When a society colonized more than one cell in a generation, the relationship among societies was randomly resolved with short branch lengths (¼0.001). Step 3: Horizontal trait donation. Trait values spread among neighbours based on a user-defined probability that a society donates traits to one of its neighbours (pdonation). The values of traits in the recipient are replaced by values from the donor. Nunn et al. (2006) focused on results with traits transferred as a pair during horizontal transmission (i.e. if X1 moves, so does X2), applying the probability of trait donation to the paired movement of traits rather than independently Phil. Trans. R. Soc. B (2010)
for each trait (an issue that is discussed below and by Currie et al. 2010). For a society with more than one neighbour, more than one horizontal donation could take place in a single generation. To deal with this possibility, the probability that at least one donation event takes place for a potential recipient society was calculated using the binomial theorem based on the number of neighbours. If the condition for trait donation was met for a recipient society, the society donating the trait was randomly assigned from among recipient’s neighbours. This means that the rate of horizontal transmission increases with the number of neighbours (as expected in real-world data) and that the spatial configuration of societies will impact the overall rate of horizontal transmission (e.g. by influencing the number of societies on edges of the matrix, as these will have fewer neighbours). Transfers of traits among all societies in the matrix were implemented simultaneously in a given generation after all cells were examined for possible trait transfer. Step 4: Trait evolution. Evolutionary change in the traits occurs at the end of each generation. For continuously varying traits, one common model of trait evolution involves Brownian motion (Felsenstein 1985), with the user identifying variance in trait change per generation and the degree of correlation between characters. These changes are calculated for each society and added to existing trait values (indicated by ‘DX’ in figure 1). For discretely varying traits with two states, trait change can be modelled as a mutation rate that reflects the probability of switching between states at each time step. When simulating discrete traits, we assume that the probability of gains (0 to 1) equals the probability of losses (1 to 0). (c) Output from a simulation run: the ‘true’ tree When constructing a simulation program, the user is essentially playing ‘god’, and this allows him or her to decide what data to collect. Along with data on the traits, their geographical distribution and the simulation parameters, the simulation program also retains the history of splitting by the societies. This reflects the true phylogeny in the sense of the actual splitting of lineages and their times to last common ancestors. While we acknowledge that the true phylogeny is never known for real data, the true tree from the model can be viewed as analogous to a tree that is generated from independent data, such as linguistic data (Mace & Pagel 1994; Gray & Atkinson 2003; Gray et al. 2010). We will see that having independent information on historical relationships can often provide deeper insights into cultural transmission and evolution. 3. HORIZONTAL TRANSMISSION AND CORRELATED EVOLUTION (a) Results Nunn et al. (2006) used the simulation framework to investigate the effects of horizontal transmission on phylogenetically independent contrasts, which is a method that can be used to study whether two or more traits show correlated evolution (Felsenstein 1985; Harvey & Pagel 1991). Briefly, independent contrasts are calculated as differences in trait values
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Simulating trait evolution 6 × 6 matrix
0.6
3 × 12 matrix
C. L. Nunn et al.
3811
1 × 36 matrix
0.5
pextinction
0.4 0.3
0.02
type I error rate
0.2 0.1 0 0.6 0.5 0.4 0.3
0.08
0.2 0.1 0 0
0.04
0.08
0.12
0.16
0
0.04
0.08 0.12 pdonation
0.16
0
0.04
0.08
0.12
0.16
Figure 2. Horizontal transmission and independent contrasts. Horizontal dotted line indicates expected type I error rate (a ¼ 0.05). Plots show how the probability of trait donation (pdonation) affects type I error rates for independent contrasts (black circles) and in non-phylogenetic analyses (circles) for pextinction ¼ 0.02 (top) and 0.08 (bottom row). There was no correlation between trait changes in these simulations of continuously varying characters (r ¼ 0). Different plots reflect different combinations of spatial configuration (i.e. matrix dimensions) and probability of extinction (pextinction), and values plotted reflect the proportion of results in a given simulation run (n ¼ 1000) in which a significant association was found between traits X and Y. Simulations were run with colonization probability (pcolonize) of 0.96 and a trait change variance of 0.02.
among lineages that share a most recent common ancestor that are standardized for evolutionary time. Nunn et al. (2006) investigated whether an increasing probability of horizontal transmission reduces the statistical performance of independent contrasts, focusing on type I errors (incorrectly rejecting true null hypotheses of no association between traits). The authors thus simulated a wide range of values for the probability of trait donation (pdonation) to neighbours (in addition to other parameters, see Nunn et al. 2006). Most of the simulations were concentrated within a range of donation probabilities from 0 to 0.06. By simulating evolution with pdonation ¼ 0, only vertical transmission occurred; thus, under this parameter setting, phylogeny-based methods were expected to produce results that match previous simulation studies that investigated the statistical performance of independent contrasts for biological traits (Martins & Garland 1991; Purvis et al. 1994). At the highest probability of donation (pdonation ¼ 0.15) and four neighbours, the actual probability of receiving a trait from one or more neighbours in a given generation equalled 0.48 (based on the binomial theorem, see above). Such high rates are unlikely in most realworld datasets, but are not unreasonable, as suggested by examples such as the rapid spread of horses among New World peoples (Roe 1955), or the movement of Islam across many parts of west (Trimmingham 1970) and east Africa (Ensminger 1997). Simulations were run for 60 generations, and analyses were conducted only on full matrices of 36 societies (see Nunn et al. 2006 for details). When considering traits that were entirely uncorrelated, Nunn et al. (2006) found that type I error rates Phil. Trans. R. Soc. B (2010)
increased with increasing probability of horizontal transmission (figure 2). It is often useful to compare methods that incorporate phylogenetic information with analyses that do not take this information into account. This comparison revealed that non-phylogenetic tests have higher type I errors than when controlling for phylogeny using independent contrasts, especially at low levels of horizontal transmission. In some simulations, type I errors for non-phylogenetic tests were lower than for analyses based on independent contrasts, but error rates were still extremely high (figure 2).
(b) Application and further considerations Simulation results are dependent on the assumptions that were made in simulating and analysing the data; thus, it is important to critically evaluate these assumptions. Of particular importance in this regard, Nunn et al. (2006) assumed that traits are donated as a pair. This is a useful starting point, as it is analogous to the paired transmission of traits during vertical transmission and it means that both traits have the same evolutionary and geographical history. Currie et al. (2010) used results based on a different assumption, namely that traits are transmitted independently during donation events, which is also a reasonable assumption for many human cultural traits. In contrast to the patterns in figure 2, they found that type I error rates are not elevated by horizontal transmission. Thus, independent contrasts is a valid method when the traits are transmitted vertically or, for situations in which horizontal transmission occurs, when we have reason to believe that the traits are transmitted
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3812
C. L. Nunn et al.
Simulating trait evolution
(a) 600 frequency
500 400 300 200 100 0 (b) 600 frequency
500 400 300 200 100 0 –1.0 –0.8 –0.6 –0.4 –0.2
0
0.2 0.4 0.6 0.8 1.0
Figure 3. Causes of higher type I error rates. Distribution of correlation coefficients statistics for (a) paired and (b) unpaired transmission of continuously varying traits across 5000 simulations. Paired transmission of traits during horizontal transmission events does not result in biased parameter estimates (mean value for paired transmission ¼ 0.0064, versus 20.002 for unpaired transmission). Instead, the elevated type I error rates for paired compared with unpaired trait transmission result from a wider expected distribution of statistics (s.d. of 0.2996 and 0.1724 for paired and unpaired transmission, respectively). Results are from a 6 6 matrix with probability of extinction of 0.08, horizontal donation of 0.01, no correlation among the traits, variance of trait change of 0.02 and 60 generations.
independently. The independence of traits will need to be determined on a case-by-case basis. As to why the higher type I error rates emerge under paired transmission of traits, it could be due to one of two factors. First, the paired transmission could artificially create a correlation where none exists simply by the maintenance of traits with similar values across data points. This involves biased estimates of the correlation coefficient, and we would expect the correlation between the traits from the simulation to be centred above zero. Second, it could be that violations of the assumptions of independent contrasts, including an incorrect topology and model of evolution, create distributions of statistics that are too wide, as found in previous phylogenetic simulations (Martins & Garland 1991; Symonds 2002). In fact, we have good reason to believe that the assumptions of independent contrasts are violated with regard to the tree topology. More specifically, horizontal transmission under paired transmission results in a tree for the traits that differs from the true tree of societal splitting; yet the contrasts are calculated on the true tree. Consistent with this expectation, we find that paired horizontal transmission does not create bias, but does alter the distribution of statistics from that expected (figure 3; F4999,4999 ¼ 3.02, p , 0.0001). Thus, it is the underlying assumption violations about the tree that cause the elevation in type I error rates, rather than bias originating from the design of the simulation model. Phil. Trans. R. Soc. B (2010)
4. INFERRING VERTICAL TRANSMISSION BASED ON LEVELS OF HOMOLOGY Given the sensitivity of independent contrasts to some forms of horizontal transmission, it is critical to identify whether the degree of vertical transmission in cultural datasets is sufficient for applying phylogenetic methods. In this section, we use the simulation model to assess whether methods designed to detect homology in biological datasets can be used to assess the degree of vertical transmission in cultural data. Importantly, for what follows, the tree topology might be the most parsimonious tree for the cultural data matrix in question (hereafter called the ‘parsimony’ tree), or it could be a tree based upon separately analysed genetic or linguistic data (hereafter called the true tree, subject to the caveats given above). For a simulation study of Mantel tests to quantify vertical and horizontal signal, see Nunn et al. (2006).
(a) Two measures of homology Episodes of independent evolutionary change (also known as evolutionary convergence or homoplasy) are expected to reduce phylogenetic signal in a dataset. Horizontal transmission is also expected to reduce the phylogenetic signal in the data, effectively resulting in more homoplasy. The CI and the RI are two statistics used in analyses of discrete biological data to assess phylogenetic signal. Both can be calculated for individual characters on a given tree, or as used here, an ‘ensemble’ metric calculated across multiple characters. The CI is measured as m/s, with m being the minimum number of parsimony steps possible for a given character on a tree completely congruent with it, and s being the minimum steps required on a given tree topology (Kluge & Farris 1969; Naylor & Kraus 1995). Thus, for a two-state character, m ¼ 1; for a three-state character, m ¼ 2 and so on. Thus, as the level of homoplasy increases for a character or character set, s goes up but m remains constant, and the CI is reduced. In principle, a CI of 1.0 reflects complete character-state homology on a tree, while a CI of 0.0 indicates a total absence of homology; in reality, however, the CI has an effective lower bound of about 0.38 (Archie 1990). CI values are not a simple reflection of character evolution alone, however, as they are known to decrease dramatically as a function of the number of taxa (Archie 1990). Additionally, the ensemble CI decreases, though less extremely, as character number increases (Archie 1989, 1990; Sanderson & Donoghue 1989). The RI is measured as (g 2 s)/(g 2 m). The parameters s and m in this equation are the same as in the CI calculation, while g indicates the highest number of parsimony steps that a character could possibly exhibit on any tree for the number of taxa in question (Farris 1989; Naylor & Kraus 1995). The RI in principle ranges from 0 to 1, with 1 indicating that all state distributions are homologous. As compared with the CI, the RI is substantially less sensitive to the number of taxa studied (Archie 1990). The logic underlying the application of the RI and CI to cultural data is that if cultural traits are consistently transmitted from parent to daughter populations,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Simulating trait evolution we should find that cultural traits produce good trees with high levels of homology. Thus, a high RI or CI is taken as consistent with some degree of vertical transmission, which in turn is interpreted to indicate the preponderance of phylogenesis over ethnogenesis in the data. These statistics are among the most commonly used in anthropology to infer the degree of vertical transmission in cultural datasets. For example, Collard et al. (2006) calculated the RI for 21 biological datasets and 20 cultural datasets (to which they added one additional published RI statistic, giving 21 cultural datasets). They reasoned that if a bifurcating tree model can account for cultural data, the RIs of cultural data should be similar to those of biological data, and this is exactly what they found: the average RI of the biological data was 0.61, while the average RI of the cultural data was 0.59. In another study that used the CI, Tehrani & Collard (2002) tested whether horizontal or vertical descent best describes decorative characteristics of Turkmen textiles. Their primary goal was to test the assertion that ethnogenesis is the predominant transmission mode for human cultural traits (e.g. Moore 1994). Using design characters from the period before Russian domination of the Turkmen, the CI was calculated at 0.68, suggesting that most of the characters are shared through common descent. Relatively similar patterns (with perhaps slightly more ethnogenesis) were found in the period following conquest of Central Asia by Tsarist Russians (Tehrani & Collard 2002). For additional examples, see Jordan & Shennan (2003) and Lycett et al. (2007).
(b) Concerns about using the consistency index and retention index with cultural data How well do the CI and RI detect vertical and horizontal transmission, and can they provide definitive evidence for either? Several important issues concerning the use of the CI and RI have yet to be resolved. First, these statistics were designed to assess the degree of homology in biological data, and increases in rates of evolution can lead to lower homology, i.e. a lower CI or RI. We might thus expect that traits will show less tree-like signal when the rate of evolution is high, even if no horizontal transmission occurs. This means that low values of the CI and RI do not necessarily indicate horizontal transmission; they could instead indicate that evolutionary rates are high. Second, if a large number of traits in the analysis are borrowed as a cultural ‘package’ (Boyd et al. 1997; Jones 2003), then parsimony inference will tend to produce a tree with low homoplasy. Such traits will tend to show a strong tree-like structure, albeit one that differs from the history of other cultural or genetic traits and despite possibly extensive borrowing. Indeed, some authors (e.g. O’Brien et al. 2001) have pointed out that cultural packages would still have a true tree-like structure to themselves that is not eroded by diffusion and cultural ‘hybridization’. If we have an independently derived tree, it becomes possible to evaluate the CI and RI for the traits on this true tree, and this should help to discern instances of borrowing. Phil. Trans. R. Soc. B (2010)
C. L. Nunn et al.
3813
Table 2. Parameter ranges investigated in CI and RI simulations. (n.a., not applicable.) parameter
parameter values (range)
maximum number of societiesa pextinction pdonation pcolonize r N pevolution
36–100 societies, arranged as square lattices (i.e. number of rows ¼ number of columns) 0.001–0.1 0.001–0.1 0.1–0.95 n.a. 100– 1000 0.001–0.1
a
This is a maximum because some cells may be empty at the end of a simulation. The actual number of taxa ranged from 21 to 100.
Lastly, both statistics lack a firm statistical framework for deciding on the statistical importance of a particular value. For example, is a CI of 0.5 for a set of traits significantly higher or lower than expected, and does it constitute evidence for horizontal or vertical transmission? All that can be done is to compare values with those obtained from biological traits or systems that have better understood properties (Collard et al. 2006). In summary, we can imagine scenarios where a low CI occurs in the absence of horizontal transmission, and other scenarios where a high CI occurs with high levels of horizontal transmission. Taking the low CI first, this could simply reflect homoplasy, i.e. the independent innovation of cultural traits, rather than horizontal transmission, owing to high rates of cultural evolution. A high CI might reflect lower rates of evolution, for example if the number of societies chosen is small and relatively homogeneous with respect to the traits in question. Moreover, horizontal transmission can produce a high CI on a tree inferred from traits that are borrowed as a package (Borgerhoff Mulder et al. 2006); in such a case, the phylogenetic signal will remain, but the true tree will differ from the tree that is reconstructed from cultural traits. Lastly, even if we can overcome these problems, how should we decide whether a given value is statistically significant, and should it be compared with the null hypothesis of maximal homoplasy or perfect homology? (c) Applying the simulation model to study the consistency index and retention index In light of these concerns, we altered the simulation model to test whether the RI and CI can detect horizontal transmission (MATLAB 7.0, Mathworks, Inc.). We also investigated other variables that might influence the calculation of the CI and RI (table 2). Thus, in addition to rates of trait donation to neighbours, we included rates of evolution, as a higher rate of evolution should increase homoplasy in the data (thus reducing the CI and RI). Similarly, we examined variation in the rate of extinction, the number of societies, the number of traits and the rate of colonization into empty niches. In all of these simulations, the discretely varying traits underwent independent horizontal transmission (i.e. unpaired transmission).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3814
C. L. Nunn et al.
Simulating trait evolution
In terms of output, we focused on four measures. The first two represent the CI and RI as typically calculated in cross-cultural research. Specifically, we used the trait data from the simulation model to generate phylogenies using parsimony. For this, we modified the simulation program so that it produced NEXUS files (Maddison et al. 1997), which we then analysed in PAUP* v. 4.0 Beta 10 (Swofford 2003). For each NEXUS file, we excluded parsimony uninformative characters and performed a heuristic search. We then selected at random one of the most parsimonious trees, and we calculated the CI and RI on this tree; these were ensemble statistics, calculated across the entire dataset from a simulation run. For the other two output measures, we calculated the CI and RI on the tree recorded from the simulation (i.e. the true tree). We expected that this might improve our ability to detect the signal of horizontal transmission, which should create differences between the parsimony tree and the true tree recorded in the simulation. We faced two challenges in generating and analysing the data: the first involved effectively exploring the sixdimensional space of parameters that we varied, and the second involved analysing the output in a way that can account for possible interactions among variables. To explore parameter space, we used Latin hypercube sampling, which is a type of stratified Monte Carlo sampling that has been used in epidemiological modelling and is more efficient in this context than random sampling regimes (Seaholm et al. 1988; Blower & Dowlatabadi 1994; Rushton et al. 2000). Our Latin hypercube sample drew values from the range of values in table 2 10 times with 500 samples in each set, giving a total of 5000 simulation runs. To analyse the output, we used regression trees (De’ath & Fabricius 2000). This statistical method produces graphical output in the form of a decision tree that predicts the outcome of a simulation with particular parameter values. Methodologically, it repeatedly splits the data into homogeneous groups according to the six parameters used in our simulation. The advantages of regression tree analysis in the context of analysing simulation output are many, including its ability to deal with nonlinear effects, higher order interactions and ease of interpreting the graphical output. In addition, it does not rely on p-values, which can be highly significant with the large sample sizes used here, yet give little explanatory value. Regression trees were calculated using the Statistics Toolbox in MATLAB. We split impure nodes when the number of observations for that node was 1000. After creating an initial tree using the simulation output, we used 10-fold crossvalidation to identify the pruning level with the minimal cost (De’ath & Fabricius 2000), identified as the tree with the minimum error rate. Using this pruned tree, we calculated the percentage of variance explained by comparing predicted and observed values.
(d) Results We first examined the regression trees for the CI and RI calculated from the simulated data. As shown in Phil. Trans. R. Soc. B (2010)
figure 4a, the primary predictor of the CI was the number of societies, with increasing number of societies leading to a lower CI (Archie 1990). The rate of evolution occurs at the second level in the regression tree, with higher rates of evolution leading to a lower CI. The model explained 83 per cent of the variation in CI scores, and horizontal transmission was not included in the regression tree. Figure 4b shows the regression tree for the RI, which is thought to be less sensitive to the number of taxa. In this case, the rate of evolution was the primary factor influencing the RI, with a higher rate of evolution leading to a lower RI (i.e. more homoplasy). The model accounted for 77 per cent of the variation in RI scores, and horizontal transmission was not included in the regression tree. We also examined the simulation parameters from the simulations that yielded the top 1 per cent of RI values (i.e. the 50 highest RIs in our simulation). The highest 1 per cent of RI scores ranged from 0.62 to 0.85. Relative to the values simulated, we found that these were from simulations with low rates of evolution (median ¼ 0.003), a lower probability of donation (median ¼ 0.01) and higher extinction rates (median ¼ 0.08); importantly, however, four of the simulations with the highest RIs had values of horizontal transmission greater than 0.05. Other variables did not show an obvious association with the highest RIs. On the whole, this suggests that an RI above about 0.60 is usually indicative of a high degree of vertically transmission (phylogenesis) and a low degree of horizontal transmission (ethnogenesis). Next, we examined whether the CI and RI are better able to detect horizontal transmission when using the true tree generated in the simulation (as compared with the parsimony tree). The results for the CI revealed that the number of taxa was again the primary variable on the regression tree, with tertiary levels involving rates of evolution. This model explained 55 per cent of the variation, and the probability of donation was not included in the regression tree. Results for the RI are shown in figure 4c. The regression tree model explained 66 per cent of the variation and revealed that rates of evolution and extinction were the primary factors influencing RI values. Again, horizontal transmission was not included in the regression tree models. The importance of evolutionary rate is further illustrated in the bivariate plots in figure 5, which show that rates of evolution explain more of the variation in the RI than does probability of trait donation. We again examined the parameters of the simulations that yielded the top 1 per cent of RIs calculated from the true tree, which ranged from 0.58 to 0.85. In this case, we found that the top RI values were associated with very low rates of evolution (median ¼ 0.004), and also generally high rates of extinction (median ¼ 0.08) and low rates of horizontal transmission (as compared with the range of values used in our simulations, table 2). In the latter case, the median probability of horizontal donation was 0.009, with only one of 50 simulations having a value greater than 0.05.
Phil. Trans. R. Soc. B (2010)
0.41
>56
>0.024
0.08
0.39
0.45 number of traits ≤ 295
>0.048
0.37
rate of evolution < 0.07
>295
probability of extinction < 0.05
>0.05
0.05
>0.05
0.04
>92
0.36
>0.07
0.41
rate of evolution < 0.07
>76
>0.012
0.06 number of societies ≤ 92
0.08 number of >41 societies ≤ 76
rate of >0.014 evolution < 0.012
rate of evolution < 0.05
probability of extinction < 0.048
0.48
>0.009
rate of evolution < 0.024
0.11
0.15 number of societies ≤ 41
rate of evolution < 0.014
number of societies ≤ 56
0.39
0.03
0.02
>0.05
probability of horizontal donation < 0.04
>266
>0.05
>0.04
0.33
>57
0.04
0.04 number of traits ≤ 448
0.06 rate of evolution < 0.046
>0.025
>448
0.35
>0.07
0.03
>0.06
>0.046
0.37
rate of evolution < 0.07
0.40
rate of evolution < 0.06
rate of evolution < 0.025
0.31
number of societies ≤ 57
0.35
>0.06
>0.03
probability of extinction < 0.05
rate of evolution < 0.06
0.47
>0.04
rate of evolution < 0.03
probability of extinction < 0.05
number of traits ≤ 266
0.06
>0.07
(d)
0.40
probability of extinction < 0.04
(c)
Simulating trait evolution
Figure 4. Regression tree analyses. Regression trees for the (a) CI and (b) RI using phylogenies from the parsimony analyses of the simulated cultural trait data; (c) RI using phylogenies from the ‘true tree’ saved in the simulation program; and (d ) the difference in the RI between the parsimony and ‘true’ trees.
0.56
rate of evolution < 0.009
(b)
(a)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
C. L. Nunn et al. 3815
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3816
C. L. Nunn et al.
Simulating trait evolution
(a)
(b)
RI on “true” tree
0.8 0.7 0.6 0.5 0.4 0.3 0
0.02
0.04 0.06 rate of evolution
0.08
0.1
0
0.02
0.04 0.06 0.08 probability of donation
0.1
Figure 5. The RI on the true tree. The RI declines with increasing rate of evolution (a), whereas the RI shows only a weak relationship with the probability of horizontal donation (b). Based on 5000 simulations.
RI parasimony tree
0.8
Table 3. Predictors of RI when horizontal transmission is allowed.
0.7 0.6 0.5
term
estimate
t-statistic
p-value
intercept RIno_horizontal pdonation
0.20 0.47 20.24
89.7 95.5 212.8
,0.0001 ,0.0001 ,0.0001
Results from a general linear model: RIhorizontal ¼ intercept þ b1(RIno_horizontal) þ b2(pdonation). R 2 for the full model is 0.79. Results are based on the RI calculated using the ‘true’ tree.
0.4 0.3 0.3
0.4
0.5 0.6 RI true tree
0.7
0.8
Figure 6. The RI on the parsimony tree and true tree. The RI is higher on the parsimony tree, as compared with the true tree. The dashed line indicates equal values and thus goes through the origin. Based on 5000 simulations.
We found that the RI for the parsimony tree was always higher than the RI on the true tree, as expected given that the simulated data were used to both generate the parsimony tree and to calculate the RI (figure 6). We also found that increasing horizontal transmission is associated with a greater divergence between the RI calculated from the parsimony tree (RIparsimony_tree) and the RI calculated from the true tree (RItrue_tree), with the RI for the true tree tending to decline relative to the RI for the parsimony tree as horizontal transmission increases (b ¼ 20.251, F1,4998 ¼ 964, p , 0.0001). We therefore re-ran the analyses using the difference in RIs, calculated as RIparsimony_tree 2 RItrue_tree. The resulting regression tree model explained 57 per cent of the variation in RI differences, and for the first time in any of the models, the probability of horizontal transmission was included in the regression tree (figure 4d ). Higher probabilities of donation tended to lead to a greater divergence in the RI scores between the trees, at least when the number of traits exceeded 266. Additional variables were included in the model, and this result emphasizes the importance of having external information on the history of societies, such as a linguistic tree (Mace & Pagel 1994). Phil. Trans. R. Soc. B (2010)
Given that the simulations with the highest RI for both the parsimony tree and the true tree were characterized by low rates of evolution and low horizontal transmission, it seems relatively safe to conclude that a high RI is consistent with low horizontal transmission and a high degree of vertical transmission of cultural trait variation. Conversely, it is difficult to conclude that a low RI is indicative of horizontal transmission, as this is also consistent with high rates of evolution. As a last test to probe the effect of horizontal transmission, we examined the distribution of RIs when horizontal transmission takes place to the RIs in which all other parameters are identical, but no horizontal transmission was allowed. We used the RI from the true tree. On average, the RI was lower when horizontal transmission occurred (mean RIp – donation¼0 ¼ 0.411, mean RIp – donation.0 ¼ 0.381), and in a paired t-test of the data, we found a significant effect (t ¼ 222.4, p , 0.0001). Nonetheless, this difference was in the same direction for only 63 per cent of the pairs. In a general linear model that tested for an independent effect of horizontal transmission on the RI, we found that higher rates of horizontal transmission depressed the calculation of the RI (table 3). (e) Applying the results to real-world data In the new results presented here, we found that rates of evolution have a bigger impact on the RI and CI than do rates of horizontal transmission. Because the CI is sensitive to the number of societies, the RI is a better measure to use in cross-cultural research (see also Archie 1989, 1990). On a more positive note,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Simulating trait evolution however, a high RI is almost always associated with relatively low rates of horizontal transmission, especially when it is calculated on a tree that reflects the true evolutionary history of a set of societies. Thus, in future work, it would be worthwhile to use phylogenetic methods to compare trees calculated from cultural traits with trees obtained using other information, such as genetic or linguistic data, with the prediction that these trees will differ to a greater extent as horizontal transmission increases. Our simulation results have several implications for the use of homology measures in studies of cultural traits. First, our results confirm that high CI and RI values (for example, greater than 0.60) are usually indicative of low horizontal transmission and thus the distribution of cultural variation being due to phylogenesis, which is consistent with previous uses of the metrics (O’Brien et al. 2001; Tehrani & Collard 2002; Collard et al. 2006). Second, low values were not consistently associated with high levels of horizontal transmission because the CI and RI are heavily influenced by other factors, such as the rate of character evolution and the extinction rate among societies. Thus, while high values may indicate phylogenesis, low values are uninformative. Lastly, low rates of horizontal transmission, and thus a high fidelity of cultural inheritance across generations, only rarely produced a high CI or RI. This may indicate that these metrics provide very low power to detect the prevalence of phylogenesis in cultural evolution. Caution should therefore be taken before adopting these metrics as the primary hypothesis test for a study of trait transmission. Our simulation shows that further caution is needed if differences in RI values across the time depth of a tree are to be used to test whether cultural or genetic change is responsible for an observed distribution of traits. A recent study of chimpanzee behavioural data across populations found lower RI values on a tree with deeper evolutionary divergences than on two trees of more closely related chimpanzee populations (Lycett et al. 2007). While substantial evidence exists for chimpanzee social learning capabilities and traditional behaviours (Whiten et al. 1999), the RI-based test conducted by Lycett et al. (2007) is insufficient to rule out genetic inheritance of the behavioural variations because our simulation shows that the RI is very sensitive to the rate of evolution. Assuming the behavioural variations to be genetically inherited and not learned, then a sufficiently high evolutionary rate, which could be produced by selection, would also cause a reduction of RI on a tree with deeper phylogenetic separation. This effect would be analogous to the potential for saturation of a fast evolving gene with homoplasy when comparing distantly related organisms. While we might agree that genetic inheritance without social learning seems implausible for the behaviours in question, the point remains that a test based solely on the RI is insufficient.
5. CONCLUSIONS This is an exciting time for cross-cultural research. Increasing availability of linguistic and genetic data is Phil. Trans. R. Soc. B (2010)
C. L. Nunn et al.
3817
providing new opportunities to examine trait evolution in a historical context. Simultaneously, methodological developments in evolutionary biology and anthropology are providing the tools to examine cultural trait data in a more rigorous way. The success of this enterprise will depend on how well the methods work for particular types of data and under different evolutionary conditions. Cross-cultural studies have increasingly relied on phylogenetic methods to study correlated trait evolution, reconstruct ancestral states and detect vertical or horizontal transmission. To date, however, few studies have quantitatively examined whether phylogenetic methods are appropriate for cross-cultural research. We showed how simulation approaches can be used in this endeavour, specifically to test methods, to identify the conditions under which they fail and even to explore new approaches, such as comparing the RI calculated for a parsimony tree with the RI for a tree that reflects the true history of societal splitting. Sasha Langley played a major role in developing parts of the computer code used in this study, particularly involving the tree structures. We thank two anonymous referees for helpful comments. This paper was supported in part through NSF grants BCS-0132927 and 0323793 and Harvard University.
REFERENCES Archie, J. W. 1989 Homoplasy excess ratios: new indices for measuring levels of homoplasy in phylogenetic systematics and a critique of the consistency index. Syst. Zool. 38, 253 –269. (doi:10.2307/2992286) Archie, J. W. 1990 Homoplasy excess statistics and retention indices: a reply to Farris. Syst. Zool. 39, 169 –174. (doi:10. 2307/2992454) Billing, J. & Sherman, P. W. 1998 Antimicrobial functions of spices: why some like it hot. Q. Rev. Biol. 73, 3–49. (doi:10.1086/420058) Bininda-Emonds, O. R. P. et al. 2007 The delayed rise of present-day mammals. Nature 446, 507– 512. (doi:10. 1038/nature05634) Blower, S. M. & Dowlatabadi, H. 1994 Sensitivity and uncertainty analysis of complex-models of disease transmission—an HIV model as an example. Int. Stat. Rev. 62, 229 –243. (doi:10.2307/1403510) Borgerhoff Mulder, M., George-Cramer, M., Eshleman, J. & Ortolani, A. 2001 A study of East African kinship and marriage using phylogenetically controlled comparison. Am. Anthropol. 103, 1059–1082. (doi:10.1525/aa.2001. 103.4.1059) Borgerhoff Mulder, M., Nunn, C. L. & Towner, M. C. 2006 Macroevolutionary studies of cultural trait transmission. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Boyd, R., Borgerhoff Mulder, M., Durham, W. H. & Richerson, P. J. 1997 Are cultural phylogenies possible? In Human by nature: between biology and the social sciences (eds P. Weingart, S. D. Mitchell, P. J. Richerson & S. Maasen), pp. 355 –386. Mahwah, NJ: Erlbaum. Burton, M. L. & White, D. R. 1984 Sexual division of labor in agriculture. Am. Anthropol. 86, 568 –583. (doi:10. 1525/aa.1984.86.3.02a00020) Collard, M., Shennan, S. J. & Tehrani, J. J. 2006 Branching, blending, and the evolution of cultural similarities and differences among human populations. Evol. Hum. Behav. 27, 169–184. (doi:10.1016/j.evolhumbehav. 2005.07.003)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3818
C. L. Nunn et al.
Simulating trait evolution
Currie, T. E., Greenhill, S. J. & Mace, R. 2010 Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Phil. Trans. R. Soc. B 365, 3903–3912. (doi:10.1098/rstb.2010.0014) De’ath, G. & Fabricius, K. E. 2000 Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192. (doi:10. 1890/0012-9658(2000)081[3178:CARTAP]2.0.CO;2) Diaz-Uriarte, R. & Garland, T. 1996 Testing hypotheses of correlated evolution using phylogenetically independent contrasts: sensitivity to deviations from Brownian motion. Syst. Biol. 45, 27–47. (doi:10.1093/ sysbio/45.1.27) Dow, M. M. 1984 A biparametric approach to network autocorrelation. Sociol. Methods Res. 13, 201 –217. (doi:10.1177/0049124184013002002) Dow, M. M. 1993 Saving the theory: on chi-square tests with cross-cultural survey data. Cross Cult. Res. 27, 247 –276. (doi:10.1177/106939719302700305) Dow, M. M. 2007 Galton’s problem as multiple network autocorrelation effects—cultural trait transmission and ecological constraint. Cross Cult. Res. 41, 336 –363. Ensminger, J. 1997 Transaction costs and Islam: explaining conversion in Africa. J. Inst. Theor. Econ. 153, 4–29. Farris, J. 1989 The retention index and the rescaled consistency index. Cladistics 5, 417 –419. (doi:10.1111/j.10960031.1989.tb00573.x) Felsenstein, J. 1985 Phylogenies and the comparative method. Am. Nat. 125, 1 –15. (doi:10.1086/284325) Fortunato, L., Holden, C. & Mace, R. 2006 From bridewealth to dowry? A Bayesian estimation of ancestral states of marriage transfers in Indo-European groups. Hum. Nat. 17, 355–376. (doi:10.1007/s12110-006-1000-4) Gordon, R. G. (ed.) 2005 Ethnologue: languages of the world. Dallas, TX: SIL International. See http://www.ethnologue. com. Gray, R. D., Bryant, D. & Greenhill, S. J. 2010 On the shape and fabric of human history. Phil. Trans. R. Soc. B 365, 3923–3933. (doi:10.1098/rstb.2010.0162) Gray, R. D. & Atkinson, Q. D. 2003 Language-tree divergence times support the Anatolian theory of IndoEuropean origin. Nature 426, 435 –439. (doi:10.1038/ nature02029) Gray, R. D. & Jordan, F. M. 2000 Language trees support the express-train sequence of Austronesian expansion. Nature 405, 1052– 1055. (doi:10.1038/35016575) Hartung, J. 1982 Polygyny and the inheritance of wealth. Curr. Anthropol. 23, 1–12. (doi:10.1086/202775) Harvey, P. H. & Pagel, M. D. 1991 The comparative method in evolutionary biology. Oxford Series in Ecology and Evolution. Oxford, UK: Oxford University Press. Harvey, P. H. & Rambaut, A. 2000 Comparative analyses for adaptive radiations. Proc. R. Soc. Lond. B 355, 1599–1605. (doi:10.1098/rstb.2000.0721) Holden, C. J. 2002 Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximumparsimony analysis. Proc. R. Soc. Lond. B 269, 793 –799. (doi:10.1098/rspb.2002.1955) Hurles, M. E., Matisoo-Smith, E., Gray, R. D. & Penny, D. 2003 Untangling oceanic settlement: the edge of the knowable. Trends Ecol. Evol. 18, 531 –540. (doi:10.1016/ S0169-5347(03)00245-3) Jones, D. 2003 Kinship and deep history: exploring connections between culture areas, genes, and languages. Am. Anthropol. 105, 501–514. (doi:10.1525/aa.2003.105.3.501) Jordan, P. & Shennan, S. J. 2003 Cultural transmission, language, and basketry traditions amongst the California Indians. J. Anthropol. Archaeol. 22, 43– 74. (doi:10.1016/ S0278-4165(03)00004-7) Phil. Trans. R. Soc. B (2010)
Kluge, A. G. & Farris, J. S. 1969 Quantitative phyletics and evolution of anurans. Syst. Zool. 18, 1–32. Lipo, C. P., O’Brien, M. J., Collard, M. & Shennan, S. (eds) 2006 Mapping our ancestors. New Brunswick, NJ: Aldine Transaction. Lycett, S. J., Collard, M. & McGrew, W. C. 2007 Phylogenetic analyses of behavior support existence of culture among wild chimpanzees. Proc. Natl Acad. Sci. USA 104, 17 588 –17 592. (doi:10.1073/pnas.0707930104) Mace, R. & Holden, C. J. 2005 A phylogenetic approach to cultural evolution. Trends Ecol. Evol. 20, 116–121. (doi:10.1016/j.tree.2004.12.002) Mace, R. & Pagel, M. 1994 The comparative method in anthropology. Curr. Anthropol. 35, 549– 564. (doi:10. 1086/204317) Mace, R. & Pagel, M. 1995 A latitudinal gradient in the density of human languages in North America. Proc. R. Soc. Lond. B 261, 117 –121. (doi:10.1098/rspb.1995.0125) Mace, R., Holden, C. J. & Shennan, S. (eds) 2005 The evolution of cultural diversity: a phylogenetic approach. London, UK: UCL Press. Maddison, D. R., Swofford, D. L. & Maddison, W. P. 1997 Nexus: an extensible file format for systematic information. Syst. Biol. 46, 590– 621. (doi:10.2307/ 2413497) Martins, E. P. & Garland, T. 1991 Phylogenetic analyses of the correlated evolution of continuous characters: a simulation study. Evolution 45, 534 –557. (doi:10. 2307/2409910) Mesoudi, A., Whiten, A. & Laland, K. N. 2004 Perspective: is human cultural evolution Darwinian? Evidence reviewed from the perspective of the origin of species. Evolution 58, 1–11. (doi:10.1554/03-212) Moore, J. H. 1994 Putting anthropology back together again: the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925– 948. (doi:10.1525/aa.1994.96.4.02a00110) Moore, C. C. & Romney, A. K. 1994 Material culture, geographic propinquity, and Linguist affiliation on the north coast of New-Guinea—a reanalysis. Am. Anthropol. 96, 370–392. (doi:10.1525/aa.1994.96.2.02a00050) Moore, J. L., Manne, L., Brooks, T., Burgess, N. D., Davies, R., Rahbek, C., Williams, P. & Balmford, A. 2002 The distribution of cultural and biological diversity in Africa. Proc. R. Soc. Lond. B 269, 1645 –1653. (doi:10.1098/ rspb.2002.2075) Moylan, J. W., Borgerhoff Mulder, M., Graham, C. M., Nunn, C. L. & Hakansson, T. 2006 Cultural traits and Linguistic trees: phylogenetic signal in East Africa. In Mapping our ancestors (eds C. P. Lipo, M. J. O’Brien, M. Collard & S. Shennan), pp. 33–52. New Brunswick, NJ: Aldine Transaction. Murdock, G. P. & White, D. 1969 Standard cross-cultural sample. Ethnology 8, 329– 369. (doi:10.2307/3772907) Naroll, R. 1961 Two solutions to Galton problem. Phil. Sci. 28, 15–39. (doi:10.1086/287778) Naylor, G. & Kraus, F. 1995 The relationship between s and m and the retention index. Syst. Biol. 44, 559–562. (doi:10.1093/sysbio%2F44.4.559) Nee, S., Holmes, E. C., May, R. M. & Harvey, P. H. 1994 Extinction rates can be estimated from molecular phylogenies. Phil. Trans. R. Soc. Lond. B 344, 77–82. (doi:10. 1098/rstb.1994.0054) Nunn, C. L. 1995 A simulation test of Smith’s ‘degrees of freedom’ correction for comparative studies. Am. J. Phys. Anthropol. 98, 355–367. (doi:10.1002/ajpa.1330980308) Nunn, C. L. & Barton, R. A. 2001 Comparative methods for studying primate adaptation and allometry. Evol. Anthropol. 10, 81–98. (doi:10.1002/evan.1019) Nunn, C. L., Borgerhoff Mulder, M. & Langley, S. 2006 Comparative methods for studying cultural trait
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Simulating trait evolution evolution: a simulation study. Cross Cult. Res. 40, 177 – 209. (doi:10.1177/1069397105283401) O’Brien, M. J., Darwent, J. & Lyman, R. L. 2001 Cladistics is useful for reconstructing archaeological phylogenies: Palaeoindian points from the southeastern United States. J. Archaeol. Sci. 28, 1115– 1136. (doi:10.1006/ jasc.2001.0681) Pagel, M. & Mace, R. 2004 The cultural wealth of nations. Nature 428, 275–278. (doi:10.1038/428275a) Pagel, M. D., May, R. M. & Collie, A. R. 1991 Ecological aspects of the geographical-distribution and diversity of mammalian-species. Am. Nat. 137, 791 –815. (doi:10. 1086/285194) Price, T. 1997 Correlated evolution and independent contrasts. Phil. Trans. R. Soc. Lond. B 352, 519–529. (doi:10.1098/rstb.1997.0036) Purvis, A., Gittleman, J. L. & Luh, H. 1994 Truth or consequences: effects of phylogenetic accuracy on two comparative methods. J. Theor. Biol. 167, 293–300. (doi:10.1006/jtbi.1994.1071) Roe, F. G. 1955 The Indian and the horse. Norman, OK: University of Oklahoma Press. Rushton, S. P., Lurz, P. W. W., Gurnell, J. & Fuller, R. 2000 Modelling the spatial dynamics of parapoxvirus disease in red and grey squirrels: a possible cause of the decline in the red squirrel in the UK? J. Appl. Ecol. 37, 997–1012. (doi:10.1046/j.1365-2664.2000.00553.x) Sanderson, M. J. & Donoghue, M. J. 1989 Patterns of variation in levels of homoplasy. Evolution 43, 1781–1795. (doi:10.2307/2409392) Seaholm, S. K., Ackerman, E. & Wu, S. C. 1988 Latin hypercube sampling and the sensitivity analysis of a Monte–Carlo epidemic model. Int. J. Biomed.
Phil. Trans. R. Soc. B (2010)
C. L. Nunn et al.
3819
Comput. 23, 97–112. (doi:10.1016/0020-7101(88) 90067-0) Sutherland, W. J. 2003 Parallel extinction risk and global distribution of languages and species. Nature 423, 276 –279. (doi:10.1038/nature01607) Swofford, D. L. 2003 PAUP*: phylogenetic analysis using parsimony (*and other methods). Sunderland, MA: Sinauer Associates. Symonds, M. R. E. 2002 The effects of topological inaccuracy in evolutionary trees on the phylogenetic comparative method of independent contrasts. Syst. Biol. 51, 541–553. (doi:10.1080/10635150290069977) Tehrani, J. & Collard, M. 2002 Investigating cultural evolution through biological phylogenetic analyses of Turkmen textiles. J. Anthropol. Archaeol. 21, 443– 463. (doi:10.1016/S0278-4165(02)00002-8) Terrell, J. E., Hunt, T. L. & Gosden, C. 1997 The dimensions of social life in the Pacific—human diversity and the myth of the primitive isolate. Curr. Anthropol. 38, 155–195. (doi:10.1086/204604) Trimmingham, J. S. 1970 A history of Islam in west Africa. London, UK: Oxford University Press. Tylor, E. B. 1889 On a method of investigating the development of institutions applied to the law of marriage and descent. J. R. Anthropol. Inst. 18, 245 –272. White, D. R., Burton, M. L. & Dow, M. M. 1981 Sexual division of labor in African agriculture—a network autocorrelation analysis. Am. Anthropol. 83, 824– 849. (doi:10.1525/aa.1981.83.4.02a00040) Whiten, A., Goodall, J., McGrew, W. C., Nishida, T., Reynolds, V., Sugiyama, Y., Tutin, C. E. G., Wrangham, R. W. & Boesch, C. 1999 Cultures in chimpanzees. Nature 399, 682– 685. (doi:10.1038/21415)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3821–3828 doi:10.1098/rstb.2010.0048
Measuring the diffusion of linguistic change John Nerbonne* Alfa-informatica, University of Groningen, PO Box 716, 9700 AS Groningen, The Netherlands We examine situations in which linguistic changes have probably been propagated via normal contact as opposed to via conquest, recent settlement and large-scale migration. We proceed then from two simplifying assumptions: first, that all linguistic variation is the result of either diffusion or independent innovation, and, second, that we may operationalize social contact as geographical distance. It is clear that both of these assumptions are imperfect, but they allow us to examine diffusion via the distribution of linguistic variation as a function of geographical distance. Several studies in quantitative linguistics have examined this relation, starting with Se´guy (Se´guy 1971 Rev. Linguist. Romane 35, 335 – 357), and virtually all report a sublinear growth in aggregate linguistic variation as a function of geographical distance. The literature from dialectology and historical linguistics has mostly traced the diffusion of individual features, however, so that it is sensible to ask what sort of dynamic in the diffusion of individual features is compatible with Se´guy’s curve. We examine some simulations of diffusion in an effort to shed light on this question. Keywords: linguistics; dialects; diffusion 1. INTRODUCTION We summarize our key contributions in this introductory section, and provide a guide for the rest of the paper. (a) Key contributions There are two core contributions of the present paper. First, we extend arguments made by Nerbonne & Heeringa (2007) that dialectometric models provide a means for measuring linguistic variation in the aggregate and thence a means for measuring the influence of geography (and other factors) on linguistic variation. We extend these arguments by recalling Se´guy’s early demonstration that there was a sublinear relation between geographical distance and lexical variation, and then by examining six novel datasets, all of which confirm the relationship, which we propose dubbing SE´GUY’S CURVE. But we also ask how we might engage the sociolinguistic literature, which has reflected profoundly on the mechanisms of diffusion, and the current exercise in measuring linguistic variation. We therefore turn secondly to a novel simulation of linguistic diffusion in which we can manipulate the strength of accommodation owing to geography. The results of the simulation suggest that the attractive force owing to gravity decreases linearly with distance, and not quadratically as the gravity model proposes. (b) Structure Section 2 reviews some of the linguistic literature on the geographical diffusion of language change, in particular, Trudgill’s GRAVITY MODEL. Our point in this review is to note the need for a way of measuring *
[email protected] One contribution of 14 to a Theme issue ‘Cultural and linguistic diversity: evolutionary approaches’.
diffusion and the influence geography has on it. Section 3 provides a very brief introduction to dialectometric techniques for measuring linguistic differences, and introduces SE´GUY’S CURVE of linguistic variation as a function of geography. Section 4 then reviews two recent papers exploring the gravity model using dialectometric techniques and extends their empirical base by examining six other datasets, reporting on the percentage of linguistic variation which can be explained by geography, even those which may not represent relatively stable settlements in which we can be sure that diffusion has worked ‘normally’. Section 5 then introduces simulations as a tool to explore the relation between the diffusion patterns of individual lexical items and those of large aggregates, drawing the conclusion that the attractive power of geography is more probably a linear force than an inverse square relation of the sort proposed by the gravity model. Finally, §6 wraps things up a little, and also suggests why the ideas discussed here may be of more general interest for researchers interested in the determinants of linguistic diffusion.
2. THE SOCIOLINGUISTICS OF DIFFUSION There is a substantial linguistic literature on diffusion which has documented and explained a large number of cases where individual features have spread, and most of the recent work has come from sociolinguists. We review this before turning to the aggregate analyses that have arisen in dialectometry. (a) The wave theory The locus classicus for linguists’ discussion of diffusion is Schmidt’s (1872) demonstration that there are important features that cut across the hierarchical classification of the Indo-European languages (Bloomfield 1933, pp. 312 – 319). Bloomfield uses Schmidt’s demonstration to argue that in addition to
3821
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3822
J. Nerbonne
Measuring linguistic diffusion
cases of sharp divisions between languages demonstrated by regular correspondences and modelled by family trees of relatedness, there must also be regular processes of diffusion even between the branches of the family trees, i.e. even between differentiated varieties (Bloomfield 1933, pp. 318). Bloomfield believed that speech habits were modified throughout life each time an individual entered into communication with another (Bloomfield 1933, pp. 46). He, therefore, predicted that processes of diffusion would follow the lines of communication DENSITY (Bloomfield 1933, pp. 46, 326) so that lines of dialect differentiation, reflecting processes of diffusion, should ultimately be explained by the density of communication. The idea is that diffusion is enabled and promoted by more frequent communication.
(b) The gravity model Peter Trudgill’s GRAVITY MODEL effectively recast Bloomfield’s notion of density, focusing on distance and population sizes as predictors of the chance of communication (Trudgill 1974). In these models, inspired by social geography, the spread of linguistic innovation is always via social contact which is naturally promoted by proximity and population size. The gravity model foresees linguistic innovations not simply radiating from a centre, as they might in a pure version of the wave theory, but rather affecting larger centres first, and from there spreading to smaller ones, and so on. In the special case of landscapes with a few largersize cities, an innovation may spread from one large population centre directly to another intermediately sized one, often by-passing smaller, geographically intermediate sites. This is owing to the role of population size. Innovations are no longer seen as rolling over the landscape uniformly as waves, but rather as passing over immediate small neighbours in favour of larger, potentially more distant settlements. For this reason it is also referred to as a CASCADE model (Labov 2001, p. 285): linguistic innovations proceed as water falling from larger pools to smaller ones, and thence to smallest. Each population centre may be seen as having a sphere of influence in which further diffusion proceeds locally. The connection to physical gravity may be appreciated if one considers the solar system, i.e. the sun, the nine planets and their moons. In understanding the movements of a given heavenly body, it is best to concentrate on the nearest very massive body. For example, even though the moon is affected by the sun’s mass, its rotation is determined almost entirely by the much closer Earth. The physical theory of gravity accounts for this by postulating a force owing to gravity which is inversely proportional to the square of the distance between bodies. In this way very distant bodies are predicted to have much less influence than nearby ones. There have been many reactions to the gravity model which we cannot elaborate on here for reason of space. We refer to our earlier paper (Nerbonne & Heeringa 2007) for discussion. In summary, research has been mixed in its reception of Trudgill’s postulation of a gravity-like effect in linguistic diffusion. There have been voices of affirmation, but also of dissent. Phil. Trans. R. Soc. B (2010)
This essay concentrates on the effect of geography, rather than population size, as both Nerbonne & Heeringa (2007) and Heeringa et al. (2007) show that geography is by far the more important factor. In contrast to most of the literature on this topic, this essay aims to measure the influence of geography on language variation in order to contribute to the discussion. Other contributions attempt no quantitative assessment of the strength of the influence, while this is possible and worthwhile. They also all require methodologically that the researchers identify one or more ongoing linguistic changes and find a way to track them, which is likewise non-trivial. We shall instead examine the residue of a large range of changes in a number of different language areas.
3. AGGREGATE (DIALECTOMETRIC) VARIATION The remainder of this paper explores an alternative approach to studying the influence of geography on diffusion. We proceed from techniques for measuring linguistic variation, immediately obtaining the advantage of then being able to measure the influence of geography using standard (regression) designs. We shall aggregate the differences in many linguistic variables in order to strengthen their signals. We also assume that all the variation we encounter is the result of diffusion—even if we cannot identify its source. This makes it easier to apply the techniques without first studying where changes are occurring and in what direction. But before presenting dialectometric approaches to diffusion, it is worthwhile reviewing how and why linguistic distances are measured. We do not have the time or space to review all of the background or range of techniques here, so the presentation will be sketchy. Fortunately, there are good introductions available (Goebl 1984; Heeringa 2004; Goebl 2006; Nerbonne 2009; Nerbonne & Heeringa 2009). Roughly, dialectometry attempts to distil the aggregate relations from among a set of sites by systematically comparing a large set of corresponding linguistic items (Nerbonne 2009) and measuring differences. By aggregating over a large set of corresponding items, the dialectometric procedure attempts to immunize its work against the dangers of fortuitous, or biased selection of material. The simplest dialectometric procedures analyse linguistic variation at a nominal or categorical level (Goebl 1984), at which linguistic items are either identical or not. Non-identical items contribute to the linguistic distance between sites, while identical elements do not. Various weighting schemes may be employed, as well (Nerbonne & Kleiweg 2007). Dialect similarity (s) is assayed as the fraction of overlap in the sample, and dialect dissimilarity (distance) is simply the inverse, d ¼ 1 2 s. Our own developments in dialectometry have emphasized the advantage of applying LEVENSHTEIN DISTANCE, also known as EDIT DISTANCE, to phonetically transcribed data. Heeringa (2004) and Nerbonne (2009) present these techniques in more detail, so that we may summarize here that the techniques enable us to measure differences in pronunciation at
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Measuring linguistic diffusion
J. Nerbonne
3823
courbe moyenne y = 36 log (x + 1) 50 40 30 20 10 2 4 6 810 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 (km) Figure 1. Se´guy’s (1971) plot of lexical distance, measured categorically, as a sublinear function of geography.
a finer level than merely ‘same’ or ‘different’. Instead the technique ALIGNS the phonetic segments in a pair of word pronunciation transcriptions optimally and sums the differences between the segments in an optimal transcription. We illustrate the procedure via two aligned strings, one a transcription of the pronunciation of the word ‘afternoon’ in the American south, and the other a transcription of its pronunciation in the American north (or Midland):
The optimal alignment reveals three points of mismatch, one substitution of one version of [u] for creating an initial another, one insertion of diphthong and one deletion of a syllable-final [r]. Each of these points of mismatch is associated with a cost, and the total cost is regarded as the phonetic distance between the two pronunciations. As Nerbonne (2009) shows, the procedure normally results in a consistent measure of pronunciation difference (Cronbach’s a 0.80 with at least 30 words in the sample), and the procedure has been validated with respected to dialect speakers’ judgements of dialect distance with the result that measurements correlated well with judgements r 0.7 (Gooskens & Heeringa 2004). We shall examine various dialectological situations below, all of which were examined using either the nominal level of analysis pioneered by Se´guy (1973) or Goebl (1984) or the numeric level as realized by Levenshtein distance. (a) Se´guy’s curve The founder of dialectometry, Se´guy (1971) examined the distribution of lexical distance, measured categorically (same or different variants) in the very first publication in this direction ‘La relation entre la distance spatiale et la distance lexicale’, and he compared the resulting curve with the one in which lexical distance varied with the square root of the logarithm of geographical distance. The result, as one might imagine, is a curve that shows an initial rise and then becomes quite flat. We show Se´guy’s distribution in figure 1. In view of Se´guy’s very early work in this direction, I propose that the sublinear curve of linguistic distance versus geographical distance be called SE´GUY’S CURVE. Phil. Trans. R. Soc. B (2010)
Since Se´guy (1971) appeared a little before Trudgill (1974), there was apparently never any early attempt to confront the two views of how geography influences linguistic variation until recently. The task of the following sections will be to show what these two views have to do with one another. Se´guy’s insight is comparable to that of population geneticists, who had earlier found the same sublinear distribution of genetic diversity when viewed as a function of geography, a phenomenon they have come to call ‘isolation by distance’ (Jobling et al. 2004), tracing the idea back to work in mathematical biology of the 1940s and 1950s (Wright 1943; Male´cot 1955). More recently, Holman et al. (2007) have examined the relation between geographical distance and typological distance as assayed by a set of structural features, referring to their results as establishing ‘spatial autocorrelation’.
4. A DIALECTOMETRIC VIEW OF GRAVITY Nerbonne & Heeringa (2007) and Heeringa et al. (2007) applied dialectometric designs to questions of diffusion in an attempt to add an aggregate quantitative perspective to the discussion, claiming two advantages for dialectometric approaches in approaching this question. First, some researchers may have relied on fortuitously chosen features which corroborate or contradict the lasting influence of geography and the chance of social contact, but which might be atypical. Dialectometry proceeds from the measurement of a large number of linguistic variables, and thus affords the opportunity to examine Trudgill’s ideas from a more general perspective. Second, dialectometry enables the research to quantify the strength of attractive forces at least somewhat, and thus move beyond cataloguing examples which appear to obey or contradict the predictions of the theory. Nerbonne & Heeringa derived linguistic distances from 52 towns in the Lower Saxon area of the Netherlands using a technique explained above (§3); they then attempted to explain the linguistic distances on the basis of geographical distance and the chance of social contact as reified in population size. Using a multiple regression model, they proceed from Trudgill’s formulation of the gravity model: Iij ¼ s
Pi Pj ðdij Þ2
;
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3824
J. Nerbonne
Measuring linguistic diffusion
where Iij represents the mutual influence of centres i and j, Pi is the population of center i, etc. and dij is the distance between i and j. s is a constant needed to allow for simple transformations, but it may be viewed as ‘variable expressing linguistic similarity’. It will of necessity be ignored in what follows. See Nerbonne & Heeringa (2007) for discussion. The model predicts that the ‘attractive’ (accommodating) force should correlate inversely with (the square of) geographical distance, and directly with the product of the population sizes. Reasoning that linguistic distance should reflect this attractive force, but inversely, Nerbonne & Heeringa (2007) examined whether linguistic distance therefore directly correlates with (the square of ) geographical distance and inversely with the populations’ product, and found that there appears to be no effect of population size on linguistic distance, but also that linguistic distance indeed correlates directly with geographical distance. But Nerbonne & Heeringa also noted that the correlation between geographical and linguistic distance appeared not to be quadratic, as the gravity model predicts, but rather sublinear, i.e. in the same family of relations that Se´guy noted in 1971. The best predictor of aggregate linguistic distance was not the square of the geography, but rather the logarithm. Heeringa et al. (2007) criticized the choice of sites in the Nerbonne & Heeringa study, replacing these with a set of sites from the entire Dutch area (Nerbonne & Heeringa had worked exclusively with Lower Saxon) which included rather more settlements of large population size. This study replicated the fact that aggregate linguistic distance depends in a sublinear fashion on geographical distance, but it vindicated the gravity model in showing that population size indeed played the role predicted. In the later study, population product size accounted for six per cent of the variance in linguistic distance.
(a) Se´guy’s law Given Se´guy’s early demonstration that French lexical variation depends sublinearly on geographical distance, and Nerbonne & Heeringa’s (2007) replication of this result for Dutch pronunciation, it is worth examining a range of other studies to see that they may contribute to the discussion. Alewijnse et al. (2007) obtained pronunciation data from Bantu data collected in Gabon by researchers from the Dynamique du Langage (http://www.ddl.ishlyon.cnrs.fr/) in Lyon. Let us note that since the Gabon Bantu population consisted of migratory farmers until recently, it might not be the right sort of population for this study as the relative mobility of the population might disturb the traces of ‘normal diffusion’. The data involve broad phonetic transcriptions of 160 concepts taken from 53 sampling sites. Tone was not analysed as the Bantu experts were skeptical about how reliably it had been recorded and transcribed. The geographical locations recorded were those provided by native speaker respondents, but they should be regarded in some cases as ‘best guesses’ considering that the population has been fairly mobile (over long periods of time). The pronunciation Phil. Trans. R. Soc. B (2010)
differences were analysed using the procedure sketched in §3, and these correlate strongly with logarithmic geographical distances (r ¼ 0.469). Prokic´ (2007) obtained data on Bulgarian dialectology from Prof. Vladimir Zhobov’s group at St Clement of Ohrid’s University of Sofia. Prokic´ worked on broad phonetic transcriptions of 156 words from 197 sampling sites in Bulgaria. Palatalized consonants, which are phonemically distinct in Bulgarian, were represented in the data, but stress is not. The pronunciation difference measurement of §3 was applied, where alignments were constrained to respect syllabicity so that vowels only aligned with vowels and consonants only with consonants. Since Bulgaria was occupied by Turkey for several centuries (until 1872), its linguistic variation may display less reliable patterns vis-a`-vis geography. The correlation of pronunciation and logarithmic geographical distance was measured at r ¼ 0.488. Nerbonne & Siedle (2005) obtained data from the Deutscher Sprachatlas in Marburg (http://www.unimarburg.de/fb09/dsa/). The pronunciations of 186 words had been collected at 201 sampling sites for the project Kleiner Deutscher Lautatlas. A team of phoneticians transcribed the data narrowly; each word was transcribed twice independently and disagreements were settled in consultation so that there was consensus about the results. The pronunciation difference measurement of §3 was applied, where alignments were constrained to respect syllabicity so that vowels aligned only with vowels and consonants only with consonants. Logarithmic geographical distance correlates strongly with pronunciation in this dataset (r ¼ 0.566). Kretzschmar (1994) reports on the LAMSAS project (http://hyde.park.uga.edu/lamsas/), conceived and carried out mainly by Hans Kurath, Guy Lowman and Raven McDavid in the 1930s and again in the 1950s and 1960s. The data are publicly available at http://hyde.park.uga.edu/lamsas/. Due to differences in fieldworker/transcriber practices, we analyse only the 826 interviews which Guy Lowman conducted in the 1930s involving 151 different response items. LAMSAS used its own transcription system, which we converted automatically to X-SAMPA for the purpose of analysis, which was conducted using the measurements described in §3. Nerbonne (in press) describes some aspects of the analysis in more detail, in particular the degree to which phonological structure is present. Since the area of the present USA has only been English speaking for the last several centuries, it may retain traces of migration disturbance in the geographical distribution of linguistic variation. We nonetheless measured a strong correlation between pronunciation and geographical distance after applying a logarithmic correction to the latter (r ¼ 0.511). Wieling et al. (2007) analyses the data of the projects Morphologische Atlas van Nederlandse Dialecten (MAND) and Fonologische Atlas van Nederlandse Dialecten (FAND; Goeman & Taeldeman 1996). In order to eschew potential confounds owing to transcription differences Wieling et al. (2007) analyse only the data from the Netherlands, and not that of Flanders. The former included 562 linguistic items from 424 varieties. Since the Netherlands comprises only 40 km2, the MAND/FAND is one of the densest
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Measuring linguistic diffusion Bantu
(a)
J. Nerbonne
3825
Bulgaria
(b)
0.20 0.004
0.10
0.002
0
0 0
100
200
(c)
300
400
500
600
0
100
(d) 0.5
Germany
200
300
400
500
LAMSAS/Lowman
0.12 0.4 0.3 0.08
0.2 0.1
0.04
0 0
(e)
200
400
600
800
The Netherlands
0
200
400
800 1000 1200
Norway
(f)
0.07
600
4 0.05
3
0.03
2
0.01
1 0
50
100
150
200
250 300
0
100
200
300
400
500
Figure 2. Six examinations of the influence of geography on linguistic variation; a logarithmic curve is drawn in every case. The y-axes vary owing to details of measurements, but all are linear scales. See text for details.
dialect samplings ever. The pronunciation differences were assayed using the technique described in §3, where alignments were constrained to respected syllabicity. Pronunciation distance correlates strongly with the logarithm of geographical distance (r ¼ 0.622). Gooskens & Heeringa (2004) analyse the variation in 15 Norwegian versions of the fable of the International Phonetic Association, ‘The North Wind and the Sun’, making use of material from http://www.ling.hf.ntnu.no/ nos/. The material was again analysed using the pronunciation difference measurements in §3. Norwegian distinguishes pronunciations using lexical tone, and Gooskens & Heeringa experimented with measurements which incorporated this, with little distinction in the overall (aggregate) results. Interestingly from a geographical point of view (Britain 2002), Gooskens (2004) compares two geographical explanations of the linguistic differences, one based on ‘as the crow flies’ distances, and Phil. Trans. R. Soc. B (2010)
another based on the (logarithmic) travel time estimates of the late nineteenth century, showing an improvement in correlation (from r ¼ 0.41 to r ¼ 0.54). The motivation for examining the two operationalizations was naturally that Gooskens expected travel time to be the better reflection of the chance of social contact.1 We conclude from this section that there is a simple, measurable and normally sublinear influence which geography exerts on aggregate linguistic differences (figure 2). It is an empirical finding, not a theoretical prediction, that geography accounts for 16 per cent to about 37 per cent of the linguistic variation in these datasets (100 r 2). We note that the potential disturbances caused by migration, occupation, and recent settlement appear insubstantial enough in the cases examined so as not to disturb the overall tendency first noted by Se´guy, namely that variation increases as a sublinear function of geography. We
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3826
J. Nerbonne
Measuring linguistic diffusion
should also note that Spruit (2006) obtains a slightly better analysis using a linear rather than a sublinear geographical model to explain SYNTACTIC distance, but we shall not pursue the issues this suggests here. 5. INDIVIDUAL VERSUS AGGREGATE DIFFERENCES Taking stock a little, we note that the substantial sociolinguistic literature on diffusion has on the one hand recognized a major role for social contact and therefore geography, but has concentrated on identifying additional, what we might call ‘extra-geographical’ factors. Its data collection and analysis have exclusively concerned the patterns of diffusion found in individual linguistic items such as individual words or sounds. The dialectometric view on the other hand enables the measurement of the influence of geography on aggregate variation. Is there any way to bring these two perspectives to a more rewarding engagement? We can think of two ways of exploring the relation between the individual variation discussed in the scholarly literature reviewed in §2 and the aggregate dialectal variation presented in §3, one empirical and the other simulation-based. The empirical path is conceptually simple, and involves examining the distributions of a large number of individual items to explore how their distributions are related to the aggregate distance curves of §4. As conceptually simple as that strategy is, still it requires identifying a large range of words, sounds, etc. about which there would be agreement that they constitute units of diffusion. This could be quite difficult. But we can also simulate the diffusion of individual linguistic items to obtain insight about the relation between the diffusion of individual items and the aggregate diffusion curves examined in §4a. We turn now to a description of a simulation. (a) Simulating diffusion We wish to examine the effect of the ‘gravity’-like, attractive force which influences diffusion, and we shall restrict our attention to the influence owing to geography, continuing to ignore the influence of population density. Like Holman et al. (2007) we created simulations in order to focus on the contribution of individual factors, in our case the relation between aggregate linguistic distance and the effect of attractive forces of varying strengths on individual linguistic features. To investigate this process via simulation, we create several thousand sites, each of which is represented by a 100-dimensional binary vector. The sites are at regular distances from a single reference site so that the most distant site is several thousand times more distant from the reference site than the closest is. The 100 dimensions may be thought of as 100 linguistic items, e.g. 100 words or perhaps 100 pronunciation features, such as the pronunciation of the vowel in a words such as ‘night’ (i.e. as [nat] in the American south versus [nait] in standard American). The value ‘0’ indicates that the site is the same as the reference site with respect to a given dimension, and ‘1’ indicates that it is different. We intend to be deliberately vague about the Phil. Trans. R. Soc. B (2010)
units of diffusion. We proceed from the assumption that we are observing the differentiation of an initially homogeneous community, but we add some noise in the form of 100 random chances at change at every site. Unchanged linguistic items are assumed to be identical to those at the reference site. Since we finally compare each value in a given dimension only with other values in the same dimensions, we make no special assumptions about what these values are. In reality, each settlement in a sample may potentially be influenced by any other sample, as Holman et al. (2007) note, but we wish to keep the simulation simple, so we shall examine the situation in which all the influence is exerted by a single reference site. The simulation will vary the strength with which that influence is exerted. We wish to contrast two possibilities concerning the strength with which the reference site influences others. In the first linear view, the distance of the simulation site predicts directly the chance with which the value of the reference site is adopted. In that case a site that is d distant from the reference site has twice the chance of being like it (with respect to a given linguistic dimension) as a site that is 2d distant. In the second quadratic view, a site that is d distant from the reference site has four times the chance of being like the reference site when compared with another that is 2d distant. The latter is the view advanced by the gravity model. To simulate the diffusion of linguistic change we iterate once through the set of sites. At each site, we repeat the process of random change n times, where n depends on the distance of the site from the reference site. In the linear model n depends directly on the distance to the reference site, and in the quadratic model of influence n depends on the square of the distance. The random change itself is quite simple. We randomly select one dimension i in the 100-element vector, then generate a second random number, this time between 0 and 1. If the number is greater than 0.5, then we set the ith position to 1, indicating that the site differs from the reference site at dimension i. If the number is 0.5 or less, then the value of the ith dimension at the site is set to 0, indicating that it is linguistically the same as the reference site. (We note that it is distinctly possible that the same dimension is randomly chosen more than once when there is a large number of repetitions of the random change. In this case changes may cancel each other out.) In all cases the aggregate distance of the site from the reference site is simply the sum of the vector over all positions. So the overall effect is that sites near the reference site have few chances of changing—the influence of the reference site is too strong. Sites twice as distant have twice as many chances to change, and, in the case of the ‘gravity’-inspired simulated, four times as many. In both cases changes are introduced randomly, but while the chance of a change being attempted depends linearly on the geographical distance from the reference point in the linear model, it rises quadratically with the geographical distance in the quadratic model. Furthermore, since the stochastic events of change are competing for the same limited number of linguistic dimensions, the more distant sites are also more liable to change and also change back.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Measuring linguistic diffusion
linguistic difference
(a) 70
J. Nerbonne
3827
(b) 60
60 50
50
40
40
30
30
20
20 0
100
200
300
400
0
10
geographic distance
20
30
40
geographic distance
Figure 3. Two simulations of linguistic diffusion, with an attractive influence that (a) diminishes linearly and (b) diminishes quadratically. In both, we appear to obtain the characteristic sublinear Se´guy curve of aggregate linguistic distance.
linguistic difference
(a) 70
(b) 60
60 50
50
40
40
30
30
20
20 0
100
200
300
400
geographic distance
0
10
20
30
40
geographic distance
Figure 4. Local regression lines have been added to the scatterplots of the graphs in figure 3 revealing, in the case of the graph (b) representing (aggregate) inverse quadratic influences, a sigmoidal shape which is otherwise missing. The graph (a) shows the local regression line for inverse linear influence. This suggests that the inverse quadratic strength influence attributed to geography in the ‘gravity’ is bequeathed to the aggregate distribution as well, contrary to facts adduced in §4 (see figure 2). The local regressions were carried out using R with a ¼ 0.4 (each of 8000 steps considered 40% of the data using an inverse tricubic weighting).
(b) Results of simulations Figure 3 compares the results of two single runs of the simulation, one in which the chance of change rises linearly in the distance to the reference point, and a second in which the same chance of change rises quadratically with respect to distance. In both cases we have drawn the logarithmic regression line, for which we obtain r ¼ 0.66 (linear attraction) and 0.71 (inverse quadratic attraction), and in both cases we appear to obtain the characteristically sublinear Se´guy curve of aggregate linguistic distance. The curve in figure 3b appears somewhat sigmoidal, however, a suspicion we examine more closely by applying local regression to the same dataset. The result of the local regression is shown in figure 4, and, indeed, it appears that the quadratic influence results in a different curve in this respect. Local regression lines do not differ significantly from logarithmic lines in the other scatterplots (the left plot of cumulative linear influence in figure 3 or in any of the plots in figure 2). Although the results clearly point to a linear effect of geography on the likelihood of an individual linguistic item differing from that of another site, we Phil. Trans. R. Soc. B (2010)
acknowledge that further simulations would be useful to be certain of the influence of some parameters, including the relatively great distance used at initialization, the effective ceiling of 50 per cent on average differences caused by restricting the model to binary choices, and the relatively constant variance in the simulations. Finally, it would be useful to view simulations in which locations interacted with each other and not merely with a single reference point. Holman et al. (2007) have analysed simulations with lattice structures with respect to other research questions.
6. CONCLUSIONS AND FUTURE WORK Our foremost conclusion is that we can effectively test models of diffusion quantitatively, and, in particular, that these may be tested on the basis of large aggregates of linguistic material. This avoids the danger of picking material fortuitously, and it obviates the need to find material in the process of change. We likewise conclude on the basis of several empirical studies that the chance of social contact, operationalized through geography, can account for
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3828
J. Nerbonne
Measuring linguistic diffusion
about one-quarter of the aggregate linguistic variation we find in large collections such as dialect atlases. Our experiment in simulation suggests that the attractive force which tends to resist linguistic change decreases linearly with geographical displacement while the gravity model suggests an attractive force that would decrease quadratically with geography. It would clearly be valuable to seek empirical data on individual linguistic variables with which the diffusion model might be tested. If this approach to analysing diffusion is sound, and assuming that the relevant variables can be operationalized and that suitable data can be found, then this approach should likewise open the door to studies on the influence on non-geographical factors. These might be compared with geography. As the brief remarks on travel time (§4a) might suggest, it is also possible to examine alternative conceptions of geography, taking a step in the direction urged by Britain (2002). Peter Kleiweg was responsible for all the programs used in this paper, in particular, the dialectometric package L04 (http://www.let.rug.nl/kleiweg/L04/). Two anonymous referees were generous in their comments.
ENDNOTE 1
Van Gemert (2002) also examined the use of travel time in predicting Dutch dialect distances, but it turned out that travel time correlated nearly perfectly with geographical distance in the Netherlands, which lack the fjords and mountains that impede direct lines of travel in Norway.
REFERENCES Alewijnse, B., Nerbonne, J., van der Veen, L. & Manni, F. 2007 A computational analysis of Gabon varieties. In Proc. of the RANLP Workshop on Computational Phonology Workshop at Recent Advances in Natural Language Processing (eds P. Osenova et al.), pp. 3 –12. Borovetz, Bulgaria: RANLP. Bloomfield, L. 1933 Language. New York, NY: Holt, Rhinehart and Winston. Britain, D. 2002 Space and spatial diffusion. In The handbook of language variation and change (eds J. Chambers, P. Trudgill & N. Schilling-Estes), pp. 603 –637. Oxford, UK: Blackwell. Goebl, H. 1984 Dialektometrische Studien: Anhand italoromanischer, ra¨toromanischer und galloromanischer Sprachmaterialien aus AIS und ALF, vol. 3. Tu¨bingen, Germany: Max Niemeyer. Goebl, H. 2006 Recent advances in Salzburg dialectometry. Lit. Linguist. Computing 21, 411–435. (doi:10.1093/llc/ fql042) Goeman, A. & Taeldeman, J. 1996 Fonologie en morfologie van de nederlandse dialecten. Een nieuwe materiaalverzameling en twee nieuwe atlasprojecten. Taal en Tongval 48, 38–59. Gooskens, C. 2004 Norwegian dialect distances geographically explained. In Language variation in Europe: papers from ICLaVE 2 (eds B.-L. Gunnarson, L. Bergstro¨m, G. Eklund, S. Fridella, L. H. Hansen, A. Karstadt, B. Nordberg, E. Sundgren & M. Thelander), pp. 195 –206. Uppsala, Sweden: Uppsala University. Gooskens, C. & Heeringa, W. 2004 Perceptual evaluation of Levenshtein dialect distance measurements using Phil. Trans. R. Soc. B (2010)
Norwegian dialect data. Lang. Variation Change 16, 189 –207. Heeringa, W. 2004 Measuring dialect pronunciation differences using Levenshtein distance. PhD thesis, Rijksuniversiteit Groningen, The Netherlands. Heeringa, W., Nerbonne, J., van Bezooijen, R. & Spruit, M. R. 2007 Geografie en inwoneraantallen als verklarende factoren voor variatie in het nederlandse dialectgebied. Tijdschrift voor Nederlandse Taal- en Letterkunde 123, 70–82. Holman, E. W., Schulze, C., Stauffer, D. & Wichmann, S. 2007 On the relation between structural diversity and geographical distance among languages: observations and computer simulations. Linguist. Typology 11, 393–421. (doi:10.1515/LINGTY.2007.027) Jobling, M. A., Hurles, M. E. & Tyler-Smith, C. 2004 Human evolutionary genetics: origins, peoples and diseases. New York, NY: Garland. Kretzschmar, W. A. (ed.) 1994 Handbook of the linguistic atlas of the Middle and South Atlantic States. Chicago, IL: The University of Chicago Press. Labov, W. 2001 Principles of linguistic change: social factors, vol. 2. Malden, MA: Blackwell. Male´cot, G. 1955 The decrease of relationship with distance. Cold Spring Harbor Symp. Quant. Biol. 20, 52–53. Nerbonne, J. 2009 Data-driven dialectology. Lang. Linguist. Compass 3, 175– 198. (doi:10.1111/j.1749-818X.2008. 00114.x) Nerbonne, J. In press. Various variation aggregates in the LAMSAS south. In Language variety in the South III (eds C. Davis & M. Picone). Tuscaloosa, AL: University of Alabama. Nerbonne, J. & Heeringa, W. 2007 Geographic distributions of linguistic variation reflect dynamics of differentiation. In Roots: linguistics in search of its evidential base (eds S. Featherston & W. Sternefeld), pp. 267 –297. Berlin, Germany: Mouton De Gruyter. Nerbonne, J. & Heeringa, W. 2009 Measuring dialect differences. In Theories and methods, language and space (eds J. E. Schmidt & P. Auer). Berlin, Germany: Mouton De Gruyter. Nerbonne, J. & Kleiweg, P. 2007 Toward a dialectological yardstick. Quant. Linguist. 14, 148–167. (doi:10.1080/ 09296170701379260) Nerbonne, J. & Siedle, C. 2005 Dialektklassifikation auf der Grundlage aggregierter Ausspracheunterschiede. Z. Dialektol. Linguist. 72, 129 –147. Prokic´, J. 2007 Identifying linguistic structure in a quantitative analysis of dialect pronunciation. In Proc. of the ACL 2007 Student Research Workshop, pp. 61–66. Prague: Association for Computational Linguistics. Schmidt, J. 1872 Die Verwandtschaftsverha¨ltnisse der indogermanischen Sprachen. Weimar, Germany: Bo¨hlau. Se´guy, J. 1971 La relation entre la distance spatiale et la distance lexicale. Rev. Linguist. Romane 35, 335 –357. Se´guy, J. 1973 La dialectome´trie dans l’atlas linguistique de Gascogne. Rev. Linguist. Romane 37, 1 –24. Spruit, M. R. 2006 Measuring syntactic variation in dutch dialects. Lit. Linguist. Computing 21, 493 –506. (doi:10.1093/llc/fql043) Trudgill, P. 1974 Linguistic change and diffusion: description and explanation in sociolinguistic dialect geography. Lang. Soc. 2, 215 –246. (doi:10.1017/ S0047404500004358) van Gemert, I. 2002 Het geografisch verklaren van dialectafstanden met een geografisch informatiesysteem (gis). Master’s thesis, Rijksuniversiteit Groningen, The Netherlands. www.let.rug.nl/alfa/scripties.html. Wieling, M., Heeringa, W. & Nerbonne, J. 2007 An aggregate analysis of pronunciation in the Goeman-Taeldeman-van Reenen-project data. Taal en Tongval 59, 84–116. Wright, S. 1943 Isolation by distance. Genetics 28, 114–138.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3829–3843 doi:10.1098/rstb.2010.0099
Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories Paul Heggarty1,*, Warren Maguire2 and April McMahon2 1
Linguistics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany 2 Linguistics and English Language, University of Edinburgh, Dugald Stewart Building, 3 Charles Street, Edinburgh EH8 9AD, UK
Linguists have traditionally represented patterns of divergence within a language family in terms of either a ‘splits’ model, corresponding to a branching family tree structure, or the wave model, resulting in a (dialect) continuum. Recent phylogenetic analyses, however, have tended to assume the former as a viable idealization also for the latter. But the contrast matters, for it typically reflects different processes in the real world: speaker populations either separated by migrations, or expanding over continuous territory. Since history often leaves a complex of both patterns within the same language family, ideally we need a single model to capture both, and tease apart the respective contributions of each. The ‘network’ type of phylogenetic method offers this, so we review recent applications to language data. Most have used lexical data, encoded as binary or multi-state characters. We look instead at continuous distance measures of divergence in phonetics. Our output networks combine branch- and continuum-like signals in ways that correspond well to known histories (illustrated for Germanic, and particularly English). We thus challenge the traditional insistence on shared innovations, setting out a new, principled explanation for why complex language histories can emerge correctly from distance measures, despite shared retentions and parallel innovations. Keywords: tree; network; phylogeny; historical linguistics; language history; language divergence
1. SAME LANGUAGES, DIFFERENT VISIONS There is no clearer way to illustrate the topic of this paper than by contrasting three different representations of the sub-grouping relationships within the same language family: Indo-European. Figure 1 reproduces one of Ringe et al.’s (2002, p. 90) ‘best trees’ from their search for a ‘perfect phylogeny’ for the family. Figure 2 reproduces, also for Indo-European, the ‘consensus tree’ produced by Gray & Atkinson (2003, p. 437) out of a sample of 1000 possible configurations. Figure 3, meanwhile, is a NeighborNet analysis of Indo-European based on a distance matrix derived from the binary values inherent in the ‘isogloss map’ of the family by Anttila (1989, p. 305), also reproduced here as figure 4. For more on how this NeighborNet was produced, including Anttila’s specification of his data characters, see the electronic supplementary material, www.languagesandpeoples. com/Eng/SupplInfo/AnttilaNeighborNet.htm. The contrast could hardly be clearer: ‘trees’ in figures 1 and 2; a ‘web’ or network in figure 3. The first two are structured entirely by binary splits; in the third there is almost no such branching, and the relationships between the subgroups take the form of a network of cross-cutting relationships instead.
Most importantly, the difference is by no means merely one of representation, but has implications for our understanding of the relationships between the early Indo-European languages and the real-world context in which they arose. For what moulded the particular constellation of relationships between the dialects and languages of any family was none other than the unfolding relations between the populations who spoke them, as they themselves diverged through (pre-)history. The opposing linguistic patterns in figures 1–3 therefore also imply contrasting visions of what must have happened in the real world, during the early divergence history of IndoEuropean, to account for how those patterns came about. Yet there was only one real-world population history, of course. So how is it that for this same language family, different types of phylogenetic analysis can come to such radically different outputs? Is one right and the others wrong? And how might the different types of phylogenetic method—tree-only or network approaches—help us uncover what actually happened in linguistic prehistory? 2. LANGUAGE DIVERGENCE AND THE REAL WORLD: TWO MODELS, ONE REALITY Linguists have traditionally represented patterns of divergence within a language family in terms of either of two discrete models by which they are assumed to arise:
* Author for correspondence (
[email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
† the splits model, corresponding to a branching family tree structure;
3829
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3830
P. Heggarty et al.
Splits or waves? Trees or webs?
Hittite
Tocharian A Tocharian B
Lycian Luvian Albanian Latin
old Irish Welsh
Oscan Umbrian Armenian Greek
old Church Slavonic old Prussian Lithuanian Latvian
Vedic Avestan old Persian
Figure 1. ‘One of the best trees with Germanic omitted’. (Reproduced with permission from Ringe et al. 2002, p. 90.)
† the ‘wave’ model, typically yielding a (dialect) continuum. There has been much debate in theory as to the respective merits and demerits of these models, and how well suited each might be for representing the types of relationship that can obtain between language varieties within a family. What has been all too briefly considered, however, is how and why in practice these two types of relationship come to arise in the first place. How and why should it be in the nature of languages to develop into relationships of these particular and starkly contrasting types? In truth the contrast in models only exists at all because it reflects two very different processes in the real world. Broadly speaking, these two mechanisms are as follows. † A speaker population divides, prototypically by longdistance migration(s), into two (or more) groups, henceforth physically separated from one another. This leads to the classic language split pattern. † A speaker population expands (whether suddenly or more progressively) over a continuous territory, across which a degree of contact is maintained—at least at the immediate local level, and all the more strongly the shorter the distance between any two points. This leads to language divergence in a pattern of overlapping, cross-cutting waves. Cases would include, but are certainly not limited to, expansions by the ‘demic diffusion’ model (Renfrew 1989, pp. 126 – 131). The nature of any given language family as either more split-like or more wave-like can thus be interpreted as effectively a linguistic record of the past, pointing to one or other of these two different mechanisms as the probable history of its speaker populations, the real-world scenario that moulded that family’s particular pattern of divergence. Let it be clear from the outset, then, what the nature of the relationship is between the divergence pattern of a language family and the real-world contexts Phil. Trans. R. Soc. B (2010)
in which its speakers lived. For this relationship is unambiguously one of cause-and-effect; and in a direction that is equally ineluctable. Whether, how, and which particular languages diverge is not just some natural law of ‘what languages do’ or ‘how languages evolve’. Forces in the real-world context— demographic, socio-political, cultural, and so on—are the cause; they alone determine entirely the linguistic divergence effects. One must hasten to clarify that what those forces do not determine is the form and nature of whatever particular language changes arise (other than in cases of contact). Changes can be highly idiosyncratic, and generally are either random, or in line with other changes in the language system as a whole (Heggarty 2006, p. 188). Changes arise by natural linguistic processes, then; what external forces determine is only whether those changes (whatever linguistic form they take) either develop independently and differently, or come to be shared, from one region to the next. That is, real-world forces dictate not which particular changes occur, but the patterns of language divergence that they ultimately give rise to. It is in real-world forces that lies all the difference between on the one hand, the ‘singletons’ within IndoEuropean, such as Albanian and Armenian; and on the other hand, the great families like Romance, Germanic or Slavic, born out of vast expansions propelled by the might of Rome and the turmoil that attended its fall. What many readers may feel is missing from the above discussion of the two divergence mechanisms is the role of borrowing or contact. Certainly, when faced with data on language relationships that crosscut in ways incompatible with branching trees, as a stock explanation to rescue a tree-only analysis its advocates typically roll out a ‘splits-then-borrowing’ model, seen as a more tree-friendly alternative to the dialect continuum. It is with very good reason, however, that we leave borrowing out of this section. For there must be no blurring of the distinction between splits-then-borrowing and ‘dialect continuum’ scenarios: the two represent very different visions of real-world language histories. For a start, borrowing and contact are modes of language convergence (of languages either unrelated, or related but already diverged from each other). As such, neither has any real place here, in this discussion of the two basic modes of language divergence out of a common ancestor. The contrasts in figures 1– 3 pose a conundrum for the prehistory of Indo-European in particular, then. How could those three studies come to such radically different results? Which of the two basic mechanisms of language divergence comes closest to what actually happened to the peoples who spoke the earliest IndoEuropean languages? Answering those questions calls for a full exploration of the difference between a splits-then-borrowing and a dialect continuum scenario, and how significant it is for how we understand and model language prehistory. These issues have to be left for a more wide-ranging survey than there is space for here, however, in further papers by Heggarty (in preparation a,b).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs? The treatment here will be limited to only one of the main issues in the trees versus webs debate, and seek instead to establish a more general point of methodology. We start out from a first way to defuse the potential stand-off between the tree and wave models, which is to progress beyond a simplistic vision that sees the two as mutually exclusive either/ or alternatives for any one language family. For in reality, neither a branching tree nor a continuum model alone is sufficient to account for the complex relationships observed across many a language family. Nor, indeed, have we any reason to expect either to be. The vagaries of history typically ensure that in the real world, both types of process may act upon the populations speaking any given language family, combining in any manner of ways across time and space. The real-world history of any language family need by no means be a story of uniquely one or the other mechanism, but is very often a complex composite of the two. This complexity has implications for the tools and data we might look to in order to model and represent relationships between the languages within a family. In principle, to do justice to real language divergence histories, we need a model able to capture both split and wave mechanisms within a single analysis and representation. Indeed ideally we would wish for a model that allows us to tease apart the respective contributions of each to the overall story.
3. NETWORK METHODS: BEST OF BOTH WORLDS? The ability to do just this is precisely the claim made for one particular type of phylogenetic analysis: those of the network type, in contrast to others of the tree-only type. Though initially developed for applications in the biological sciences, particularly genetics, two network-type methods in particular have also been widely applied to language divergence data: Network and NeighborNet, both of which we shall survey here. There is in fact another well-known network-type analysis method, namely Split Decomposition by Bandelt & Dress (1992), now integrated into the SPLITSTREE 4 package (Huson & Bryant 2006). It is not considered in detail here, however, firstly because it has been rather less used in linguistic studies of late. A second and more critical objection is that for precisely the task in hand here, that of teasing apart the respective strengths of tree-like and web-like signals, Split Decomposition has an inherent bias— towards the former. As McMahon & McMahon (2005, p. 158) put it, with Split Decomposition, ‘graphs based on bigger and more complex datasets tend to become more tree-like by default’. It is not the task of this short article to set out in detail the workings of these methods. Useful general sources include the valuable and readable survey of and introduction to network methods, as applied specifically to language studies, in Bryant et al. (2005), and discussion and illustrations for more language families in McMahon & McMahon (2005). Here, we just briefly overview how the two networkPhil. Trans. R. Soc. B (2010)
P. Heggarty et al.
3831
type methods we cover here have been received in historical linguistics, and focus on how they relate to the issue of particular interest in this paper.
(a) Network The Network algorithm (Bandelt et al. 1995, 1999) was developed by a group led by the mathematician Hans-Ju¨rgen Bandelt and geneticist Peter Forster. The programme can be downloaded from www. fluxus-engineering.com, and for a brief explanation of how the algorithm produces its network outputs, see Forster et al. (1998, pp. 182 – 184). For the purposes of this paper, one of the key defining characteristics of Network, in contrast to NeighborNet, is that it takes as its input format individual state data, not overall measures of distance or similarity between languages (see §3c). Forster in particular has applied Network to language data, in papers with various colleagues. Forster & Toth’s (2003) study of Celtic languages, however, was rather taken to task by historical linguists. Criticisms of their approach to the language data and certain assumptions in the dating methodology employed are widely felt to invalidate the paper’s conclusions. Rather less problematic are Forster et al. (1998) on Alpine Romance varieties, and Forster et al. (2006) on Germanic. These too, though, are based on rather limited datasets. The authors start out from the Swadesh list of 100 basic word-meanings, but for many of these the data are invariant across all language varieties in the study, or missing, or cause ‘chaotic reticulation’. The authors remove all of these data, which in the case of Germanic leaves an effective dataset of just 28 data-points. Moreover, each of these is open to the established criticisms of traditional lexicostatistics as to the bluntness of its ‘all-or-nothing’ binary approach to how related languages overlap in their lexical semantics. Questions also remain as to the authors’ inferences for what their outputs may mean for the history of the Germanicspeaking populations (Forster et al. 2006, pp. 135 –136). Still, however one might question the handling of the data and ancillary assumptions in any one case, such objections are besides the main methodological point for the purposes of this paper. For criticism on these scores does not impugn the algorithm per se as one that in principle does harbour considerable potential as a means of representing language divergence relationships. One does not have to agree with all aspects of Forster and his colleagues’ approach in order to grant that their Network algorithm does indeed offer one means of differentiating, ‘weighting’ and combining in a single representation both the tree-like and web-like components of the overall complex of relationships within a language family. As an illustration, figure 5 shows its output for the Germanic languages, reproduced here from Forster et al. (2006, p. 134). Albeit on an imperfect and limited dataset, the pattern does duly reflect that cross-cutting relationships exist within Germanic—whether imputable to shared or parallel innovation, to cross-cutting waves across a dialect continuum, or to contacts
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3832
P. Heggarty et al.
Splits or waves? Trees or webs?
Irish A Irish B 100 Welsh N Welsh C 100 Breton List 100 Breton SE 100 Breton ST Romanian List 88 100 Vlach Ladin Provencal 75 100 57 French 6100 Walloon 100 100 French Creole C French Creole D 98 59 100 Spanish 1700 Portuguese ST Brazilian 83 100 100 0.01 changes Catalan Italian Sardinian N 100 100 Sardinian C Sardinian L 100 German ST 5500 Penn Dutch 45 100 99 Dutch List 6500 100 Afrikaans 72 Flemish 100 44 Frisian English ST 100 1750 Sranan 100 100 Swedish Up 100 Swedish VL 100 Swedish List Riksmal 99 100 Icelandic ST 78 Faroese Danish 100 Lithuanian O 100 Lithuanian ST Latvian Slovenian 79 100 100 Macedonian Bulgarian 3400 97 Serbocroatian 100 100 Lusatian L Lusatian U 6900 86 Czech 1300 100 84 Czech E 42 Slovak 97 Ukranian 58 48 Byelorussian Russian 64 Polish 98 Romani Singhalese 100 Marathi 93 Gujarati 87 99 Panjabi ST 98 99 2900 Lahnda Hindi 98 99 Bengali 100 Nepali List 7300 Khaskura 4600 Kashmiri 96 100 85 Ossetic 86 Wakhi 100 Persian List 2500 44 Tadzik 100 35 Baluchi 100 Afghan Waziri 59 Albanian T 7900 600 100 Albanian G Albanian Top 100 47 71 Albanian K Albanian C 100 Greek ML Greek MD 100 800 93 Greek Mod 8700 100 Greek D 40 Greek K 100 Armenian Mod Armenian List 100 Tocharian A Tocharian B 1700 Hittite
2900
Celtic Italic* French/Iberian } Italic Weat Germanic Germanic North Germanic } Baltic Baltic-Slavic Slavic } Indic Iranian } Indo-Iranian Albanian Greek 67 Armenian Tocharian Anatolian
100
100
Figure 2. Consensus tree for Indo-European (Reproduced with permission from Gray & Atkinson 2003, p. 437). The numbers up to 100 along each branch (in small font) are posterior probability values—see text. The numbers up to 8700 at each node (in large font) are time-depth estimates expressed in years BP. Phil. Trans. R. Soc. B (2010)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs?
P. Heggarty et al.
3833
Germanic West Slavic A
Baltic Germanic North
Slavic B
Germanic East
Albanian A Albanian B Italic B Armenian
Italic A
Indic Celtic A
Celtic B1 Celtic B2
Iranian A
Tocharian Iranian B
Greek A Hittite
Greek B
Figure 3. NeighborNet of a distance matrix from Anttila’s (1989, p. 305) Indo-European isogloss map (figure 4).
19 21 20
N
3 11 19 10 18 7 15
18 1 2 17
E Germanic W
3
For other authors’ applications of Network to language data, see McMahon & McMahon (2005, pp. 140 –154).
6
Slavic
Baltic
Albanian
17 6 11 9 8
Tocharian Armenian
24 Italic
7
Celtic
Hittite 22
Indic
4 12 13 15
Greek Iranian
12 16
23 10
9 25
8 14
23
13
14
5
16 1 4
Figure 4. A dialect map of the Indo-European languages. (Reproduced with permission from Anttila 1989, p. 305.)
between speakers after an earlier split. Such an output format is arguably more realistic and balanced than a tree-only representation. It also has the attraction (and advantage over NeighborNet) of identifying all individual changes in the network diagram itself, i.e. in figure 5 the word-meanings in which a cognate change is ‘reconstructed’. Phil. Trans. R. Soc. B (2010)
(b) NeighborNet NeighborNet, of which figures 3, 6 and 7 are illustrative outputs, was developed by Bryant & Moulton (2004), and is now integrated into the SPLITSTREE 4 package (Huson & Bryant 2006). It too was first intended particularly for applications in the biological sciences, but has been enthusiastically advocated for applications to language data by a number of researchers, including April McMahon and Russell Gray, each together with various colleagues. Unlike Network, NeighborNet takes as its input format not state data, but overall measures of distance between languages (see §3c). Bryant et al. (2005) provide an instructive talkthrough of the process by which the method goes about turning its input data into its output representation, as applied to illustrative language data. This includes an application to Indo-European (not, of course, the tree reproduced here in figure 2), while Holden & Gray (2006) apply it to Bantu. In both studies, the authors use traditional lexicostatistical data. Other researchers have also used NeighborNet on traditional lexicostatistical data, applied to Australian and Indo-European languages in McMahon & McMahon (2005, pp. 164 –165), for example. We have also applied it to other, finer-grained measures of language divergence. Heggarty’s study of the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3834
P. Heggarty et al.
Splits or waves? Trees or webs? giv e
Saterfrisian
ear
th
yo
ad
he
Low german l High german d e Swiss a h m e h an Bavarian y y small n a m
u
m
all
g
bi g
f leea at fle sh
moun
t
mount
e di
ny
Dutch West frisian
sand
Gothic
AD 350
Beowulf
>AD 516
unt
fles
h
fles h
e
big
big fir e no t
root moun t ep sle ep sle mount
ma
un
mo
mo
leaf
sh
fle
sh
fle
t
go
od
Swedish
un
eat
mo
leaf
n ma
leaf
mount
say
say
eat
mount
say
ck
n
earth
big
moo
Faroese Nynorsk
earth
sl big
die big ep e l s die ro ot
mount
flesh
earth
eep
ne
sl
big
ck eep
nt mou mount
tre
ne
Icelandic
e
d
say
e giv
bir
g
bi
f
lea
Alfred
k nec
ck
>AD 900
Heliand
AD 890
bla
e di
smal
ad
you
AD 825
many say
do
say
ow
tre
kn
say many
y an
sm
Old Norse
say
English
eat
t mo
fire
un
Bokmal
t
Danish
Figure 5. Unrooted network of 19 Germanic language samples. (Reproduced with permission from Forster et al. 2006, p. 134.)
Andean language families Quechua and Aymara is based on distance data in lexical semantics, as calculated by a new quantification method to work to a more refined level than traditional lexicostatistics, set out in Heggarty (2005), and more briefly in McMahon et al. (2005) and McMahon & McMahon (2005, pp. 166 – 173). We have explored NeighborNet also with divergence ratings in phonetics, for the Romance languages (Heggarty et al. 2005), and for accents and dialects of English and other Germanic languages (see figures 6 and 7 later; McMahon et al. 2007; Maguire et al. 2010).
(c) Measuring network versus tree signals Both of these network-type methods, then, are able to visualize together the relative strengths of the tree-like and web-like signals within a single dataset. This is indeed how Forster et al. (1998, p. 174) explicitly present their Network method, while Bryant et al. (2005, p. 67) make much the same claim of NeighborNet. As applied to language data, both can effectively represent the split and/or wave elements within the overall divergence pattern of a family. For the purposes of uncovering prehistory, the value of this balance is as a record of the particular mix of the two main realworld processes that underlie the tree-like and web-like sectors of that overall pattern: physical separations, including migrations (yielding branching patterns); and/or expansions over continuous territory (leaving wave patterns). Moreover, as well as providing a way of representing this balance graphically, network-type methods can also be seen as a means of effectively weighing up, indeed measuring the overall ‘tree-ness’ or ‘net-ness’ of a particular dataset. Together the various networkPhil. Trans. R. Soc. B (2010)
type methods provide a range of approaches to assessing this numerically, such as the ‘splittable percentage’ ratings produced by the Split Decomposition network method (Bandelt & Dress 1992). For NeighborNet, meanwhile, see in particular the ‘delta scores’ discussed in Gray et al. (2010). In fact, even methods whose graphical outputs are only in tree format can nonetheless produce similar quantifications of tree-ness, many of them based on samples of large numbers of possible trees. As examples, in their study on Indo-European Gray & Atkinson (2003, p. 436) report that ‘a preliminary parsimony analysis produced a consistency index of 0.48 and a retention index of 0.76’—effectively, measures of how well the data fit on the tree. (There are, however, known problems with consistency indices particularly, with actual scores overly dependent on sample size.) One can also focus on individual branches within the tree. In figure 2, Gray & Atkinson’s consensus tree includes a ‘posterior probability’ value specified above each branch, which can stand as an indication of how strongly supported it is across the set of possible trees—themselves valuable data on Indo-European prehistory (Heggarty in preparation a, §§4.1, 4.2.3, 6). Other recent work takes existing tree-only approaches as a basis upon which to build what are effectively new forms of network analysis. Dealing with large sets of possible trees allows such collections to be synthesized into a consensus network, for which an algorithm has been devised by Holland & Moulton (2003). Atkinson & Gray (2006, p. 97) illustrate this for Indo-European, though they see it in this case as ‘just [a] useful pictorial summary of the [. . .] fundamental output’, namely the distribution of possible trees that is their core interest for their proposed dating methodology. Nakhleh et al. (2005, pp. 399–400),
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs?
P. Heggarty et al.
3835
Table 1. Comparative lexical data, by cognate set, for four basic meanings in five Romance languages. EAT
SLEEP
GO
HOUSE
Portuguese comer A Spanish comer A French manger B Italian mangiare B Romanian maˆnca B Cognate Sets derived from different Latin variants A comedere original sense eat up
dormir A dormir A dormir A dormire A dormi A
ir A ir A aller B andare B merge C
casa A casa A maison B casa A casa˘ A
dormı¯re sleep
¯ıre go
casam hut, cabin
B original sense C original sense
—
ambula¯re walk about mergere immerse, go under
mansio¯nem place to stay —
mandu¯ca¯re chew —
—
Table 2. A table of multi-state data, derived from the data in table 1.
Portuguese Spanish French Italian Romanian
multistate datum 1 (EAT)
multistate datum 2 (SLEEP)
multistate datum 3 (GO)
multi-state datum 4 (HOUSE)
A A B B B
A A A A A
A A B B C
A A B A A
meanwhile, have developed an extension of their original perfect phylogeny approach that yielded figure 1, seeking to produce now perfect phylogenetic networks, which they explore for Indo-European—though they still present their networks essentially as trees, albeit with contact edges. In practice, network-type methods have often been used just as initial diagnostic tests, to judge whether a particular dataset yields a pattern that is tree-like ‘enough’ in order to justify proceeding to tree-only analyses useful for particular research ends. It was on the strength of this sort of test, for instance, that Gray & Atkinson concluded that the Dyen et al. (1992) lexicostatistical dataset for Indo-European yields a fairly strongly tree-like signal, and thus felt confident in going on to use a tree-only analysis for their approach to dating the family. And yet, as the contrasts between figures 1–3 here suggest, different datasets and different methodological approaches can nonetheless lead to very different assessments of how tree-like or web-like was the divergence of the same family. For Ringe et al. and Gray et al., their data and analyses give a pattern that is basically treelike for Indo-European, while the NeighborNet from Anttila’s data suggests quite the contrary. Much of the explanation for this apparent contradiction lies in whether one considers the entire family, throughout its history, or focuses on just its earliest divergence stages. Many later developments in Indo-European were obviously ‘tree-like’: all modern Romance languages clearly derive from a single Proto-Romance ancestor, Phil. Trans. R. Soc. B (2010)
all Germanic languages from Proto-Germanic, and so on. The impact is to make the overall signal for the family more tree-like, by ‘diluting’ what appears to be the more web-like structure of its earliest divergence— the same characteristic that has for so long frustrated attempts to establish an agreed higher-order branching for the family. This issue is taken up more fully in Heggarty (in preparation a, §4.1). (d) Data formats: multi-state, binary or distance measures? We turn now to the key difference between Network and NeighborNet for our purposes in this paper, namely in the format of the input data each takes. Network takes multi-state data; NeighborNet takes distance measures. To illustrate how these represent two very different approaches to the same language data, tables 1, 2 and 3 provide an example for four word-meanings (EAT, SLEEP, GO and HOUSE) in five Romance languages (Portuguese, Spanish, French, Italian and Romanian). The ‘raw’ language data can be set out as in table 1, giving each language’s principal lexeme in that meaning, and identifying which of the ‘cognate sets’ for that meaning across Romance that lexeme falls into, symbolized here by the letters A, B, C, and so on. By a cognate set is meant a collection of words (technically lexemes), one from each of several different languages within a family, which all derive directly (i.e. without borrowing) from the same original lexeme in that family’s common ancestor language—even if sound changes since may have left them rather different in precise phonetic form. For the EAT meaning, for example, word-forms in the Romance languages fall into two main cognate sets: † the set of word-forms derived from Latin comedere (originally eat up), including Spanish comer and Portuguese comer, cognate with each other (and spelt identically, though somewhat different in pronunciation); † the different set derived instead from Latin mandu¯ca¯re (originally chew), including French manger, Catalan menjar, Italian mangiare and Romanian maˆnca, also all cognate with each other.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3836
P. Heggarty et al.
Splits or waves? Trees or webs?
Table 3. A triangular matrix of pairwise distances between languages, derived from the data in table 1. Portuguese
Spanish
—
0
/4 —
French
Italian
Romanian
3
2
2
/4 /4 —
/4 /4 1 /4 —
/4 /4 2 /4 1 /4 —
2
3
Portuguese Spanish French Italian Romanian
2
Table 4. A table of binary data, derived from the data in table 1.
Portuguese Spanish French Italian Romanian
binary datum 1 ¼ EATA comedere
binary datum 2 ¼ EATB mandu¯ca¯re
binary datum 3 ¼ SLEEPA dormı¯re
binary datum 4 ¼ GOA ¯ıre
binary datum 5 ¼ GOB ambula¯re
binary datum 6 ¼ GOC mergere
binary datum 7 ¼ HOUSEA casam
binary datum 8 ¼ HOUSEB mansio¯nem
1 1 0 0 0
0 0 1 1 1
1 1 1 1 1
1 1 0 0 0
0 0 1 1 0
0 0 0 0 1
1 1 0 1 1
0 0 1 0 0
For the SLEEP meaning, meanwhile, the major Romance languages offer just one major cognate set, since all use cognates derived from Latin dormı¯re. For the meaning GO there are three sets: cognates of Latin ¯ıre, ambula¯re and mergere; e.g. Spanish ir, French aller and Romanian merge, respectively. Table 1 illustrates such patterns of shared cognates for our four sample word-meanings in five sample Romance languages. The patterns in table 1 can be converted into a table of multi-state data as in table 2. This is the input format used by Network, in which the particular cognate state for each meaning in each language remains distinguished. Any single change in cognate state in any language can be reconstructed along all edges of the network, as per the meanings attached to them in figure 5. An alternative approach to the same raw language data is to convert them instead into a triangular grid or matrix of ‘pairwise’ distances. With lexicostatistical data such as these, distance measures are typically calculated by simply counting the proportion of the (here, four) meanings in the list for which a given pair of languages use lexemes that are not cognate (e.g. Spanish comer but Italian mangiare), relative to the meanings for which their lexemes are cognate (e.g. Spanish casa and Italian casa). These ratings could equally well be expressed as ‘similarity’ rather than distance ratings, by simply subtracting each of the results shown from 1. Once converted in this way, it is no longer possible to recover from these conflated overall distance measures the individual data that underlie them, i.e. to tell apart in which particular meanings any two given languages do or do not share cognates. So unlike Network, or even isogloss maps as in figure 4, there is no way for NeighborNet itself to identify which particular aspects of its output diagrams correspond to which particular linguistic differences. (Although in practice this can often be inferred by Phil. Trans. R. Soc. B (2010)
other means: see for instance the discussion of the rhotic versus non-rhotic division among varieties of English in McMahon et al. 2007, pp. 136 – 137.) This has often been raised as an objection to the utility of distance measures in linguistics, and of methods such as NeighborNet that rely on them—the key issue we shall consider in the remainder of this paper. Certainly, the difference between state and distance data is a crucial one that also distinguishes our three Indo-European representations here: figures 1 and 2 are both based on state data; figure 3 on distance data. Finally, it is important to note that even methods which use state rather than distance data fall into two quite different sub-types, since state data may be either † multi-state: i.e. more than two states for a given datum, such as A, B and C for multi-state datum 3 in table 2 (the meaning GO); or † binary: i.e. only two states per datum, presence versus absence, or 1 versus 0. (The Network package, in fact, initially had only an algorithm for binary data, i.e. Bandelt et al. (1995). Another was later included which takes multi-state data, i.e. Bandelt et al. (1999)). Again, the same raw language data can be analysed in either of these two ways. With traditional lexicostatistical data as in table 1, the two possible approaches differ in terms of the question(s) asked of the wordform registered in each language, for every meaning in the Swadesh lists. † A single information question: into which cognate set—A, B, C, D, etc.—does this language’s wordform fall? This yields one multi-state datum per meaning in the list, with all values taken as equally different from one another. † A series of yes/no questions, one for each cognate set: is cognate A present or absent in this language? This yields one binary datum (yes or no) for each of
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs? the various cognate sets found for this meaning across all languages. Table 3 has already illustrated multi-state analysis by word-meanings in the list; table 4 now shows the alternative binary analysis by cognate sets. In this latter case, each of the various meanings in the list converts into a range of cognate sets, more for some meanings and fewer for others: in our Romance sample, just one cognate set for the meaning SLEEP, two for EAT, three for GO and two for HOUSE. The result across all four of these meanings combined is a total of eight cognate sets (i.e. 1 þ 2 þ 3 þ 2). For the full Swadesh 200-meaning list for the 87 Indo-European languages in the Dyen et al. (1992) dataset, the first approach gives 200 multi-state data, one for each meaning in the list. The second approach, however, as used by Gray & Atkinson (2003) for the tree-only phylogenetic method that they prefer for their proposed dating technique, yields a total of 2449 binary data points, since across all 200 meanings combined there are a total of 2449 cognate sets. That is, on average there are over 12 cognate sets per meaning, though this masks significant differences between meanings with very few cognate sets (e.g. the numerals), and others with relatively high numbers of cognate sets. Objections have been raised to this approach and its implications: for how independent some of the cognate set data-points are from others; and for how some meanings in the Swadesh list are effectively weighted more or less than others. The authors maintain, however—in Holden & Gray (2006, p. 25), for example—that these make ‘very little difference to the results’, a finding whose logic Pagel & Meade (2006, appendix A) investigate and concur with.
4. MATCHING DIVERGENCE PATTERNS WITH THE REAL-WORLD CONTEXT (a) State versus distance data: theory and practice Returning to the contrast between state data and distance data, before one ventures into the relative unknown of early Indo-European divergence, we would do well to see first how distance data and NeighborNet perform in a case where the external history is largely known. Does NeighborNet live up in practice to the claim that within the overall divergence pattern of a single language family, it can tease apart tree-like and web-like components—and thus, according to the logic set out in §2 above, point separately to split-like ‘migrations’ and/or to wave-like dialect continua arising out of more continuous expansions? We illustrate this with figures 6 and 7, our own NeighborNet outputs from a triangular distance matrix of measures of divergence in phonetics between language varieties within the Germanic family. Figure 6 covers 52 traditional regional languages and dialects within Germanic. Figure 7, meanwhile, compares a collection of 11 historical varieties of English against present-day ‘traditional’ local speech in 29 different regions. The recording data were collected by Paul Heggarty (for Germanic) and Warren Maguire Phil. Trans. R. Soc. B (2010)
P. Heggarty et al.
3837
(for English), and transcribed by Maguire (or in some cases, an expert on that particular variety). The transcriptions underlying figure 6 can be accessed, and the modern recording data heard, at www.languagesandpeoples.com/Germanic. Likewise, the database of English varieties in figure 7 can be viewed and heard at www.soundcomparisons.com (websites by Heggarty). The method by which the divergence ratings in phonetics were calculated from these transcriptions was devised and programmed by Heggarty—see www.languagesandpeoples.com/ Methods.htm and Heggarty et al. (2005). Now of course, whether overall distance measures (particularly in phonetics) can stand as a valid or useful indicator of language history is in principle very much a moot point, which we come to shortly. Nor do geographical proximity or separation necessarily match with language descent. And certainly, a first impression of the NeighborNet in figure 6 is that it does not closely resemble the family tree structure by which the Germanic languages are traditionally classified. Aside from the now extinct ‘Eastern’ branch (Gothic) not included in figure 6, that traditional tree sees a primary North versus West split, with English classified within West Germanic, indeed inside a further ‘Anglo-Frisian’ sub-branch. On the other hand, the rather narrow and exclusive criteria applied in order to come to a binary branching tree are not the only way in which one can conceive of the realworld history of a language family, and do not necessarily make for an answer to precisely the question we are asking here. For our somewhat different perspective of language speakers’ history, set out in §2, a look at our Germanic NeighborNet reveals that the splitlike versus web-like patterns within it do in fact reflect real-world contexts and histories of speaker populations rather well. Varieties which since an early stage were geographically isolated from each other into essentially separate speaker communities duly emerge at opposite ends of the few clear splits in the overall pattern: all English (and Scots) varieties in the British Isles stand as a group off to the left, separated from all other (‘continental’) Germanic varieties by the one very sharp split running through the overall picture. To the right, meanwhile, is a second relatively clear split, between the Scandinavian varieties at the top, and those of continental West Germanic. Within these major groups, however—that is, across the unitary territories settled by largely contemporaneous or progressive expansions—the pattern is very different. The English and Scots varieties, even very marked dialectal variants such as Holy Island (Northumbria), Buckie (a ‘Doric’ form of North-Eastern Scots) and Ulster Scots, all relate to each other in essentially web-like patterns with no particularly sharp splits. Across continental West Germanic, where traditional dialects still survive rather more strongly, the web-like picture is an even more striking reflection of a progressive dialect continuum across the entire region, incrementally proceeding in fairly close step with geography, from Flanders to the Alps. Not that the match with geography is perfect, of course; among the various reasons is one that goes back to certain limitations inherent in NeighborNet for
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3838
P. Heggarty et al.
Splits or waves? Trees or webs? Scandinavia Faroese Std Icelandic Std Danish Std
Norwegian Stavanger
Swedish Stockholm
Netherlands and Belgium W Frisian Grou Low Saxon NL Groningen
Low Saxon NL Achterhoek London
Dutch Limburg
Norfolk Dutch Std NL Liverpool Sheffield
Dutch Antwerp
RP
Dutch Ostend
Tyneside Morpeth Low Ger W Bargstedt Holy Island Ctl Ger Koeln
Cornhill
Germany North and Central
Ctl Ger Luxemburg
Berwick
High Ger (Almc L) FR Alsace N Hawick Coldstream
High Ger (Almc L) DE Tuebingen
Edinburgh Renfrew
High Ger (Almc L) AT Bodensee High Ger (Almc H+) LI Walser
Shetland
Britain and N. Ireland
High Ger (Almc H+) CH Graubuenden
Fermanagh Buckie Glasgow
Antrim Tyrone Yiddish New York
High Ger (Almc H) CH Biel
High Ger (ABv C) DE Traunstein
Germany S., Switz. Austria
Figure 6. NeighborNet from a matrix of distances in phonetics between traditional dialects in Germanic.
representing language and particularly dialect relationships, to which we return in §4b. (Within Scandinavia, and over the dialect zone between southern Denmark and North Germany that links the two ‘blocs’, coverage in our dataset is as yet rather too sparse to draw any sound judgements. Further recordings, transcriptions and calculations for this region are underway to fill in these gaps.) The distance-based approach used here also makes it possible to include comparisons against reconstructed historical forms, albeit with all the necessary provisos as to quite how certain and precise we can be in the phonetic transcriptions assumed for them. In this respect too, distance measures can in practice yield outputs eminently coherent with known history, as demonstrated in figure 7 which compares varieties of English from a number of different historical stages. Closest to the (nonetheless distant) ProtoGermanic, and first to ‘branch off ’ from it, are the two Old English forms; then the four Middle English; then Early and Middle Scots. (Note that these two divisions of Scots refer to periods that actually correspond most closely to the Middle English and Early Modern English periods, respectively.) Phil. Trans. R. Soc. B (2010)
By this stage in the history of English, clear patterns are beginning to emerge in geography too, in that while the historical Scots varieties do remain closest to the broadly contemporary English ones, vis-a`-vis presentday varieties they side clearly with those of Scotland. Similarly, the positions of the Early Modern and Late Modern English of England, with respect to modern regional varieties both within Britain and overseas, reflect the respective time-depths at which English began expanding and diversifying into worldwide patterns too: first to the New World and, more recently, to the Southern Hemisphere. These aspects of the NeighborNet can but underline a further principle critical to the correct interpretation of any distance measurements: that degrees of divergence between language varieties are a function not just of separation time but also of the degree of cohesiveness of a speaker community, for which geographical space is often a fairly close proxy, especially within a dialect continuum. This principle, and the serious consequences it entails for attempts to date language divergence from distance measurements, are explored more fully in Heggarty (in preparation b, §6, §7).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs?
P. Heggarty et al.
3839
London
England S. S. Hemisphere Auckand Norfolk
Liverpool
England N.
Johannesburg Perth RP
Alabama
North America
Boston
Sheffield
New York City
Tyneside Morpeth
Standard American Standard Canadian
Cornhill
Late Modern English
Holy Island
Early and Late Modern English
Shakespeare
Berwick
Fermanagh
to Proto-Germanic (distant)
Tyrone
N. Ireland Antrim OE West Saxon
Hawick Coldstream
ME Kentish
Edinburgh ME Northern
Modern Scotland
Renfrew
OE Mercian
ME SW Midland ME E Midland
Old English
Glasgow Early Scots
Shetland
Middle English
Middle Scots Buckie
Historical Scots = Middle and E.M. English Periods
Figure 7. NeighborNet from a matrix of measures of divergence in phonetics between traditional dialects and reconstructed historical varieties of English.
(b) Limitations of NeighborNet The two figures (6 and 7 ) also give an inkling of one potentially quite serious limitation that NeighborNet faces as a means of representing language divergence patterns, especially in dialect continua. This is that a language taxon cannot appear in the midst of a network of reticulations, but only around the perimeter. For in dialect continua, of course, it is not only possible but positively expected that one dialect may be intermediate between many others all around it. NeighborNet cannot place a language taxon in such an intermediate position graphically. As our coverage of English and other Germanic language varieties has expanded, it has become clear that certain intermediate language varieties can be quite unstable in where they appear in the NeighborNet outputs, depending on how one filters the wider set of language taxa. Examples include certain ‘variably rhotic’ varieties of English, and Frisian. The latter is of course much debated in Germanic linguistics in any case, given how the traditional branching tree ranks it as the one variety within Phil. Trans. R. Soc. B (2010)
Germanic that is closest to English; but in the face of objections that it is certainly today closer to neighbouring varieties normally considered dialects of Dutch or of Low Saxon. The latter position is the one that emerges more strongly from these distance measures, although Frisian does also appear as something of an outlier relative to continental West Germanic, in the direction of both English (especially its rhotic varieties) and Scandinavian. In a NeighborNet which includes all of these, Frisian’s position necessarily emerges as a compromise of all of these relationships (though not necessarily a happy one). If one wishes to focus only on the relationships within traditional West Germanic, one can filter out the Scandinavian varieties and this duly isolates an even stronger signal of Frisian as intermediate between continental West Germanic and English (though still closer to the former). Furthermore, from the perspective of continental West Germanic, in a NeighborNet like figure 6 that includes ‘external’ varieties like English and Scandinavian, these take up the left-hand side of the network,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3840
P. Heggarty et al.
Splits or waves? Trees or webs?
leaving the continental West Germanic varieties all in a group to the right. The result is that any relationships within this group are effectively limited to being expressed by the unilinear sequence they branch off in down this right-hand edge of the NeighborNet: any one variety can only be set between two others. In practice this cannot faithfully reproduce even twodimensional geographical space on a map. Frisian, for example, would rank at a far extreme of the continuum in the north, but so too would a west Belgium variety such as Ostend. These two also need to be quite distant from one another, with other Dutch varieties in between. But those Dutch varieties are also intermediate between both of these and the Low and Central German varieties. Such triangular relationships cannot be captured in what is effectively just a unilinear sequence around the perimeter of the network, which necessarily cannot do full justice to the multi-dimensional relationships between varieties in a dialect continuum. Again, some improvement can be gained by focusing on those relationships only: if one filters out known ‘external’ varieties, the relationships across West Germanic then duly form a two-dimensional space within which the Frisian, Belgian and other Dutch varieties stand in rather more natural positions relative to each other. A final factor influencing outputs and stability to filtering is how many taxa are included, and how smoothly they are sampled across geographical space. Adding multiple very close varieties effectively gives greater ‘weight’ to their region and forces NeighborNet to treat them more stably, while varieties from less heavily sampled regions become relatively less stable to filtering. These can be quite serious difficulties when applying NeighborNet to language data. In ongoing research we are therefore exploring alternative representation and analysis methods that do not suffer from this limitation, including multi-dimensional scaling and alternatives to traditional isogloss maps. These have their own limitations, however, and provided one is well aware of the inherent limitations and potential instabilities of NeighborNet, and careful in one’s interpretations and in using filtering to focus on specific questions, it remains a powerful (and very fast) algorithm. Indeed, its limitations notwithstanding, figures 6 and 7 confirm that NeighborNet analysis can indeed distinguish split-like and wavelike patterns in the divergence of languages and dialects, in ways that can offer an impressively close match with the known external history of their speakers. And of course it achieves this from simple distance data (in this case in phonetics), i.e. without a specification of ancestral states.
(c) ‘Diagnostic power’, ancestral states and parallel change At first sight this can seem unexpected, since such data do not obey the common insistence from historical linguists that only those correspondences known to be shared innovations can inform us of language histories. Indeed, from the traditional viewpoint, an expected response to figure 3 here is that, especially once converted to distances, Anttila’s data may be Phil. Trans. R. Soc. B (2010)
insufficiently reliable to be at all probative of history. The strongest view holds that correspondences between languages that reflect shared retentions, survivals since a common ancestor, tell us nothing, and that for any given comparative datum, if we cannot tell which of the states is ancestral, then the datum cannot be useful. Similarly, independent parallel innovations tell us nothing about common histories; so we should also remove from our dataset all correspondences that seem to be instances of them. The only probative characters assumed to be diagnostic of (tree) history are therefore taken to be known shared innovations. On these grounds, this school of thought might simply rule out some of Anttila’s data. Ringe et al. (2002) do not appear to sign up fully to this strongest view, though they do generally take a rather dim view of the value of distance data. Presumably not all of Anttila’s data would have passed their own stringent data screening criteria; though as argued in Heggarty (in preparation a, §4.3.2), their screening itself is hardly without impact on their own database. Certainly, distance data call for caution in interpreting what they can tell us of language relationships and especially histories. Nonetheless, views have been clouded by certain misconceptions as to exactly how the various phylogenetic analysis methods put state and/or distance data to use in order to generate their outputs (see Heggarty 2005, pp. 37– 38, and of course the explanations of how each method actually works). More generally, other misconceptions inherent in the traditional ‘strongest view’, as set out above, certainly seem to overstate the limitations of distance data as informative on language history, if we are to judge from figures 6 and 7.
(d) Why distance data work in practice: a new proposal For if the strong principles are so sacrosanct, how is it that our Germanic database (including the subset of varieties of English), which neither specifies ancestral states nor filters out known cases of parallel innovations, nonetheless produces NeighborNets that do indeed reflect the decisive historical factors that shaped the divergence patterns across the family? Did our study just ‘get lucky’ with the particular dataset in hand? Or is there in fact a stronger, principled way to account for why distance measures might reflect language history after all? I (Heggarty) would propose precisely that. The explanation starts back at some first principles of how changes pattern across languages. At any given stage in any language’s history, a vast number of changes are possible, indeed more or less ‘natural’. But other than in cases of direct contact, we usually cannot explain why in any one language a given change occurred precisely when it did, nor why that change did occur while other possible ones did not. Or, to see it from the alternative perspective of all language lineages, the subset of them in which a given change occurs is effectively random. Consider the sound change of the vocalization of post-vocalic [l], by which (usually first via dark [ł]) it becomes [w], [u] or even [o]. European language varieties that
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs? exhibit this change make for an eclectic, random collection ranging from modern Glaswegian and Estuary English (milk as [mIwk]) to the Portuguese ofÐ [bQaziw], Old French (castle versus chateau [ Ato]), Serbo-Croat (Belgrade versus Beograd ), Polish, some accents of Bulgarian, and so on. That any individual sound change like this occurs in two or more language varieties is clearly no necessary indicator whatsoever that it happened during a period of common history, for the innovation so often occurs independently anyway, as all these known cases show. And just as the languages partaking in this parallel innovation form a subset that is essentially random, so too is its converse: the remaining subset of those languages that have not undergone this change, i.e. those characterized by the ‘shared retention’ of post-vocalic consonantal [l]. So it is both expected in principle, and the case in practice, that our dataset on phonetic distances for Germanic in part reflects changes that are not shared innovations. Examples include the parallel innovations of the vocalization/loss of post-vocalic /r/ in many English varieties and in Standard German, e.g. here as and hier as , respectively. Likewise, they reflect the shared retention of ancestral dental fricatives [u] and [ð] in Icelandic and (most varieties of ) English, as in þing, thing. Like any other phonetic correspondence or difference, these register significantly in our distance data. So wherever these particular sounds occur in our database, in this characteristic the varieties concerned are indeed duly measured as less distant to each other than to all other varieties. How is it, then, that these parallel innovations and shared retentions do not seem to disturb the overall pattern? The explanation is potentially twofold. Firstly, one might surmise that these parallel innovations and shared retentions must simply be far outweighed in practice by those changes that are indeed shared innovations. That may well be part of the explanation, but even to see things in terms of ‘outweighing’ is to fail to follow through on the first principle of language change discussed above: the effective randomness in the patterning of which language changes occur when, in which language lineages. For the very same characteristic that makes a particular language change susceptible to occurring independently in parallel also largely guarantees that it is effectively ‘neutral’ as to which other language varieties happen to develop it too. Indeed, the more susceptible a given change is to develop independently, the more random its distribution across language lineages should be. So for any one independent change, the patterns of languages in which it happens to occur in parallel, and the remainder in which it does not occur, will be random. This randomness entails that taken together, all parallel changes and all shared retentions over all varieties will, in net effect, largely balance each other out. This will leave a background level of random (i.e. star-like, not net-like) divergence between language varieties, in both parallel changes and shared retentions. On top of this indistinct background, clear non-random patterns do emerge in how varieties differ from each other, of course; so what principally determines them must be the only Phil. Trans. R. Soc. B (2010)
P. Heggarty et al.
3841
changes that remain, namely the shared innovations. These are the changes that come to be shared in different regions not by multiple independent chance occurrences, but by being propagated, from just a single occurrence, across a wider speech community. They come to be shared, that is, not by randomness, but by the forces in the real-world context that determine the extent and nature of those communities, their degrees of coherence, and also their external boundaries that the propagation wave may not cross. In the strongest case, almost all changes will be either adopted or rejected right across a speech community, maintaining it as a single, coherent language albeit changing through time. In the weaker but arguably historically more common case, many changes will spread widely, but in waves overlapping in different patterns, leading to a dialect continuum. It is these factors that account, in a perfectly principled way, for why—despite the contribution of parallel innovations and shared retentions to our overall distance measures—our Germanic phonetics database can nonetheless reproduce such a close match with known historical population splits, and real-world geographical patternings across the dialect continua zones. More generally, I put them forward here as a principled argument that undermines the objection that it is only identifiable shared innovations, and thus only state and not distance data, that are useful for determining ancestry by means of phylogenetic and network analysis. On the strength of this, it transpires that distance data can in fact be perfectly relevant—provided that certain criteria are met in how they are calculated. For in order for the above principles to apply, the dataset must be a balanced, global and representative sample of the level of language in question: lexical semantics, phonetics, etc. (and ideally, of all levels together). Furthermore, the various types of linguistic difference within any one level need to be taken together and weighted for their relative significance with respect to each other, in some principled, balanced way (see Heggarty et al. 2005). These provisos stress once more the critical importance that attempts to put numbers on language data must attach to meaningful weightings (see also Heggarty 2006, p. 185). But if a quantification method can achieve these key requirements, its distance measures can indeed present a picture of the degrees of difference between languages which in practice can be highly informative of language history too—even if not entirely consistent with traditional branching tree representations. Indeed, the insistence on a suitably weighted ‘holistic’, unfiltered database can be considered a particularly healthy aspect of distance-based approaches. For they not only allow, but ideally call for, datasets that are more complete and thus more balanced; more so, certainly, than any approach that hand-picks or applies heavy a priori screening or filtering to the real-language dataset. We return here to the observation in §4a above that figure 6 is certainly no perfect match with the traditional family tree for Germanic, for which the primary split is between North versus West, with English classified within the latter ‘branch’. And yet figure 6 does show coherence with
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3842
P. Heggarty et al.
Splits or waves? Trees or webs?
key aspects of the real-world contexts in the population history of the speakers of Germanic languages. Those realities, as set out in §2, dictate that tree-only representations do not, indeed cannot, necessarily tell us everything significant about language histories. By prioritizing certain ‘diagnostic’ changes and overriding others, the tree-only approach risks misrepresenting the overall story. If our real goal is to uncover the histories of the populations that spoke given languages, rather than abstract schemas intellectually satisfying for their binary purity, then it is served by using language data to arrive at a picture of the nature and degree of cohesion (or otherwise) of speech communities within a language family, through the story of its divergence. To this end, we must represent the historical and linguistic truth that English ultimately underwent a longer and more total isolation than did most continental varieties from each other. The approach that best uncovers this signal is to measure and represent the full impact of all those far-reaching changes that came to mark it out as so distinct from all other Germanic varieties in so many ways—rather than to limit our representation of that history simply to the earliest few isoglosses assumed to identify a initial, radical and ‘exclusive’ North versus West (versus East) split. It is only right that the impact of both of these different determiners of Germanic language history should show up clearly in any representation that seeks to reflect that history. Overall distance measures and network analyses are by definition better placed to achieve this balance than tree-only analyses based only on selected ‘screened’ data. Furthermore, while English is widely assumed to have derived from a mixture of Germanic dialects— eminently logical also in terms of population history—this too cannot be represented in a tree model. That structure is inherently forced to oversimplify the most plausible history. Nor could it capture the clear possibility that of the dialectal mix that went to make up English, a greater part was of more western than northern Germanic character and provenience, but not exclusively so, especially at a time when the difference between those two groups may well have been very much a continuum still. It is such matters of degree, rather than mutually exclusive binary alternatives, that speak in favour of distance and network analyses for language history, and against treeonly ones. (And in the case of Germanic there are, besides these methodological objections, other grounds to criticise the traditional tree representation of its history: see Robinson’s (1992, pp. 247 – 263) discussion of how even the same data are often interpreted very differently by different authorities.)
(e) Lessons for Indo-European, and for historical linguistic methodology? To conclude, let us recall the due balance that is needed when assessing the respective utility, for uncovering language history, of distance versus state data, and of tree-only versus network-type phylogenetic methods. Certainly, our intent in contrasting figures 1, 2 and 3 here is by no means to underestimate the value Phil. Trans. R. Soc. B (2010)
of Gray & Atkinson’s or Ringe et al.’s pioneering studies. For many purposes, the tree idealization undoubtedly has immense practical utility. To echo Atkinson et al.’s (2005, p. 209) point, when it comes to devising models, known lies are indeed permissible, if they are the sort that can help lead us to the truth. Among these valuable idealizations is at least the possibility of a dating mechanism as put forward by Gray & Atkinson. To be sure, questions remain as to their method, but it is precisely one of its attractions that the authors are at pains to limit the impact of the tree idealization, by ‘dating’ not from a single tree but from a distribution of many ‘most plausible’ trees, and their various respective time-depths (as their method calculates them). The point here is hardly to claim, then, that distance data and network analyses are uniquely valuable, and state data and branching-tree analyses necessarily less so. But it most definitely is to redress the balance. We should check the traditional historical linguist’s instinct that all data that cannot be confirmed as shared innovations are to be discarded as valueless. For an equally principled case can be made, as here, for why the supposed limitations of such data turn out to be far less serious than has generally been assumed. In our search to uncover language prehistory, we are only the poorer if we overlook the value of distance data (provided, of course, that the methods we use to measure language divergence are suitably weighted and balanced). For certain specific purposes, the tree idealization may be valid, indeed indispensable. But it is above all when it comes to representing what actually happened as a given family of languages diverged, in which configurations, and in which real-world scenarios of their speaker populations, that the tree idealization will not do. Not least when we look to phylogenetic tools, let us not allow our visions of language prehistory to become detached from the real-world forces that shape how languages diverge in the first place, as they act upon the populations that speak them. Cross-cutting relationships are nothing if not entirely normal ‘facts of life’ of how languages naturally diverge. Nor, in seeking to account for them, does the contrast between a branching tree with later contacts, versus a dialect continuum, lie only on the level of abstract models. Rather, viewed in terms of population prehistory, it corresponds to two quite different scenarios that the different models effectively argue for. The utility of Germanic as a case-study is that it provides a (reasonably) known external history against which to assess our methodological approaches. On the strength of the findings here, a similar logic can now be extended to probing the unknown of how the early divergence history of Indo-European unfolded. In the full exploration in Heggarty (in preparation a), it transpires that even the data underlying figures 1 and 2 here suggest an early divergence along the lines of a dialect continuum. And for all the purported analytical elegance of binary branches, as a real-world demographic scenario it is this Indo-European continuum that offers the more straightforward and economical explanation. A splits-then-borrowing scenario has instead to invoke not just a complex series of
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Splits or waves? Trees or webs? divergent migrations, but then later movements to attenuate this by bringing certain groups back into contact again. This in turn entails consequences for which of the main rival hypotheses—the migratory Kurgan ‘horse culture’, or the progressive demic diffusion of agriculture—best fits as the driving force that shaped the pattern of the earliest Indo-European expansion. This research was made possible thanks to funding from the Leverhulme Trust, for the multidisciplinary Languages and Origins project in Cambridge (grant F/09757/A to P.H.); and from the Arts and Humanities Research Council (grant 112 229), for the project Sound Comparisons: Dialect and Language Comparison and Classification by Phonetic Similarity, based in Edinburgh (to P.H., W.M. and A.M.).
REFERENCES Anttila, R. 1989 Historical and comparative linguistics, 2nd edn. Amsterdam, The Netherlands: John Benjamins. Atkinson, Q. D. & Gray, R. D. 2006 How old is the IndoEuropean language family? Progress or more moths to the flame? In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 91–109. Cambridge, UK: McDonald Institute for Archaeological Research. Atkinson, Q. D., Nicholls, G., Welch, D. & Gray, R. D. 2005 From words to dates: water into wine, mathemagic or phylogenetic inference? Trans. Philol. Soc. 103, 193–219. (doi:10.1111/j.1467-968X.2005.00151.x) Bandelt, H. J. & Dress, A. W. 1992 Split decomposition: a new and useful approach to phylogenetic analysis of distance data. Mol. Phylogenet. Evol. 1, 242–252. (doi:10.1016/1055-7903(92)90021-8) Bandelt, H. J., Forster, P., Sykes, B. C. & Richards, M. B. 1995 Mitochondrial portraits of human populations. Genetics 141, 743–753. Bandelt, H. J., Forster, P. & Ro¨hl, A. 1999 Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol. 16, 37–48. Bryant, D. & Moulton, V. 2004 NeighborNet: an agglomerative algorithm for the construction of phylogenetic networks. Mol. Biol. Evol. 21, 255 –265. See www.ab. informatik.uni-tuebingen.de/software/jsplits/. (doi:10. 1093/molbev/msh018) Bryant, D., Filimon, F. & Gray, R. D. 2005 Untangling our past: languages, trees, splits and networks. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. Holden & S. Shennan), pp. 67–84. London, UK: UCL Press. Dyen, I., Kruskal, J. B. & Black, P. 1992 An Indo-European classification: a lexicostatistical experiment. Trans. Am. Phil. Soc. 82. See www.wordgumbo.com/ie/cmp/iedata.txt. Forster, P. & Toth, A. 2003 Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European. Proc. Natl Acad. Sci. USA 100, 9079–9084. (doi:10.1073/pnas. 1331158100) Forster, P., Toth, A. & Bandelt, H.-J. 1998 Evolutionary network analysis of word lists: visualizing the relationships between alpine Romance languages. J. Quant. Linguist. 5, 174 –187. (doi:10.1080/09296179808590125) Forster, P., Polzin, T. & Ro¨hl, A. 2006 Evolution of English basic vocabulary within the network of Germanic languages. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 131–138. Cambridge, UK: McDonald Institute for Archaeological Research. Gray, R. D. & Atkinson, Q. D. 2003 Language-tree divergence times support the Anatolian theory of Indo-
Phil. Trans. R. Soc. B (2010)
P. Heggarty et al.
3843
European origin. Nature 426, 435 –439. (doi:10.1038/ nature02029) Gray, R. D., Bryant, D. & Greenhill, S. J. 2010 On the shape and fabric of human history. Phil. Trans. R. Soc. B 365, 3923–3933. (doi:10.1098/rstb.2010.0162) Heggarty, P. 2005 Enigmas en el origen de las lenguas andinas: aplicando nuevas te´cnicas a las inco´gnitas por resolver. Rev. Andina. 40, 9 –57. Heggarty, P. 2006 Interdisciplinary indiscipline? Can phylogenetic methods meaningfully be applied to language data—and to dating language? In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 183–194. Cambridge, UK: McDonald Institute for Archaeological Research. Heggarty, P. In preparation a. Barking up the wrong IndoEuropean tree? Heggarty, P. In preparation b. How language lineages diverge: models vs. the real world. Heggarty, P., McMahon, A. & McMahon, R. 2005 From phonetic similarity to dialect classification: a principled approach. In Perspectives on variation (eds N. Delbecque, J. van der Auwera & D. Geeraerts), pp. 43–91. Amsterdam, The Netherlands: Mouton de Gruyter. Holden, C. J. & Gray, R. D. 2006 Rapid radiation, borrowing and dialect continua in the Bantu languages. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 19–31. Cambridge, UK: McDonald Institute for Archaeological Research. Holland, B. & Moulton, V. 2003 Consensus networks: a method for visualizing incompatibilities in collections of trees. Algorithms in bioinformatics (eds G. Benson & R. Page), pp. 165 –176. Berlin, Germany: Springer. Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267. See www.ab.informatik.uni.tuebingen.de/ software/jsplits/. (doi:10.1093/molbev/msj030) Maguire, W., McMahon, A., Heggarty, P. & Dediu, D. 2010 The past, present and future of English dialects: quantifying convergence, divergence and dynamic equilibrium. Lang. Variation Change 22, 69–104. (doi:10.1017/ S0954394510000013) McMahon, A. & McMahon, R. 2005 Language classification by numbers. Oxford, UK: Oxford University Press. McMahon, A., Heggarty, P., McMahon, R. & Slaska, N. 2005 Swadesh sublists and the benefits of borrowing: an Andean case study. In Quantitative methods in language comparison—Transactions of the Philological Society, vol. 103 (ed. A. McMahon), pp. 147–169. Oxford, UK: Blackwell. McMahon, A., Heggarty, P., McMahon, R. & Maguire, W. 2007 The sound patterns of Englishes: representing phonetic similarity. Engl. Lang. Linguist. 11, 113– 142. (doi:10.1017/S1360674306002139) Nakhleh, L., Ringe, D. & Warnow, T. 2005 Perfect phylogenetic networks: a new methodology for reconstructing the evolutionary history of natural languages. Language 81, 382–420. (doi:10.1353/lan.2005.0078) Pagel, M. & Meade, A. 2006 Estimating rates of lexical replacement on phylogenetic trees of languages. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 173 –182. Cambridge, UK: McDonald Institute for Archaeological Research. Renfrew, C. 1989 Archaeology and language: the puzzle of Indo-European origins. London, UK: Penguin. Ringe, D. A., Warnow, T. & Taylor, A. 2002 Indo-European and computational cladistics. Trans. Philol. Soc. 100, 59– 129. See www.cs.rice.edu/~nakhleh/CPHL. (doi:10.1111/ 1467-968X.00091) Robinson, O. W. 1992 Old English and its closest relatives. London, UK: Routledge.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3845–3854 doi:10.1098/rstb.2010.0013
Historical linguistics in Australia: trees, networks and their implications Claire Bowern* Department of Linguistics, 370 Temple Street, Yale University, New Haven, CT 06511 USA This paper presents an overview of the current state of historical linguistics in Australian languages. Australian languages have been important in theoretical debates about the nature of language change and the possibilities for reconstruction and classification in areas of intensive diffusion. Here are summarized the most important outstanding questions for Australian linguistic prehistory; I also present a case study of the Karnic subgroup of Pama–Nyungan, which illustrates the problems for classification in Australian languages and potential approaches using phylogenetic methods. Keywords: Australian languages; phylogenetics; historical linguistics, reconstruction
1. INTRODUCTION In the historical linguistics literature, Australian languages stand out as unusual in more than one way. For example, the Pama – Nyungan family appears to be an exception to generalizations regarding language family size and hunter – gatherer communities (Wichmann et al. 2008). The family itself has been the subject of debate, although the consensus view within Australian linguistics has been, for some time, that alternative models of change (such as punctuated equilibrium; see Dixon 1997) are problematic and premature before further work has been done using the comparative method (Bowern 2006). Elsewhere (Bowern & Koch 2004a; Bowern 2006), I have argued forcefully against models of language change that highlight areal diffusion at the expense of other types of change. However, it is clear that a lot remains to be said about language change in Australia, particularly with respect to areal patterns. There has been a tendency in some recent work (e.g. Dench 2001) to privilege areal spread over genetic descent, or to argue that an areal explanation for any given change is just as probable as shared descent. However, a claim that makes areal diffusion the primary mechanism of language change contradicts what we know about more general processes of language change. It is a testable claim about how languages are acquired and spread, and how changes are spread through communities and beyond. Linguistic diffusion thus correlates with social interactions; however, to my knowledge no one who has made an areal claim of this type (Dench 2001; Dixon 2001, 2002; Clendon 2006) has supported that claim with appropriately detailed sociolinguistic and anthropological data. Furthermore, many of these arguments confuse the family tree and comparative method. The comparative
*[email protected] Electronic supplementary material is available at http://dx.doi.org/ 10.1098/rstb.2010.0013 or via http://rstb.royalsocietypublishing.org. One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
method can be used to reconstruct extensive diffusion as well as shared innovation through other types of descent (pace Thomason & Kaufman 1988; Labov 2007). An areal reconstruction does not represent a failure of the comparative method. Finally, there is a need for greater transparency in our assumptions which relate area, linguistic diffusion, shared innovations and the reconstruction of language history. For example, in Bowern (2008), I argue in response to Breen (2007) that subgrouping failures (or subgrouping difficulties) may themselves be indicative of certain types of change. That is, our failure to find neat subgrouping is not necessarily a failure of our methods; it is indicative of a type of language splitting produced by certain types of population prehistory.1 Australia is thus an important region for historical reconstruction theory, especially as it relates to small populations. There are currently four outstanding issues in the study of Australian linguistic prehistory: — What is the subgrouping of Pama – Nyungan languages? (What does it mean for our interpretation of prehistory when we cannot draw a neat tree?) — Where did Pama– Nyungan spread from? — How do we account for the spread of Pama – Nyungan? — What is the relationship between the Pama – Nyungan family and other languages in Australia? We do not have good answers to any of these questions at present, although we have hypotheses for all of them. In this article, I discuss previous work on the above four questions and outline a programme for research in this area. (a) Background to the languages of Australia Some basic information about the languages in question is in order. At the time of European settlement in Australia in 1788, there were approximately 250 distinct languages spoken by people who lived in social units varying in size from fewer than 100 to several
3845
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3846
C. Bowern Historical linguistics in Australia
thousand people. Aboriginal people lived in all parts of Australia, including the arid central desert regions. The languages have been grouped into approximately 28 families (O’Grady et al. 1966a; Wurm 1972; Wurm & Hattori 1981; Bowern & Koch 2004a). Initial classifications were completed using lexicostatistics (O’Grady et al. 1966b) and these classifications provided us with approximately 20 primary subgroups of the Pama – Nyungan family, along with the remaining 27 non-Pama – Nyungan families, which are clustered in the far north of the country. Currently, more than 90 per cent of Australia’s indigenous languages are endangered, 60 per cent of aboriginal people live in urban or regional centres, and fewer than 10 per cent of aboriginal people speak a traditional language. The largest languages have about 5000 speakers, and only 20 languages are being acquired by children. For those languages without detailed 20th Century records, primary data collection is therefore either extremely urgent or too late in most cases. However, there is a considerable amount of primary material from the 19th century, as well as unpublished fieldnotes (and increasingly these are being published). Although Australian speech populations are currently small, there is no reason to assume that languages would have been larger in the precolonial period; in fact, some community lingua francas, such as Dhuwal and Burarra, probably have more speakers now than they did before European settlement. Multi-lingualism was widespread in precontact times but not universal by any means. The picture from work such as Heath (1978, 1981) suggests linguistically diverse communities where speakers were all fluent in each other’s languages; while that is accurate for Arnhem Land, other parts of the country show a variety of patterns, including monolingualism, asymmetrical bilingualism and multi-lectalism. Australia is the only continent where agriculture did not develop before the colonial period. Several different subsistence methods which broadly fall under the label ‘hunter –gatherer’ were practised. Social organization also varied, from nomadic groups of perhaps 50 people to sedentary clan groups comprising several hundred individuals. Some groups were monolingual; in others exogamy was the norm and societywide multi-lingualism in several unrelated languages was found. (For an overview, see Hiscock (2008) and for detailed case studies, see Keen (2002, 2004).) I stress this because it is important to remember that Australia is not a homogeneous area, either geographically, socially or linguistically.
(b) Pama– Nyungan languages and subgrouping While earlier work identified differences between northern and southern languages in Australia (e.g. Schmidt 1919; Kroeber 1923), the identification of the Pama – Nyungan family is due to work by Ken Hale and colleagues (Hale 1964, 1966; O’Grady et al. 1966a,b).2 The work of Schmidt (1919) was the first large-scale classification attempt using all available materials (much of Schmidt’s data came from the wordlists in Curr (1886)). He used a list of Phil. Trans. R. Soc. B (2010)
44 vocabulary items, personal pronouns, interrogatives and some phonotactic and syntactic information. From this, he identified a set of languages which he called ‘South Australian languages’. He also posited some intermediate-level major groupings within Southern Australian and some of these had further groups within them. Schmidt misclassifies the Pama – Nyungan languages that have undergone extensive sound change. The next major classification is due to Capell (1941, 1956, 1979) and is based largely on typology (i.e. it is a phenetic classification based on shared structural features rather than shared innovations). His groups are partially areal and partially typological. The lexicostatistical classification of O’Grady et al. (1966a) used a modified form of the Swadesh word list. They aimed to cover the whole continent, and the classification project included fieldwork as well as existing materials. The process for the lexicostatistics classification was described in O’Grady & Klokeid (1969). Since then there have been a number of classifications based broadly on O’Grady et al.’s (1966a), including Wurm & Hattori (1981). The classification was never meant to stand as anything other than a first effort, to be refined as our knowledge of the languages grew. There has been subsequent comparative work by O’Grady and his students (Hendrie 1990; O’Grady 1990, 1998; O’Grady & Fitzgerald 1997) at the level of Pama – Nyungan. Dixon (2002) was a new subgrouping of a rather different type. It is not a genetic subgrouping: that is, Dixon’s approach is not cladistic. It is partly genetic and partly areal. He combines claimed linguistic areas and families and subgroups in the same classification.3 The data on which certain groups are decided as areal or cladistic have not been published. Bowern & Koch (2004a) is a collection of subgrouping studies, including Alpher (2004), which is a first principles demonstration of Pama– Nyungan as a family. The papers in this volume demonstrate the comparative method for nine subgroups of Pama– Nyungan; the remaining papers discuss wider relations among non-Pama – Nyungan languages (e.g. Baker 2004; Bowern 2004; Green & Nordlinger 2004). There are problems in the reconstruction of Pama– Nyungan subgrouping, some of which I will mention briefly here. First is that there are not many people working in this area. There are no full-time historical linguists working on Australian languages: there are few active scholars in this area and everyone has an alternative speciality, such as typology or language documentation. While it is an active and close knit group of scholars, there is only so much that can be done, and this is relevant when we compare it with the number of people working on, for example, the history of French. Second is a data problem. There is a great deal more data for Australian languages than there was 30 years ago, but so many languages have disappeared that classification data for some areas is extremely degraded. In the Karnic example we will see below, for example, the northeastern fringe is represented entirely by 19th-century wordlist sources. The third problem is that we do not have a very good picture of the language contact situation for
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Historical linguistics in Australia much of the country, and so we assume that it was equivalent everywhere to the best-studied cases in Arnhem land. However, from the data that I have been able to collect opportunistically, it is clear that there were multiple patterns. Not only was there widespread community-wide multi-lingualism in multiple languages as we find in current Arnhem land communities (see Bowern 2008), there were also asymmetric bilingual interactions (where one community learnt the language of its neighbours, but the neighbours would not learn the other language4), and there were monolingual populations. In some communities, they seem to have been key language people who knew many surrounding languages and would have acted as interpreters for their communities. Until we get a better idea of the social interaction among groups we will not have an adequate idea about the role of language contact and how big a role it should play in our theories.
(c) The spread of Pama – Nyungan There are two primary competing theories regarding the origin of the current distribution of Pama– Nyungan languages. The first is that they spread in the early Holocene, probably from somewhere south of the modern Gulf of Carpentaria. Variants of this model have been proposed by Sutton (1990, 1997), McConvell & Evans (1997) and Evans & Jones (1997) and others. The location of the putative homeland is based primarily on methods in linguistic geography, such as the area of greatest diversity within Pama – Nyungan. It is also noteworthy that the current distribution of Pama – Nyungan languages and a spread from a north-eastern homeland correlates well with the distribution of backed artefacts (Hiscock 2002). The second theory is that Pama – Nyungan is not a clade, but a remnant diffusion area created from the initial (Pleistocene era) colonization of Australia, with subsequent intense diffusion. This is essentially the view of Dixon (1997) and subsequent work; a modified view (Clendon 2006) has Pama –Nyungan as a bottleneck linguistic area which expanded from the South following climatic amelioration in the arid centre after the last glacial maximum. As appealing as these ideas might be to archaeology, they are implausible linguistically. A linguistic area maintained over such a large area for 40 000 years is highly implausible. It equires types of changes which have not been attested elsewhere, or which are rare elsewhere but would have to be exceedingly common in Australia (such as the borrowing of pronouns). I have written extensively on problems with the idea of Pama–Nyungan as a diffusion area, and will not repeat those arguments here (see Bowern 2006, 2007). The Holocene expansion theory is itself also not without problems, however. It is unclear what the trigger for the spread of Pama – Nyungan would have been. There is no punctuation or other event in the archaeological record which we can associate with significantly better technology or superior warfare. However, as noted above there is some evidence for significant small tool expansion at the relevant time Phil. Trans. R. Soc. B (2010)
C. Bowern
3847
Proto-Australian other Australian Gunwinyguan Tangkic Garwan Pama–Nyungan Figure 1. Putative Proto-Australian family tree (after Evans 2005).
period (see also Evans & Jones 1997). Large-scale conquests are unknown in hunter – gatherer communities for obvious demographic reasons. The archaeological record is patchy but does show habitation of the southern regions from the Pleistocene period (e.g. Devil’s Lair in South Western Australia from 36 000 BP and Willandra Lakes in New South Wales; see Mulvaney & Kamminga 1999; Hiscock 2008). A late Holocene expansion from the north would most probably have meant the acquisition by Pama– Nyungan speakers of lands belonging to nonPama– Nyungan speakers. Why would speakers have shifted languages? It is well known that newcomers to communities tend to adopt the languages of the hosts, and not vice versa; moreover, people do not generally switch languages without good reason. Exceptions to this pattern of language shift seem to be largely confined to the colonial and post-colonial period (although see McConvell 2001; McConvell & Alpher 2002 for some situations where migration may lead to language shift towards immigrant languages). Finally, given that there is an obvious climatic change in the late Pleistocene, it is tempting to link the expansion of the family to it and assume a scenario of spread with climatic amelioration and greater access to resources driving population increase and therefore expansion. We therefore need a principled reason why this would not be appropriate. In summary, Pama– Nyungan is obviously a challenging problem for those who maintain that the only causes of widespread expansion are either technological or expansion into uninhabited territory (e.g. Renfrew 1989; Bellwood 2001).
(d) Relationships between Pama – Nyungan and other Australian languages Given that there is little work in the reconstruction of the non-Pama – Nyungan families, and that reconstruction of Pama– Nyungan itself is at an early stage, discussion of more remote relationships is premature. Evans (2005) and Evans & Jones (1997) and other work has Pama – Nyungan as one of a series of families, as schematized in figure 1. This is not the only hypothesis concerning Pama –Nyungan relationships but it is the most widely accepted currently. This tree is based on very little evidence, however. A single shared innovation defines each of these nodes. Moreover, the status of Gunwinyguan is unclear at present; the composition of the family is disputed and is not established by the traditional methods used in linguistic reconstruction. On the basis of
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3848
C. Bowern Historical linguistics in Australia
shared irregularities in verb morphology, Green (2003) suggests that Gunwinyguan in fact belongs to a macrofamily containing a number of languages in Arnhem Land that have so far not figured in close relationships to Pama – Nyungan. This must remain an open question for the present. An alternative tree is presented by Heath (1990), and used implicitly by Clendon (2006). In this tree, proto-Australian has two daughters: Proto-Pama– Nyungan and Proto-Non-Pama – Nyungan. Evidence here is scanty and inconclusive, and based mostly on pronominal data. It is tempting to compare Australian languages to those of Papua New Guinea. After all, the two countries were joined by a land bridge until the end of the Pleistocene and Australia must have been settled via New Guinea. Comparisons have so far failed to reveal any relatives. 2. CASE STUDY: THE LAKE EYRE LANGUAGES In this case study, I present work in progress using a combination of established historical methods and computational phylogenetic analysis. Karnic languages are an excellent case study for the problems in Pama– Nyungan, simply because the same problems we see at a larger scale with 150 languages are also found in the subgroup. We find a similar set of problems with the interplay between areal and non-areal features, difficulty in distinguishing archaic shared features from shared innovations, and little evidence for higher-order structure. (a) Overview (i) Geographical area The languages which form the basis of this case study are those formally spoken in the Lake Eyre Basin of Eastern Central Australia, straddling the Queensland, Northern Territory, New South Wales and South Australian borders. The area where Karnic languages are spoken broadly comprises the Lake Eyre drainage basin; mostly rather flat semi-arid country, subject to occasional seasonal inundation. A map of the area can be seen in the electronic supplementary material, figure S1. (ii) Prior classification The classification of the Lake Eyre languages has not been stable. Researchers have vacillated between recognizing a series of low-level groups with no closer higher relations between them, and grouping the languages into a larger family; the composition of this family, however, has also varied over time. This section briefly surveys the most widely known classifications. Schmidt (1919, pp. 43– 44) defines a Karna group, which refers to Pitta-Pitta, Mithaka, Kunggari, and related dialects, as part of his Su¨d-Zentral-gruppe. He also defines two Untergruppen ‘subgroups’; the Nulla-Untergruppe (Arabana– Wangkangurru) and the Dieri-Yarrawurka-Wonkamarra-Evelyn CreekUntergruppe, the name of which is self-explanatory. O’Grady et al. (1966a) also recognized Karnic, although their Karnic was considerably smaller than Phil. Trans. R. Soc. B (2010)
the current Karnic classifications. Other publications identify several independent groups, implicationally no more closely related to each than to any other Pama– Nyungan subgroup. Breen (1971) recognized the wider relations of O’Grady, Voegelin and Voegelin’s ‘Karnic’ group and related it to ‘Mitakudic’ and others in the Lake Eyre Basin. Almost all the subgroupings have been based primarily on lexico-statistical data, whether as part of a wider preliminary survey of languages (O’Grady et al. 1966a; Wurm 1972) or a more detailed comparison (Breen 1971). Two of the later classifications, Austin (1990a) and Bowern (2001), also take morphological and lexical reconstruction into account, but because the reconstructions of the two authors are different they come to rather different conclusions regarding subgrouping. Trees of these previous classifications are provided in the electronic supplementary material, figure S2. The number of classification claims for the subgroup makes it one of the better studied in the family. However, there is little agreement beyond the lowest level groups. A summary is given in Breen (2007) and further discussion in Bowern (2009). I see the following issues as being most important for the study of the subgroup and its theoretical historical implications: — Are there any higher-level groupings beyond the lower-level ones identified by early lexicostatistics, and about which all classifications are in agreement? — If so, what are they? — How far do the borders of the family extend? Does the family include Arabana– Wangkangurru? The eastern languages Garlali and Badjiri? The northern languages Yanda and Guwa? — Is there a northern Karnic subgroup comprising Arabana and Pitta-Pitta (and associated dialects)? — Which subgroup does Mithaka belong to? What are the innovations which would characterize each of these groups?5 As I showed in Bowern (1998), there are innovations in morphology which provide conflicting evidence for Karnic subgrouping. Evidence for a Northern subgroup includes shared vocabulary, innovations in pronouns (e.g. Proto-Karnic *ngantya 1sg dative . nominative and *nhuka 3sg nominative . uka (Arabana), nhuwa (Pitta-Pitta)) and the use of one of the allomorphs of the locative case marker as a causal. However, there are also innovations that group Pitta-Pitta and associated dialects with other Karnic languages, implying that there is no common northern Karnic clade. These include lexical items, a change of *-nga locative allomorph . dative (þ more general locative . dative change) and a second person singular accusative *nyuna (although this may be shared archaism and therefore useless for subgrouping). In other work (e.g. Bowern 2009), I have argued that the existence of contradictory subgroupings such as this may have more than one explanation. In much work in historical linguistics, it is assumed implicitly (and sometimes stated explicitly) that conflicting subgrouping is primarily (only?) owing to
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Historical linguistics in Australia language contact that has obscured clear tree branching (Thomason & Kaufman 1988; Labov 2007). However, it seems clear that there are cases where the same processes of language change that produce tree-like splits may also produce networks. In particular, a large area that was settled fairly quickly and where speech communities retained alliances with one another for some time is highly likely to produce a complex dialect area, and relics of the conflicting isoglosses in the earlier dialect area may persist after the languages lose mutual intelligibility (and cease to be regarded as dialects of one another). Arguments of this type rely on relative chronology of shared innovations. In the Karnic case, they are further complicated by subsequent diffusion.6 The conflicting claims for subgrouping could have several other explanations, however. It could be that the authors’ reliance on different types of data has resulted in different trees because of unidentified shared retention or borrowing in different areas of grammar and lexicon. In the following section, I consider a computational phylogenetic analysis of the problem.
(b) Method Information on the language sources used is given in Bowern (2001, table 1, p. 248) and is summarized in the electronic supplementary material, S3. Data for 770 mostly lexical character sets were coded as multi-state and then converted to 5487 binary characters. The data were then analysed with SPLITSTREE 4.0 (Huson & Bryant 2006) using the NeighborNet algorithm (Bryant et al. 2005; Huson & Bryant 2006).7 When using the comparative method in linguistics, loanwords are normally excluded. In this coding, however, I did not treat identified loans differently because of the high likelihood of substantial undetected borrowing. Moreover, in §2c, I use variable borrowing rates to identify potential loan paths, and this would not be possible if borrowings were filtered. Originally, 40 taxa were sampled, including dialects of the better attested languages and languages outside the Lake Eyre Basin. Some of these languages are very poorly attested and lack of data resulted in such a small number of informative characters.8 The Thura– Yura language Adnyamathanha (Simpson & Hercus 2004) was used as an outgroup. Others were excluded because of clear data contamination.9 The results were then compared with those obtained in previous classification studies. It has long been known that certain types of words are more susceptible to borrowing than others. Flora, fauna and artefact terms are universally borrowed at higher rates than body part terms, for example. McConvell (2010) finds that more than 50 per cent of animal terms are loans in the Pama –Nyungan language Gurindji, for example.10 Other semantic fields are more variable. For example, kinship terms are often treated as basic vocabulary in Indo-European studies, but in groups that practise exogamy kinship terms are often subject to borrowing; more generally, certain kinship terms are subject to replacement by Phil. Trans. R. Soc. B (2010)
C. Bowern
3849
baby-talk terms and are therefore perhaps less reliable than other basic vocabulary items. See further Haspelmath (2008) and Haspelmath & Tadmor (2009).
(c) Results and discussion The NNet for 25 well-attested taxa, using all data points, is given in figure 2. All the lower-level groupings identified by previous classification studies also appear in this network: Eastern Karnic (Wangkumara dialects and Punthamara), Western Karnic (Diyari, Ngamini and Yarluyandi), and Central Karnic (those languages plus Yandruwandha, Yawarrawarrka and Nhirrpi). Mithaka and Karuwali are in Central Karnic but not Western Karnic, and some splits group Arabana and Pitta-Pitta (and associated dialects) together. Since NeighborNets are known to be somewhat sensitive to missing data, a subset of 1211 characters from 23 taxa with the best attestations was studied. The network from this dataset is given in the electronic supplementary material, S4. Missing data here are slightly over 30 per cent. The structure is consistent with figure 2 except that splits are ambiguous as to Mithaka’s placement with Western Karnic or with Yandruwandha’s group. There is also less networklike splitting in the Eastern Karnic groups. These varieties approach the similarity of dialects, and since it is very common to find overlapping isoglosses among mutually intelligible varieties (since at any given point changes may have differing ranges) this should be unsurprising. Since the full dataset and the well-attested data give the same groupings, missing data are unlikely to be an issue for the languages under consideration here. We find some ambiguity in the placement of the Western Lake Eyre languages Arabana and Wangkangurru. One set of splits groups the languages with Pitta-Pitta and Wangkayutyuru; this is consistent with Hercus’s (1994) ‘northern Karnic’ group. A second set of splits groups these languages with those immediately to the east of Lake Eyre, in particular, Western Karnic. This has not been proposed in any of the previous classifications of Karnic languages, although it has long been noted that languages on either side of Lake Eyre exhibit loanwords and grammatical borrowings, such as the use of pronouns inflected for kinship information. Since the Western Karnic languages have undergone changes that Arabana– Wangkangurru has not (Hercus 1994; Bowern 1998), these splits are likely to reflect loans. NeighborNets allow us to schematize ambiguity in classification, but they do not by themselves allow us to pinpoint the source of the ambiguity. For example, if some varieties share extensive archaic features, this will place them closer together, even though shared archaisms are unrevealing for subgrouping. (This is true, of course, for all distance-based classification methods.) However, we can try to identify potential borrowings and shared archaisms by considering subdomains of vocabulary. The Karnic character sets were coded for semantic domain and then further divided into borrowability hierarchies.11 Figure 3 shows the NNet diagram for characters for flora and
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3850
C. Bowern Historical linguistics in Australia
Figure 2. Karnic NeighborNet: 25 taxa, 5487 binary characters. Boldface indicates languages whose classification we are particularly considering.
Figure 3. Karnic NNet: characters that are more susceptible to borrowing. Boldface indicates languages whose classification we are particularly considering.
fauna items and other items that are commonly borrowed. Figure 4, in contrast, shows items that are less likely to be borrowed.12 Boldface indicates languages whose classification we are particularly considering. The high-borrowing category includes 1336 Phil. Trans. R. Soc. B (2010)
characters, while the low-borrowing category includes 3027 characters. Let us first consider the similarities between the two networks. In both cases, a group of Eastern languages is identified, although the internal structure of the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Historical linguistics in Australia
C. Bowern
3851
Figure 4. Karnic NNet: characters that are less susceptible to borrowing. Boldface indicates languages whose classification we are particularly considering.
group is different. We find a certain amount of network-like structure in Central Karnic in both the high-borrowing and low-borrowing datasets. This is unsurprising since speakers of the languages were in contact and a thriving trade network in grind-stone tools and other items existed in precontact times (McBryde 1987) and linked these groups closely together. Let us now consider points of difference between the two networks. As one would expect, figure 3 is considerably messier than the low-borrowing network in figure 4. This is presumably because loans are affected by geography, with languages borrowing from more than one neighbour.13 In the low-borrowing network, as in the aggregate data, we find ambiguous clustering. The Western Karnic languages in the high-borrowing ability network ambiguously group with the other central Karnic languages (as identified by Austin 1990b; Bowern 1998) and with the languages to the west of Lake Eyre, Arabana and Wangkangurru. Although in the aggregate data there were conflicting splits grouping Arabana and Wangkangurru variably with Pitta-Pitta and Wangkayutyuru and with Western Karnic (although not with Central Karnic), the signal for grouping with Pitta-Pitta and Wangkayutyuru is much weaker in the high-borrowing network, and weaker in the highly stable vocabulary. This implies that a Northern Karnic grouping is neither an artefact of loans, nor purely of shared retentions. While it does not conclusively demonstrate the existence of a Northern Karnic family, it is suggestive of such a grouping.14 Central and Eastern Karnic are differentiated in both networks, although in the high-borrowing network Ngamini is ambiguously grouped with both Diyari and Yarluyandi. Phil. Trans. R. Soc. B (2010)
Included in the dataset are five languages that are not usually discussed in Karnic classification. Karuwali was suggested as Karnic as far back as Breen (1971) but did not feature much in subsequent classifications that dealt mostly with grammatical data, as there are only wordlist data for the language. Aggregate data place the language as a sister to Mithaka; however, both the high- and low-borrowing networks in figures 3 and 4 have Karuwali showing conflicting splits, between the out-group Adnyamathanha and Central Karnic in the first instance, and the Eastern and Karnic groups in the second. More research is required. Two other doubtfully Karnic languages, Yanda and Guwa, group with Pitta-Pitta and Wangkayutyuru in all cases. Pirriya and Kungkari group with Eastern Karnic in the high-borrowing data, but not clearly with any particular Karnic group in the aggregate and low-borrowing data. This implies either that they are a further primary subgroup within Karnic or that they are not Karnic. The final language that warrants discussion is Garlali. Breen (2007) argues against Bowern’s (2001) classification of Garlali as Karnic. Data here are from Breen’s fieldnotes; Garlali is grouped more or less strongly with Eastern Karnic in all networks. However, given the problematic history of the language description and the likelihood of borrowings in the data, it is possible that if detailed Maric subgroup data were included Garlali would not group with Eastern Karnic in the same way. Thus in summary, there was weak evidence for a Northern Karnic group. The grouping of Central and Western Karnic is solid, although within Western Karnic the placement of Mithaka is still ambiguous. Regarding the borders of the family, we have evidence
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3852
C. Bowern Historical linguistics in Australia
for including Arabana– Wangkangurru as Karnic, some evidence for Garlali, good evidence for Karuwali and some evidence for the Northern languages Yanda and Guwa and the Eastern fringe languages Pirriya and Kungkari. The ambiguous grouping of languages such as Mithaka in all networks lends further support to the analysis of Karnic as being an area of long-standing dialect breakup, supporting analysis using the comparative method.
Many thanks to Luise Hercus and Gavan Breen for access to unpublished data for Karnic languages. This research is funded by NSF grant 844550 ‘Pama– Nyungan reconstruction and Australian prehistory’ to Yale University (Claire Bowern PI) and 902114 ‘Dynamics of Hunter – gatherer Language Change’. Thanks also to Barry Alpher, Russell Gray, Simon Greenhill, Patrick McConvell, David Nash and an anonymous reviewer for discussion on these topics and feedback.
ENDNOTES 1
3. CONCLUSIONS AND FUTURE DIRECTIONS I see several areas where work is urgent for Australian prehistory and reconstruction. One is the compilation of site stratigraphy in archaeology. Data from many parts of the country show a pattern of site settlement and abandonment. It would be extremely useful to have this data tabulated and plotted using GIS. To my knowledge there is no amalgamation of this data. Within linguistics there are some urgent tasks. Perhaps the most urgent is lexical reconstruction of low-level subgroups. Some areas are better studied than others but all are in need of more work. Australia would benefit from less interdisciplinary work, particularly where linguistics is concerned. At the same time that Australia has been a leader in interdisciplinary work in linguistics, anthropology and archaeology, we have lagged in the detailed linguistic work required to build solid interdisciplinary hypotheses. For example, we cannot at present use data from flora and fauna to shed light on the Pama– Nyungan homeland because we do not have the relevant linguistic reconstructions. Linguistics has also suffered from an attempt to fit as much of the archaeological data as possible into a model, where it is not clear what is relevant. For example, the length of settlement in Australia is irrelevant to the reconstruction of Pama – Nyungan, just as the original settlement of Europe by Homo sapiens is irrelevant to the reconstruction of Proto-Indo-European. In this article, I have presented a method for comparing subgrouping hypotheses by considering character sets broken down by semantic domain. This method relies on compensating for our inability to distinguish shared innovation from shared archaism by considering subnetworks. In areas where the history is not well known and relative chronology cannot be inferred because of high rates of lexical replacement, this method provides an alternative that does not rely solely on the linguists’ impression of the most common groupings. This method was tested on the Karnic subgroup of Pama – Nyungan. This revealed splits that differed between frequently borrowed lexical items and infrequently borrowed ones, and both of these produced different networks from an aggregate character set. This implies both that borrowing has occurred, and that shared archaisms may be affecting subgrouping. Removing those characters produced a network in which Karnic languages fell into three primary groups: a northern group (confirming the work of Hercus 1994), a Central group (confirming Austin 1990a and Bowern 1998) and an Eastern group, which is recognized by all the relevant previous work. Phil. Trans. R. Soc. B (2010)
For example, Gray et al. (2009) find that the areas of least tree-like signal in Austronesian coincides with ‘pauses’ in migrations where groups are diverging in situ. 2 This section is based on information in Koch (2004) and the reader is referred to this paper and Bowern & Koch (2004b) for more information. 3 Note that this is not the same as the social network model used for Oceanic languages by Ross (1988): these are not linkages in the sense that Ross uses the term. 4 This is similar to the current situation in Belgium, where native Flemish speakers usually also speak French, but native speakers of French almost never speak Flemish. 5 From early on in historical work, linguists have avoided distancebased methods relying on similarity of forms in favour of innovation-based parsimony methods using data from both the lexicon and grammar. 6 This model has certain elements in common with Schmidt’s (1872) wave theory, although there are other aspects of that model which are inappropriate here, such as the lack of differentiation between changes between dialects and those that affect languages. 7 For the use in linguistics of NNet and NJ techniques, see amongst others Gray et al. (2007), McMahon & McMahon (2006) and Bryant et al. (2005). 8 In the case of the Yardli languages Wadikali, for example, less than 10 per cent of the character data could be supplied because the language is known only from a wordlist of 70 items (Hercus & Austin 2004). 9 The sources in Reuther (1883), for example, show extensive influence from the languages to the west of Lake Eyre. Comparison with varieties recorded later (such as Austin 1981) shows that this contamination is not likely to be borrowing into that variety alone. 10 McConvell (2010) suggests that flora/fauna borrowings might be particularly associated with territorial expansions of speakers. 11 Note that my procedure here is not identical to that used in McMahon & McMahon (2006): they divided the Swadesh word list into degrees of borrowability: my dataset is much larger; the high-borrowing set, for example, has 1336 binary characters. 12 I do not know of any work that quantifies the relative probability at which certain semantic classes are borrowed apart from the preliminary findings in Haspelmath (2008). The statement is based on the fact that in many parts of the world speakers borrow words from neighbouring languages when they come into contact with new items, and these new items tend to be artefacts, plants and animals, and technology, and tend not to be items such as body parts. In areas where there is a high degree of exogamy, kinship terms show instability. I assume this is because in such areas speakers have several language terms to choose from when referring to different relatives. 13 A small supporting point is that the arrangement of taxa in the high borrowing NeighborNet is very close to the geographical distribution of the languages, with ambiguous splits often reflecting geographically adjacent languages. 14 Remember that the grammatical reconstruction and limited innovation-based lexical data were ambiguous in this respect.
REFERENCES Alpher, B. 2004 Pama–Nyungan: phonological reconstruction and status as a phylogenetic group. In Australian languages: classification and the comparative method
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Historical linguistics in Australia (eds C. Bowern & H. Koch), ch. 5, pp. 105–142. Amsterdam, The Netherlands: John Benjamins. Austin, P. 1981 A grammar of Diyari, South Australia. Cambridge, UK: Cambridge University Press. Austin, P. 1990a Classification of Lake Eyre languages. La Trobe Univ. Working Papers Linguist. 3, 171– 201. Austin, P. 1990b The Karangura. Records of the South Australian Museum, vol. 25, pp. 129–137. Baker, B. 2004 Stem forms and paradigm reshaping in Gunwinyguan. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 13, pp. 343– 373. Amsterdam, The Netherlands: John Benjamins. Bellwood, P. 2001 Archaeology and the historical determinants of punctuation in language-family origins. In (eds A. Aikhenvald & R. M. W. Dixon), Areal diffusion and genetic inheritance: problems in comparative linguistics, ch. 2, pp. 27–42. Oxford, UK: Oxford University Press. Bowern, C. 1998 The case of Proto-Karnic. BA Honours thesis, Australian National University, Canberra, Australia. Bowern, C. 2001 Karnic classification revisited. In Forty years on: Ken Hale and Australian languages (eds J. Simpson, D. Nash, M. Laughren, P. Austin & B. Alpher), ch. 17, pp. 245–261. Canberra, Australia: Pacific Linguistics. Bowern, C. 2004 Diagnostic similarities and differences between Nyulnyulan and neighbouring languages. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 11, pp. 295–318. Amsterdam, The Netherlands: John Benjamins. Bowern, C. 2006 Another look at Australia as a linguistic area. In Linguistic areas (eds Y. Matras, A. McMahon & N. Vincent), pp. 244 –265. Basingstoke, UK: Palgrave Macmillan. Bowern, C. 2007 Review of ‘areal diffusion and genetic inheritance: problems in comparative linguistics’. Anthropol. Linguist. 49, 429– 434. Bowern, C. 2008 Linguistic fieldwork: a practical guide. Basingstoke, UK: Palgrave Macmillan. Bowern, C. 2009 Reassessing Karnic: a reply to Breen 2007. Aust. J. Linguistics 29, 337–348. (doi:10.1080/ 07268600903232733) Bowern, C. & Koch, H. (eds) 2004a Australian languages: classification and the comparative method. Amsterdam, The Netherlands: John Benjamins. Bowern, C. & Koch, H. 2004b Introduction: subgrouping methodology in historical linguistics. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 1, pp. 1– 16. Amsterdam, The Netherlands: John Benjamins. Breen, G. 1971 Aboriginal languages of Western Queensland. Linguist. Commun. 5, 1–88. Breen, G. 2007 Reassessing Karnic. Austral. J. Linguist. 27, 175 –199. (doi:10.1080/07268600701522780) Bryant, D., Filimon, F. & Gray, R. 2005 Untangling our past: languages, trees, splits and networks. In The evolution of cultural diversity: phylogenetic approaches (eds R. Mace & C. Holden & S. Shennan), pp. 69–85. London, UK: UCL Press. Capell, A. 1941 The structure of Australian languages. In Studies in Australian linguistics, vol. 3 (ed A. P. Elkin). The Oceania Monographs, pp. 46– 80. Sydney, Australia: AIAS. Capell, A. 1956 A new approach to Australian linguistics (Handbook of Australian languages part I ), Oceania
Phil. Trans. R. Soc. B (2010)
C. Bowern
3853
Linguistic Monographs, vol. 1. Sydney, Australia: University of Sydney. Capell, A. 1979 The history of Australian languages: a first approach. In Australian linguistic studies, vol. C54 (ed. Stephan A. Wurm), pp. 419– 619. Canberra, Australia: Pacific Linguistics. Clendon, M. 2006 Reassessing Australia’s linguistic prehistory. Curr. Anthropol. 47, 39–61. (doi:10.1086/497671) Curr, E. M. (ed.) 1886 The Australian race, vol. 4. Melbourne, Australia: Government Printers. Dench, A. 2001 Descent and diffusion: the complexity of the Pilbara situation. In Areal diffusion and genetic inheritance: problems in comparative linguistics (eds A. Aikhenvald & R. M. W. Dixon), ch. 5, pp. 105– 133. Oxford, UK: Oxford University Press. Dixon, R. M. W. 1997 The rise and fall of languages. Cambridge, UK: Cambridge University Press. Dixon, R. M. W. 2001 The Australian linguistic area. In Areal diffusion and genetic inheritance: problems in comparative linguistics (eds A. Aikhenvald & R. M. W. Dixon), ch. 4, pp. 64–104. Oxford, UK: Oxford University Press. Dixon, R. M. W. 2002 Australian languages: their nature and development. Cambridge, UK: Cambridge University Press. Evans, N. 2005 Australian languages reconsidered: a review of Dixon (2002). Oceanic Linguist. 44, 242 –286. Evans, N. & Jones, R. 1997 The cradle of the Pama– Nyungans: archaeological and linguistic speculations. In Archaeology and linguistics (eds P. McConvell & N. Evans), ch. 22, pp. 385–417. Melbourne, Australia: Melbourne University Press. Gray, R. D., Greenhill, S. J. & Ross, R. M. 2007 The pleasures and perils of Darwinizing culture (with phylogenies). Biol. Theory 2, 360 –375. (doi:10.1162/biot. 2007.2.4.360) Gray, R. D., Drummond, A. J. & Greenhill, S. J. 2009 Language phylogenies reveal expansion pulses and pauses in pacific settlement. Science 323, 479– 483. (doi:10.1126/science.1166858) Green, R. 2003 Proto Maningrida within Proto Arnhem: evidence from verbal inflectional suffixes. In The NonPama–Nyungan languages of Northern Australia: comparative studies of the continent’s most linguistically complex region, 369. Canberra, Australia: Pacific Linguistics. Green, I. & Nordlinger, R. 2004 Revisiting Proto-Mirndi. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 12, pp. 319– 342. Amsterdam, The Netherlands: John Benjamins. Hale, K. L. 1964 Classification of the Northern Paman languages, Cape York Peninsula, Australia: a research report. Oceanic Linguist. 3, 248 –265. (doi:10.2307/ 3622881) Hale, K. L. 1966 The Paman group of the Pama–Nyungan family. Languages of the World: Indo-Pacific fascicle 6. Anthropol. Linguist. 8, 162 –197. Haspelmath, M. 2008 Loanword typology: steps toward a systematic cross-linguistic study of lexical borrowability. Empirical Approaches to Language Typology 35, 43. Haspelmath, M. & Tadmor, U. 2009 Loanwords in the world’s languages: a comparative handbook. Berlin, Germany: Mouton de Gruyter. Heath, J. 1978 Linguistic diffusion in Arnhem Land. Canberra, Australia: Australian Institute of Aboriginal Studies. Heath, J. 1981 A case of intensive lexical diffusion: Arnhem Land, Australia. Language 57, 335– 367. Heath, J. 1990 Verbal inflection and macro-subgroupings of Australian languages: the search for conjugation markers in non-Pama–Nyungan. In Linguistic change and
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3854
C. Bowern Historical linguistics in Australia
reconstruction methodology, vol. 45 (ed. P. Baldi). Trends in Linguistics: Studies and Monographs, pp. 403 –417. Berlin, Germany: Mouton. Hendrie, T. R. 1990 Initial apicals in Nuclear Pama– Nyungan. In (eds N. G. O’Grady & T. D. Tryon) Studies in comparative Pama–Nyungan, vol. C-111 1990, pp. 15–77. Canberra, Australia: Pacific Linguistics. Hercus, L. A. 1994 A grammar of the Arabana–Wangkangurru Language. Lake Eyre Basin, South Australia, vol. C-128. Canberra, Australia: Pacific Linguistics. Hercus, L. & Austin, P. 2004 The Yarli languages. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 9, pp. 227 –244. Amsterdam, The Netherlands: John Benjamins. Hiscock, P. 2002 Pattern and context in the holocene proliferation of backed artifacts in australia. Archeol. Papers Am. Anthropol. Assoc. 12, 163 –177. (doi:10.1525/ap3a. 2002.12.1.163) Hiscock, P. 2008 Archaeology of ancient Australia. London, UK: Routledge. Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254 –267. (doi:10.1093/molbev/msj030) Keen, I. 2002 Seven aboriginal marriage systems and their correlates. Anthropol. Forum 12, 145 –157. (doi:10.1080/ 006646702320622770) Keen, I. 2004 Aboriginal economy and society: Australia at the threshold of colonisation. Oxford, UK: Oxford University Press. Koch, H. 2004 A methodological history of Australian linguistic classification. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 2, ppp. 17–66. Amsterdam, The Netherlands: John Benjamins. Kroeber, A. L. 1923 Relationship of the Australian languages. J. R. Soc. New South Wales 57, 101 –117. Labov, W. 2007 Transmission and diffusion. Language 83, 344 –387. (doi:10.1353/lan.2007.0082) McBryde, I. 1987 Goods from another country: exchange networks in the Lake Eyre Basin. In Australians to 1788, Australians: a historical library vol. III, pp. 252 –273. Sydney, Australia: Fairfax, Syme and Weldon Associates. McConvell, P. 2001 Language shift and language spread among hunter– gatherers. In Hunter –gatherers: social and biological perspectives (eds C. Panter-Brick, P. RowleyConwy & R. Layton), pp. 143 –169. Cambridge, UK: Cambridge University Press. McConvell, P. 2010 Loanwords in Gurindji. Loanword typology. The Hague, The Netherlands: Mouton. McConvell, P. & Alpher, B. 2002 On the Omaha trail in Australia: tracking skewing from east to west. Anthropol. Forum 12, 159 –175. (doi:10.1080/ 006646702320622789) McConvell, P. & Evans, N. 1997 Archaeology and linguistics. Melbourne, Australia: Melbourne University Press. McMahon, A. & McMahon, R. 2006 Language classification by numbers. Oxford, UK: Oxford University Press. Mulvaney, J. & Kamminga, J. 1999 The prehistory of Australia. Sydney, Australia: Allen and Unwin. O’Grady, G. N. 1990 Umpila and Wadjuk: a long-slot approach to Pama–Nyungan. In (eds N. G. O’Grady &
Phil. Trans. R. Soc. B (2010)
T. D. Tryon) Studies in comparative Pama–Nyungan, vol. C-111 1990, pp. 1 –17. Canberra, Australia: Pacific Linguistics. O’Grady, G. N. 1998 Toward a proto-Pama– Nyungan stem list, part I: Sets j1-j25. Oceanic Linguist. 37, 209 –233. (doi:10.2307/3623409) O’Grady, G. N. & Fitzgerald, S. 1997 Cognate search in the Pama–Nyungan language family. In Archaeology and linguistics (eds P. McConvell & N. Evans), ch. 19, pp. 341 –355. Melbourne, Australia: Melbourne University Press. O’Grady, G. N. & Klokeid, T. 1969 Australian linguistic classification: a plea for coordination of effort. Oceania 39, 298–311. O’Grady, G., Voegelin, C. & Voegelin, F. 1966a Languages of the world: Indo-Pacific fascicle 6. Anthropol. Linguist. 6, 1– 106. O’Grady, G. N., Stephen, A. W. & Kenneth, L. H. 1966b Map of Aboriginal languages of Australia. Victoria, Canada: Department of Linguistics, University of Victoria. Renfrew, C. 1989 Models of change in language and archaeology. Trans. Philol. Soc. 87, 103 –155. (doi:10.1111/j. 1467-968X.1989.tb00622.x) Reuther, J. G. 1883 Three central Australian grammars (diari, jandruwanta, wonkanguru) (eds L. A. Hercus & J. G. Breen). Manuscript translated by T. Schwarzschild, L.A. Hercus. Ross, M. D. 1988 Proto-Oceanic and the Austronesian languages of Western Melanesia, vol. C, 98. Canberra, Australia: Pacific Linguistics. Schmidt, J. 1872 Die Verwandtschaftsverha¨ltnisse der indogermanischen Sprachen. Weimar, Germany: Bo¨hlau. Schmidt, W. 1919 Die Gliederung der Australischen Sprachen. Vienna, Austria: Mechenaristen-Buchdruckerei. Simpson, J. & Hercus, L. 2004 Thura-Yura as a subgroup. In Australian languages: classification and the comparative method, vol. 249 (eds C. Bowern & H. Koch). Current Issues in Linguistic Theory, ch. 8, pp. 197–226. Amsterdam, The Netherlands: John Benjamins. Sutton, P. 1990 The pulsating heart: large-scale cultural and demographic processes in Aboriginal Australia. In Hunter –gatherer demography, vol. 39 (eds B. Meehan & N. White). Oceania Monographs, pp. 71–80. Sydney, Australia: University of Sydney. Sutton, P 1997 Materialism, sacred myth and pluralism: competing theories of the origin of Australian languages. In Scholar and sceptic: Australian Aboriginal studies in Honour of L.R. Hiatt, pp. 211– 242. Canberra, Australia: Aboriginal Studies Press. Thomason, S. & Kaufman, T. 1988 Language contact, creolization and genetic linguistics. Berkeley, CA: University of California Press. Wichmann, S., Stauffer, D., Schulze, C. & Holman, E. W. 2008 Do language change rates depend on population size? Adv. Complex Systems 11, 357– 369. (doi:10.1142/ S0219525908001684) Wurm, S. A. 1972 Languages of Australia and Tasmania. The Hague, The Netherlands: Mouton de Gruyter. Wurm, S. A. & Hattori, S. (eds) 1981 Language Atlas of the Pacific area part I: New Guinea Area, Oceania, Australia. Canberra, Australia: Australian Academy of the Humanities (maps 20–23 compiled by M. Walsh).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3855–3864 doi:10.1098/rstb.2010.0051
Language shift, bilingualism and the future of Britain’s Celtic languages Anne Kandler1,*, Roman Unger2 and James Steele1 1
AHRC Centre for the Evolution of Cultural Diversity, Institute of Archaeology, University College London, 31– 34 Gordon Square, London WC1H 0PY, UK 2 Chemnitz University of Technology, Faculty of Mathematics, 09107 Chemnitz, Germany
‘Language shift’ is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100 – 120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the ‘natural wastage’ as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission. Keywords: language competition; Celtic; Gaelic; Welsh; reaction – diffusion; intergenerational transmission
1. INTRODUCTION ‘Language shift’ is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. Membership of a community defined by its language selectively facilitates and inhibits interaction, enables entry into social contracts and cooperative exchange and gives access to a reservoir of accumulated and linguistically encoded knowledge. In cases of language contact, therefore, people are inevitably confronted with difficult choices about which language they wish or need to speak. The major driver of language shift is the decision to abandon a more local or less prestigious language, typically because the target of the shift is a language seen as more modern, useful or giving access to greater social mobility and economic opportunities (McMahon 1994; Mufwene 2001; Brenzinger 2006). In the modern era, nation states, globalization and selective migration (Boyd & Richerson 2009) have been potent forces of language standardization and
* Author for correspondence ([email protected]). Electronic supplementary material is available at http://dx.doi.org/ 10.1098/rstb.2010.0051 or via http://rstb.royalsocietypublishing.org. One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
of minority language endangerment or extinction. The expected scale of global loss of contemporary linguistic diversity over the next 50 – 100 years is immense (Krauss 1992; Nettle & Romaine 1999). The basis of the phylogenetic explanation in historical linguistics is that human populations have in the past undergone expansions, with the mechanism of expansion being local population increase, fissioning and spatial relocation of some fraction of that population. Subsequent divergence from a common linguistic root is driven by the natural tendency of languages to diversify under the combined effects of inherited mutation and isolation by distance, with the diversification accelerated by physical barriers to interaction and by effective population size-related sampling effects (drift). If the fissioning is kinstructured, with sub-populations splitting off who already share idiosyncratic linguistic features by virtue of membership of the same part of the larger interaction network (for instance, because of kinship ties), then these effects will be accelerated (Croft 2003). There is a substantial body of recent scientific literature on the large-scale correlations between genetic and linguistic variation, much of it influenced by the integrative approach of Cavalli-Sforza and his collaborators who see the two systems as coevolving as a result of population expansion and splitting, geographical isolation and parental transmission (the
3855
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3856
A. Kandler et al.
Language shift
1891
1901
1911
1921
1931
1961
1971
1981
1991
2001
1951
Figure 1. Percentages of Gaelic speakers (mono- and bilingual) in Scotland in successive census years, 1891–2001. Data for civil parishes: 1891–1971 from Withers (1984, pp. 227–234); 1981 from Withers (1988, p. 40); 1991–2001 from General Register Office for Scotland (2005, table 3). Red, 75–100% Gaelic speaking; orange, 50–74.9% Gaelic speaking; yellow, 25– 49.9% Gaelic speaking; white, less than 25% Gaelic speaking.
latter being the sole mechanism of genetic inheritance and, they would argue, the predominant mechanism of linguistic inheritance in small-scale societies; e.g. Cavalli-Sforza et al. 1988, 1992). In prehistoric archaeology, such demographic interpretations of cultural macroevolution are familiar from the muchdebated farming/language dispersal hypothesis for the spatial spread and diversification of languages such as those of the Bantu, Austronesian or Indo-European groups (Diamond & Bellwood 2003). However, accepting a role for the dispersal of its speakers in the initial spread of these major linguistic groupings does not preclude contact-induced language change and recruitment into the speaker population by language shift, either at the time of initial spread or subsequently. In fact, Campbell (2006, p. 2) suggests that empirically, in terms of the likelihood of finding complete gene – language congruence in language contact situations, ‘All of the following are attested (‘no’ here means ‘little or no’): (1) (2) (3) (4)
no linguistic admixture—no genetic admixture no linguistic admixture—genetic admixture linguistic admixture—no genetic admixture linguistic admixture—genetic admixture
where much work in language – gene correlation has tended to privilege (1) [. . .], linguists expect (1) least, with (4) perhaps the most common.’ In this paper, we focus on the social processes underlying Campbell’s scenarios (2) and (4) to frame the following questions: when does branch pruning on a linguistic phylogeny (language death) reflect local population extinction, and when does it reflect a purely cultural extinction process with the descendants of its speakers simply transferring to a different branch of the language tree (language shift)? Thomason (2001; cf. McMahon & McMahon 2005, pp. 78– 79) has suggested that the effects of language contact can be arranged on a continuum from Phil. Trans. R. Soc. B (2010)
contact-induced language change (which may involve just non-basic vocabulary elements, or basic vocabulary and structural features, depending on the level of contact and of bilingual interaction), to extreme language mixture (involving pidgins, creoles and mixed languages; cf. Mufwene 2008), to language death, with people abandoning one language outright and shifting to adopt another. Tree-building methods attempt to reconstruct the aspects of similarity and divergence that are due to conservative transmission with mutation-based modification. However, the phylogenetic approach ignores the important role of selective cultural migration (or shifting between branches) in determining the extinction rates of different branches of such trees. In this paper, we will describe our recent work on language competition and language shift using the example of the recent history of Britain’s major Celtic languages. We emphasize the extreme lack of congruence between genetic and linguistic trees that results from language shift, and stress that the frequent shift of individuals between branches of a linguistic tree is not only a contemporary phenomenon (for discussion, see Steele & Kandler 2010). The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly wellstudied examples of language competition for which good census data exist for the most recent 100 – 120 years in many areas where Celtic languages were once the prevailing vernaculars (see figure 1 for a visualization of the Scottish Gaelic census data). Some of the earliest fieldwork on language death was done in communities where Scottish Gaelic was endangered or dying out (MacKinnon 1977; Dorian 1981). The last monolingual speakers of Cornish died in the late seventeenth century, although their language survived locally among Cornish – English bilinguals until the end of the nineteenth century. On the Isle of Man, the last native speaker bilingual in Manx died in the 1970s. Following the extinction of these informal
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Language shift (a) 1.0
A. Kandler et al.
3857
(b)
proportion of speakers
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 (c)
(d)
proportion of speakers
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1900
1920
1940 1960 time (in years)
1980
2000
1900
1920
1940 1960 time (in years)
1980
2000
Figure 2. Frequencies of the three sub-populations in the four Scottish Highland counties for the time period 1891–2010. Empirical data (solid lines) and predictions of model (5.1) under the assumptions c31 ¼ c32 and c13 ¼ c12 (dotted lines) and c31 = c32 and c13 = c12 (dashed lines) of the frequencies of Gaelic (black), bilingual (light grey) and English (grey) speakers in (a) Argyll, (b) Inverness, (c) Ross and Cromarty and (d) Sutherland over time.
within-household transmission pathways, Cornish and Manx are now subjects of local revival efforts to bring these languages back into the community via schools, print and broadcast media, the arts and traditional community events. In Scotland and Wales, the original Gaelic- and Welsh-speaking populations were more numerous and the pattern of decline has been more influenced by local geographical factors. During the twentieth century, Welsh remained widely spoken, and even in 1961 it was still possible to traverse Wales from north to south without leaving a parish in which 80 per cent or more of the residents spoke Welsh (Aitchison & Carter 1985). This is despite long-term pressures for Anglicization owing to interventions such as the Act of Union of 1536, which incorporated Wales into the realm of the English monarch and included a stipulation that ‘no Person or Persons that use the Welsh Speech or Language shall have or enjoy any Manor Office or Fees within the Realm of England, Wales or Other the King’s Dominion’ (Bowen 1908), and much later, promotion of the use of English in schools to eradicate Welsh from the industrial heartlands after rural – urban migration had created self-contained Welsh-speaking communities in the coalfields (Commissioners of Inquiry into the State of Education in Wales 1847). However, in the last 50 years, monolingual Welsh speakers declined towards extinction (in 1981 there were 21 283 Welsh Phil. Trans. R. Soc. B (2010)
monolinguals recorded in the official census, 0.8% of the total population), and a vigorous programme of Welsh language revitalization since the 1970s has been targeted at creating the conditions for stable bilingualism1 ( Jones 1993). In Scotland, by late mediaeval times, Gaelic was the main language of the Highlands and western islands, with Scots (descended from the Old Northumbrian dialect of Old English) and English prevailing in the Lowlands. This division appears to have been reinforced by a contrast between these two regions in their social structure, marriage and migration patterns (with the clan system predominating in the Highlands): the subsequent breakdown of the geographical ‘niche’ for Scottish Gaelic is closely linked to the political and economic dominance of actors to the south, and their interference with the Highlands’ political and economic systems. Drastic demographic changes (the eighteenth – nineteenth century ‘Highland clearances’) and the associated establishment of English as the language of education and advancement were associated with increasing rates of Gaelic-to-English language shift (Murdoch 1996). The late stages of this shift process can be reconstructed from census records. Figure 2 (solid lines) shows the change in the proportions of monolingual English and Gaelic speakers and bilinguals for the counties of Argyll, Inverness, Ross and Cromarty
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3858
A. Kandler et al.
Language shift
and Sutherland during the time period 1891 – 1971. These four counties are seen as the ‘core land’ of the Gaelic language (‘Gaidhealtachd’): in 1891, 73 per cent of all Scotland’s Gaelic speakers were located among the 8 per cent of Scotland’s population that lived in these ‘Highland Counties’, covering the mainland Highlands and the Western Isles. By 2001, economic adversity in Highland areas, the ‘pull’ factor of economic opportunity in urban, industrial areas and Gaelic revivalism in the Lowlands have produced a substantial Gaelic presence in the Lowlands, with only 52 per cent of all Gaelic speakers resident in the wider Gaidhealtachd (where only 6.5% of Scotland’s population now live), and 48 per cent residing in the rest of Scotland (figure 1). The absolute numbers of Gaelic speakers in Scotland have however declined through this period, from about 250 000 in the 1891 census of Scotland to about 65 000 in the most recent (2001) census. Of these, the majority were always bilingual in Gaelic and English, with the last census record of Gaelic monolinguals finding fewer than 1000 still alive in 1961. Recent revitalization efforts have included the establishing of Gaelicmedium pre-school and primary school units (MacKinnon 1993) and the development of Gaelicmedium broadcasting (Murdoch 1996). In 2005, the Gaelic Language (Scotland) Act was passed by the Scottish Parliament, providing a planning framework for a number of additional shift-reversal measures, while Comhairle nan Eilean Siar, the Western Isles Council, has adopted Gaelic as its primary language.
2. MATHEMATICAL MODELS OF LANGUAGE SHIFT (a) Basic model We model the dynamics of language shift as a competition process in which the numbers of speakers of each language vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. Mathematical work on language shift dynamics2 has been stimulated by Abrams & Strogatz (2003), who proposed a simple two-language competition model in which the outcome (extinction of one or other language) is determined by the strength of innate attraction to the higher status language and by the initial conditions (with preferential attachment—the nonlinear effect of initial concentrations on shift rates—capable of driving the higher status language to extinction when its speakers are rare). Our own basic model is very different. In addition to the status-related shift term, we model the changing sizes of speaker sub-populations as the balance of births and deaths, and of immigration and emigration, and we model a bilingual transition state.3 There is no process of preferential attachment—absolute rates of shift are a simple linear function of sub-population sizes. In our basic model of the shift process, the variables u1, u3 and u2 represent the sizes of the two monolingual and the bilingual sub-populations and the parameter cij represents the strength of the innate attraction of language i to speakers currently situated in Phil. Trans. R. Soc. B (2010)
sub-population j (for a graphical representation of the shift process, see electronic supplementary material, figure S2a). Each sub-population also recruits internally by reproduction, spatial dispersal and long-distance migration, which is modelled as a reaction–diffusion process with logistic growth to a carrying capacity K and (in spatially explicit formulations) with diffusion of speakers between adjacent locations (§5). This model of language shift leads inevitably to the extinction of one or other monolingual sub-population, followed by the extinction of the language itself in the bilingual community. In the absence of the bilingual transition state, extinction would always be the fate of the lower status language; however, including the bilingual transition state fundamentally changes the dynamics. The less attractive or lower status language can now prevail, provided that its speakers have an initial numerical advantage that outweighs their language’s intrinsic status disadvantage. In formal terms, and if overall population size is stable, this outcome requires that there are initially few enough monolinguals in the high-status language, and therewith enough pressure on them to become bilingual, for it to always hold that c12u2 , c31u3 (where u1 defines the frequency of the sub-population speaking the high-status language). These dynamics are analysed in more detail in the electronic supplementary material.
(b) Diglossia model Many advocates of the preservation of endangered languages as living languages have promoted strategies in which the objective is stable societal bilingualism, by creating or preserving essential social domains (perhaps quite prestigious domains, such as political fora) in which the endangered language is the preferred or only acceptable medium of communication. Although such reversal strategies all require some measure of planned intervention to revive demand for skill in the endangered language, language planners typically cite as precedent the apparent stability of language coexistence in cases of diglossia (e.g. Fishman 1991). Diglossia, in the strict sense, refers to situations where the mother tongue of the community is used in everyday (low status) settings, but another language (or another form of the vernacular language) is used in certain highstatus domains typically involving religious ceremonies, or written transactions in societies with low levels of literacy (Ferguson 1959; Hudson 2002). Language coexistence is possible in such diglossic situations because the demand for the high-status language is specific to social context. Language shift, in the sense of our basic model, relates instead to situations where the high-status language is associated with entire social identities that are seen as desirable and worthy of emulation. Such situations are not compatible with stable language coexistence, because the languages are competing as the medium of communication in all social contexts. To consider the effects of the creation and maintenance of segregated and complementary sociolinguistic domains, in each of which both languages are differentially preferred as the medium of communication, we have examined a second model in which bilingualism
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Language shift
A. Kandler et al.
3859
Table 1. Fitted shift coefficients for the basic model with c31 ¼ c32 and c13 ¼ c12, respectively, c13 = c32 and c31 = c12.
Argyll
Ross and Inverness Cromarty
Sutherland
0.025 0.03
0.03
0.035
0.03
0.035
0.005 0
0
0
0
0
0.06 0.02 0 0
0.115 0.03 0.005 0
0.1 0.03 0 0
0.12 0.025 0.005 0.005
0.075 0.035 0 0
Wales shift from Celtic to bilingual and/or to monolingual English (c13) shift from English to bilingual and/or to monolingual Celtic (c31) shift from Celtic-only to bilingual (c13) shift from bilingual to English-only (c12) shift from English-only to bilingual (c31) shift from bilingual to Celtic-only (c32)
Scottish Highlands
0.07 0.025 0 0
is no longer simply the transitional state for households moving between alternative monolingual states. Superimposed on the basic shift dynamics, there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, and this demand persists regardless of the numbers of speakers of the endangered language until that number becomes very small (at which point the demand ceases; for a graphical representation of the shift process, see electronic supplementary material, figure S2b). This additional dynamics creates a steady reverse flow of monolingual speakers in the dominant language who enter or re-enter the bilingual sub-population. Because this second model allows for demand for both languages, each in its own preferred domain, bilingualism is now a stable final state; we now find that a wider range of extinction and coexistence states is possible, depending on the strength of the various in- and outfluxes between the three sub-populations. We can now model—for any given case of well-advanced language shift—the rates of acquisition of skills in the endangered language that would be required from monolingual households fluent in the dominant language for shift reversal to take off. These dynamics are also analysed in more detail in the electronic supplementary material.
3. RESULTS Using our basic model, we have estimated the strengths of the competitive advantage driving language shift from Scottish Gaelic to English in Highland Scotland (1891 – 2001), and from Welsh to English in Wales (1901 – 2001). We fitted4 the model to official census data (see electronic supplementary material, S1 ‘Data’ for more details) for these time periods. Historical census data on language use will include some ‘noise’ owing to inaccurate answers (for instance, owing to the perceived social status implications of self-classification into a particular category), and to changes in the phrasing of the questions in successive censuses. To avoid over-fitting (where the model fits the noise in the data as well as the significant trends), we initially reduced the model’s degrees of freedom by assuming the parameter constellation c31 ¼ c32 and c13 ¼ c12. The results are shown in table 1 and figures 2 and 3 (dotted lines). Our basic model captures well the general dynamics of the language shift process (the decrease in Phil. Trans. R. Soc. B (2010)
the Gaelic and Welsh monolingual and bilingual sub-populations and the increase in the English monolingual sub-population). Table 1 (top two rows) gives the estimated values for the shift coefficients. These show that while the Celtic monolingual subpopulations were not able to attract a significant number of English speakers or bilinguals (cf. c31 ¼ c32 ¼ 0– 0.005), the shift from the Celtic monolingual to the bilingual sub-populations and from the bilingual to the English monolingual sub-populations happened at high rates owing to the competitive advantage of the dominant language (cf. c13 ¼ c12 ¼ 0.025 – 0.035). Additionally, the competitive advantage for English speakers in Highland Scotland was greater than in Wales. However, the fitted curves in figures 2 and 3 also suggest that the parameter constellations given in table 1 generally overestimate the Celtic monolingual sub-population and slightly underestimate the bilingual sub-population. Therefore, we also fitted the basic model with constellations in which c31 = c32 and c13 = c12 (so that, for example, the balance of competitive advantage driving the shift from monolingual Welsh to bilingualism can be different from that driving the shift from bilingualism to monolingual English). The results are illustrated by the dashed lines in figures 2 and 3 and by the values for the competition coefficients in table 1 (bottom four rows). The fit is improved,5 and table 1 (bottom four rows) shows that the key to improvement in fit lies in the increase in the shift parameter from Celtic-only to bilingual (c13). All other coefficients stay roughly constant. Celtic monolinguals were more affected by the status difference between English and Celtic than were bilinguals. This implies that the priority was to learn the high-status language and not to abandon the Celtic language. Bilinguals tended to stay bilingual longer than Gaelic speakers stayed monolingual. We also found that the fit of the basic shift model is most sensitive to changes in the coefficient c12, implying that small changes in the rate of shift from bilingualism to monolingual English may result in significantly changed competition dynamics. Figure 3 highlights a further deviation of the basic model’s predictions from the census data for the very recent period in Wales. In the above results, we have assumed constant shift coefficients over time (i.e. that the ‘environment’ for language competition does not change within the considered time period). However, political, social and/or economic changes can
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3860
A. Kandler et al.
Language shift
(a) 1.0
(b)
proportion of speakers
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1920
1940 1960 time (in years)
1980
2000
1920
1940
1960 1980 time (in years)
2000
2020
Figure 3. Empirical and projected frequencies of the three sub-populations in Wales for the time period 1901 –2001. Empirical data (solid lines) and predictions of model (5.1) under the assumptions c31 ¼ c32 and c13 ¼ c12 (dotted lines) and c31 = c32 and c13 = c12 (dashed lines) of the frequencies of Welsh (black), bilingual (light grey) and English (grey) speakers. (a) Prediction of model (5.1) with parameters given in table 1 (bottom four rows) and (b) prediction of model (5.2) with the same c-values and w1 ¼ 0.005 and w3 ¼ 0 for the time period 1901–1971 and w1 ¼ 0.01 and w3 ¼ 0 for the time period 1971–2001.
lead to a change in the sociolinguistic environment and consequently to a change in the competition dynamics. Figure 3a shows that the basic model with time-independent shift coefficients captures well the dynamics of Welsh – English language competition until about 1971, but not the change in the competition dynamics that is observable more recently. During the last 40 years, Welsh language-planning initiatives and legislation have led to several maintenance interventions that were able to alter the shift dynamics. The decline in the bilingual sub-population appears to have been reduced or halted, leading to a stable coexistence condition. This new situation must be explained using our diglossia model, since it is inconsistent with the basic model (in which bilingualism is assumed to be a transitional state and not a final stable state). Figure 3b shows fitted curves for the diglossia model in relation to the Welsh census data. Here, time-dependent coefficients are crucial to capture the change in the competition dynamics. The fitted values of the same shift coefficients as just discussed in the basic model for the two time periods before and after 1970 show that the language-planning initiatives resulted in an increased ‘force’ for English speakers to learn Welsh (w1, being a measure of the strength of this force for the time period 1971 – 2001), which means sociolinguistic domains in which Welsh is the advantageous language have been created and supported. Figure 3b also projects ahead the fate of Welsh – English bilingualism if the ‘environment’ stays the same, and indicates that Welsh is then preserved in the bilingual sub-population. However, at present, this is due to the maintenance activities of the planners creating an influx of English monolinguals into the bilingual subpopulation that balances the continuing ‘organic’ loss of bilingual households to English monolingualism owing to low levels of intergenerational transmission of Welsh within the home. How might the experiences of language planners intervening to limit the shift from Welsh to English be used to ‘save’ the Gaelic language in the Scottish Highlands? We applied the diglossia model to the Phil. Trans. R. Soc. B (2010)
Gaelic – English situation and asked how strong an intervention would need to be (in other words, how many English monolinguals have to learn Gaelic per year) in order to alter the shift dynamics. We note that the number of Gaelic monolinguals is now effectively zero, so that the term w3u3 does not play a role in the competition dynamics: therefore, we set w3 ¼ 0. We obtain that w1 ¼ 0.0035 is sufficient to stabilize the bilingual population at its current level (cf. figure 4). This implies that roughly 860 English speakers have to become bilingual every year (based on a Highland population of about 315 000 individuals). However, the coexistence between the bilingual and the English-speaking sub-populations depends in this case entirely on the planners’ initiatives and on legislation. Intervention strategies may prove much more successful if the rate of intergenerational transmission of the bilingual strategy could be increased as well. Thus, for example, the number of English monolinguals required to learn Gaelic each year could drop down to roughly 440 if the rate of intergenerational transmission of Gaelic at home could be increased (c12 from 0.025 to 0.0125). This means that beside the 440 new recruits to bilingualism, roughly 340 more children who live in bilingual households would have to be raised in both languages to stabilize the bilingual population at the current level. These numbers indicate that an increase in the rate of intergenerational transmission is a highly effective language maintenance strategy, although one that is also harder to achieve in practice.
4. DISCUSSION The current linguistic ‘extinction crisis’ is expected to decimate global cultural diversity. As outlined in this paper in the Gaelic – English example, most of the recent language extinction events are caused by language shift rather than by the extinction of the population speaking this language. This inevitably results in an increasing divergence between the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Language shift
A. Kandler et al.
3861
Now I should stress here that fisherfolk Gaelic was not lexically impoverished. The trouble was, that like any other strictly local speech form deeply associated with a traditional lifestyle, the richness of the lexicon was chiefly connected with their own specialized way of life. There wasn’t much connected with the sea or with boats that they didn’t have a word for, and they had a lot of weather terms that reflected the importance of decisions about whether to put to sea or not. When I acquired the dialect I learned the names of more varieties of seaweed than I had ever known existed, the names for parts of a rabbit snare, and the term for an egg that emerged from the hen without an exterior shell. But, not surprisingly, there were no local words for the parts of a car or for the national health service (Dorian 2006, p. 7).
We have not considered here the reasons why a phylogenetic model might explain the historical evolution of languages in terms of their basic vocabularies; rather, we have shown that language shift (seen as selective migration between branches of a language tree) is another significant force in cultural evolution, one which may also—in some circumstances—serve as a mechanism of cultural selection acting on alternative systems of economic practices and social norms. Language planners are active in many situations attempting to reverse or modify this shift process, while academic linguists increase their effort to record details of representative samples of these endangered languages (most of which have no or minimal written corpora) before they disappear. With the English – Gaelic and the English – Welsh case studies, we analysed two different scenarios. While the 2001 census showed that the decline in numbers of Scottish Gaelic speakers had not yet been halted, census data for Wales in the same year showed that Welsh seemed to be being maintained at stable levels in a bilingual sub-population. Analysis of our diglossia model has shown that the key language-planning issues for maintenance of an endangered language are (i) to create or support social domains in which the endangered language is the preferred or only acceptable medium of communication and (ii) to increase the rate of intergenerational transmission of the endangered language. Other important dimensions of language maintenance are the creation of economic incentives (e.g. jobs created to implement languageplanning-related initiatives and which themselves require skills in the endangered language), and the establishment of corpora of written texts in the endangered language as a cultural archive and as a medium of continuing cultural self-expression. Without stabilizing a sustainable level of intergenerational transmission, language planners will have to rely on constant interventions in formal public domains (e.g. in the school curriculum) to counter the continuing outflux from bilingualism by individual households. An indication of one cause of this background outflux from Gaelic-speaking bilingualism can be found in the 2001 Scottish census data (General Register Office for Scotland 2005): 70 per cent of children aged 3– 15 years speak Gaelic in households in which a married or co-habiting couple both speak Gaelic, while the percentages are only 18 per cent if the male partner alone speaks Gaelic, and 27 per cent if the female partner alone speaks Gaelic. This is the current reality of intergenerational transmission in an environment where languages compete with very unequal external advantages. The success of current planning interventions in reversing language shift and preserving Welsh and Gaelic as living languages will be assessed when results are available from the next Welsh and Scottish censuses in 2011.
The solution used by Gaelic speakers was to adopt the English words as loanwords; what drives the shift process is not the available specialized lexicon, but the wider contrast in social and economic potential that participation in one or other linguistic community opens up.
5. MODEL AND METHODS (a) Basic model We examine the dynamics of language shift as a spatially dependent competitive process using the
1.0 0.9 proportion of speakers
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1900
1920
1940 1960 1980 time (in years)
2000
2020
Figure 4. Empirical and projected frequencies of the three sub-populations in the Scottish Highlands for the time period 1901–2030 with assumed intervention after 2009. Empirical data (solid lines) and predictions of model (5.1) until 2009 and model (5.2) after 2009 (dashed lines) of the frequencies of Gaelic (black), bilingual (light grey) and English (grey) speakers. Parameter values for model (5.1) are given in table 1 (bottom rows); after 2009, a diglossic model with the same c-values and w1 ¼ 0.0035 and w3 ¼ 0 is assumed.
transmission histories represented in genetic and in linguistic trees. What provokes shift is not cultural selection acting on grammatical or prosodic potential, but people shifting between two competing languages because of their associated social ecologies. There may of course be some associated variation in expressive potential relating to those ecologies (for example, in terms of specialized vocabulary); in the case of the Gaelic-speaking fishing communities of East Sutherland, the death of whose language was studied closely by Dorian, problems arose when their niche was irrevocably altered:
Phil. Trans. R. Soc. B (2010)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3862
A. Kandler et al.
Language shift
reaction – diffusion system: @u1 u1 ¼ d1 Du1 þ a1 u1 1 @t K ðu2 u3 Þ c31 u3 u1 þ c12 u2 u1 @u2 u2 ¼ d2 Du2 þ a2 u2 1 @t K ðu1 u3 Þ
9 > > > > > > > > > > > > > > > > > > =
> > > þ ðc13 þ c31 Þu1 u3 ðc12 u1 þ c32 u3 Þu2 > > > > > > > @u3 u3 > > ¼ d3 Du3 þ a3 u3 1 > > @t K ðu1 u2 Þ > > > > ; c13 u1 u3 þ c32 u2 u3 ;
ð5:1Þ
with the boundary conditions @ui/@n ¼ 0, x [ @D, i ¼ 1,2,3, where @/@n is the outer normal derivation. The time- and space-dependent variables u1 and u3 stand for the frequencies of monolingual speakers of Language A and Language B, respectively, whereas u2 describes the frequency of bilingual speakers of both languages. The terms @ui/@t, i ¼ 1,2,3, indicate the rate of change in these frequencies over time. The terms on the right-hand side of the equations in system (5.1) describe the changes in the frequency of speakers in each of the three sub-populations u1, u2 and u3. The components aiui(1 2 ui/(K 2(uj þ uk)) define the internal reproductive rates, which represent coupled biological and cultural reproduction within each sub-population. This is usually modelled (as shown here) as a logistic process with intrinsic rate of increase ai. The variable K stands for the carrying capacity of the environment and defines an upper limit to the size of the whole population regardless of the languages spoken, which imposes the condition u1 þ u2 þ u3 K for any time t (i.e. we assume that our human sub-populations must compete for a common resource base). For a detailed analysis of the relevance of this self-limiting term, see Kandler & Steele (2008). The mobility of speakers of each subpopulation in space within the modelled region is modelled by the diffusion terms diDui. The language shift dynamics is modelled in system (5.1) by the frequency-dependent conversion term cijuiuj. The coefficients c13 and c31 represent the likelihood of language shift causing speakers to become bilingual based on the differential prestige or attractiveness of the two competing languages. Following Minett & and Wang (2008), we assume c31 ¼ ~c31 s c13 ¼ ~c13 ð1 sÞ; where the variable s describes the social status differences between the two languages on a scale from 0 to 1. The higher the status of a language, the higher is the likelihood of being the preferred target of shifting. The coefficients ~c13 and ~c31 model the likelihood that monolinguals will respond to these status differences by learning the other language. Language shift cannot happen by passing directly from being monolingual in one language to being monolingual in the other language, but must involve a bilingual transition state. The bilingual subpopulation therefore recruits from both monolingual sub-populations at a rate (c13 þ c31)u1u3. In turn, bilinguals shift to being monolingual in one or other Phil. Trans. R. Soc. B (2010)
language at rates c12u1u2 (representing the loss to monolingualism in Language A) and c32u3u2 (representing the loss to monolingualism in Language B). The coefficients c12 and c32 represent the likelihood of bilingual speakers then becoming monolingual in each of the two languages. In real life, this transition back to monolingualism happens when bilingual parents choose to raise their children monolingually, or when speakers reared as bilinguals in bilingual households abandon one of their languages during their lifetime. We define the overall balance of competitive advantage to speaking each language on the base of the conversion rates: for example, fluency in Language A can be assumed to be more advantageous if it holds that c31 , c13 and c12 . c32. This implies that when the monolingual sub-populations are compared, monolinguals of Language A are less likely to become bilingual, and bilinguals are more likely to shift to speaking only Language A.
(b) Diglossia model To model the effects of planned interventions on stable societal bilingualism, we generalize the basic language shift model (5.1) by incorporating a simplified concept of (extended) diglossia. While in the majority of social domains the shift mechanisms of the basic model apply, diglossia pertains to some restricted social domain in which the balance of competitive advantage differs from that which drives the main shift process. We now assume that the language that tends to lose its speakers in the majority of social domains can nonetheless be the preferred language in a more restrictive domain or set of domains. We therefore generalize the basic model (5.1) by allowing for the possibility that in such domains language use is determined by an alternative set of social norms or prescriptions. This assumption results in a change of the shift dynamics. In our basic model, the reason for monolinguals becoming bilingual is simply that it is a required transition state on the way to being monolingual in the other language. In the diglossia model, we now also allow people to become bilingual as the preferred ‘end state’. If monolinguals of the disadvantaged language want to participate in domains where the advantaged language is required (such as higher education or ‘global’ businesses), they need to learn that second language. This is modelled by the term w3u3 where w3 measures the demand for participation in these domains. However, as long as the low-status language is still used, there is also the possibility that the low-status language is the required language in some domains (such as small ‘local’ businesses or service encounters). Such domains might also be created by political interventions (e.g. legislation that requires use of the endangered local language in a specific set of contexts). This dynamics is incorporated into model (5.2) by the term w1r(u2)u1. Again, w1 models the demand of participation of monolinguals of the high-status language in these domains and the function r(u2) controls for the frequent use of the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Language shift low-status language.6 These considerations lead to our second model: 9 @u1 u1 > > ¼ d1 Du1 þ a1 u1 1 > > > @t K ðu2 u3 Þ > > > > > > w1 rðu2 Þu1 c31 u3 u1 þ c12 u2 u1 > > > > > > @u2 u2 > > ¼ d2 Du2 þ a2 u2 1 þ w3 u3 > > = @t K ðu1 u3 Þ þ w1 rðu2 Þu1 þ ðc13 þ c31 Þu1 u3 ðc12 u1 þ c32 u3 Þu2 @u3 u3 ¼ d3 Du3 þ a3 u3 1 @t K ðu1 u2 Þ w3 u3 c13 u1 u3 þ c32 u2 u3 :
> > > > > > > > > > > > > > > > > > > > > ; ð5:2Þ
Both systems of partial differential equations are implemented in Cþþ and solved numerically using the finite-element method. (c) Data Data for Scottish Gaelic speakers are from the decennial census of Scotland (see electronic supplementary material). The first census to enumerate Gaelic speakers was that of 1881, but only from 1891 were data gathered separately on numbers of Gaelic monolinguals and Gaelic – English bilinguals (in all cases, among those aged 3 years or older). After 1961, no data were collected on the incidence of Gaelic monolinguals, as these were assumed by that time to be approaching extinction. From 1891 until 1971, the census enumerations were collated and analysed on the basis of the old county divisions (the Highland counties of the Gaidhealtachd included Argyll, Inverness, Ross and Cromarty and Sutherland). From 1981 onwards, these counties were subsumed into new administrative units. The new Highland region includes most of Inverness, the majority of Ross and Cromarty, Sutherland and a small portion of Argyll; it also includes Caithness and Nairn, and a small portion (5%) of Moray. The remaining portions of Inverness and Ross and Cromarty make up the new Western Isles region, while the remainder of Argyll and Bute is included in the new Strathclyde region. To document trends in Gaelic speaking in the wider Gaidhealtachd as a single area, we have therefore also collated data from the pre-1981 counties of Argyll, Bute, Caithness, Inverness, Nairn, Ross and Cromarty and Sutherland, and compared this with collated data from the most recent three censuses for the administrative regions of Highlands and Western Isles, and for the unitary council area of Argyll and Bute within the modern Strathclyde administrative region. In the most recent 2001 census, enumeration was extended to include those who stated that they could understand Gaelic but not speak it; we have excluded these instances in order to retain comparability with the earlier records (which enumerate only those with Gaelic-speaking skills). Data for Welsh speakers are from the decennial census of England and Wales. The first census to Phil. Trans. R. Soc. B (2010)
A. Kandler et al.
3863
enumerate Welsh speakers (monolinguals and Welsh – English bilinguals) was that of 1891, but there was some dissatisfaction with the phrasing of the language question and with the definition of an age cut-off for young children. From 1901, the enumeration was limited to those aged 3 years or older. After 1981, no data were collected on the incidence of Welsh monolinguals, as these were assumed by that time to be approaching extinction. In the most recent 2001 census, enumeration was extended to include those who stated that they could understand Welsh but not speak it; we have excluded these instances in order to retain comparability with the earlier records (which enumerate only those with Welsh-speaking skills). The model-fitting procedure is described in the electronic supplementary material, S3. We would like to thank Peter Austin and April McMahon for discussions of this work during its development. We also thank our colleagues at the AHRC Centre for the Evolution of Cultural Diversity (www.cecd.ucl.ac.uk) for creating a hospitable interdisciplinary environment in which to develop models of cultural dynamics. This work was funded by an AHRC Phase Two Research Centre grant, and by a Leverhulme Early Career Fellowship for A.K.
ENDNOTES 1
In this context, Jones studied two Welsh dialects and pointed out that ‘dialect death in Wales may involve the divesting of regional features and an approximation to a commonly accepted uniform variety that is being proliferated throughout the speech community’ (Jones 1998, p. 2). 2 See Patriarca & Leppa¨nen (2004), Mira & Paredes (2005), Stauffer & Schulze (2005), Pinasco & Romanelli (2006), Castello´ et al. (2007), Kandler & Steele (2008), Schulze et al. (2008), Minett & Wang (2008), Kandler (2009) and Patriarca & Heinsalu (2009). 3 See also Baggs & Freedman (1990, 1993) and Wyburn & Hayward (2008, 2009). 4 We estimate the growth and diffusion parameters a1i and di from demographic data. In order to determine the shift rates cij, we calculate the best fit (in a quadratic sense) of model (5.1) to the empirical census data, using the pre-estimated parameters ai and d and leaving the competition terms free to vary (see electronic supplementary material, S3, ‘Model fitting’, for further information). 5 Improvement is quantified in terms of a smaller quadratic distance between the model outcome and the empirical data. 6 The function r(u2) is assumed to be 1 if u2 is sufficiently large but tends to zero if the frequency of the bilingual population becomes too small (e.g. r(u2) can be modelled as a step function).
REFERENCES Abrams, D. M. & Strogatz, S. H. 2003 Modelling the dynamics of language death. Nature 424, 900. (doi:10. 1038/424900a) Aitchison, J. & Carter, H. 1985 The Welsh language, 1961 – 1981: an interpretative atlas. Cardiff, UK: University of Wales Press. Baggs, I. & Freedman, H. I. 1990 A mathematical model for the dynamics of interactions between a unilingual and a bilingual population: persistence versus extinction. Math. Sociol. 16, 51–75. (doi:10.1080/0022250X.1990. 9990078) Baggs, I. & Freedman, H. I. 1993 Can the speakers of a dominated language survive as unilinguals? A mathematical model of bilingualism. Math. Comput. Model. 18, 9– 18. (doi:10.1016/0895-7177(93)90122-F) Bowen, I. 1908 The statutes of Wales. London, UK: T. Fisher.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3864
A. Kandler et al.
Language shift
Boyd, R. & Richerson, P. J. 2009 Voting with your feet: payoff-biased migration and the evolution of group beneficial behavior. J. Theoret. Biol. 257, 331 –339. (doi:10.1016/j.jtbi.2008.12.007) Brenzinger, M. 2006 Language maintenance and shift. In The encyclopedia of language and linguistics (ed. K. Brown), pp. 542 –548, 2nd edn. In Society and language, vol. 6 (ed. R. Mesthrie). Oxford, UK: Elsevier. Campbell, L. 2006 Languages and genes in collaboration: some practical matters. In Language and Genes:an Interdisciplinary Conf., Santa Barbara, 8 –10 September 2006. Berkeley, CA: University of California. Castello´, X., Loureiro, L., Eguı´luz, V. M. & Miguel, M. 2007 The fate of bilingualism in a model of language competition. In Advancing social simulation: the first World Congress (eds S. Takahashi, D. Sallach & J. Rouchier), pp. 83–94. New York, NY: Springer. Cavalli-Sforza, L. L., Piazza, A., Menozzi, P. & Mountain, J. 1988 Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. Proc. Natl Acad. Sci. USA 85, 6002–6006. (doi:10. 1073/pnas.85.16.6002) Cavalli-Sforza, L. L., Minch, E. & Mountain, J. 1992 Coevolution of genes and languages revisited. Proc. Natl Acad. Sci. USA 89, 5620–5624. (doi:10.1073/pnas.89.12.5620) Commissioners of Inquiry into the State of Education in Wales 1847 Reports of the Commissioners of Inquiry into the state of education in Wales, vols 1–3. London, UK: William Clowes and Sons / HMSO. Croft, W. 2003 Social evolution and language change. See http://www.unm.edu/~wcroft/Papers/SocLing.pdf. Diamond, J. & Bellwood, P. 2003 Farmers and their languages: the first expansions. Science 300, 597 –603. (doi:10.1126/science.1078208) Dorian, N. C. 1981 Language death: the life cycle of a Scottish Gaelic dialect. Philadelphia, PA: University of Pennsylvania Press. Dorian, N. 2006 Using a private-sphere language for a public-sphere purpose: some hard lessons from making a television documentary in a dying dialect. Paper presented at Bryn Mawr College, 16 March. See www. brynmawr.edu/emeritus/gather/Dorian.doc. Ferguson, C. A. 1959 Diglossia. Word 15, 325 –340. Fishman, J. A. 1991 Reversing language shift: theoretical and empirical foundations of assistance to threatened languages. Clevedon, UK: Multilingual Matters Ltd. General Register Office for Scotland. 2005 Scotland’s Census 2001, Gaelic Report. Edinburgh, UK: General Register Office for Scotland. Hudson, A. 2002 Outline of a theory of diglossia. Int. J. Sociol. Lang. 157, 1– 48. Jones, R. O. 1993 The sociolinguistics of Welsh. In The Celtic languages (ed. M. J. Ball), pp. 536–605. London, UK: Routledge. Jones, M. C. 1998 Language obsolescence and revitalisation: linguistic change in two sociolinguistically contrasting Welsh communities. Oxford, UK: Oxford University Press. Kandler, A. 2009 Demography and language competition. Hum. Biol. 81, 181 –210 Kandler, A. & Steele, J. 2008 Ecological models of language competition. Biol. Theor. 3, 164–173. (doi:10.1162/biot. 2008.3.2.164)
Phil. Trans. R. Soc. B (2010)
Krauss, M. 1992 The world’s languages in crisis. Language 68, 4–10. MacKinnon, K. 1977 Language, education and social processes in a Gaelic community. London, UK: Routledge and Kegan Paul. MacKinnon, K. 1993 Scottish Gaelic today: social history and contemporary status. In The Celtic languages (ed. M. J. Ball), pp. 491– 535. London, UK: Routledge. McMahon, A. 1994 Understanding language change. Cambridge, UK: Cambridge University Press. McMahon, A. & McMahon, R. 2005 Language classification by numbers. Oxford, UK: Oxford University Press. Minett, J. W. & Wang, W. S.-Y. 2008 Modelling endangered languages: the effects of bilingualism and social structure. Lingua 118, 1945. ´ . 2005 Interlinguistic similarity Mira, J. & Paredes, A and language death dynamics. Europhys. Lett. 69, 1031– 1034. (doi:10.1209/epl/i2004-10438-4) Mufwene, S. S. 2001 The ecology of language evolution. Cambridge, UK: Cambridge University Press. Mufwene, S. S. 2008 Language evolution. London, UK: Continuum. Murdoch, S. 1996 Language politics in Scotland. Aberdeen, UK: Aberdeen Universitie Scots Leid Quorum. Nettle, D. & Romaine, S. 1999 Vanishing voices: the extinction of the world’s languages. Oxford, UK: Oxford University Press. Patriarca, M. & Heinsalu, E. 2009 Influence of geography on language competition. Physica A 388, 174–186. (doi:10. 1016/j.physa.2008.09.034) Patriarca, M. & Leppa¨nen, T. 2004 Modeling language competition. Physica A 338, 296 –299. (doi:10.1016/j.physa. 2004.02.056) Pinasco, J. P. & Romanelli, L. 2006 Coexistence of language is possible. Physica A 361, 355 –360. (doi:10.1016/j. physa.2005.06.068) Schulze, C., Stauffer, D. & Wichmann, S. 2008 Birth, survival, and death of languages by Monte Carlo simulation. Commun. Comput. Phys. 3, 271 –294. Stauffer, D. & Schulze, C. 2005 Microscopic and macroscopic simulation of competition between languages. Phys. Life Rev. 2, 89–116. (doi:10.1016/j. plrev.2005.03.001) Steele, J. & Kandler, A. 2010 Language trees = gene trees. Theor. Biosci. 129, 223 –233. (doi:10.1007/s12064-0100096-6) Thomason, S. G. 2001 An introduction to language contact. Edinburgh, UK: Edinburgh University Press. Withers, C. W. J. 1984 Gaelic in Scotland, 1698–1981: the geographical history of a language. Edinburgh, UK: John Donald. Withers, C. W. J. 1988 Gaelic Scotland: the transformation of a cultural region. London, UK: Routledge. Wyburn, J. & Hayward, J. 2008 The future of bilingualism: an application of the Baggs and Freedman model. J. Math. Soc. 32, 267 –284. (doi:10.1080/002225008 02352634) Wyburn, J. & Hayward, J. 2009 OR and language planning: modeling the interaction between unilingual and bilingual populations. J. Operat. Res. Soc. 60, 626– 636. (doi:10. 1057/palgrave.jors.2602600)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3865–3874 doi:10.1098/rstb.2010.0020
The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles Jamshid J. Tehrani1,2, *, Mark Collard2,3 and Stephen J. Shennan2,4 1
Evolutionary Anthropology Research Group, Department of Anthropology, Science Site, South Road, Durham University, Durham DH1 3LE, UK 2 AHRC Centre for the Evolution of Cultural Diversity, and 3 Laboratory of Human Evolutionary Studies, Department of Archaeology, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, V5A 1S6, Canada 4 Institute of Archaeology, University College London, London, WC1H 0PY, UK
Phylogenetic approaches to culture have shed new light on the role played by population dispersals in the spread and diversification of cultural traditions. However, the fact that cultural inheritance is based on separate mechanisms from genetic inheritance means that socially transmitted traditions have the potential to diverge from population histories. Here, we suggest that associations between these two systems can be reconstructed using techniques developed to study cospeciation between hosts and parasites and related problems in biology. Relationships among the latter are patterned by four main processes: co-divergence, intra-host speciation (duplication), intra-host extinction (sorting) and horizontal transfers. We show that patterns of cultural inheritance are structured by analogous processes, and then demonstrate the applicability of the host– parasite model to culture using empirical data on Iranian tribal populations. Keywords: cultural phylogenies; population history; coevolution; cophylogeny; cultural evolution; Iranian tribes
1. INTRODUCTION The extent to which cultural traditions track the descent histories of populations has long been debated. For most of the last century, the consensus among anthropologists and archaeologists has been that any evidence relating to the historical origins of cultural assemblages would probably be swamped by the rapid rate of cultural evolution, and by the effects of trade, intermarriage and exchange among neighbouring groups (e.g. Boas 1940; Kroeber 1948; Moore 1994). However, recent applications of techniques of phylogenetic analysis borrowed from biology have succeeded in reconstructing coherent and long-lasting lineages of cultural inheritance across a number of domains (e.g. Mace et al. 2005; Lipo et al. 2006). For instance, analyses of relationships among languages suggest that resemblances among word forms can often be traced back to ancestral speech communities that existed many thousands of years ago (e.g. Gray & Jordan 2000; Gray & Atkinson 2003; Kitchen et al. 2009). Similarly, it would appear that many craft styles and technologies are handed down from generation to generation, * Author for correspondence ([email protected]). Electronic supplementary material is available at http://dx.doi.org/ 10.1098/rstb.2010.0020 or via http://rstb.royalsocietypublishing.org. One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
eventually giving rise to new forms that are recognizably derived from their parent tradition (e.g. Tehrani & Collard 2002, 2009a,b; O’Brien & Lyman 2003; Buchanan & Collard 2007, 2008; Lycett 2007, 2009). The reconstruction of such lineages can provide useful evidence about the origins and dispersal of populations, especially in cases where genetic data are scarce or noisy. For example, phylogenies derived from cultural traits have been used to test competing hypotheses about the colonization of the Pacific (Gray & Jordan 2000; Gray et al. 2009), the Bantu expansions in Africa (Holden 2002), the origins of the Indo-Europeans (Gray & Atkinson 2003) and the peopling of the Americas (Buchanan & Collard 2007). However, while most studies indicate that cultural phylogenies and population histories are usually highly correlated (e.g. Gray & Jordan 2000; Holden 2002; Tehrani & Collard 2002, 2009a,b), the match is not always perfect. For example, Tehrani & Collard (2002) noted that some of the relationships among Turkmen and rural Iranian (Tehrani & Collard 2009a,b) weaving traditions contradict written and oral histories about the tribes’ origins. Similar inconsistencies have been reported in reconstructions of indigenous Californian basketry assemblages (Jordan & Shennan 2003), Siberian material culture ( Jordan & Mace 2006), Baltic stringed instruments (Temkin & Eldredge 2007) and Polynesian canoes (Rogers et al. 2009).
3865
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3866
J. J. Tehrani et al.
Cophylogeny of populations and cultures
(a)
(c)
Independent phylogeny
Independent phylogeny
Dependent phylogeny
Dependent phylogeny
(b) Independent phylogeny
(d) X
Dependent phylogeny
Independent phylogeny Dependent phylogeny
Figure 1. Terminology of historical associations between a dependent (parasite) phylogeny and independent (host) phylogeny. (a) Co-divergence, (b) sorting event, (c) duplication and (d) horizontal transfer.
To shed more light on these issues, we draw on ideas from dual-inheritance theory or gene – culture coevolutionary theory (e.g. Cavalli-Sforza & Feldman 1981; Boyd & Richerson 1985; Durham 1991; Richerson & Boyd 2005). Dual inheritance theory views culture and genes as separate but coevolving systems of heritable variation, each based on autonomous mechanisms of information transmission (i.e. imitation and teaching versus biological reproduction). At the individual level, this requires models that can account for the interactions between genetic traits, which can only be transmitted ‘vertically’ from parents to offspring, and learned behaviours that can be acquired vertically, ‘obliquely’ from other adults, or ‘horizontally’ among members of the same generation. Similar models are needed at the group level. These would recognize that, while cultural traditions and populations may be closely linked, the processes involved in their propagation, dispersal and extinction are ultimately independent of one another. The main aim would then be to understand what kinds of processes lead to correlations between cultural phylogenies and population histories, and what kinds of processes lead to divergences. Following the suggestions of Jordan & Mace (2006), Gray et al. (2008) and Riede (2009), we argue that such a model can be developed from the study of long-term co-evolutionary, or ‘cophylogenetic’, relationships in biology. 2. THE COPHYLOGENETIC FRAMEWORK The study of cophylogeny spans several domains in biology, including cospeciation in host and parasite organisms, the reconciliation of species trees and gene trees, and associations between species histories and area histories in vicariance biogeography (e.g. Brooks & McLennan 1991; Page 2003). The key issue in each of these endeavours is essentially identical to the one we face here. It concerns how far the history of one group of entities (i.e. the parasites, genes, organisms or cultural traditions) is determined by the history of another group (i.e. the hosts, species, Phil. Trans. R. Soc. B (2010)
geographical areas or populations). This is addressed by mapping a dependent phylogeny (i.e. the parasite, gene, or organism tree) onto an independent phylogeny (the host, species, or geographical tree). Historical relationships between the two systems of interest can then be described in relation to four generic processes: co-divergence, sorting, duplication and horizontal transfer (e.g. Page 2003; figure 1). Each of these processes can be readily identified in cultural evolution. (a) Co-divergence In co-divergence, the dependent lineage splits as a result of the independent lineage splitting. In the case of hosts and parasites, co-divergence is equivalent to cospeciation, and typically occurs when the speciation of a host organism results in the speciation of associated parasites (e.g. Hafner & Nadler 1988). In molecular phylogenetics, co-divergence occurs when a genetic lineage diverges into daughter lineages coincident with a speciation event (interspecfic coalescence), while in biogeography a co-divergence takes place when a new species arises in geographic isolation as a result of a geological event (vicariance) (e.g. Hafner & Page 1995; Ronquist 1998). In all these instances, co-divergence results in a direct correspondence between the dependent and independent phylogenies. In the case of cultural evolution, co-divergence is equivalent to the division of cultural traditions resulting from population splits, which is often associated with the demographic expansion of populations. The impact of co-divergence in generating cultural patterns is exemplified by the spread of agriculture. The Neolithic expansions in Europe, Oceania and Africa not only left strong genetic signatures, but also were associated with the growth and spread of distinct language families (e.g. Gray & Jordan 2000; Holden 2002; Gray & Atkinson 2003). In each of these cases, new languages appear to have evolved primarily as a result of population dispersals.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cophylogeny of populations and cultures J. J. Tehrani et al. (b) Sorting In host– parasite studies, sorting refers to the extinction of a parasite lineage within a host lineage. Sorting events can also occur as a result of a parasite ‘missing the boat’ when a descendent of the host species does not inherit all the latter’s parasites (e.g. Paterson et al. 1999). The extinction of a genetic lineage within a species or of a species in a habitat is also classified as a sorting event (e.g. Hafner & Page 1995; Page & Charleston 1998). Sorting can be thought of as the pruning of some branches on the dependent phylogeny, which results in mismatches with the tips of the independent phylogeny. Sorting events are likely to be common in cultural and linguistic evolution. Globalized capitalism and the spread of modern communications systems have caused (or at least coincided with) the decline of innumerable dialects, technologies and other cultural practices associated with indigenous peoples around the world. For example, Ohamgari & Berkes (1997) found that traditional bush skills are in decline among the Cree of James Bay, Canada, because their communities no longer depend on hunting and fishing for subsistence. Instances of cultural loss are also known from historical evidence. One of the most dramatic of these occurred in Tasmania. Archaeological evidence suggests that the first humans to arrive in Australia possessed a relatively sophisticated set of weapons, tools and crafts. While many of these were maintained by mainland groups, in the 10 000 years prior to the arrival of the first Europeans, native Tasmanians appear to have lost techniques required to fish, prepare furs, make bone tools, arrows and boomerangs, and even the knowledge required to make fire (Henrich 2004).
(c) Duplication In duplication, the branches of the dependent phylogeny split but the branches of the independent phylogeny do not. In other words, duplication events create mismatches between the dependent phylogeny and independent phylogeny by adding branches to the dependent phylogeny. In the host– parasite case, this equates to the intra-host speciation of a parasite species. In genetics, duplication results in an organism carrying two copies of the same gene. In the case of organism –area associations, duplication is equivalent to sympatric speciation, which occurs within an undivided geographical area or habitat range (Page & Charleston 1998). The history of sport is replete with examples of cultural duplication. For instance, modern football and rugby are descended from ball games played in nineteenth century England, and that were not recognizably distinct from one another. It was only after the establishment of separate governing bodies who formally codified the rules that the two sports diverged. A later schism gave rise to separate codes of Rugby League and Rugby Union. Like the earlier split from football, the diversification of these sports occurred within an undivided population and can therefore be classed as a duplication. Duplication can also be seen in the diversification of religious sects and denominations. Although ideological disputes Phil. Trans. R. Soc. B (2010)
3867
can result in congregations dividing into separate communities of worship, this does not usually result in the formation of genetically, ethnically or linguistically distinct populations. In modern societies, members of different religious communities frequently intermarry and may even change their faith several times over their lives. These examples show how cultural lineages can diversify independently of the populations with which they are associated.
(d) Horizontal transfer Some parasite species colonize new hosts via a process known as ‘switching’. Switches are described as horizontal transfers because they involve a host acquiring a parasite from a non-ancestral species that they have come into contact with. This process can lead to major discrepancies between the phylogenies of the two groups of species (Page 2003). Horizontal transfers can be similarly problematic in other areas. In molecular evolution, horizontal transfers, or ‘reticulations’, are considered rare but are known to occur in some organisms, such as the exchange of plasmid DNA in bacteria. This can greatly complicate the reconstruction of these organisms’ phylogenies (Doolittle 1999). In biogeography, horizontal transfers are equivalent to the dispersal of a species from one region to another. In this context, the phylogeny of a group of species may not map well onto the geological histories of the territories in which they are found (e.g. Ronquist 1998). Horizontal transfers are likely to be a significant problem in reconciling cultural traditions with population histories. There is considerable evidence that horizontal transfers can occur across a variety of domains. One such domain is technology, where useful innovations can spread far from their original point of origin through trade and contact among populations. This phenomenon has been extensively studied by anthropologists and archaeologists since the nineteenth century. For example, Balfour (1889) carried out detailed analyses of composite bows from the Pitt Rivers collection, literally dissecting them to examine their shared ‘anatomical’ characteristics. Balfour (1889) proposed a Central Asian origin for the bow, which was then adopted and successively modified by populations who adopted it as it spread north to the Arctic regions and then west into Siberia and across the Bering Strait into America, west to Persia and Europe, and south to the Indian subcontinent. Similar kinds of processes have been documented in the spread of doctrinal religions as populations are converted by other populations with whom they have contact. Buddhism is an excellent example. Buddhism emerged in India in the sixth century BCE. Within 200 years it underwent a massive expansion, spreading south to Sri Lanka, east into Indochina and northwest into Central Asia, eventually reaching China via the Silk Route (Conze 1980). While the central tenets of Buddhism remained more-or-less the same, specific doctrines and rituals were adapted by the various populations who adopted it. This gave rise to new traditions of Buddhism that are phylogenetically derived from India, even though many of their respective adherents are not.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
J. J. Tehrani et al.
Cophylogeny of populations and cultures
TURKMENISTAN
S ha
kk e Te
Caspian Sea
hseva n
3868
t Yomu
AFGHANISTAN
r tia kh Ba
IRAQ
I
R
A
N
i
Boyer Ahmad KUWAIT
sh Qa
qa
’i
P
er s
N
SAUDI ARABIA
ian G
Ba ulf
luc h
QATAR 0
km
200
Figure 2. Map showing locations of Iranian tribal populations included in the case study and their approximate migration histories. Dashed line, Turkic migrations; dotted lines, Iranian migrations.
3. CASE STUDY: THE SPREAD OF WEAVING IN IRANIAN TRIBAL GROUPS The generic nature of the processes described above means that there has been considerable cross-over in the methods used to study cophylogeny in different biological contexts. We are not the first researchers to realize the potential value of extending them to cultural evolution. For example, Gray et al. (2008) have suggested that techniques used to reconcile gene trees with species trees could be useful for studying the ways in which word histories are embedded in language histories. Jordan & Mace (2006) and Riede (2009) have used methods to test for cospeciation in host and parasite lineages to explore historical correlations among different components of material culture assemblages (e.g. Jordan & Mace 2006; Riede 2009). In this section, we present a case study that applies a comprehensive co-phylogenetic framework to reconstruct historical relationships between cultural traditions and populations. The study focused on weaving traditions in seven Iranian tribal populations, whose geographical distributions are shown in the map in figure 2. Unfortunately, there are currently no genetic data on the population histories of the tribes. However, it is possible to draw inferences about their origins and relationships to one another from linguistic affiliations and oral history (e.g. Barthold 1962; Oberling 1974; Amanolahi 1988; Grimes 2002; Windfuhr 2007). These suggest that the populations can be divided into two main lineages. The first lineage comprises Iranian-speaking groups that are believed to have originated in western Iran (Amanolahi 1988). The groups Phil. Trans. R. Soc. B (2010)
are the Baluch, the Boyer Ahmad and Bakhtiari. Members of this lineage can be further divided into the Baluch on the one side, and the Boyer Ahmad and the Bakhtiari on the other. The latter two groups speak Lori and inhabit the Zagros Mountains of western Iran. The ancestors of the Baluch are believed to have migrated from western Iran to the desert regions of southeastern Iran, western Afghanistan and northwest Pakistan some 900 years ago (Frye 1960; Thompson 2002), splitting from the ancestral population that gave rise to the Lors (Amanolahi 1988). The second main lineage comprises the Qashqa’i, Shahsevan, Tekke and Yomut. These populations claim descent from Oguz Turkic hordes that invaded Iran between the tenth and twelfth centuries (e.g. Barthold 1962; Oberling 1974; Beck 1986). All four of these groups speak Turkic languages. They can be subdivided into two sub-groups, one that speaks Turkmani, which belongs to the eastern branch of Oguz Turkic languages, and the other Azeri, which belongs to the western branch (Grimes 2002). The Yomut and the Tekke speak Turkmani. Both groups inhabit the northeastern region of Iran and Turkmenistan. The Shahsevan and the Qashqa’i speak Azeri. The Shahsevan are located in northwestern Iran close to the Caspian Sea. The ancestors of the Qashqa’i are believed to also have originated near the Caspian Sea, but migrated south to the Zagros Mountains about 500 years ago (Oberling 1974), where they are now neighbours of two of the Iranianspeaking groups, the Bakhtiari and Boyer Ahmad. The hypothesized migration histories of the tribes are shown in the map in figure 2.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cophylogeny of populations and cultures J. J. Tehrani et al. There are several reasons to suspect that the history of weaving traditions is likely to be strongly correlated with population histories. The first is that textile weaving is intimately connected to the nomadic – pastoralist mode of subsistence pursued by members of these communities until recently. Unlike objects made from other materials such as wood and metal, woven rugs, bags and bands can be folded or rolled and are therefore much easier to carry on long and physically challenging migrations between seasonal camps, which in some cases covered distances of hundreds of miles across difficult, mountainous terrain. Furthermore, the raw materials and equipment for weaving were easy to obtain locally: wool can be sheared from sheep and goats, while in the past dyes were extracted from plants, insects and fruits. The second reason is that weaving skills are transmitted in a highly vertical and conservative fashion from mothers to their daughters (Tehrani & Collard 2009a). Endogamous marriage norms mean that females do not usually marry males from other tribes. This in turn implies that daughters do not generally inherit from their mothers craft traits that are foreign in origin. Lastly, even when weavers do adopt traits from non-maternal sources, they usually copy members of their immediate community. Social norms prevent women from travelling far from their father’s or husband’s household, with the result that they have few opportunities to interact with weavers from other tribes. To reconstruct the history of the tribes’ weaving traditions, we carried out a cladistic analysis of 150 characters in each of the seven tribes’ assemblages. The weavings of the Qashqa’i, Bakhtiari and Boyer Ahmad were sampled by J.J.T. during two field surveys carried out in southwestern Iran in May 2001 and September – December 2002. Data on the weavings of the Baluch, Shahsevan, Yomut and Tekke were gathered from published catalogues (Baluch: Konieczny 1979; Yomut and Tekke: Thompson 1980; Tzavera 1984; Shahsevan: Tanavoli 1985). The characters consisted of textile traits, including techniques of preparation and fabrication (e.g. spinning, knotting, etc.), the use of different materials (e.g. wool, goat hair, dyes, etc.) and variation in decorative features (e.g. carpet designs, border patterns, etc.). We used a prehistoric archaeological textile assemblage as an outgroup for the analysis. The assemblage comprised rugs, mats and decorative felts excavated from the ice-filled tombs of nomadic people who inhabited the Pazyryk valley in the Altai Mountains of Siberia in the fourth to fifth century BCE (Rudenko 1970). These artefacts provide the best available information on the roots of weaving among Central and Western Asian nomadic pastoralists and, as such, are a useful means of inferring the likely ancestral states of the characters used in the present study. The data matrix is provided in electronic supplementary material, S1. The analysis was carried out in the software program PAUP 4.0* (Swofford 1998). A branchand-bound search of the data returned a single most parsimonious cladogram, which is shown in figure 3. The relationships shown in the cladogram are compatible with those reported by Tehrani & Collard (2009b) Phil. Trans. R. Soc. B (2010)
3869
Outgroup 99
Yomut Tekke
100
Shahsevan
78
Qashqa’i 89 62
Boyer Ahmad Bakhtiari Baluch
Figure 3. Cladogram for the woven assemblages, with bootstrap support percentages for individual clades shown beside each node.
in a previous analysis of these data, in which a different outgroup was used (Arab Bedouin). The fit between the cladogram and the data was measured using the Retention Index (RI) and bootstrapping. RI is a measure of the number of homoplastic changes that a cladogram requires independent of its length (Farris 1989a,b). A maximum RI of 1 indicates that the cladogram fits perfectly with the dataset, whereas the worse it fits, the closer the RI score approaches 0. The RI of this cladogram was 0.62. Simulation work (see Nunn et al. 2010) suggests that a RI as high as this provides strong evidence that these assemblages evolved by descent with modification from ancestral assemblages. The phylogenetic bootstrap is a technique for measuring support for individual clades (Felsenstein 1985). It involves generating cladograms by creating ‘pseudo’ datasets of the same size as the original by randomly re-sampling characters from the original dataset with replacement a large number times (in this case, 10 000) and calculating the percentage of replicates that support a given clade. As can be seen in figure 3, all of the relationships were supported by a large percentage of the bootstrap replicates. Several of the relationships indicated in the cladogram are consistent with ethnohistorical and linguistic evidence about the relationships among the populations, while several others are not. The finding that the weavings of the Yomut and Tekke are descended from an exclusive common ancestor is compatible with the fact that both populations speak the same Turkic language, Turkmani. Similarly, the finding that the assemblages of the Bakhtiari and Boyer Ahmad are more closely related to each other than they are to those of any other group is supported by the fact that they both speak closely related dialects of Lori and inhabit the same area. However, contrary to ethnohistorical and linguistic evidence, the assemblages of the two Lor groups appear to be more closely related to those of Turkic-speaking groups (the Yomut, Tekke, Qashqa’i and Shahsevan) than they are to the other Indo-Iranian-speaking group, the Baluch. Furthermore, the cladogram suggested that the Qashqa’i and Shahsevan share a more recent common ancestor with the three Lor-speaking groups than they do with the Yomut and Tekke, which again contradicts linguistic groupings. Finally, the Qashqa’i assemblage appears to be more closely
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3870
J. J. Tehrani et al.
Cophylogeny of populations and cultures
related to those of the Boyer Ahmad and Bakhtiari than it is to the Shahsevan, even though the latter speak a closely related dialect of the same language. To assess the importance of these differences, we compared the number of changes required by each character on the most parsimonious tree with the number of changes required by a tree in which the relationships among the assemblages were forced to reflect the tribes’ population histories. The difference in the character lengths was then evaluated using a one-tailed Wilcoxon sign-ranks test, as described by Templeton (1983). The analysis found that the population tree required a significant number of extra steps compared with the most parsimonious tree (total number of extra steps ¼ 38, p , 0.01). Thus, the strong phylogenetic signature recovered from the textile data cannot simply be accounted for purely in terms of descent with modification from common ancestral populations. To shed more light on the relationships between the population history of the tribes and their weaving traditions, we carried out a cophylogenetic analysis in which the best estimate of tribal population history was treated as the independent phylogeny and the cladogram derived from the weaving data was treated as the dependent phylogeny. Previous efforts to apply cophylogenetic techniques to cultural evolution (e.g. Jordan & Mace 2006; Riede 2009) were limited by methods that only mapped three types of relationships between the independent and dependent phylogenies: codivergences, sorting events and duplications. They were therefore unable to address the potential role played by horizontal transfers in generating mismatches between the compared trees. Here, we were able to overcome this constraint using the program TREEMAP v. 2.0 (Page & Charleston 2002), which implements an algorithm called ‘jungles’ (Charleston 1998). Jungles is an advancement on previous tree reconciliation methods because it considers all four cophylogenetic processes, including horizontal transfer. First, a jungles analysis generated all the possible solutions to the cophylogeny of the craft tree and population tree. The total cost of each solution was then estimated according to the number of events other than co-divergences that they hypothesized. Solutions with lower costs are considered preferable to those with higher costs, since the latter require a greater number of independent evolutionary events to explain how the observed patterns of association between the two sets of entities arose. This approach is known as ‘event-based parsimony’ (Ronquist 1996). In principle, it is possible to impose additional optimality criteria by assigning different costs to each type of event. However, for the purposes of this study, we assumed that there is an equal likelihood of horizontal transfers, duplications and sorting events and therefore assigned the same cost (1) to each of them (with a cost of 0 for co-divergences). Figure 4 shows three different solutions to the cophylogeny of the craft tree and language tree returned by TREEMAP. Figure 4a hypothesizes four co-divergences and two horizontal transfers. Thus, the total cost of the reconciliation between the two Phil. Trans. R. Soc. B (2010)
trees is 2. Figure 4b also has a reconciliation cost of 2. It hypothesizes five co-divergences and one horizontal transfer and a sorting event. Figure 4c hypothesizes a reconstruction of events that involves no horizontal transfers. Instead, it suggests that there were three duplications early in the history of weaving that gave rise to several distinct craft lineages. All the lineages subsequently underwent extensive pruning as a result of sorting events that occurred at each juncture where ancestral populations split into new ones. In total, the jungle proposes three duplications and nine sorting events, with a total reconciliation cost of 12. To test the validity of these various explanations, a further analysis was carried out that involved randomizing the associate tree and measuring how often the randomized trees fit, as well as the original tree. The results of this analysis suggested that the number of events hypothesized by both the first two jungles was significantly fewer (p , 0.05) than the number of events that would be required to explain associations between the population tree and random trees. In contrast, the number of events hypothesized by the third jungle was not less than what would be expected by chance. We can therefore reject the hypothesis shown in figure 4c. The analyses were unable to distinguish which of the other two reconstructions represent a better explanation for associations among the tribes’ weaving traditions and population histories. Both explanations were found to be statistically significant and both had the same cost (2). Since we currently lack convincing reasons to assume that horizontal transfers are either more or less costly than sorting events, we cannot reject a priori an explanation that requires two horizontal transfers (figure 4a) in favour of one that requires only one horizontal transfer but also one sorting event (figure 4b) or vice versa. We can however judge the merits of each reconstruction against other existing lines of evidence. The horizontal transfers hypothesized in figure 4a are compatible with geographical evidence and historical records. The hypothesis that the ancestor of the Bakhtiari and Boyer Ahmad acquired weaving from the ancestor of the neighbouring Qashqa’i is consistent with the fact that they are close neighbours. It is also compatible with ethnohistorical data suggesting that the ancestors of the Qashqa’i arrived in the region prior to the divergence of the Bakhtiari and Boyer Ahmad. As noted earlier, whereas the ancestors of the Qashqa’i are believed to have migrated to their present day territories in southwestern Iran some 500 years ago (Oberling 1974), the Bakhtiari and Boyer Ahmad did not emerge as distinct tribal entities until the eighteenth or nineteenth century (Garthwaite 1983; Amanolahi 1988). It is therefore plausible that the Bakhtiari and Boyer Ahmad inherited weaving from a common ancestor that had adopted it as a result of contact with the ancestors of the Qashqa’i. The other horizontal transfer hypothesized in figure 4a occurs between the ancestor of the Shahsevan, Qashqa’i, Tekke and Yomut and the ancestor of the Baluch. As mentioned previously, the Baluch are thought to be descended from a tribe that migrated from the southern Caspian Sea to southwestern Iran
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cophylogeny of populations and cultures J. J. Tehrani et al.
Baluch
Boyer Ahmad
Bakhtiari
Qashqa’i
Shahsevan
Yomut
Tekke
Baluch
Boyer Ahmad
Bakhtiari
Qashqa’i
Shahsevan
(b) Yomut
Tekke
(a)
3871
X
Baluch
Boyer Ahmad
Bakhtiari
Qashqa’i
Shahsevan
Yomut
Tekke
(c)
key: X X
co-divergence X
sorting event
X X
X X X X
duplication horizontal transfer
X
Figure 4. Three solutions to the cophylogeny of the tribes’ weaving traditions and population histories, as reconstructed in TREEMAP v. 2.0. The independent tree (hollow cladogram) represents the populations’ histories, while the dependent tree (solid lines) represents the history of their craft traditions. The different events hypothesized in each reconstruction are indicated by symbols that are explained in the key.
some 900 years ago. This event is roughly contemporaneous with (and may perhaps have even been caused by) the expansion of Oguz Turks into western Iran in the eleventh and twelfth centuries (Thompson 2002), from who the Shahsevan and Qashqa’i are descended (Oberling 1974). It is certainly possible, therefore, that the Baluch split from the Boyer Ahmad and Bakhtiari before the Shahsevan and Qashqa’i split from the Tekke and Yomut, and that all five groups acquired their weaving traditions from a common Oguz Turkic source. The explanation in figure 4b also hypothesizes a horizontal transfer from the ancestor of the Qashqa’i to the ancestor of the Bakhtiari and Boyer Ahmad. As pointed out above, this scenario is plausible in the light of the historical evidence. However, instead of assuming that the Baluch acquired weaving from the ancestor of the Tekke, Yomut, Qashqa’i and Phil. Trans. R. Soc. B (2010)
Shahsevan, figure 4b suggests that the weavings of the Baluch are derived from an ancestral Iranian tradition that went extinct in the Boyer Ahmad – Bakhtiari lineage. It further indicates that the relationship between the weavings of the Baluch and those of the Shahsevan, Qashqa’i, Tekke and Yomut can be explained by descent from a common ancestor of both Turkic and Iranian-speaking groups. However, given that the best estimate from historical linguistics (e.g. Gray & Atkinson 2003) is that the relationship between Turkic and Iranian languages probably predates the origins of agriculture and therefore the keeping of animals for wool, this hypothesis seems unrealistic. It is more plausible that Baluchi weaving traditions, like those of the Boyer Ahmad and Bakhtiari, were originally borrowed from Turkic peoples their ancestors came into contact with. On that basis, we believe that the reconstruction of events as shown
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3872
J. J. Tehrani et al.
Cophylogeny of populations and cultures
in figure 4a represents the best explanation for the origins and spread of weaving among the populations.
4. DISCUSSION AND CONCLUSIONS Phylogenetic approaches to cultural diversity have shown that the diversification and spread of cultural traditions are often closely linked to the dispersal histories of populations (e.g. Mace et al. 2005; Collard et al. 2006; Lipo et al. 2006). The findings of our case study lend further weight to this evidence. Borrowing techniques from biology that are designed to study coevolutionary relationships, we found that relationships between Iranian tribal craft traditions and population histories could be largely accounted for in terms of ‘co-divergence’—the parallel cladogenesis of one lineage with another. Thus, in the two best reconstructions returned by the analyses, all of the relationships among the Turkic tribal assemblages could be explained by population phylogenesis, as could the relationship between the assemblages of two of the Iranian-speaking groups, the Boyer Ahmad and Bakhtiari. Nevertheless, it was also clear that some of the relationships between textile assemblages were incompatible with data on the groups’ population histories. Following other researchers (e.g. Jordan & Mace 2006), we have suggested that such anomalies can be explained in relation to dual inheritance theory whereby, just as individuals can copy cultural behaviours from role models other than their parents, populations may sometimes acquire traditions from sources other than their immediate ancestors. However, as biologists have long known, horizontal transfers are not the only cause of discrepancies between co-evolving systems. In order to estimate horizontal transfers accurately, it is crucial to consider the possible roles played by sorting events and duplications, both of which have direct analogues in cultural evolution. Using the jungles algorithm, we were able to evaluate the likely role played by each of these processes in generating the conflicts between the textile phylogeny and population tree. Two of the reconstructions returned by the analysis involved horizontal transfers, while a third did not. Since the latter required a significantly greater number of events than the other two, it was rejected. The two remaining reconstructions were equally parsimonious. One required two horizontal transfers, while the other required one horizontal transfer and one sorting event. By comparing both reconstructions with other sources of evidence, we concluded that the former was the more realistic scenario. Thus, having considered and ruled out the alternatives, we can be reasonably confident that in this case, horizontal transfers are likely to be the major source of inconsistencies between the textile phylogeny and the population phylogeny reconstructed from linguistic data and oral histories. Of course, like weaving, both language and oral histories are socially transmitted, and as such cannot be regarded as unproblematic guides to population history. Some studies suggest that mismatches between language and genetic history are common among Phil. Trans. R. Soc. B (2010)
pastoralist populations in the Middle East (Nettle & Harriss 2003), and that oral accounts of group origins can be ambiguous or misleading. As Barth (1961) explained in his classic study of nomads of South Persia, linguistic and ethnic identities are often based on a group’s political affiliations, rather than its actual historical origins. Barth (1961) describes several cases where groups are known to have adopted the language of politically dominant groups, initially becoming bi-lingual but ultimately switching completely to their new tongue. Thus, in the absence of genetic data, we cannot be certain that language and oral history provide an accurate reflection of group histories. Instead, they and the weaving traditions may represent different ‘packages’ of cultural inheritance (e.g. Boyd et al. 1997), whose descent histories differ from each other and from the ‘true’ population history of the tribes. An even more intriguing possibility is that these traditions are all tracking population histories, but different aspects of population history. Thus, whereas weaving is transmitted down the female line, oral history and ethno-linguistic affiliations are usually traced via males. Studies of population genetics in other patrilineal pastoralist groups in the region suggest that there are often differences in the migration histories of males and females in these populations, which can occur as a result of some patrilines expanding into others’ territories and then marrying with local females (e.g. Perez-Lezaun et al. 1999; Chaix et al. 2007). The complexities of human genetic and cultural histories here and elsewhere mean that in most cases there will not be a single phylogeny for either populations or their traditions. Reconciling these diverse lineages of inheritance is likely to present us with significant challenges. Fortunately, the progress that has been made in addressing similar problems in biology means that we are well-equipped to face them. We thank Fiona Jordan, Jeremy Kendal, Kenny Smith, Robert Layton, Emma Flynn and Rachel Kendal for their input into the development of this article. We are also grateful for the feedback provided by participants at the CECD Theme B workshop on Cultural and Linguistic Diversity, where we presented this research. During the period in which most of the data were collected J.J.T. was funded by the Economic and Social Research Council, the Arts and Humanities Research Council and the WennerGren Foundation for Anthropological Research. During the period in which the paper was written J.J.T. was supported by the Research Councils UK’s Academic Fellowship scheme. M.C. is supported by the Social Sciences and Humanities Research Council, the Canada Research Chairs Program, the Canada Foundation for Innovation, the British Columbia Knowledge Development Fund and Simon Fraser University.
REFERENCES Amanolahi, S. 1988 Tribes of Iran—vol. 1: Luristan, Bakhtiari, Kuh Gilu and Mamsani. New Haven, CT: Human Relations Area Files. Balfour, H. 1889 On the structure and affinities of the composite bow. J. Anthropol. Inst. Great Britain and Ireland. 19, 220–250. (doi:10.2307/2842074)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Cophylogeny of populations and cultures J. J. Tehrani et al. Barth, F. 1961 Nomads of south Persia. Oslo, Norway: Oslo University Press. Barthold, W. 1962 Four studies on the history of Central Asia. Leiden, The Netherlands: E.J.Brill. Beck, L. 1986 The Qashqa’i of Iran. New Haven, CT: Yale University Press. Boas, F. 1940 Race, language and culture. Chicago, IL: Chicago University Press. Boyd, R. & Richerson, P. 1985 Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Boyd, R., Borgerhoff Mulder, M., Durham, W. H. & Richerson, P. 1997 Are cultural phylogenies possible? In Human by nature (eds P. Weingart, S. D. Mitchell, P. J. Richerson & S. Maasen), pp. 355 –386. Mahwah, NJ: Lawrence Erlbaum. Brooks, D. R. & McLennan, D. A. 1991 Phylogeny, ecology and behavior. Chicago, IL: University of Chicago Press. Buchanan, B. & Collard, M. 2007 Investigating the peopling of North America through cladistic analyses of early Paleoindian projectile points. J. Anthropol. Archaeol. 26, 366 –393. (doi:10.1016/j.jaa.2007.02.005) Buchanan, B. & Collard, M. 2008 Phenetics, cladistics and the search for the Alaskan ancestors of the Paleoindians: a reassessment of the relationships among the Clovis, Nenana and Denali archaeological complexes. J. Archaeol. Sci. 35, 1683–1694. (doi:10.1016/j.jas. 2007.11.009) Cavalli-Sforza, L. L. & Feldman, M. 1981 Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Chaix, R., Quintana-Murci, L., Hegay, T., Hammer, M., Mobasher, Z., Austerlitz, F. & Heyer, E. 2007 From social to genetic structures in Central Asia. Curr. Biol. 17, 43–48. (doi:10.1016/j.cub.2006.10.058) Charleston, M. A. 1998 Jungles: a new solution to the host/ parasite phylogeny reconciliation problem. Math. Biosci. 149, 191 –223. (doi:10.1016/S0025-5564(97)10012-8) Collard, M., Shennan, S. J. & Tehrani, J. 2006 Branching, blending and the evolution of cultural similarities and differences among human populations. Evol. Human Behav. 27, 169 –184. (doi:10.1016/j.evolhumbehav. 2005.07.003) Conze, E. 1980 A short history of Buddhism. London, UK: George Allen & Unwin. Doolittle, W. F. 1999 Phylogenetic classification and the universal tree. Science 284, 2124–2128. (doi:10.1126/ science.284.5423.2124) Durham, W. H. 1991 Co-evolution: genes, culture, and human diversity. Chicago, IL: Stanford University Press. Farris, J. S. 1989a The retention index and homoplasy excess. Syst. Zool. 38, 406 –407. (doi:10.2307/2992406) Farris, J. S. 1989b The retention index and the rescaled consistency index. Cladistics 5, 417– 419. (doi:10.1111/j. 1096-0031.1989.tb00573.x) Felsenstein, J. 1985 Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. (doi:10.2307/2408678) Frye, R. 1960 Baluchistan. A geography and history. Encycl. Islam 1, 1005– 1006. Garthwaite, G. R. 1983 Khans and Shahs: a documentary analysis of the Bakhtiyari in Iran. Cambridge, UK: Cambridge University Press. Gray, R. & Atkinson, Q. 2003 Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426, 435– 439. (doi:10.1038/nature02029) Gray, R. F. & Jordan, F. 2000 Language trees support the express-train sequence of Austronesian expansion. Nature 405, 1052–1055. (doi:10.1038/35016575) Gray, R. D., Greenhill, S. J. & Ross, R. M. 2008 The pleasures and perils of Darwinizing culture (with Phil. Trans. R. Soc. B (2010)
3873
phylogenies). Biol. Theory 2, 360 –375. (doi:10.1162/ biot.2007.2.4.360) Gray, R. D., Drummond, A. J. & Greenhill, S. J. 2009 Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479– 483. (doi:10.1126/science.1166858) Grimes, B. F. 2002 Ethnologue: languages of the world, 14th edn. Dallas, TX: Summer Institute of Linguistics. Hafner, M. & Nadler, S. 1988 Phylogenetic trees support the coevolution of parasites and their hosts. Nature 332, 258–259. (doi:10.1038/332258a0) Hafner, M. & Page, R. 1995 Molecular phylogenies and host –parasite cospeciation: gophers and lice as a model system. Phil. Trans. R. Soc. Lond. B 349, 77–83. (doi:10.1098/rstb.1995.0093) Henrich, J. 2004 Demography and cultural evolution. How adaptive cultural processes can produce maladaptive losses—the Tasmanian case. Am. Antiquity 69, 197–214. (doi:10.2307/4128416) Holden, C. J. 2002 Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximum-parsimony analysis. Proc. R. Soc. Lond. B 269, 793– 799. (doi:10.1098/rspb.2002.1955) Jordan, P. & Mace, T. 2006 Tracking culture-historical lineages: can ‘descent with modification’ be linked to ‘association by descent’? In Mapping our ancestors (eds C. Lipo, M. O’Brien, M. Collard & S. Shennan), pp. 149– 168. New Brunswick, NJ: Aldine Transaction. Jordan, P. & Shennan, S. J. 2003 Cultural transmission, language, and basketry traditions amongst the Californian Indians. J. Anthropol. Archaeol. 22, 42–74. (doi:10.1016/S0278-4165(03)00004-7) Kitchen, A., Ehret, C., Assefa, S. & Mulligan, C. J. 2009 Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East. Proc. R. Soc. B 276, 2703– 2710. (doi:10. 1098/rspb.2009.0408) Konieczny, M. G. 1979 Textiles of Baluchistan. London, UK: British Museum. Kroeber, A. 1948 Anthropology: race, language, culture, psychology and prehistory. New York, NY: Brace. Lipo, C., O’Brien, M., Collard, M. & Shennan, S. J. (eds) 2006 Mapping our ancestors: phylogenetic approaches in anthropology and prehistory. New Brunswick, NJ: Aldine Transaction. Lycett, S. 2007 Why is there a lack of Mode 3 Levallois technologies in East Asia? A phylogenetic test of the Movius–Schick hypothesis. J. Anthropol. Archaeol. 26, 541–575. (doi:10.1016/j.jaa.2007.07.003) Lycett, S. 2009 Are Victoria West cores ‘proto-Levallois’? A phylogenetic assessment. J. Human Evol. 56, 175– 191. (doi:10.1016/j.jhevol.2008.10.001) Mace, R., Holden, C. & Shennan, S. 2005 The evolution of cultural diversity—a phylogenetic approach. London, UK: UCL Press. Moore, J. H. 1994 Putting anthropology back together again: the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925 –948. (doi:10.1525/aa.1994.96.4.02a00110) Nettle, D. & Harriss, L. 2003 Genetic and linguistic affinities between human populations in Eurasia and West Africa. Hum. Biol. 75, 331–344. (doi:10.1353/hub.2003.0048) Nunn, C. L., Arnold, C., Matthews, L. & Mulder, M. B. 2010 Stimulating trait evolution for cross-cultural comparison. Phil. Trans. R. Soc. B 365, 3807–3819. (doi:10. 1098/rstb.2010.0009) O’Brien, M. J. & Lyman, R. L. 2003 Cladistics and archaeology. Salt Lake City, UT: University of Utah Press. Oberling, P. 1974 The Qashqa’i nomads of fars. The Hague, The Netherlands: Mouton.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3874
J. J. Tehrani et al.
Cophylogeny of populations and cultures
Ohamgari, K. & Berkes, F. 1997 Transmission of indigeneous knowldege and bush skills among the western James Bay Cree women of subartic Canada. Human Ecol. 25, 197 –222. (doi:10.1023/A:1021922105740) Page, R. (ed.) 2003 Tangled trees: phylogeny, cospeciation and coevolution. Chicago, IL: University of Chicago Press. Page, R. & Charleston, M. 1998 Trees within trees: phylogeny and historical associations. Trends Ecol. Evol. 13, 356– 359. (doi:10.1016/S0169-5347(98) 01438-4) Page, R. & Charleston, M. 2002. TreeMap 2.0. Available to download at http://taxonomy.zoology.gla.ac.uk/%7emac/ treemap/index.html. Paterson, A. M., Wallis, G. P. & Gray, R. D. 1999 How frequently do avian lice miss the boat? Syst. Biol. 48, 214–223. (doi:10.1080/106351599260544) Pe´rez-Lezaun, A. et al. 1999 Sex-specific migration patterns in Central Asian populations, revealed by analysis of Y-chromosome short tandem repeats and mtDNA. Am. J. Human Genet. 65, 208 –219. (doi:10.1086/ 302451) Richerson, P. & Boyd, R. 2005 Not by genes alone: how culture transformed human evolution. Chicago, IL: University of Chicago Press. Riede, F. 2009 Tangled trees: modelling material culture evolution as host –associate co-speciation. In Pattern and process in cultural evolution (ed. S. Shennan), pp. 85– 99. Berkeley, CA: University of California Press. Rogers, D, Feldman, M. & Ehrlich, P. 2009 Inferring population histories using cultural data. Proc. R. Soc. B 276, 3835–3843. (doi:10.1098/rspb.2009.1088) Ronquist, F. 1996 Reconstructing the history of host – parasite associations using generalised parsimony. Cladistics 11, 73–89. (doi:10.1111/j.1096-0031.1995. tb00005.x) Ronquist, F. 1998 Phylogenetic approaches in coevolution and biogeography. Zool. Scripta 26, 313– 322. (doi:10. 1111/j.1463-6409.1997.tb00421.x)
Phil. Trans. R. Soc. B (2010)
Rudenko, S. 1970 Frozen tombs of Siberia: the Pazyryk burials of iron-age horsemen. Berkeley, CA: University of California Press. Swofford, D. L. 1998 PAUP* 4. phylogenetic analysis using parsimony (*and other methods), v. 4. Sunderland, MA: Sinauer. Tanavoli, P. 1985 Shahsevan Iranian rugs and textiles. New York, NY: Rizzoli. Tehrani, J. & Collard, M. 2002 Investigating cultural evolution through biological phylogenetic analyses of Turkmen textiles. J. Anthropol. Archaeol. 21, 443 –463. (doi:10.1016/S0278-4165(02)00002-8) Tehrani, J. & Collard, M. 2009a On the relationship between inter-individual cultural transmission and populationlevel cultural diversity: a case study of weaving in Iranian tribal populations. Evol. Human Behav. 30, 286–300. (doi:10.1016/j.evolhumbehav.2009.03.002) Tehrani, J. & Collard, M. 2009b The evolution of cultural diversity among Iranian tribal populations. In Pattern and process in cultural evolution (ed. S. Shennan), pp. 99–111. Berkeley, CA: University of California Press. Temkin, I. & Eldredge, N. 2007 Phylogenetics and material culture evolution. Curr. Anthropol. 48, 146– 153. (doi:10. 1086/510463) Templeton, A. 1983 Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37, 221 –224. (doi:10.2307/2408332) Thompson, J. 1980 Turkmen carpet weavings. In Turkmen: tribal carpets and traditions (eds L. Mackie & J. Thompson), pp. 60–191. Washington, DC: The Textile Museum. Thompson, J. 2002 The Baluch. In The nomadic peoples of Iran (eds R. Tapper & J. Thompson), pp. 298–304. London, UK: Thames and Hudson. Tzavera, E. 1984 Rugs and carpets from Central Asia. Leningrad, USSR: Aurora Art Publishers. Windfuhr, G. 2007 Iranian languages. London, UK: Routledge Curzon.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3875–3888 doi:10.1098/rstb.2010.0092
Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast Peter Jordan1,2 and Sean O’Neill1,2,* 1
Department of Archaeology, University of Aberdeen, St Mary’s Building, Elphinstone Road, Aberdeen AB24 3UF, UK 2 AHRC Centre for the Evolution of Cultural Diversity, Institute of Archaeology, University College London, 31– 34 Gordon Square, London WC1H 0PY, UK Many recent studies of cultural inheritance have focused on small-scale craft traditions practised by single individuals, which do not require coordinated participation by larger social collectives. In this paper, we address this gap in the cultural transmission literature by investigating diversity in the vernacular architecture of the Pacific northwest coast, where communities of hunter – fisher – gatherers constructed immense wooden long-houses at their main winter villages. Quantitative analyses of long-house styles along the coastline draw on a range of models and methods from the biological sciences and are employed to test hypotheses relating to basic patterns of macro-scale cultural diversification, and the degree to which the transmission of housing traits has been constrained by the region’s numerous linguistic boundaries. The results indicate relatively strong branching patterns of cultural inheritance and also close associations between regional language history and housing styles, pointing to the potentially crucial role played by language boundaries in structuring largescale patterns of cultural diversification, especially in relation to ‘collective’ cultural traditions like housing that require substantial inputs of coordinated labour. Keywords: ethnogenesis; phylogenesis; architecture; cultural transmission; hunter – gatherers; Pacific northwest coast
1. INTRODUCTION A growing body of empirical research is focusing on the inheritance and diversification of technological traditions. Many studies of material culture diversity now adopt an explicitly Darwinian perspective on ‘cultural transmission’, employing models, analytical methods and theory from evolutionary biology to study analogous processes in the cultural domain. At the core of these ‘descent with modification’ approaches is the observation that overall, for a variety of reasons, people tend to imitate others when acquiring cultural traditions rather than invent new skills and practices entirely by themselves, generating a tendency for historical continuity in cultural traditions, rather than radical change (Boyd & Richerson 1985). While there are a series of well-understood ‘microscale’ processes by which individuals acquire practices within social groups, the large-scale outcomes of these processes are less well-understood, especially at population levels (e.g. Collard et al. 2008 with references). In particular, vigorous debates about the most likely patterns of macro-scale cultural diversification are still focusing on two mutually exclusive and competing theoretical models, the first termed the ‘branching’
* Author for correspondence ([email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
(phylogenesis) hypothesis and the second the ‘blending’ (ethnogenesis) hypothesis. The branching model predicts that macro-scale cultural diversification takes place when initial populations demographically expand and then split into successive generations of daughter populations, each new population carrying a modified set of cultural traditions with it. In other settings, similar outcomes may be generated as interacting communities reduce the degree to which they borrow and blend their traditions with other groups, perhaps as a result of emerging hostilities, ideologies of exclusion, perceived cultural or ethnic identities and other factors. In these settings, local cultural traditions may eventually become ‘insulated’ from outside influences, ensuring strong patterns of vertical transmission within each community. Through time, the dominance of vertical transmission within populations ensures that cultural diversification proceeds in a branching manner akin to biological speciation, enabling patterns of historical relatedness among cultural traditions to be mapped as branching tree diagrams (for recent reviews see Mace et al. 2005; Lipo et al. 2006; Gray et al. 2007; Collard et al. 2008 with references; O’Brien 2008; but see also Borgerhoff Mulder et al. 2006; Te¨mkin & Eldridge 2007). According to the alternative blending or ‘ethnogenesis’ model, populations have rarely, if ever, been completely isolated from one another and have always engaged in a ready horizontal exchange of
3875
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3876
P. Jordan & S. O’Neill
Untangling cultural inheritance
traditions, ideas and practices. This tendency encourages a rapid and ceaseless blending of cultural traditions across time and space, creating a blur of hybrid forms whose patterns of descent are simply too chaotic for any kind of coherent historical signal to be maintained (Terrell 1987, 1988; Moore 1994, 2001). More recent investigations employing a quantitative analytical approach have added new dimensions to these debates by demonstrating empirically that macro-scale cultural evolution varies enormously according to the culture–historical context, with branching predominating in some settings (e.g. Gray & Jordan 2000; Tehrani & Collard 2002, 2009a), and blending in others (Jordan & Shennan 2003; Jordan 2007). Further case studies have added new levels of complexity by exploring the degree to which a broad suite of material culture traditions and languages have been transmitted in tandem. For example, a number of individual cultural traditions may be characterized by similar patterns of branching descent, while other traditions practised by the same communities may deviate from this pattern, following a more hybridized pattern of inheritance (Jordan & Mace 2006, 2008; Jordan 2009; Jordan & Shennan 2009; for a useful range of summary models, see Boyd et al. 1997). Finally, more studies are now starting to address the major gap in our current understanding of the relationship between micro-level (inter-individual) transmission and population-level cultural diversity (Tehrani & Collard 2009b). As the range of published quantitative analyses of cultural transmission expands and diversifies in terms of subject matter and culture– historical setting, we are now moving away from theoretical models towards a fuller empirical sense of both the complexity and local variability that characterizes macro-scale cultural inheritance. Understanding the specific processes that generate these patterns of variability will demand a renewed focus on local social settings in order to understand how and why individuals and populations interact, exchange, re-combine or withhold cultural traits in the myriad ways that they do. For example, most research on the transmission of material culture traditions has tended to focus on the dynamics of ‘small-scale’ portable crafts that are made by individual practitioners, for example, textiles, baskets, clothing and other items in ethno-historic studies, and projectile points and pottery in archaeological analyses (Tehrani & Collard 2002; Jordan & Shennan 2003; chapters in Mace et al. 2005; Eerkens & Lipo 2005; Lipo et al. 2006; Buchanan & Collard 2007; Stark et al. 2008; O’Brien 2008; Jordan 2009). In contrast, much less research has been directed at understanding the dynamics of large-scale undertakings like communal architecture (but see Jordan 2007; Jordan & Mace 2008; Jordan & Shennan 2009). These larger cultural projects cannot be executed by one person working in isolation, and are often undertaken less frequently than the day-to-day production of small craft items, and also tend to require closely coordinated labour inputs from the wider social group. Given these different characteristics, the transmission and diversification of ‘collective’ material Phil. Trans. R. Soc. B (2010)
culture traditions, like housing, are likely to possess their own set of dynamics, although these remain poorly understood. In this paper, we aim to make two contributions to the cultural transmission literature: first, we test basic models about probable patterns of macro-scale cultural diversification in vernacular ‘long-house’ architecture on the Pacific northwest coast; second, we test whether diversity in long-house styles has been constrained by language boundaries. Two central questions are addressed: Q1: Have northwest coast long-house traditions been characterized by branching or blending patterns of community-scale transmission? Q2: Have housing styles been transmitted in tandem with languages, with linguistic boundaries serving to ‘canalize’ the vertical inheritance of architectural styles within local communities? In line with similar case studies, the present analysis introduces the ethno-history of the study area, explores local long-house traditions and then employs multiple quantitative methods, each based on different assumptions, to cross-check insights into macro-scale cultural diversification (see Jordan & Shennan 2009). 2. COMPLEX HUNTER –GATHERERS OF THE PACIFIC NORTHWEST COAST The rich ethno-historic record of the Pacific northwest coast has fascinated anthropologists for many generations (Suttles 1990): stretching from Yakutat Bay in Alaska down to northern California this narrow but tremendously rich and varied ecozone was occupied at the time of European contact by a string of distinctive hunter– fisher – gatherer cultures who spoke a multitude of different languages, practised salmonbased storage economies, occupied permanent winter villages and were organized into highly stratified kin-groups who owned resource sites, houses and other properties (Jorgensen 1980; Carlson 1983; Suttles 1990; Matson & Coupland 1995; Ames & Maschner 1999). One striking feature of these unique coastal hunter– gatherer societies was their elaborate decorative art based mainly on woodworking (Boas 1955; Drucker 1955; Inverarity 1971; Hawthorn 1979; Jonaitas 1981; Stewart 1984; Emmons 1991). This included the construction of immense wooden long-houses, which were built according to strikingly different styles on different reaches of the coast (Drucker 1955; Vastokas 1966; Nobokov & Easton 1989; Suttles 1990). Long-houses were generally located at the main winter villages, and served as storage points, ritual settings, as well as primary dwellings for multiple families organized into different lines of descent and usually led by a house chief. Construction and decoration of new houses was an immense logistical operation involving sustained work by a coordinated pool of labour which was supervised and directed by chiefs and specialist builders as they endeavoured to follow process-based ‘recipes’ of house construction (see O’Brien et al. 2010).
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance
P. Jordan & S. O’Neill 3877
Table 1. Central and northern Pacific northwest coast: ethno-linguistic communities (names, codes and linguistic affinities; after Drucker 1950). ethno-linguistic number communitya
code (current paper)
CED codea
language (local)b
language (branch)b
language (family)b
1 2 3 4 5 6
Tlingit 1 Tlingit 2 Haida 1 Haida 2 Nass-Gitskan Tsimshian 1
LC LS HS HM GK TG
Tlingit Tlingit Haida Haida Nass-Gitskan Coast Tsimshian
Tlingit Tlingit Haida Haida Tsimshian Tsimshian
Tlingit Tlingit Haida Haida Tsimshian Tsimshian
Tsimshian 2
TH
Coast Tsimshian
Tsimshian
Tsimshian
Xaisla Heiltsuk-Oowekyala 1 Heiltsuk-Oowekyala 2 Bella Coola Heiltsuk-Oowekyala 3 Kwakwaka’wakw 1 Kwakwaka’wakw 2 Nuu-chah-nulth 1 Nuu-chah-nulth 2 Nuu-chah-nulth 3
KX KC KO BC KW KK KR NC NT NH
Xaisla Heiltsuk-Oowekyala Heiltsuk-Oowekyala Bella Coola Heiltsuk-Oowekyala Kwakwaka’wakw Kwakwaka’wakw Nuu-chah-nulth Nuu-chah-nulth Nuu-chah-nulth
Kwakiutlan Kwakiutlan Kwakiutlan Bella Coola Kwakiutlan Kwakiutlan Kwakiutlan Nuu-chah-nulth Nuu-chah-nulth Nuu-chah-nulth
Wakashan Wakashan Wakashan Salishan Wakashan Wakashan Wakashan Wakashan Wakashan Wakashan
7 8 9 10 11 12 13 14 15 16 17
Chilkat Sanyakwan Skidegate Massett Gitskan (Kispiyox division) Tsimshian Proper (Gilutsa division) Southern Tsimshian (Kitqata division) Xaisla (Kitimat) Xaihais (China Hat) Bella Bella (Oyalit division) Bella Coola Wikeno Koskimo Kwexa Clayoquot Tsishaat Hupachisat
a
After Drucker (1950). After Thompson & Kincade (1990).
b
Mobilizing this social effort would have required a high level of power and wealth. Generally, the planning and the building of a new residential house was initiated by the pre-arrangement of an elite marriage, and construction usually took many years of preparation, starting with the procurement of raw materials from forest stands owned by a consenting chief, and their subsequent preparation by large groups of skilled craftsmen working over several seasons. Construction was supervised by men (Stewart 1984, p. 61) and it is now believed that often as many as 200 people would have worked simultaneously under the direction of a chief during the more intensive construction phases, for example, while raising the massive timber frames, which consisted of logs some 2 m in diameter, some of which had to be perfect-joined with each other while hanging in mid-air (Stewart 1984, pp. 61 – 63). Consisting of massive timbers, these houses could survive decades with only minor repairs (MacDonald 1983a,b; Stewart 1984; Nobokov & Easton 1989; Samuels 1991). Once built, they saw collective use by multiple families, each occupying a designated section of the vast interiors. Building a new structure would therefore represent a tremendously important social statement, providing a focus for new household identities, and reflecting and accommodating the needs, expectations and aspirations of extended kinship structures.
3. MATERIALS: NORTHWEST COAST HOUSING TRAITS There is a well-established typology of Pacific northwest coast housing styles dating to the later nineteenth century ethno-historic ‘present’ (Drucker Phil. Trans. R. Soc. B (2010)
1955; Vastokas 1966; Nobokov & Easton 1989; Suttles 1990), and growing archaeological understanding of developments in long-house architecture prior to this (Samuels 1991; Matson & Coupland 1995; Coupland 1996; Ames & Maschner 1999; Matson 2003). In general, northern houses (e.g. found among the Tlingit, Haida and Tsimshian) were built to a precise rectangular ground plan, with mortice and tenon joints supporting a high gabled roof. Once erected, these buildings could not be extended further without being completely dismantled and rebuilt. They housed substantial communities, consisting of nobles, commoners and slaves (but see Matson & Coupland 1995). Hereditary titles, wealth and status were inherited down the matrilineal line. In contrast, shed-roof houses were built further to the south, and involved simpler construction consisting of support posts and rafter beams. This fixed framework was clad with removable planks, which enabled the building to be adapted or extended according to the size of the community it sheltered during any one season. These mutable structures could be seen to reflect a flexible and egalitarian system of the reward of title and inheritance from one generation to the next, based on relatively meritocratic and inclusive social traditions, and clearly manifested in numerous aspirational potlatching events (Rosman & Rubel 1971, pp. 176 – 200). Drucker (1950) systematically recorded these variations in housing styles among 17 communities (table 1) inhabiting the central and northerly sections of the classic ‘Northwest Coast Culture Area’ (e.g. Jorgensen 1980, p. 19). These data form the basis of the present case study, and we also follow Drucker (1950) in focusing only on housing from Chilkat in the north down to Vancouver Island; at this stage, we do not include housing from the Gulf of Georgia
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3878
P. Jordan & S. O’Neill
Untangling cultural inheritance
Table 2. Trait-based documentation of house-building traditions of the Pacific northwest coast (edited from Drucker 1950). housing traditions trait number description of house traits 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
general category
trait description
house pits
excavated central pit series of steps into the pit pit walls plank-lined house built on pilings wall planks detachable for move to summer houses framework and wall planks inseparable round posts squared posts zoomorphic relief carvings on posts two-pitch roof one-pitch roof (‘shed roof ’) single ridgepole ridgepole as lintel directly on posts ridgepole on cross-lintel double ridgepole intermediate beams roof plates and sills slots for wall sheathing wall sheathing horizontal supported between vertical stakes overlapping clapboard wall sheathing vertical roof of boards roof of bark overlapping peak ridge cover: dugout pole ridge cover: horizontal boards earth floor board floor corner fireplaces central fireplace—for rituals only central fireplace—for everyday use fire on floor level fire in pit roof boards moved to allow smoke escape central smokehole adjustable smokehole shield sleeping platform around walls sleeping platform made of boards sleeping platform segmented high shelves for storage private sleeping cubicles partitions between spaces doorway in gable end doorway rectangular door oval or round entry directly through portal pole door wooden door propped against opening door suspended at top fac¸ade of house painted individual backrests or settees above items painted wood stools walls at sleeping places lined with mats
pilings wall planks posts
roof construction
wall support
roof boards
floors fireplaces
sleeping platforms
storage shelves partitions doorways
house facades furniture
wall lining
documentation of housing traditions Tlingit 1 Tlingit 2 Haida 1 Haida 2 Nass-Gitskan
1110010101000010111001100100100101011000110110010100000 1110010111000010110001111000100101011110111111110110000 1110011101000010110001010000100101011111110101110111101 1110011111000011110001110000100101011111111111110111101 0000011011000011110001010001000110011000100110110011111 (Continued.)
Phil. Trans. R. Soc. B (2010)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance
P. Jordan & S. O’Neill 3879
Table 2. (Continued.) Tsimshian 1 Tsimshian 2 Xaisla Heiltsuk-Oowekyala 1 Heiltsuk-Oowekyala 2 Bella Coola Heiltsuk-Oowekyala 3 Kwakwaka’wakw 1 Kwakwaka’wakw 2 Nuu-chah-nulth 1 Nuu-chah-nulth 2 Nuu-chah-nulth 3
1111011011000011110001110011000110011110010111010111111 0001011011000010110001101001000110011110010111110011101 1111011011000010110001101001000110011111001110111011101 1110011011000010110001101001000110011111010111110111101 1110011011000010110001101001000110011110110111010011101 1111011011100010111110101001000110011110111110111011101 1111011011000010111111100011011010011111011110111011101 1111101011011110001110101001011010100100110110010111100 1110101011011110001110101001011010100110111111011011100 0000101011011100001110100011011001100110101110000010000 0000101011100000001110101001011010100111100010011010000 0000101010100000001110100001011010100111100010010010000
Salish (Barnett 1939; Jordan & Mace 2008) or other areas further to the south (i.e. down to California). Moreover, in contrast to Jordan and Mace’s earlier case study (2006), our current focus is strictly on understanding diversification in the style of ‘dwelling houses’ (i.e. ‘rectangular plank houses’; Drucker 1950, pp. 178 –180, and not on diversity in Drucker’s broader general category of ‘structures’, which includes bark-houses, earth lodges, storehouses, caches, stockades and sweathouses as well as dwelling houses; Drucker 1950, pp. 180 – 181). Each of these forms of vernacular architecture could potentially have been affected by a wide range of different transmission processes; a sharper focus on long-houses enables us, in the current paper, to concentrate on understanding the inheritance of a single coherent cultural tradition at the heart of daily community life. Drucker (1950) records variations in long-house architecture in terms of distinct traits that are systematically recorded as being ‘present’ or ‘absent’ across the 17 ethno-linguistic communities (table 2). For example, the survey captures the major distinctions between ridgepole/gable-roofed structures found in the north, through to the shed-roof structures found further to the south, as well as the more subtle gradations between these idealized types in the intervening communities (e.g. Vastokas 1966). Several rows of Drucker’s original dataset contained missing information; we retained only rows with full sets of data— this exercise generated a binary data matrix of 55 cultural traits for the 17 communities (table 2).
4. MODELS AND METHODS With imitation, innovation and imperfect replication, central mechanisms in cultural inheritance, we draw heuristic parallels between these cultural processes and a range of analogous processes operating in biological evolution (Cavalli-Sforza & Feldman 1981; Boyd & Richerson 1985; Durham 1992; Shennan 1997, 2002, 2004; Collard & Shennan 2008; Collard et al. 2008). While there are many fundamental differences in cultural and biological evolution (e.g. humans have only two biological parents while their cultural traditions may be acquired from multiple sources in both older and contemporary generations), both can usefully be understood as systems of information transmission that operate along principles of ‘descent Phil. Trans. R. Soc. B (2010)
with modification’. Recognizing and exploring both the positive and negative analogies between cultural and biological systems of inheritance also opens the way for application of quantitative analytical tools developed by evolutionary biologists in more rigorous analyses of cultural transmission and cumulative diversification. (a) Tree-based methods Phylogenetic analysis is employed by biologists to reconstruct the genealogies of organisms, and rests on the axiom that evolutionary relationships can be represented by a branching tree diagram (Hennig 1966; Forey et al. 1992; Kitching et al. 1998). The key principle involves defining traits and then identifying the presence or absence of these traits across a range of taxa; descent relationships can be reconstructed by determining which similarities are derived from shared common ancestry (homologies), and which are a result of other processes, including lateral borrowing and hybridization (homoplasies; Hennig 1966; Forey et al. 1992). Given these goals, biologists have tended to regard homologies as the most important signal for discovering branching evolutionary relationships, whereas signals for homoplastic convergences between lineages tend to be regarded as background noise, which obscures attempts to reconstruct deeper evolutionary relationships (Forey et al. 1992, p. 3). As a range of studies has shown, cultural differences between communities can also be recorded in terms of the presence and absence of particular traits, but in contrast to biologists, anthropologists are equally interested in identifying signals for lateral hybridization, as well as common ancestry—either process may have predominated in a given culture– historical setting (see Holden & Shennan 2005; Gray et al. 2007; Collard et al. 2008 for recent reviews). In applying phylogenetic analyses to cultural data, the relative proportions of homology and homoplasy in any given dataset can be measured statistically, quantifying the closeness of fit between the patterns in a data matrix, and a tree model derived from that data. If the statistical measures indicate that the data fit the tree model closely, it can be argued that branching transmission has predominated. If the fit is poor, then it can be argued that processes other than branching have dominated.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3880
P. Jordan & S. O’Neill
Untangling cultural inheritance
Tlingit 1 Tlingit 2
Nass-Gitskan Tsimshian 1 Haida 2 Haida 1 Tsimshian 2
Xaisla
Heiltsuk-Oowekyala 1
Bella Colla Heiltsuk-Oowekyala 2 Heiltsuk-Oowekyala 3
N Kwakwaka’wakw 1 Kwakwaka’wakw 2
Nuu-chah-nulth 1 Nuu-chah-nulth 2
0
Nuu-chah-nulth 3
300 km
Figure 1. Location map of various ethno-linguistic communities on the Pacific northwest coast (after Drucker 1950). Filled squares, Tlingit; open squares, Haida; filled circles, Kwakiutlan; open circles, Tsimshian; filled triangles, Nuu-chah-nulth; open triangles, Salishan.
In the present analysis, a general heuristic search was performed using the PAUP* 4.0b10 phylogenetic software (Swofford 1998) with the following settings: optimality criterion as parsimony; starting trees obtained via stepwise addition and the branch swapping algorithm set as tree-bisection-reconnection. The results were interpreted using the outgroup method (Watrous & Wheeler 1981; Farris 1982; Clark & Curran 1986), which is commonly used to root the tree (Smith 1994, pp. 55 –58; Kitching et al. 1998). We selected the Salish-speaking Bella Coola as the outgroup, on the basis that they are a linguistic isolate in the region, whereas all other communities are aligned with the coast’s larger language families (Tlingit, Haida, Tsimshian and Wakashan; see Thompson & Kinkade 1990; table 1, figures 1 and 2). Phil. Trans. R. Soc. B (2010)
A further descriptive statistic was calculated to test for relative degrees of branching and blending in the housing dataset. Computer algorithms will construct tree diagrams from random data, making it important to directly measure the strength of the phylogenetic signal in any given dataset. In the present study, we employ the ‘Retention Index’ (RI; Farris 1989a,b), which calculates the amount of homoplasy as a fraction of the maximum possible homoplasy (Forey et al. 1992, p. 75). The RI ranges in principal from 0 to 1.0, with a high RI taken as being consistent with vertical transmission, and hence a branching pattern of phylogenesis. In addition, the RI is a useful goodness-of-fit measure because it is not affected by either the number of taxa or the number of characters, enabling
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance
P. Jordan & S. O’Neill 3881
Bella Coola Tlingit 1 Tlingit 2 Haida 1 Haida 2 Nass-Gitskan Tsimshian 1 Tsimshian 2 Xaisla Heiltsuk-Oowekyala 1 Heiltsuk-Oowekyala 2 Heiltsuk-Oowekyala 3 Kwakwaka’wakw 1 Kwakwaka’wakw 2 Nuu-chah-nulth 2 Nuu-chah-nulth 1 Nuu-chah-nulth 3 Figure 2. Pacific northwest coast language tree (after Thompson & Kincade 1990 and based on qualitative assessment of linguistic diversity; for further explanation, see text).
results to be compared across a range of case studies. For example, Collard et al. (2006a,b) employed RIs to examine the relative strength of branching signals across a broad range of biological and cultural datasets (Collard et al. 2006a, p. 57). Recent simulation work by Nunn et al. (2010) has also tested the robustness of RI measures, and has concluded that RI values greater than 0.6 do indicate low levels of horizontal transmission, high degrees of vertical transmission and hence the predominance of ‘phylogenesis’ over ethnogenesis. Finally, it is important to specifically identify which sections of the tree diagram are well supported by the existence of hierarchical structures in the data matrix (Smith 1994, p. 48). Bootstrap analysis (Smith 1994, p. 50) is a random sampling program that calculates percentage levels of support for each branch in the tree (e.g. Forey et al. 1992, p. 76), and a level of support over 50 per cent should be interpreted as a highly conservative measure of the accuracy of tree structure (Smith 1994, p. 51). Bootstrap supports were calculated with 1000 replications in PAUP* 4.0b10 (Swofford 1998), and only tree branches with over 70 per cent bootstrap support were retained. Phil. Trans. R. Soc. B (2010)
(b) Network-based methods In addition to using tree-based models, biologists have begun to develop network-based methods to explore more complex evolutionary relationships characterized by potentially higher levels of lateral transfer (Bryant et al. 2005, p. 80). The NeighborNet technique (Bryant & Moulton 2004; and see Bryant et al. 2005 for a recent application to the analysis of linguistic history; see also Gray et al. 2007 for further discussion) starts by calculating a distance matrix from the dataset; these distances are then used to generate a series of ‘splits’ in the data, using an agglomerative clustering algorithm, which progressively combines clusters into larger and larger overlapping clusters. Weights are then calculated for these splits, which are represented in the form of a network diagram known as a ‘split graph’ (see Bryant et al. 2005, pp. 68 – 69, 74– 79). Each split graph (or plot) contains two kinds of information: the splits, which represent the groupings in the data; and the branch lengths, which indicate the degree of separation for each split (Bryant et al. 2005, p. 77). For example, where phylogenesis has been the dominant process of cultural diversification, the split graph will closely resemble a tree diagram, as cultural descent with modification will have
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3882
P. Jordan & S. O’Neill
Untangling cultural inheritance
proceeded in a strict branching manner. Conversely, if borrowing and hybridization have been widespread, then the diagram will be much more complex, with conflicting signals represented as ‘box-like’ sections in the graph. These conflicting signals may, in extreme cases, be so strong that the split graph includes multiple boxes that reflect frequent instances of lateral borrowing and recombination. In the current study, NeighborNet (Bryant & Moulton 2004) incorporated into SPLITS TREE v. 4beta10 (Huson & Bryant 2006) was employed.
(c) Testing for co-transmission of housing and language Tree- and networked-based analytical methods are based on different assumptions and their combined application to a single cultural dataset enables the results to be cross-checked. Where branching processes do appear to have generated distinct cultural lineages, a further series of hypotheses can be tested, for example, the degree to which cultural traditions and language history have tracked each other with varying degrees of fidelity through time, or the extent to which several cultural traditions have been co-transmitted (see Jordan & Mace 2006; Jordan & Shennan 2009). Analogous processes of co-transmission are recorded in biological evolution, and a range of methods is now available for identifying the patterns of ‘co-speciation’ that arise from closely shared evolutionary histories (Page 2003; Page & Charleston 1998; and see Tehrani et al. 2010). An analogous culture– historical scenario would involve close association between independent material culture lineage(s) and/or language history (e.g. Boyd et al. 1997; Jordan & Mace 2006; Jordan & Shennan 2009). Were northwest coast long-house traditions transmitted in tandem with language, with language frontiers serving to ‘constrain’ the exchange of housing traits between adjacent communities? This hypotheses can be tested as follows: first, do the network- and tree-based methods indicate that housing styles had been subjected to branching processes of diversification? If yes, does the branching tree of housing styles share statistically significant structural similarities to the tree of northwest coast languages? For this second hypothesis, we employed a language tree (figure 2) based on the qualitative classification of local languages into phylum/family, branch and language presented by Thompson & Kincade (1990, pp. 30, 34–35); these groupings and ancestral relationships reflect current consensus among linguists working in the region. The descriptive classifications were used to manually construct a language tree in MACCLADE 4.05 (Maddison & Maddison 2000), which formed the basis for the tests described below. The strength of historical associations between language history and the housing tree were tested in COMPONENT 2.0 (Page 2003). Unlike PAUP (Swofford 1998), the software does not infer trees from the data but rather requires that pre-existing trees be entered into the program, after which Phil. Trans. R. Soc. B (2010)
comparison methods can be applied. In these analyses, a single strict consensus tree was generated for northwest coast housing in PAUP* 4.0b10. This was imported into COMPONENT 2.0, along with the language tree described above, in order to calculate an overall measure of similarity between the housing and language trees; this can be estimated by breaking down each tree into sets of simpler structures. We employed the ‘triplet’ measure, which is the smallest possible informative sub-tree on a rooted tree. If the two trees are very similar, then only a few of these sub-trees will be resolved differently, giving a low overall score. In contrast, if the structure of the two trees is very different, then a large number of triplets will be resolved differently, generating a higher score. As any tree can easily be reduced to its triplets scores, comparison of the overall similarities and differences between large numbers of trees becomes quite straightforward (see Jordan & Mace 2006). However, COMPONENT 2.0 generates only an overall measure of similarity between trees; it is therefore important to identify the point at which apparent similarity between trees actually becomes statistically significant. COMPONENT 2.0 generates sets of random trees as the basis for these statistical tests: if the triplet measure of difference between the language and housing trees falls below the range of measures for a randomly generated set of trees, then it can be assumed that association between the language and housing trees is greater than would be the result of chance alone, and that a substantial degree of co-transmission has taken place. Where substantial co-transmission is demonstrated on the basis of the triplets results, a further set of tests can assess whether the housing and language trees are, in fact, identical owing to perfect co-transmission. The Kishino – Hasegawa test, modified to fit PAUP* 4.0b10 (Kishino & Hasegawa 1989; see Jordan & Shennan 2003, 2009 for applications to cultural datasets), measures the difference between the initial best-fit tree for housing, and a second tree, using parsimony as the optimality criterion. The second tree is generated by a heuristic search in PAUP* 4.0b10, which has been artificially constrained by the structure of the language tree in order to test the hypothesis that vertical transmission of housing traditions had been closely ‘canalized’ by a tree mapping language history. If there is no statistically significant difference between the original best-fit tree for housing, and the tree constrained by language history, then a hypothesis of perfect co-transmission can be accepted; in contrast, if the two trees are significantly different then the hypothesis of perfect co-transmission can be rejected.
5. RESULTS: MAPPING DIVERSITY IN NORTHWEST COAST LONG-HOUSES (a) Branching versus blending? The northwest coast housing dataset was initially examined in the NeighborNet program to generate some general insights into the strength of branching
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance
P. Jordan & S. O’Neill 3883
Haida 1
Haida 2 Tlingit 2
Tsimshian 1
Tlingit 1 Heiltsuk-Oowekyala 1 Heiltsuk-Oowekyala 2
Nass-Gitskan
Tsimshian 2
Xaisla
Bella Colla
Heiltsuk-Oowekyala 3 Kwakwaka’wakw 1 Kwakwaka’wakw 2
Nuu-chah-nulth 1 Nuu-chah-nulth 2 Nuu-chah-nulth 3 Figure 3. Diversity in northwest coast housing traditions: NeighborNet split graphs.
versus blending signals. As the NeighborNet plot illustrates (figure 3), the length of the individual branches appears to indicate considerable underlying differences in housing styles along the coast, with the most southerly groups pulled out to the bottom right and more northerly groups to the left and upper right; the degree to which the individual housing styles are pulled apart suggests some hierarchical structuring in the dataset, which is consistent with an underlying pattern of branching descent. At the same time, the ‘boxed’ sections indicate a degree of conflict in the data owing to hybridization between styles. As noted above, the NeighborNet plots provide only a basic visual exploration of the degree of vertical structure in a dataset; tree-based methods, on the other hand, generate more robust quantitative Phil. Trans. R. Soc. B (2010)
measures of the degree of branching versus blending in a dataset. The housing data were analysed in PAUP*4.01b10 (Swofford 1998), employing Bella Coola as the outgroup (see above). A heuristic search generated four trees with a length of 112; these trees were converted into a strict consensus tree, which was then bootstrapped at 1000 replicates. Only clades with over 70 per cent support were retained, resulting in a clearly branching tree diagram (figure 4). The RI for the tree was 0.64, indicating a strong signal for vertical transmission, and hence phylogenesis—this enabled the hypothesis of branching descent to be accepted. Using RI to measure the strength of this signal also enabled the results to be contextualized against other works assessing the relative degrees of branching and blending in both biological and cultural datasets. For example, in a
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3884
P. Jordan & S. O’Neill
Untangling cultural inheritance Bella Coola Tlingit 1 Tlingit 2 Haida 1 Haida 2 Nass-Gitskan Tsimshian 1 Tsimshian 2 Heiltsuk-Oowekyala 1 Heiltsuk-Oowekyala 2 Xaisla Heiltsuk-Oowekyala 3 Kwakwaka’wakw 1 Kwakwaka’wakw 2 Nuu-chah-nulth 1 Nuu-chah-nulth 2 Nuu-chah-nulth 3
Figure 4. Diversity in northwest coast housing traditions: most parsimonious cladogram (all clades with over 70% bootstrap support).
recent comparative analysis, Collard et al. (2006a) assembled 21 biological datasets and nine cultural datasets. These biological datasets—ranging from DNA data for lizards, lagomorphs and carnivores, to morphological data for fossil hominids, seals and ungulates—had all been used to reconstruct relationships among species and high-level taxa. Datasets pertaining to simple organisms (e.g. viruses, bacteria) or subspecies of complex organisms were not included in the study on the grounds that they had possibly been affected by blending processes. Cultural datasets included basketry, prehistoric pottery, projectile points, textiles and other forms of material culture (Collard et al. 2006a, p. 58). RIs were calculated for trees derived from all individual datasets in order to compare whether biological datasets tended to fit a branching tree better than the cultural datasets. The results indicated Phil. Trans. R. Soc. B (2010)
that, overall, there was little difference between the fit of tree models to either biological or cultural data (Collard et al. 2006a, pp. 57– 58). In addition, not only were the average RIs similar across biological and cultural datasets, but the ranges were also comparable: for example, mean, minimum and maximum RIs for biological data were 0.60, 0.35 and 0.94, respectively; and for cultural data, they were 0.60, 0.35 and 0.93. Results of Collard and co-workers provide a useful comparative framework against which the current RI value for northwest coast housing can be evaluated further. This RI of 0.64 falls above the mean for both the biological and cultural datasets included in the above study, and falls around the same RI values as ungulate morphology (0.69), Phalacrocoracidae bird mtDNA (0.65) and phocid seal morphology (0.60) (Collard et al. 2006a, p. 58). These broader
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance comparisons strengthen the conclusion that housing styles on the northwest coast had been influenced by strongly branching processes of descent with modification. (b) Co-transmission of long-house styles and language? Having accepted the first hypothesis that northwest coast housing diversity was largely the outcome of phylogenesis, we can now move onto testing whether housing styles and languages had, in fact, been co-transmitted, or whether they had separate descent histories. In order to test the degree of general association between the descent histories of language and housing, the northwest coast language tree (figure 2) and bootstrapped consensus tree for housing styles (figure 4) were imported into COMPONENT 2.0 so that triplet measures of difference between the two trees could be calculated. The analysis indicated that 164 triplets had been resolved differently in the language and housing trees. As noted above, however, all trees can be reduced to triplet scores, enabling distances between extremely different trees to be reduced to a single measure. As a result, it was important to identify whether the degree of similarity between the housing and language trees was greater than would be expected by chance alone. A further 1000 trees were randomly generated in COMPONENT 2.0. The triplet measures of difference between these randomly generated trees ranged between 186 and 504, with a mean of 372.673 (s.d. ¼ 42.726). Clearly, the triplet measure of difference between the housing and language trees (164) was less than the range of triplet distances between the 1000 randomly generated trees (186–504). These results indicated that the trees for housing and language were therefore more similar than would be expected by chance alone, and that they shared a broadly similar pattern of branching descent. Building on these results, a Kishino – Hasegawa test in PAUP* 2.0b10 was also performed, using the language tree to constrain the search for a new bestfit housing tree. The test generated a new tree that had been constrained by the hypothesis that there had been perfect co-transmission between housing and language. However, this new tree was significantly different from the original housing tree (p , 0.05), indicating that perfect co-transmission of housing and languages had not taken place. On balance, these overall results indicate that housing styles had been subjected to branching patterns of descent; second, that the trees for housing and language share more similarity than would be expected by chance alone, indicating some association by descent and suggesting that language boundaries had, in part, constrained the horizontal diffusion and hybridization of housing traits between communities. At the same time, this co-transmission of housing and language had not been perfect, and that while their descent histories had been similar, they had not been identical. 6. DISCUSSION: LONG-HOUSES, LANGUAGE AND SOCIAL INSTITUTIONS This study has examined patterns of cultural inheritance on the Pacific northwest coast, focusing on Phil. Trans. R. Soc. B (2010)
P. Jordan & S. O’Neill 3885
variability in long-house architecture, and testing for possible associations between the transmission of language and long-house styles. The results indicate a substantial branching signal in the long-house dataset, which largely supports the phylogenetic model of cultural diversification (see Q1, §1). In addition, there appears to be significant association between regional language history and the community-based transmission of long-house styles (see Q2, §1). With generations of ethnographers highlighting the role of long-houses as the most important focus of local community’s social and cultural reproduction, it is perhaps predictable that there should be clear evidence of a branching signal in housing styles, and also that housing diversity should, in part, be associated with regional language history. Beyond these general patterns of cultural and linguistic diversification, some interesting small-scale patterns also emerge. For example, figures 3 and 4 indicate that the overall branching signal is much stronger among ‘southern’ groups located south of the Bella Coola (figure 1)—in the NeighborNet plot this is shown by the longer branch lengths and the limited degree of boxing to the bottom right of the plot; in the tree diagram, it is revealed by the progressive splitting of the ‘southern’ branches, all of which were wellsupported by bootstrapping. In contrast, there appears to have been a more substantial degree of hybridization among groups north of the Bella Coola—this is demonstrated by the increased boxing in the upper half of the NeighborNet plot, indicating transmission signals that conflict with strictly branching processes of descent. In the tree diagram, a greater degree of local hybridization in this part of the coast is indicated by the fact that the bootstrapping returned low levels of support (,70%) for the ‘northern’ splits—as a result, these branches have been collapsed into a more ‘bush-like’ structure. Together, these patterns suggest that there has been a greater degree of ‘spillage’ in long-house traditions between northern communities, while southern communities retain the strongest signal for strictly vertical transmission of housing styles. With each long-house, the outcome of a carefully choreographed process of collective and coordinated effort, what social factors might have generated these contrasting patterns of diversification? Were there significant local differences in the structure of inter- and intra-community interactions and social networks along these different stretches of the coast? Drucker reports that along these reaches of the northwest coast, autonomous kin groups were organized by contrasting matrilineal and patrilineal descent reckoning; moreover, each system had a mutually exclusive geographical distribution (Drucker 1955, p. 46). For example, the more northerly groups were matrilineal and avunculocal, and the more southerly groups were organized according to patrilineal and patrilocal descent. Interestingly, these distributions appear to correlate spatially with the stronger branching signals in southern housing and the greater degree of hybridization in the north. On the northwest coast, these descent rules were crucially important because they structured the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3886
P. Jordan & S. O’Neill
Untangling cultural inheritance
transfer of property, status and privilege, and were also linked to postmarital residence rules for both adult men and women. For example, Kwakwaka’wakw, Nuu-chah-nulth and Salish communities had a patrilineal system of inheritance, with a virilocal rule: a new wife would take up residence with her husband’s family, either within the same settlement or by moving to another settlement. In this way, ‘southern’ husbands continued to live and work in the same long-houses and villages as their fathers and grandfathers, eventually inheriting their titles, properties and privileges (Rosman & Rubel 1971, pp. 176 – 200). In contrast, the Tlingit, Haida and Tsimshian practised a matrilineal system, whereby status and privileges were inherited through the female line. Marriage residence rules were based on an avunculocal rule, so that at around 10 years of age, a boy would take up residence with their mother’s uncle for apprenticeship and preparation for marriage. Generally, there was a cross-cousin preference, for instance a man would marry his maternal uncle’s daughter, and eventually rise to take over his maternal uncle’s status and privileges when he died. In this way, the relationship with one’s maternal uncle was more important in northerly communities than with one’s own father (Rosman & Rubel 1971, pp. 10– 25). Given the striking geographical correlation between these contrasting matrilineal and patrilineal kinship systems and the spatial distributions of the ‘northern blending’ versus ‘southern branching’ signals in long-house architecture (figures 3 and 4), it is tempting to link the two phenomena together. For example, it is clear from ethnographic accounts that houses were built by men on all stretches of the coastline. However, the greater tendency for men to stay within their own households and villages in the south may be generating the stronger branching signal for vertical transmission of long-house styles within each ethnolinguistic community. In contrast, the greater movement of men between households and villages in northerly areas may be associated with the greater tendency for the borrowing and exchange of housing styles between communities, especially if the men carried ideas about construction style with them. This may have led to a cumulative horizontal flow of long-house styles between communities, eroding any sharp stylistic differences. At present, these interpretations remain descriptive—fuller investigation of the potential links between kinship and diversification in long-house styles remain beyond the scope of this paper, but certainly point to intriguing directions for future work, both on the Pacific northwest coast and beyond.
7. CONCLUSION This paper has attempted to address a number of key debates in the current cultural transmission literature by undertaking a quantitative analysis of diversity in long-house styles on the Pacific northwest coast employing models and methods from the biological sciences. Phil. Trans. R. Soc. B (2010)
Focusing on the cultural transmission of the ‘collective’ tradition of communal architecture, rather than ‘small-scale’ crafts practised by single individuals, we have attempted to identify whether stylistic diversity has been a result of ‘branching’ or ‘blending’ processes and the degree to which material culture traditions can be linked with regional language diversity. The results indicate that, overall, branching processes of inheritance have dominated, and that vertical transmission of housing styles has at least partially been constrained by the region’s numerous linguistic boundaries. Looking more closely at the results (figures 3 and 4), it is also clear that the strength of the branching signal varies along different stretches of the northwest coast, with the strongest signals in the south, and a greater degree of blending in the north. Returning to the wider ethnographic literature to interpret these results, it appears that these differences correlate with important geographical variations in primary descent- and postmarital residence rules. Additional research could investigate how diversification of housing styles might be affected by variability in key social institutions. We would like to thank the UK’s Arts and Humanities Research Council (AHRC) for Sean O’Neill’s PhD Studentship (RC/APN111956) and for supporting the Center for the Evolution for Cultural Diversity (CECD) under whose auspices this research was carried out. Special thanks to Dr Tom Currie, Dr Fiona Jordan and Dr Jeff Oliver, and to three anonymous reviewers whose useful feedback greatly improved an earlier draft of this paper. All errors remain our own.
REFERENCES Ames, K. M. & Maschner, H. D. G. 1999 Peoples of the northwest coast: their archaeology and prehistory. London, UK: Thames and Hudson. Barnett, H. G. 1939 The coast Salish of British Columbia. Portland, OR: University of Oregon Monographs. Bentley, R. A., Maschner, H. D. G. & Chippendale, C. (eds) 2008 Handbook of archaeological theories. Plymouth, UK: Rowman & Littlefield Publishers. Boas, F. 1955 Primitive art. New York, NY: Dover Publications. Borgerhoff Mulder, M., Nunn, C. L. & Towner, M. C. 2006 Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52– 64. Boyd, R. & Richerson, P. J. 1985 Culture and the evolutionary process. Chicago, IL: Chicago University Press. Boyd, R., Borgerhoff-Mulder, M., Durham, W. H. & Richerson, P. J. 1997 Are cultural phylogenies possible? In Human by nature: between biology and the social sciences (eds P. Weingart, S. D. Mitchell, P. J. Richerson & S. Maasen), pp. 355–386. London, UK: Erlbaum. Bryant, D. & Moulton, V. 2004 NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. Mol. Biol. Evol. 21, 255 –265. Bryant, D., Filimon, F. & Gray, R. D. 2005 Untangling our past: languages, trees, splits and networks. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. Holden & S. J. Shennan), pp. 67–85. London, UK: University College London Press. Buchanan, B. & Collard, M. 2007 Investigating the peopling of North America through cladistic analyses of early Paleoindian projectile points. Anthropol. Archaeol. 26, pp. 59–76.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Untangling cultural inheritance Carlson, R. L. 1983 Indian art traditions of the northwest coast. Burnaby, BC: Archaeology Press of Simon Fraser University. Cavalli-Sforza, L. L. & Feldman, M. W. 1981 Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Clark, C. & Curran, D. J. 1986 Outgroup analysis, homoplasy, and global parsimony: a response to Maddison, Donohue, and Maddison. Syst. Zool. 35, 422–426. (doi:10.2307/2413393) Collard, M. & Shennan, S. J. 2008 Patterns, processes and parsimony: studying cultural evolution with analytical techniques from evolutionary biology. In Cultural transmission and material culture: breaking down boundaries (eds M. Stark, B. J. Bowser & L. Horne), pp. 17–33. Tucson, AZ: University of Arizona Press. Collard, M., Shennan, S. J. & Tehrani, J. 2006a Branching versus blending in macro-scale cultural evolution. In Mapping our ancestors. Phylogenetic approaches in anthropology and prehistory (eds C. Lipo, M. J. O’Brien, M. Collard & S. J. Shennan), pp. 53–63. London, UK: AldineTransaction. Collard, M., Shennan, S. J. & Tehrani, J. 2006b Branching, blending and the evolution of cultural similarities and differences among human populations. Evol. Hum. Behav. 27, 169–184. (doi:10.1016/j.evolhumbehav.2005.07.003) Collard, M., Shennan, S. J., Buchanan, B. & Bentley, R. A. 2008 Evolutionary biological methods and cultural data. In Handbook of archaeological theories (eds R. A. Bentley, H. D. G. Maschner & C. Chippendale), pp. 203–224. Plymouth, UK: Rowman & Littlefield Publishers. Coupland, G. 1996 The evolution of multi-family households on the northwest coast of North America. In People who lived in big houses: archaeological perspectives on large domestic structures (eds G. Coupland & E. B. Banning). Monographs in World Archaeology no. 27. Madison, WI: Prehistory Press. Drucker, P. 1950 Cultural element distributions: XXVI, Northwest Coast. Anthropol. Rec. 9, 157–294. Drucker, P. 1955 Indians of the northwest coast. New York, NY: The American Museum of Natural History. Durham, W. H. 1992 Applications of evolutionary culture theory. Ann. Rev. Anthropol. 21, 331 –355. (doi:10.1146/ annurev.an.21.100192.001555) Eerkens, J. W. & Lipo, C. P. 2005 Cultural transmission, copying errors, and the generation of variation in material culture and the archaeological record. Anthropol. Archaeol. 24, 316 –334. Emmons, G. T. & de Laguna, F. 1991 The Tlingit Indians. New York, NY: The American Museum of Natural History. Farris, J. S. 1982 Outgroups and parsimony. Syst. Zool. 31, 328 –334. (doi:10.2307/2413239) Farris, J. S. 1989a The retention index and homoplasy excess. Syst. Zool. 38, 406 –407. (doi:10.2307/2992406) Farris, J. S. 1989b The retention index and the rescaled consistency index. Cladistics 5, 417– 419. (doi:10.1111/j. 1096-0031.1989.tb00573.x) Forey, F. L., Humphries, C. J., Kitching, I. L., Scotland, R. W., Siebert, D. J. & Williams, D. M. 1992 Cladistics. A practical course in systematics. Oxford, UK: Claredon Press. Gray, R. D. & Jordan, F. 2000 Language trees support the express-train sequence of Austronesian expansion. Nature 405, 1052–1055. (doi:10.1038/35016575) Gray, R., Greenhill, S. J. & Ross, R. M. 2007 The pleasures and perils of Darwinizing culture (with phylogenies). Biol. Theory 2, 360 –375. (doi:10.1162/biot.2007.2.4.360) Hawthorn, A. 1979 Kwakiutl art. Seattle, WA: University of Washington Press. Hennig, W. 1966 Phylogenetic sytematics. Urbana, IL: University of Illinois Press. Phil. Trans. R. Soc. B (2010)
P. Jordan & S. O’Neill 3887
Holden, C. & Shennan, S. J. 2005 Introduction to part 1: how treelike is cultural evolution? In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. Holden & S. J. Shennan). London, UK: University College London Press. Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267. Inverarity, R. B. 1971 Art of the northwest coast Indians. Berkeley, CA: University of California Press. Jonaitas, A. 1981 Tlingit halibut hooks: an analysis of the visual symbols of a rite of passage. Anthropol. Pap. Mus. Nat. Hist. 57, 3– 48. Jordan, P. D. 2007 Continuity and change in different domains of culture: an emerging approach to understanding diversity in technological traditions. In The model-based archaeology of socionatural systems (eds T. A. Kohler & E. Van der Leeuw), pp. 13–39. Santa Fe, NM: School for Advance Research Press. Jordan, P. D. 2009 Linking pattern to process in evolution: explaining material culture diversity among the Northen Khanty of Northwest Siberia. In Pattern and process in cultural evolution (ed. S. J. Shennan), pp. 61– 83. Berkeley, CA: University of California Press. Jordan, P. D. & Mace, T. 2006 Tracking culturehistorical lineages: can ‘descent with modification’ be linked to ‘association by descent’? In Mapping our ancestors. Phylogenetic approaches in anthropology and prehistory (eds C. Lipo, M. J. O’Brien, M. Collard & S. J. Shennan). London, UK: AldineTransaction. Jordan, P. & Mace, T. 2008 ‘Gendered’ technology, kinship and cultural transmission amongst Salish-apeaking communities on the Pacific Northwest Coast: a preliminary investigation. In Breaking down boundaries: anthropological approaches to cultural transmission and material culture (eds M. Stark, B. Bowser & L. Horne), pp. 34–62. Tucson, AZ: University of Arizona Press. Jordan, P. D. & Shennan, S. J. 2003 Cultural transmission, language, and basketry traditions amongst the California Indians. Anthropol. Archaeol. 22, 42–74. (doi:10.1016/ S0278-4165(03)00004-7) Jordan, P. D. & Shennan, S. J. 2009 Diversity in hunter– gatherer technological traditions: mapping trajectories of cultural ‘descent with modification’ in northeast California. J. Anthropol. Archaeol. 30, (doi:10.1016/j.jaa. 2009.05.004) Jorgensen, J. G. 1980 Western Indians: comparative environments, languages, and cultures of 172 western American Indian tribes. San Francisco, CA: W.H. Freeman and Company. Kishino, H. & Hasegawa, M. 1989 Evaluation of maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. Mol. Evol. 29, 170–179. (doi:10.1007/ BF02100115) Kitching, I. J., Forey, P. L., Humphries, C. J. & Williams, D. M. 1998 Cladistics: the theory and practice of parsimony analysis. Oxford, UK: Oxford University Press. Lipo, C., O’Brien, M. J., Collard, M. & Shennan, S. J. 2006 Mapping our ancestors. Phylogenetic approaches in anthropology and prehistory. London, UK: Aldine Transaction. MacDonald, G. F. 1983a Haida monumental art. Villages of the Queen Charlotte Islands. Vancouver, Canada: University of British Columbia Press. MacDonald, G. F. 1983b Ninstints. Haida world Heritage site. Vancouver, Canada: University of British Columbia Press. Mace, R., Holden, C. J. & Shennan, S. J. (eds) 2005 The evolution of cultural diversity. London, UK: UCL Press.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3888
P. Jordan & S. O’Neill
Untangling cultural inheritance
Maddison, D. R. & Maddison, W. P. 2000 MacClade 4: analysis of phylogeny and character evolution. Sunderland, MA: Sinauer. Matson, R. G. 2003 The Coast Salish house: lessons from shingle point, Valdes Island, British Columbia. In Emerging from the mist: studies in northwest coast culture history (eds R. G. Matson, G. Coupland & Mackie), pp. 76– 104. Vancouver, Canada: UBC Press. Matson, R. G. & Coupland, G. 1995 The prehistory of the northwest coast. San Diego, CA: Academic Press. Moore, J. H. 1994 Putting anthropology back together again: the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925–948. (doi:10.1525/aa.1994.96.4.02a00110) Moore, J. H. 2001 Ethnogenetic patterns in native North America. In Archaeology, language and history: essays on culture and ethnicity (ed. J. E. Terrell). Westport, CT: Bergin and Garvey. Nobokov, P. & Easton, R. 1989 Native American architecture. Oxford, UK: Oxford University Press. Nunn, C. L., Arnold, C., Matthews, L. & Mulder, M. B. 2010 Simulating trait evolution for cross-cultural comparision. Phil. Trans. R. Soc. B 365, 3807–3819. (doi:10.1098/rstb. 2010.0009) O’Brien, M. J. (ed.) 2008 Cultural transmission and archaeology. Issues and case studies. Washington, DC: Society for American Archaeology. O’Brien, M. J., Lyman, R. L., Mesoudi, A. & VanPool, T. L. 2010 Cultural traits as units of analysis. Phil. Trans. R. Soc. B 365, 3797–3806. (doi:10.1098/rstb.2010.0012) Page, R. 2003 Introduction. In Tangled trees: phylogeny, cospeciation and coevolution (ed. R. Page), pp. 1– 21. Chicago, IL: University of Chicago Press. Page, R. & Charleston, M. 1998 Trees within trees: phylogeny and historical associations. Tree 13, 356 –359. Rosman, A. & Rubel, P. G. 1971 Feasting with mine enemy: rank and exchange among Northwest Coast societies. New York, NY: Columbia University Press. Samuels, S. R. 1991 Ozette archaeological project research reports. House structure and floor midden, vol. 1. WSU Department of Anthropology Reports of Investigations 63. Pullman, WA: Washington State University. Shennan, S. J. 1997 Quantifying archaeology. Edinburgh, UK: Edinburgh University Press. Shennan, S. J. 2002 Genes, memes and human history. Darwinian archaeology and cultural evolution. London, UK: Thames & Hudson. Shennan, S. J. 2004 An evolutionary perspective on agency in archaeology. In Agency uncovered (ed. A. Gardner). London, UK: University College London Press. Smith, A. B. 1994 Systematics and the fossil record: documenting evolutionary patterns. Oxford, UK: Blackwell Science.
Phil. Trans. R. Soc. B (2010)
Stark, M., Bowser, B. J. & Horne, L. (eds) 2008 Cultural transmission and material culture: breaking down the barriers. Tucson, AZ: University of Arizona Press. Stewart, H. 1984 Cedar, tree of life to the northwest coast Indians. Vancouver, Canada: Douglas and McIntyre. Suttles, W. (ed.) 1990 Handbook of North American Indians, vol. 7. Northwest coast. Washington, DC: Smithsonian Institution. Swofford, D. L. 1998 PAUP*: phylogenetic analysis using parsimony (* and other methods), version 4. Sunderland, MA: Sinauer. Tehrani, J. & Collard, M. 2002 Investigating cultural evolution through biological phylogenetic analyses of Turkmen textiles. J. Anthropol. Archaeol. 21, 443 –463. (doi:10.1016/S0278-4165(02)00002-8) Tehrani, J. & Collard, M. 2009a The evolution of material culture diversity among Iranian tribal populations. In Pattern and process in cultural evolution (ed. S. J. Shennan), pp. 99–111. Berkeley, CA: University of California Press. Tehrani, J. & Collard, M. 2009b On the relationship between inter-individual cultural transmission and population-level cultural diversity: a case-study of weaving in Iranian tribal populations. Evol. Hum. Behav. 30, 286 –300. (doi:10.1016/j.evolhumbehav. 2009.03.002) Tehrani, J. J., Collard, M. & Shennan, S. J. 2010 The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles. Phil. Trans. R. Soc. B 365, 3865 –3874. (doi:10.1098/rstb.2010.0020) Te¨mkin, I. & Eldridge, N. 2007 Phylogenetics and material culture evolution. Curr. Anthropol. 48, 146–153. (doi:10.1086/510463) Terrell, J. 1987 Comment on ‘History, phylogeny and evolution in Polynesia’, by Kirch, P.V. and Green, R.C. Curr. Anthropol. 28, 447 –448. Terrell, J. 1988 History as a family tree, history as an entangled bank: constructing images and interpretations of prehistory in the South Pacific. Antiquity 62, 642 –657. Thompson, L. C. & Kincade, M. D. 1990 Languages. In Handbook of North American Indians, vol. 7. Nothwest coast (ed. W. Suttles), pp. 30–51. Washington, DC: Smithsonian Institution. Vastokas, J. M. 1966 Architecture of the northwest coast Indians of America. PhD thesis, Columbia University, New York, NY. Watrous, L. E. & Wheeler, Q. D. 1981 The outgroup method of character analysis. Syst. Zool. 30, 1 –11. (doi:10.2307/2992297)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3889–3902 doi:10.1098/rstb.2010.0091
Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of Remote Oceania Ethan E. Cochrane1,2,* and Carl P. Lipo3 1
International Archaeological Research Institute, 2081 Young Street, Honolulu, HI 96826, USA 2 AHRC Centre for the Evolution of Cultural Diversity, 31 – 34 Gordon Square, London WC1H 0PY, UK 3 Department of Anthropology and the Institute for Integrated Research in Materials, Environments and Society, California State University Long Beach, 1250 Bellflower Boulevard, Long Beach, CA 90840, USA
Intricately decorated Lapita pottery (3100– 2700 BP) was made and deposited by the prehistoric colonizers of Pacific islands, east of the main Solomon’s chain. For decades, analyses of this pottery have focused on the ancestor– descendant relationships of populations and the relative degree of interaction across the region to explain similarities in Lapita decoration. Cladistic analyses, increasingly used to examine the evolutionary relationships of material culture assemblages, have not been conducted on Lapita artefacts. Here, we present the first cladistic analysis of Lapita pottery and note the difficulties in using cladistics to investigate datasets where a high degree of horizontal transmission and non-branching evolution may explain observed variation. We additionally present NeighborNet and phenetic distance network analyses to generate hypotheses that may account for Lapita decorative similarity. Keywords: networks; cladistics; Lapita; Oceania; pottery
1. INTRODUCTION Cladistic techniques have demonstrated their potential to investigate patterns of human cultural, biological and linguistic relatedness in an elegant statistical manner (Mace & Pagel 1994; O’Brien et al. 2001; Tehrani & Collard 2002; McMahon & McMahon 2005; Kimbel et al. 2006; Lipo et al. 2006; Skelton 2008). However, particular historical circumstances may be difficult to investigate using cladistics (cf. Te¨mkin & Eldridge 2007). For example, excessive rates of horizontal transmission may adversely affect the ability of cladistics to resolve an accurate population history (Nunn et al. 2010). Borgerhoff Mulder et al. (2006; see also Nunn et al. 2010) have examined this problem through simulations and determined that with increasing horizontal transmission relative to vertical transmission, geographical distance, not phylogenetic distance, is a better predictor of cultural trait variation. When it seems likely that horizontal transmission structures vary to a large degree, some scholars have relied upon phylogenetic techniques that allow for and attempt to identify reticulation (e.g. Hurles et al. 2003; Gray et al. 2007). Gray et al. (2007) note that horizontal transmission can be investigated using techniques such as NeighborNet (Bryant & Moulton 2004), and that incongruent transmission * Author for correspondence ([email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
histories for different traits or suites of traits can be investigated using Bayesian multiple topology mixture models (Pagel & Meade 2004). These potential problems may also be addressed by comparing results from cladistic analysis and from non-phylogenetic phenetic distance analysis. In this paper, we explain variation in archaeological materials by combining phenetic distance networks, cladistics and NeighborNet. We use these techniques to analyse variation in the decorative motifs of the Lapita pottery tradition, the earliest (3100–2700 BP) prehistoric pottery in Remote Oceania (figure 1). Our cladistic and NeighborNet analyses indicate that decorative variation across pottery assemblages may be largely explained by horizontal transmission between contemporary populations with several spatially overlapping local populations (demes) identified, some previously recognized through traditional archaeological approaches. Network analysis of phenetic distances in the same data identify similar local populations, but also several phenetic similarity-based links between assemblages that help define hypotheses of population structure in the southwest Pacific during the first several hundred years of human occupation (3100–2700 BP). The next section briefly compares the approaches of cladistics, NeighborNet and phenetic distance networks. Following this we present our analyses within the context of previous archaeological research and discuss methodological issues in the phylogenetic analysis of pottery assemblages.
3889
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3890
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
far western Lapita province Bismarck Archipelago
Remote Oceania
Solomons
New Guinea
Near Oceania
western Lapita province sOUTHEAST SOLOMONS/ NORTHERN VANUATU SAMOA
Vanuatu NATANUKU
NAIGANI
southern Lapita province YANUCA
TONGA
Fiji LAKEBA
eastern Lapita province
NEW CALEDONIA
Figure 1. Some of the archipelagos of Near and Remote Oceania with Lapita provinces noted. Analysed assemblages are labelled using small capital letters.
2. PHENETIC DISTANCE NETWORKS, CLADOGRAMS AND NEIGHBORNETS Our description of phenetic distance network methods draws largely on the work of quantitative sociologists and others including Carrington et al. (2005), Hage & Harary (1996), Lipo (2006), Scott (2000) and Wasserman & Faust (1994). Nodes represent taxa that are connected by edges describing the quantitative difference between taxa. To quantify taxon similarity, we calculate Hamming distances (Hamming 1980), the number of character state differences between taxa. For example, two taxa described by four binary characters as 0001 and 1011 differ by a Hamming distance of two. We call the resulting phenetic distance network a mini-max graph, because it displays only those edges with Hamming distances equal to, or less than, the highest value required to connect all nodes in the graph to at least one other node using a distance-minimizing algorithm. This graph, therefore, employs a parsimony criterion as it accounts for the greatest number of character similarities among all nodes in the simplest way within the rules of the method. Node position as depicted visually is a product of multi-dimensional scaling (MDS) performed on the data matrix. Edge thickness corresponds to the number of character states shared between nodes. Thicker edges indicate a greater number of character states shared and thus a lower Hamming distance. The matrix of pairwise Hamming distances is calculated using simple Microsoft EXCEL macros, and the graphical depiction of the resulting network is obtained using NetDraw, a program within UCINET 6 for WINDOWS (Borgatti et al. 2010). Table 1 gives a set of character states for a theoretical dataset with five taxa and seven characters. Figure 2 compares the different depictions of taxon similarity and assumed relatedness using cladistics, the mini-max graph method and NeighborNet. Phil. Trans. R. Soc. B (2010)
Table 1. Example dataset with seven characters and five taxa. class
C1
C2
C3
C4
C5
C6
C7
a b c d e
0 1 1 1 1
0 1 0 1 0
0 0 1 1 0
0 0 1 1 1
0 1 1 0 0
0 1 0 0 1
0 1 1 0 0
Figure 2a is a consensus cladogram (CI ¼ 0.63, RI ¼ 0.33) based on five trees generated from this character matrix using PAUP* 4.0. The relationships between classes a, d and e are apparently unresolved. Figure 2b is a mini-max graph of the same character matrix. Finally, for comparison, NeighborNet output obtained for the same data using Splits Tree4 (Huson & Bryant 2006) is shown in figure 2c and displays two conflicting splits (or clades). The split including taxa ‘c’ and ‘e’ (bold lines) conflicts with the split including taxa ‘a’ and ‘c’ (dashed lines). Although the cladistic depiction of relatedness is based on the distribution of derived characters, the mini-max graph depiction is based on the minimum phenetic distances required to link all the taxa under consideration. This is not the same use of phenetic distance as in numerical taxonomy (e.g. Sokal & Sneath 1963) or other statistical clustering approaches, where all pairwise phenetic distances contribute to a final hierarchical arrangement of taxa or similarity matrix order. Instead, assuming that our taxa descriptions track similarity that mainly results from common cultural transmission pathways, the phenetic distance measures used in our mini-max graphs are the minimal transmission pathways necessary to connect a set of taxa, but without specifying ancestor – descendant relationships as part of the ordering algorithm.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo (a) a
d
e
b
c
3891
M1.2
M19.11
M1.3
M19.14
M2.2 M24.5 M2.4 M2.6 M3.1 M5.3 (b)
b c
M28.3
M30.2
M5.8 M6.3 M6.4
d
M34.6
M8.9
e
M39.3 M10(2).4 a
(c)
M30.4
M39.4 M67.1
c e
M14(2).2
M.67.3
M14(2).9 M77.(1) a
M16.1 M18.2 M19.1
M77(2).3 M77.5
b d Figure 2. Three graphic representations of relatedness among the same set of taxa using different methodological assumptions: (a) consensus tree, (b) a mini-max graph with node position determined through metric MDS and line thickness indicating Hamming distance, (c) a neighbornet with conflicting splits indicated by bold and dashed lines, respectively.
3. WHAT WAS THE POPULATION STRUCTURE OF THE LAPITA COLONIZERS OF REMOTE OCEANIA? (a) Previous research on the cultural relatedness of Lapita colonists Remote Oceania was colonized from Near Oceania approximately 3100 BP (figure 1). This is about 400 years after Lapita pottery is first deposited in Near Oceania, a region inhabited for over 40 000 years. The similarity of complex Lapita designs across Near and Remote Oceania suggests that design variation is a product of cultural relatedness. Lapita pottery comprises diverse vessel forms and an intricate decorative system that has been the subject of intense archaeological study for the last few decades (summarized in Kirch 1997), including multiple classifications of the Lapita decorative system, all with the intent of measuring homologous similarity (Mead et al. 1973; Anson 1983; Poulsen 1987; Siorat 1990; Chiu 2003). Mead et al. (1973) observed that the decorations on Lapita pots were made using a limited number of individual dentate stamps and Phil. Trans. R. Soc. B (2010)
Figure 3. A sample of Lapita motifs from Green (1979) with the motif number from the Mead system noted.
shaped tools to create ‘design elements’ (DEs). DEs were identified as the smallest or most exclusive components of decoration that frequently occurred on pottery sherds across archaeological sites (Mead et al. 1973, p. 20) and include, for example, a dashed crescent line placed vertically (DE 1) and a dashed oval with pointed ends (DE 4). DEs may appear by themselves or are combined in patterned ways to form motifs that are repeated across the surface of vessels. Hundreds of motifs are recognized, with new motifs still being identified (e.g. Chiu 2003). Motifs are the primary analytical unit used in comparative research examining the cultural relatedness of Lapita communities. Green (1979), for example, suggested that transmission across the Lapita population in Remote Oceania was spatially structured, or at least became so soon after colonization. In support of this Green identified an inventory of early (ca 3100 – 3000 BP) motifs recorded from across Remote Oceania (figure 3), but soon thereafter the spatial distribution of many motifs was restricted to different regions: far western, western, southern and eastern (figure 1). These geographical groupings, except for the southern region added later, were mostly determined by comparing motif inventories at archaeological sites using Jaccard correlation coefficients (Green 1979). Archaeologists have largely confirmed these regional groupings in subsequent
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3892
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
analyses using other correlation measures and ordination techniques (e.g. Best 1984; Kirch 1988; Summerhayes 2001). Archaeologists have also analysed Lapita motif distributions to determine the pattern of colonization within Remote Oceania and the frequency of postcolonization interaction between island populations. This is relevant for research that attempts to explain the origins of distinct cultural lineages in Remote Oceania that probably evolved from the earlier, possibly more culturally homogeneous, colonizing population (Green 1995). Of particular importance to many scholars is the evolution of Ancestral Polynesian Society, conceived as a distinct cultural lineage with its origins in the Tonga – Samoa region that diverged from a common cultural ancestor shared with Fijian populations (Kirch 1984; Kirch & Green 1987, 2001; cf. Boyd et al. 1997; Terrell et al. 1997). Addressing questions about the patterns of colonization, Burley & Dickinson (2001, 2009) have identified Nukuleka (site Tonga1 in table 2) as the likely earliest colonization site in Tonga. The oldest Lapita sherds at Nukuleka bear similar motifs to those found in the Western Lapita province, specifically northern Vanuatu. Burley & Dickinson argue that Nukuleka was the initial settlement of migrants from northern Vanuatu and that these migrants may have bypassed Fiji. Additionally, Nukuleka may have ‘served as the initial staging point for population expansion within western Polynesia . . . to other parts of Tonga and into Samoa’ (Burley & Dickinson 2001, p. 11 830). This suggests that cladogenesis may explain some material culture variation between archipelago populations in this part of Remote Oceania (see also Burley et al. 2002). Clark & Murray (2006) also examined the pattern of colonization in Fiji and Tonga, and post-colonization interaction, through the distribution of Lapita motifs. Using the assumptions of ‘distance-decay’ (Green 1979) and unbiased transmission (Boyd & Richerson 1985; Neiman 1995; Bentley & Shennan 2003) to understand motif distribution, Clark & Murray argue that in any particular area, the oldest motifs will be the most abundant and the youngest will be least abundant. When a population colonizes a new area, primarily only the oldest, most abundant motifs, will enter the archaeological record of the colonized area, with the newer, less-established motifs dropping out of the decorative system. Clark & Murray (2006, p. 114 – 115) argue that their analysis of motif ranks suggests that east Fiji was colonized by populations from both Tonga and west Fiji and that there was no significant transmission of motif variants between newly arrived Fiji – Tonga populations and populations to the west in Vanuatu and New Caledonia. These analyses and others (e.g. Kirch 1988; Sand 2007) have come to different, though not necessarily mutually exclusive, conclusions about colonization and transmission history in western Remote Oceania. One probable reason for these different conclusions is the different methods used by researchers to assess cultural relatedness. The new analyses presented here build upon this earlier work and improve our Phil. Trans. R. Soc. B (2010)
understanding of Remote Oceanic Lapita population structure and colonization history through methods designed to tease out ancestor – descendant relationships and horizontal transmission (see also O’Brien et al. 2001; Cochrane 2009, ch. 2).
(b) Methodological concerns in phylogenetic analyses of Lapita motifs in Remote Oceania Even though phylogenetic models have been used to investigate variation in Pacific Island material culture for several decades (e.g. Kirch 1984; Kirch & Green 1987, 2001), it is only in the last few years that quantitative cladistic analyses have been applied to artefact datasets in the Pacific (e.g. Cochrane 2004, 2008, 2009; Shennan & Collard 2005; Tolstoy 2008). At least two methodological issues arise in the application of cladistics, NeighborNet and phenetic distance network analysis to Lapita decorative variation in Remote Oceania. First, when taxa comprise artefact assemblages or types what does a cladogram, rooted tree, neighbornet or phenetic distance network represent in terms of human population history? Setting aside the depositional and taphonomic processes that may structure artefact assemblages, assemblage-based cladograms, neighbornets and phenetic distance networks may depict the homologies shared by assemblages and thus the population structure of the potters who transmitted these traditions, while a rooted assemblage tree, as argued by Tolstoy (2008), might also represent more general ancestor–descendant relationships among local populations (Collard & Shennan 2000). General ancestor–descendant relationships among human populations have of course been hypothesized from language trees (e.g. Gray & Jordan 2000; Rexova et al. 2003; cf. Borgerhoff-Mulder 2001; Terrell 2001). A problem with using artefact assemblage trees to reconstruct population history is that it is unlikely that variation in one, or even a few, specialized artefact types will correspond closely to the general pattern of genetic or linguistic relatedness (see Guglielmino et al. 1995; Jordan & Shennan 2003), although many would argue that Lapita pottery is a special case where just this correspondence would have occurred (Spriggs 1984; Kirch 1997; Green 2003). A second methodological problem concerns classification. Whereas the similarity of Lapita motifs across Remote Oceania is certainly a result of cultural transmission, there have been no attempts to estimate the degree to which motifs are shared between assemblages because of common cultural descent, as opposed to chance convergence or mechanical constraints on motif application. Specht (2004) has discussed similar issues, noting that simple comparisons of Lapita assemblages based on various measures of motif similarity might not accurately estimate transmission history (see also Best 2002, p. 94). One possible solution is to weight characters in Lapita motif datasets. Alternatively, Pocklington (2006, p. 25–27) argues that if similarities in an analytical unit such as a design motif indicate shared cultural histories, then two or more subunits within the motif should generate assemblage dissimilarity matrices that are correlated when evaluated by a
1 0 1 1
1 0 1
1 0 0 0 0 1 1 0 1
1
1 1 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0
1 0 1 1
1 0 1
1 1 0 0 0 0 1 0 1
0
1 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 1
DE1 (m) M99 (d) DE2 (m) P25, P26 (p) DE2.4 (m) M1.5 (d) A25, A26 (p) DE3 (m) DE4 (m) DE4.2 (m) 393 (a) DE9 (m) DE10 (m) M1 (m) 9 (a) B11, B12, B16 (p) B13, B14, B15 (p) M2 (m) M2.5 (m) M3 (m) M85 (d) M5 (m) M5.6 (m) M6 (m) M7 (m) M8 (m) M10 (m) M11 (m) L3,L4 (b) M13 (m) M15 (m) M16 (m) K8, K9 (p) M16.1a (m)
Tonga2
Tonga1
motifa
Phil. Trans. R. Soc. B (2010)
1 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 1
0
1 0 0 0 0 1 1 0 0
1 0 0
1 0 1 0
Tonga3
1 1 1 0 0 0 0 0 1 0 0 0 0 1 1 0 1
1
0 0 1 1 0 0 1 0 0
1 0 0
1 0 1 0
Lakeba1
1 0 1 0 0 0 0 0 1 0 0 1 0 1 1 0 1
0
0 0 1 1 0 0 1 0 0
0 0 0
1 0 1 0
Lakeba2
1 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1
0
0 0 1 1 0 0 1 0 0
0 0 0
1 0 1 0
Lakeba3
1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 1
0
0 0 0 0 0 0 1 0 1
0 0 0
1 0 1 0
Yanuca1
1 0 1 0 1 0 0 1 1 1 1 0 0 1 1 0 1
0
0 0 1 0 0 0 1 0 1
0 0 0
1 0 1 0
Yanuca2
1 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 1
0
0 1 0 0 0 0 1 0 0
0 0 0
1 0 1 0
Yanuca3
1 0 0 0 1 0 1 0 1 1 1 0 0 1 1 0 1
0
1 1 0 1 1 0 1 0 0
1 0 0
1 0 1 0
Natunuku1
1 0 0 0 1 0 1 0 1 0 1 0 0 1 1 0 1
0
1 0 0 0 0 0 1 0 0
0 0 0
1 0 0 0
Natunuku2
1 0 0 0 0 0 1 0 1 1 1 1 0 1 0 0 1
0
1 1 0 1 1 0 1 0 0
0 0 0
1 0 0 0
Natanuku3
1 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0
0
1 1 0 1 0 0 1 1 0
1 1 0
1 1 1 0
Naigani1
1 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0
0
1 0 0 0 0 0 1 0 0
0 0 0
1 0 0 0
Naigani2
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0
1 1 0 0 0 0 1 0 0
0 0 0
1 0 1 0
Samoa
1 0 0 1 1 1 1 1 0 0 0 0 0 1 1 0 1
1
1 0 1 0 0 0 1 0 1
1 0 0
1 0 1 0
1 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0
0
1 0 0 0 0 0 1 0 1
0 0 0
1 0 1 0
New Caledonia
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo (Continued.)
SE Solomons
Table 2. Presence –absence matrix of Lapita motifs at selected Remote Oceanic sites. Data from Best (1984, table 9.3). Numbers after islands indicate early (1), middle (2) or late (3) assemblages as describe in text.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3893
1
0 0 1 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1
1
0 1 1
1
0 1 1 1 1 0 1 0 0 0 0 1 1 1 0 1 1 0 1 0 1 1 1 0 0 1 0 0 1
1
0 1 1
D10, D11, D12, D14 (p) 169 (a) P30 (p) M17 (m) M18 (m) M19 (m) M20 (m) K6 (p) 209 (a) M21.1a (m) M22 (m) M23 (m) M24 (m) M34 (m) M34.2 (m) M25 (m) M26 (m) N4 (p) 313 (a) M37 (m) M44 (m) M39 (m) N6 (p) N7 (p) M27 (m) M28 (m) M29 (m) M30 (m) M45 (m) P13, P14 (p) P15, P16 (p) 283 (a) P36 (p) P1-6 (p)
Tonga2
Tonga1
Phil. Trans. R. Soc. B (2010)
0 0 1
0 1 1
0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0
1
Lakeba1
0 0 1
0
0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
0
Lakeba2
0 0 1
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0
Lakeba3
0 1 0
0
0 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 0
0
Yanuca1
0 0 0
0
0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0
0
Yanuca2
0 0 0
0
0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0
0
Yanuca3
1 0 0
0
0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0
0
Natunuku1
0 1 0
0
0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
0
Natunuku2
0 0 0
0
0 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0
0
Natanuku3
0 0 1
0
1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 1 0 0 0
1
Naigani1
0 0 1
0
0 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0
1
Naigani2
0 1 0
0
0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0
Samoa
0 0 0
0
1 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0
1
SE Solomons
0 0 0
0
0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0
New Caledonia
E. E. Cochrane & C. P. Lipo
1
0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0
Tonga3
3894
motifa
Table 2. (Continued.)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery
Phil. Trans. R. Soc. B (2010)
a
1 0 0 0 0 0 1 1 1 1 1
0 0 1 0
1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 0
1 1 1 0
0 0 1 0 0 1 1 1 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0
0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0
1 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
1 0 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
1 0 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0
0 0 0 0
0 0 0 1 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
1 0 0 0 1 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0
0 0 0 0
0 1 0 0 1 1 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0
0 0 0 0 0 1 0 0 0 0 0
Sources for motif descriptions in brackets: m ¼ Mead et al. 1973; p ¼ Poulsen 1987; a ¼ Anson 1983, d ¼ Donovan 1973; k ¼ Kay 1984.
P32 (p) L14 (b) M31 (m) M32 (m) M35 (m) M1-4 (p) M42 (m) E1-4 (p) M46 (m) M12.2 (m) M50, M51 (m) K17 (p) K5 (p) K4 (p) M14(2).10 (d) K15 (p) K16 (p) L1-6 (p) N8 (p) D15 (p) D16 (p) P7-10 (p) P23 (p) P17-19 (p) A31,A32 (p) M76 (m) E5 (p) 272, 273 (a) M65.3 (d) 100 (a) 345 (a) 41 (k) M33 (m) M84 (d) M74 (d) 102 (k) 104 (b) 327 (a) 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0
0 0 0 0
0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1
0 0 0 1
0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1
0 0 0 0
0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0
0 0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 1 1 0 1 0
0 0 0 1
0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
0 0 0 0 0 0 1 0 1 0 0
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo 3895
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3896
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
Mantel test. Such an analysis requires motif definitions to be formed from combinations of similarly scaled subunits, such as in a paradigmatic classification (Dunnell 1971), but this has not been done for Lapita motifs. As a work around, if Lapita motifs are shared owing to shared cultural transmission histories, then we can expect some general spatial patterns of motif similarity when comparing early and late assemblages. Abundant research (see Kirch 1997) indicates that the frequency of transmission between local populations in Remote Oceania declined during the first 500 years. Thus all else being equal, correlations between pairwise distance and motif similarity matrices for early sites should be greater than for late sites, because as transmission between local populations declines, these populations will probably diverge in their production and the use of stylistic motifs owing to drift (Dunnell 1978; Rogers & Ehrlich 2008).
4. MATERIAL AND METHODS (a) Lapita pottery dataset The dataset of Lapita motifs used here (table 2) is generated from Best’s (1984) archaeological research on Lakeba, an island in eastern Fiji (figure 1). To compare the Lakeba Lapita assemblage with others in the region, Best compiled the abundance of 106 motifs in 17 assemblages through direct analysis of pottery assemblages, examination of motif drawings, and previously published datasets. These assemblages are either from single-component sites with relatively homogeneous cultural deposits such as Mulifanua in Samoa, or from grouped single-component sites in the cases of two sites from New Caledonia (Ile des Pins and Site 13) and four sites from the southeast Solomons (northern Vanuatu, sites SE-RF 2 and 6, SE-SZ 8 and 45). Best placed the single-component sites in an early period based on radiocarbon dates and motif inventories (see also Sand 1997; Green et al. 2008; Rieth & Hunt 2008). Four sites from Tonga are grouped into three temporal assemblages: early, middle and late time periods (see also Burley 1998; Burley & Dickinson 2001; Burley & Connaughton 2007). Three sites from Fiji (Lakeba, Yanuca and Natunuku) are each considered separate assemblages and are divided into early, middle and late periods by Best (see also Clark & Anderson 2009). One site from Fiji, Naigani, is considered a separate assemblage divided into early and middle periods by Best. Precise provenience information (e.g. excavation squares and layers) for each assemblage is given by Best (1984, p. 619, table 9.2). Best’s raw abundance data for motifs are transformed into presence/absence data in table 2. Oceanic archaeologists will recognize the weakness of this dataset: much archaeological work in Remote Oceania over the last 25 years has produced larger ceramic assemblages with many more recognized motifs, although this recent work lacks any detailed concordance data for the different motif classification systems used. If the research presented here proves worthwhile, the next step is to use a single, comprehensive classificatory system (e.g. Chiu 2003) to quantify all recovered Lapita assemblages in Remote Oceania for phylogenetic analysis. Phil. Trans. R. Soc. B (2010)
(b) Methods: Mantel matrix test To assess the ability of Lapita motifs to measure transmission, a Euclidean distance dissimilarity matrix was constructed from presence –absence data on the early Lapita motif assemblages at the archaeological sites of Tonga1, Lakeba1, Yanuca1 and Natunuku1 (table 2) and a second dissimilarity matrix was constructed for the late assemblages (Tonga3, Lakeba3, etc.) from these same sites. These are the only sites or areas in the dataset analysed here that contain both early and late Lapita assemblages. The geographical distances between sites were placed in matrices for both straight-line distance and estimated sailing distance (also straight-line, but not over land) and converted to Euclidean dissimilarities. Separate Mantel tests of the correlations between the early and late Lapita motif matrices and both geographical distance matrices were carried out using XLSTAT, a statistical analysis add-in for Microsoft EXCEL. (c) Methods: cladistic, NeighbourNet and phenetic distance network analyses To examine how cultural transmission may structure Lapita assemblage similarity, the dataset in table 2 was initially analysed using PAUP* 4.0 (Swofford 2001). To investigate the importance of reticulation events, the same dataset was also analysed using both NeighborNet, as implemented in SPLITS TREE4 (Huson & Bryant 2006), and our phenetic distance network procedures (see above). 5. RESULTS Separate Mantel tests of the correlations between the early Lapita motif matrices and both geographical distance matrices indicate that these matrices are correlated (straight-line distance: r ¼ 0.845, twotailed p value ¼ 0.043; sailing distance: r ¼ 0.855, two-tailed p value ¼ 0.04). Mantel tests of the correlation between the late Lapita motif matrix and both geographical distance matrices indicate that these matrices are not correlated (straight-line distance: r ¼ 0.409, two-tailed p value ¼ 0.425; sailing distance: r ¼ 0.428, two-tailed p value ¼ 0.386). The tests support previous research indicating a decline over time in the frequency of transmission between local populations in Remote Oceania and therefore suggest Lapita motifs measure (through presence– absence) the degree to which assemblage similarity is a product of cultural transmission. Cladistic parsimony analysis of all assemblages (taxa) described by Lapita motifs (characters, unordered) and divided into two time periods produced two unrooted cladograms (length ¼ 225, CI ¼ 0.45, RI ¼ 0.49) that differ only in their resolution of the relationships between the three Lakeba assemblages. The 50 per cent majority-rule consensus cladogram (figure 4a) depicts several relationships between assemblages also identified through other archaeological research. These relationships include the high relative similarity and putative cultural relatedness of the Naigani assemblages and assemblages from archipelagos to the west (Best 1987), and the high similarity shared between Tonga and Samoa
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo (a)
SE Solomons
Naigani1
Samoa Tonga3
76
Naigani2
New Caledonia Tonga2
SE Solomons 72 30
89
18
Samoa
29
Tonga1
Tonga
36
67
Natunuku3
Naigani
11
46
New Caledonia
(b)
3897
Lakeba
27 85
Yanuca2
19 77
Natunuku
Yanuca
Lakeba1
Natunuku1 Natunuku2 Yanuca3 Yanuca1 Lakeba3 (c)
Lakeba2
Naigani1
(d) SE Solomons
New Caledonia 72
Naigani2 SE Solomons
59 Yanuca3
20
29
Samoa 28 Lakeba1
Natunuku3
Tonga1
Tonga3
New Caledonia 38
Yanuca1 Natunucu1
Samoa
Lakeba3
Figure 4. Consensus cladograms (50% majority rule) produced through cladistic analysis of the matrix in table 2. Numbers indicate the frequency of clades defined by the dot appearing in 10 000 replicate matrices. Clades without dots appear at less than 5% frequency in replicate matrices. (a) All assemblages divided by time periods, (b) all assemblages collapsed into single time period, (c) early assemblages and those that are not divided into periods, (d) late assemblages and those that are not divided into periods.
(particularly the late Tongan assemblage), and between Lakeba and Yanuca, respectively. However, regardless of outgroup choice, eastern Fiji (represented by Lakeba) is more closely related to Yanuca, a western Fijian assemblage, than to Tonga. This is not expected from research suggesting east Fijian populations are most closely related to Tongan populations (Burley & Dickinson 2001; Burley et al. 2002). Tree support statistics (CI, RI) for this cladogram are, however, not strong and a bootstrap analysis of 10 000 replicate matrices performed using PAUP* 4.0 default settings shows almost no support for the Samoa – Tonga clade, as it appears in less than 5 per cent of the replicate matrices. Additionally, the bootstrap analysis only weakly supports clades that contain both Yanuca and Lakeba assemblages, and in general, the higher bootstrap percentages towards the tips more strongly support clades comprised of temporally divided assemblages from the same site or island than clades combining assemblages from different regions of Remote Oceania. When temporally divided assemblages are collapsed into a single assemblage, and the presence – absence of motifs re-tabulated, the resulting analysis produces a consensus cladogram from 11 equal length cladograms (length ¼ 152, CI ¼ 0.59, RI ¼ 0.35) that again does not strongly support a Phil. Trans. R. Soc. B (2010)
set of branching relationships for Lapita assemblages in Remote Oceania, except for the sister-taxa status of the Naigani and southeast Solomons assemblages (figure 4b). Additionally, the relationship of Tonga to other Remote Oceania Lapita assemblages is still ambiguous. Finally, consensus cladograms generated from matrices including either predominantly early Lapita assemblages (figure 4c; length ¼ 152, CI ¼ 0.59, RI ¼ 0.35) or late Lapita assemblages (figure 4d; length ¼ 101, CI ¼ 0.71, RI ¼ 0.38) suggest different geographical patterns of relatedness in different time periods. Resolved clades in the early Lapita period cladogram (figure 4c) include New Caledonia, the southeast Solomons and Naigani, along with a Samoa – Lakeba clade. Resolved clades in the late Lapita cladogram (figure 4d ) include Naigani and the southeast Solomons, Natunuku and Yanuca, and again Samoa and Lakeba. In the late Lapita assemblages, Natunuku and Yanuca in western Fiji are separated from Lakeba in eastern Fiji, as well as Tonga and Samoa. Bootstrap analyses, however, do not strongly support the early and late Lapita cladograms. The poorly resolved consensus cladograms and generally low bootstrap values in these analyses suggest that horizontal transmission may explain similarities between assemblages.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3898
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
To investigate this horizontal transmission hypothesis, network analyses using both NeighborNet, as implemented in Splits Tree4 (Huson & Bryant 2006), and our phenetic distance network procedures were conducted. A neighbornet of all assemblages divided by time periods (figure 5a) places the Lakeba assemblages (east Fiji), Samoa, and the Tonga assemblages into a split separate from west Fijian assemblages, New Caledonia, and the southeast Solomons. A conflicting split groups the Lakeba and Yanuca (west Fiji) assemblages. Samoa can also be added to the Lakeba– Yanuca group, but this increases the phenetic distance linking all taxa in a Lakeba – Yanuca – Samoa group. The neighbornet in figure 5b shows temporally divided assemblages collapsed into single assemblages, and here any split containing Samoa and Tonga must also contain Lakeba and, importantly, Yanuca, suggesting no simple eastern Fiji/western Fiji population structure when examining the relatedness of Fijian, Tongan and Samoan populations (as seen through their ceramics). An almost exactly similar neighbornet (not shown) is produced when examining the early Lapita assemblages. A Neighbornet of predominantly late Lapita assemblages (figure 5c) produces several conflicting splits including Lakeba–Samoa–Tonga or Samoa–Lakeba– Yanuca, although late Lapita Tonga and Samoa share the greatest similarity. These neighbornets demonstrate that when looking at assemblages across Remote Oceania, and irrespective of focusing on early, late or combined Lapita assemblages, conflicting pictures of population structure may be obtained. However, divisions separating east Fiji–Tonga–Samoa from west Fiji (and New Caledonia) are certainly identifiable, particularly in the late Lapita assemblages. Combined with the cladograms and neighbornets, can the mini-max graph method help us generate clear hypotheses about evolutionary relationships in Remote Oceania? The mini-max graph with assemblages collapsed into a single time period (figure 6a) indicates that Tonga is the most weakly connected assemblage in the network, whereas Samoa is similarly connected to eastern (Lakeba) and western (Yanuca) Fijian assemblages. Density in network analysis is the ratio of edges present to the maximum possible edges in a network or some defined group within a network (Wasserman & Faust 1994), and a cohesion index measures the extent to which edges are concentrated within a group relative to between groups (Bock & Husain 1950; Wasserman & Faust 1994). A cohesion index value of 1 indicates no difference in the relative concentration of edges, whereas values above 1 indicate greater within-group concentration of edges and values less than 1 indicate greater betweengroup concentration of edges. The density of edges in both a Tonga–Lakeba–Samoa group and a Nagani– Yanuca–Natunuku (west Fiji) group is 0.67 while the cohesion indices for these groups are 0.25 and 0.22, respectively. These low cohesion indices indicate greater between-group connections than within groups and suggest that, considered as a whole (figure 6a), transmission during the Lapita period in Remote Oceania was relatively unstructured, at least relative to often recognized groups such as west Fiji and east Phil. Trans. R. Soc. B (2010)
Fiji–Tonga–Samoa. Figure 6b presents the early assemblages in a mini-max graph that is largely unchanged from the temporally collapsed mini-max graph in figure 6a. The only difference is the relatively weaker connections between Natunuku and both the southeast Solomons and Naigani. The mini-max graph in figure 6c consists of only the late Lapita assemblages and those not divided by time periods. In this graph, the density of edges in the Tonga– Lakeba–Samoa group is 1.0 (each is connected to the other) and the cohesion index is 0.5. In the Nagani–Yanuca–Natunuku group (west Fiji) density is 0.67 and cohesion is 0.3. The greater density index for the east Fiji–Tonga–Samoa group compared with the west Fiji group is owing to the edge connecting Tonga and Samoa and suggests a relative greater frequency of transmission within this group in the late Lapita period. The cohesion indices for both groups still indicate greater between-group than within-group transmission and that population structure during the late Lapita period may be more complicated than a division between west Fiji and east Fiji–Tonga–Samoa. The generally higher Hamming distances in the late Lapita period mini-max graph mirrors expectations of lower levels of transmission between local populations towards the end of Lapita. The mini-max graphs identify New Caledonia as consistently connected to the early and late Lapita assemblages throughout Fiji and Samoa, and to Tonga during the late Lapita period. These similarities are not readily apparent in the cladograms and neighbornets and suggest that New Caledonian populations may be more closely related to Fijian and west Polynesian populations throughout the Lapita period than is generally recognized (Sand 2001, 2007; Clark & Murray 2006; cf. Matisoo-Smith & Robins 2004). Interestingly, post-Lapita ceramic surface treatments in New Caledonia and Fiji are also similar, consisting of carved-paddle impressed designs. Best (2002, pp. 29–30) argues that these post-Lapita similarities are explained by cultural transmission. The mini-max graph analysis leads us to hypothesize that New Caledonia shares a relatively high degree of similarity with both west Fijian and east Fijian populations throughout the Lapita period and that this similarity is explained by relatively high levels of horizontal transmission.
6. CONCLUSION Cladistic analyses of Remote Oceanic assemblages indicate that a nested hierarchy based on ancestral and derived traits, and therefore possibly a branching mode of evolutionary change, does not account for variation in the presence or absence of Lapita motifs. Lapita motif variation is more probably explained by the rapid colonization of this region and postcolonization transmission between local populations for 200 or more years. NeighborNet analyses also indicate no unambiguous grouping or population structure in the motif data, although groups, such as east Fiji – Tonga – Samoa, identified through other research are visible.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo (a)
–0.01 distance
Lakeba1 Lakeba2 Lakeba3
3899
Samoa
Tonga2 Tonga3
Yanuca2 Yanuca1
Tonga1 Yanuca3 New Caledonia Natunuku2
Natunuku1 Natunuku3
Naigani2 SE Solomons
Naigani1 (b)
Naigani
–0.01 distance
SE Solomons Natunuku
New Caledonia
Tonga
Lakeba
Yanuca Samoa SE Solomons
(c)
–0.01 distance
Tonga3
New Caledonia
Samoa
Lakeba3
Natunuku3 Yanuca3
Figure 5. Neigbornets of assemblages described by the presence/absence of Lapita motifs. In (a) all assemblages are divided by time periods and a split containing the Lakeba and Tonga assemblages along with Samoa is highlighted by removing the edges that connect this split to the rest of the neighbornet. A conflicting split grouping Lakeba, Samoa and Yanuca is highlighted by light grey edges. In (b) all assemblages are collapsed into a single time period and to group Tonga and Samoa in the same split, Yanuca must be included (indicated by grey edges). (c) Late assemblages and those that are not divided into periods. Phil. Trans. R. Soc. B (2010)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3900
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
(a) SE Solomons Natunuku Samoa
Naigani
Lakeba
Yanuca
Tonga New Caledonia (b) SE Solomons Natunuku Samoa
Naigani
Lakeba
bootstrap frequency (72%) for two clades, one including the Tongan assemblages, and another clade containing all others. We propose that variation within Remote Oceanic Lapita material culture may be profitably analysed by including New Caledonian datasets to a greater degree, especially when investigating the putative cultural differences between prehistoric west Fiji, and east Fiji – Tonga – Samoa, as well as the Lapita origins of Polynesian society. Finally, a dimensional or paradigmatic (Dunnell 1971) classification of Lapita motif variation across Remote Oceania is required to carry these analyses forward. This paper has benefited from comments provided by the participants at the AHRC CECD Theme B conference at Missenden Abbey, December 2008. The AHRC CECD provided funding for conference participation and software used in this analysis. The advice and editorial suggestions of Roger Green, John Terrell and two journal reviewers greatly improved the paper. We dedicate this paper to the memory of Roger Green.
Yanuca
REFERENCES Tonga New Caledonia (c) SE Solomons Natunuku Samoa
Naigani
Lakeba
Yanuca
Tonga New Caledonia Figure 6. Mini-max graphs showing assemblages and edges of the highest Hamming distance or less required to connect all assemblages to the graph. Node positions are a schematic of geographical positions. Edges are grouped into three categories: dashed for highest Hamming distance (i.e. least similar), solid-thin for medium Hamming distance and solid-thick for lowest Hamming distance (i.e. most similar). Bin ranges for these categories were determined by generating a histogram of all Hamming distance scores with three bins and testing the fit of this distribution to a normal distribution (via Kolmogorov –Smirnov test). (a) All assemblages are collapsed into a single time period; (b) early assemblages and those that are not divided into periods; (c) late assemblages and those that are not divided into periods.
This analysis confirms that horizontal transmission is the best explanation for variation in Lapita motifs in Remote Oceania (cf. Summerhayes 2001). The mini-max graph method, like NeighborNet, explores phenetic similarity and identifies New Caledonia as sharing relatively high similarity with other Remote Oceanic Lapita populations. These same graphs indicate a low level of similarity between Tonga and other populations, a finding also supported by a high Phil. Trans. R. Soc. B (2010)
Anson, D. 1983 Lapita pottery of the Bismarck archipelago and its affinities. Sydney, Australia: University of Sydney. Bentley, R. A. & Shennan, S. J. 2003 Cultural transmission and stochastic network growth. Am. Antiquity 68, 459 –485. (doi:10.2307/3557104) Best, S. B. 1984 Lakeba: the prehistory of a Fijian island. PhD thesis, University of Auckland, New Zealand. Best, S. 1987 Long-distance obsidian travel and possible implications for the settlement of Fiji. Archaeol. Oceania 22, 31–32. Best, S. 2002 Lapita: a view from the east. Auckland, New Zealand: Archaeological Association. Bock, R. D. & Husain, S. Z. 1950 An adaptation of Holzinger’s B-coefficients for the analysis of sociometric data. Sociometry 13, 146. (doi:10.2307/2784941) Borgatti, S., Everett, M. & Freeman, L. 2010 UCINET 6 for Windows. Lexington, KY: Analytic Technologies. Borgerhoff-Mulder, M. 2001 Using phylogenetically based comparative methods in anthropology: more questions than answers. Evol. Anthropol. 10, 99– 111. (doi:10. 1002/evan.1020) Borgerhoff Mulder, M., Nunn, C. N. & Towner, M. C. 2006 Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Boyd, R. & Richerson, P. J. 1985 Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Boyd, R., Borgerhoff-Mulder, M., Durham, W. H. & Richerson, P. J. 1997 Are cultural phylogenies possible? In Human by nature (eds P. Weingart, S. D. Mitchell, P. J. Richerson & S. Maasen). Mahwah, NJ: Lawrence Erlbaum. Bryant, D. & Moulton, V. 2004 NeighborNet: an agglomerative algorithm for the constrution of planar phylogenetic networks. Mol. Biol. Evol. 21, 255 –265. (doi:10.1093/ molbev/msh018) Burley, D. V. 1998 Tongan archaeology and the Tongan past. J. World Prehist. 12, 337 –392. (doi:10.1023/A: 1022322303769) Burley, D. V. & Connaughton, S. 2007 First Lapita settlement and its chronology in Vava’u, Kingdom of Tonga. Radiocarbon 49, 131 –137. Burley, D. V. & Dickinson, W. R. 2001 Origin and significance of a founding settlement in Polynesia. Proc. Natl Acad. Sci. USA 98, 11 829– 11 831. (doi:10.1073/pnas. 181335398)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phylogenetic analyses of Lapita pottery E. E. Cochrane & C. P. Lipo Burley, D. V. & Dickinson, W. R. 2009 Among Polynesia’s first pots. J. Archaeol. Sci. 37, 1020–1026. (doi:10. 1016/j.jas.2009.12.002) Burley, D. V., Storey, A. & Witt, J. 2002 On the definition and implication of eastern Lapita ceramics in Tonga. In Fifty years in the field. Essays in honour and celebration of Richard Shutler Jr’s archaeological career (eds S. Bedford, C. Sand & D. V. Burley). Auckland, New Zealand: Archaeological Association. Carrington, P. J., Scott, J. & Wasserman, S. (eds) 2005 Models and methods in social network analysis. Cambridge, UK: Cambridge University Press. Chiu, S. 2003 The socio-economic functions of Lapita ceramic production and exchange: a case study from Site WKO013A, Kone´, New Caledonia. PhD thesis, University of California, Berkeley, CA. Clark, G. R. & Anderson, A. 2009 Site chronology and review of radiocarbon dates from Fiji. In The early prehistory of Fiji (eds G. Clark & A. Anderson). Canberra, Australia: Australian National University E Press. Clark, G. R. & Murray, T. 2006 Decay characteristics of the eastern Lapita design system. Archaeol. Oceania 41, 107 –117. Cochrane, E. E. 2004 Explaining cultural diversity in ancient Fiji: the transmission of ceramic variability. DPhil. dissertation. University of Hawaii, Honolulu, HI. Cochrane, E. E. 2008 Migration and cultural transmission: investigating human movement as an explanation for Fijian ceramic change. In Cultural transmission in archaeology: issues and case studies (ed. M. J. O’Brien). Washington, DC: Society for American Archaeology. Cochrane, E. E. 2009 The evolutionary archaeology of ceramic diversity in ancient Fiji. Oxford, UK: Archaeopress. Collard, M. & Shennan, S. 2000 Processes of culture change in prehistory: a case study from the European Neolithic. In Archaeogenetics: DNA and the population prehistory of Europe (eds C. Renfrew & K. Boyle). Oxford, UK: McDonald Institute for Archaeological Research. Donovan, L. J. 1973 A study of the decorative system of the Lapita potters in reefs and Santa Cruz islands. Masters essay, University of Auckland, New Zealand. Dunnell, R. C. 1971 Systematics in prehistory. New York, NY: The Free Press. Dunnell, R. C. 1978 Style and function: a fundamental dichotomy. Am. Antiquity 43, 192–202. (doi:10.2307/279244) Gray, R. D. & Jordan, F. M. 2000 Language trees support the express-train sequence of Austronesian expansion. Nature 405, 1052–1055. (doi:10.1038/35016575) Gray, R. D., Greenhill, S. J. & Ross, R. M. 2007 The pleasures and perils of Darwinizing culture (with phylogenies). Biol. Theory 2, 360–375. (doi:10.1162/biot.2007.2.4.360) Green, R. C. 1979 Lapita. In The prehistory of Polynesia (ed. J. D. Jennings). Cambridge, MA: Harvard University Press. Green, R. C. 1995 Linguistic, biological, and cultural origins of the original inhabitants of Remote Oceania. N. Z. J. Archaeol. 17, 5 –27. Green, R. C. 2003 The Lapita horizon and traditions –signature for one set of Oceanic migrations. In Pacific archaeology: assessments and prospects (ed. C. Sand). Noume´a, New Caledonia: Service des Muse´es et du Patrimonie de Nouvelle-Cale´donie. Green, R. C., Jones, M. & Sheppard, P. 2008 The reconstructed environment and absolute dating of SE-SZ-8 Lapita site on Nendo¨, Santa Cruz, Solomon Islands. Archaeol. Oceania 43, 49– 61. Guglielmino, C. R., Viganotti, C. & Cavalli-Sforza, L. L. 1995 Cultural variation in Africa: role of mechanisms of transmission and adaptation. Proc. Natl Acad. Sci. USA 92, 7585–7589. (doi:10.1073/pnas.92.16.7585) Phil. Trans. R. Soc. B (2010)
3901
Hage, P. & Harary, F. 1996 Island networks: communication, kinship, and classification structures in Oceania. Cambridge, UK: Cambridge University Press. Hamming, R. 1980 Coding and information theory. Upper Saddle River, NJ: Prentice-Hall. Hurles, M. E., Elizabeth, M.-S., Gray, R. D. & Penny, D. 2003 Untangling Oceanic settlement: the edge of the knowable. Trends Ecol. Evol. 18, 531 –540. (doi:10.1016/ S0169-5347(03)00245-3) Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267. (doi:10.1093/molbev/msj030) Jordan, P. & Shennan, S. 2003 Cultural transmission, language, and basketry traditions amongst the California Indians. J. Anthropol. Archaeol. 22, 42–74. (doi:10.1016/ S0278-4165(03)00004-7) Kay, R. 1984 Analysis of archaeological material from Naigani. Masters thesis, University of Auckland, New Zealand. Kimbel, W. H., Lockwood, C. A., Ward, C. V., Leakey, M. G., Rak, Y. & Johanson, D. C. 2006 Was Australopithecus anamensis ancestral to A. afarensis? A case of anagenesis in the hominin fossil record. J. Hum. Evol. 51, 134–152. (doi:10.1016/j.jhevol.2006.02.003) Kirch, P. 1997 The Lapita peoples. Oxford, UK: Blackwell. Kirch, P. V. 1984 The evolution of the Polynesian chiefdoms. Cambridge, UK: Cambridge University Press. Kirch, P. V. 1988 Niuatoputapu, the prehistory of a Polynesian chiefdom. Seattle, WA: Thomas Burke Memorial Washington State Museum. Kirch, P. V. & Green, R. C. 1987 History, phylogeny, and evolution in Polynesia. Curr. Anthropol. 28, 431 –456. (doi:10.1086/203547) Kirch, P. V. & Green, R. C. 2001 Hawaiki, ancestral Polynesia: an essay in historical anthropology. Cambridge, UK: Cambridge University Press. Lipo, C. P. 2006 The resolution of cultural phylogenies using graphs. In Mapping our ancestors: phylogenetic methods in anthropology and prehistory (eds C. P. Lipo, M. J. O’Brien, M. Collard & S. Shennan). New York, NY: Aldine de Gruyter. Lipo, C. P., O’Brien, M. J., Collard, M. & Shennan, S. (eds) 2006 In Mapping our ancestors: phylogenetic methods in anthropology and prehistory. New York, NY: Aldine de Gruyter. Mace, R. & Pagel, M. 1994 The comparative method in anthropology. Curr. Anthropol. 35, 549 –564. (doi:10. 1086/204317) Matisoo-Smith, E. & Robins, J. H. 2004 Origins and dispersals of Pacific peoples: evidence from mtDNA phylogenies of the Pacific rat. Proc. Natl Acad. Sci. USA 101, 9167–9172. (doi:10.1073/pnas.0403120101) McMahon, A. & McMahon, R. 2005 Language classification by numbers. Oxford, UK: Oxford University Press. Mead, S. M., Birks, L., Birks, H. & Shaw, E. 1973 The Lapita pottery style of Fiji and its associations. Auckland, New Zealand: The Polynesian Society. Neiman, F. 1995 Stylistic variation in evolutionary perspective: inferences from decorative diversity and interassemblage distance in Illinois woodland ceramic assemblages. Am. Antiquity 60, 7–36. (doi:10.2307/282074) Nunn, C. L., Arnold, C., Matthews, L. & Mulder, M. B. 2010 Simulating trait evolution for cross-cultural comparison. Phil. Trans. R. Soc. B 365, 3807–3819. (doi:10. 1098/rstb.2010.0009) O’Brien, M. J., Darwent, J. & Lyman, R. L. 2001 Cladistics is useful for reconstructing archaeological phylogenies: palaeoIndian points from the southeastern United States. J. Archaeol. Sci. 28, 1115–1136. (doi:10.1006/ jasc.2001.0681)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3902
E. E. Cochrane & C. P. Lipo
Phylogenetic analyses of Lapita pottery
Pagel, M. & Meade, A. 2004 A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst. Biol. 53, 571 –581. (doi:10. 1080/10635150490468675) Pocklington, R. 2006 What is a culturally transmitted unit, and how can we find one? In Mapping our ancestors: phylogenetic methods in anthropology and prehistory (eds C. P. Lipo, M. J. O’Brien, M. Collard & S. Shennan). New York, NY: Aldine de Gruyter. Poulsen, J. 1987 Early Tongan prehistory: the Laptia period on Tongatapu and its relationships. Canberra, Australia: Australian National University. Rexova, K., Frynta, D. & Zrzavy, J. 2003 Cladistic analysis of languages: Indo-European classification based on lexicostatistical data. Cladistics 19, 120. (doi:10.1111/j. 1096-0031.2003.tb00299.x) Rieth, T. M. & Hunt, T. H. 2008 A radiocarbon chronology for Sa¯moan prehistory. J. Archaeol. Sci. 35, 1901–1927. (doi:10.1016/j.jas.2007.12.001) Rogers, D. S. & Ehrlich, P. R. 2008 Natural selection and cultural rates of change. Proc. Natl Acad. Sci. 105, 3416–3420. (doi:10.1073/pnas.0711802105) Sand, C. 1997 The chronology of Lapita ware in New Caledonia. Antiquity 71, 539 –547. Sand, C. 2001 Evolutions in the Lapita cultural complex: a view from the southern Lapita province. Archaeol. Oceania 36, 65–76. Sand, C. 2007 Looking at the big motifs: a typology of the central band decorations of the Lapita ceramic tradition of New Caledonia (Southern Melanesia) and preliminary regional comparisons. In Oceanic explorations: Lapita and western Pacific settlement (eds S. Bedford, C. Sand & S. P. Connaughton). Canberra, Australia: Australian National University E Press. Scott, J. 2000 Social network analysis: a handbook. London, UK: Sage. Shennan, S. & Collard, M. 2005 Investigating processes of cultural evolution on the north coast of New Guinea with multivariate and cladistic analyses. In The evolution of cultural diversity: a phylogenetic approach
Phil. Trans. R. Soc. B (2010)
(eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Siorat, J. P. 1990 A technological analysis of Lapita pottery decoration. In Lapita design, form, and composition. (ed. M. Spriggs). Canberra, Australia: Australia National University. Skelton, C. 2008 Methods of using phylogenetic systematics to reconstruct the history of the linear B script. Archaeometry 50, 158 –177. Sokal, R. R. & Sneath, P. H. A. 1963 Principles of numerical taxonomy. London, UK: W. H. Freeman. Specht, J. 2004 Lapita, the Solomons, and similarity measures. J. Polynesian Soc. 113, 369 –376. Spriggs, M. 1984 The Lapita cultural complex: origins, distribution, contemporaries, and successors. J. Pac. Hist. 19, 202–223. Summerhayes, G. R. 2001 Far western, western, and eastern Lapita: a re-evaluation. Asian Perspect. 39, 109 –138. (doi:10.1353/asi.2000.0013) Swofford, D. L. 2001 PAUP*: Phylogenetic analysis using parsimony and other methods, 4.0 edn. Sunderland, MA: Sinauer Associates. Tehrani, J. & Collard, M. 2002 Investigating cultural evolution through biological phylogenetic analyses of Turkmen textiles. J. Anthropol. Archaeol. 21, 443 –463. (doi:10.1016/S0278-4165(02)00002-8) Te¨mkin, I. & Eldridge, N. 2007 Phylogenetics and material culture evolution. Curr. Anthropol. 48, 146–153. (doi:10.1086/510463) Terrell, J. E. (ed.) 2001 Archaeology, language, and history. Westport, CT: Bergin and Garvey. Terrell, J., Hunt, T. L. & Gosden, C. 1997 The dimensions of social life in the Pacific: human diversity and the myth of the primitive isolate. Curr. Anthropol. 38, 155–195. (doi:10.1086/204604) Tolstoy, P. 2008 Barkcloth, Polynesia and cladistics: an update. J. Polynesian Soc. 117, 15–57. Wasserman, S. & Faust, K. 1994 Social network analysis: methods and applications. Cambridge, UK: Cambridge University Press.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3903–3912 doi:10.1098/rstb.2010.0014
Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits Thomas E. Currie1,2,*, Simon J. Greenhill3 and Ruth Mace1 1
Human Evolutionary Ecology Group, Department of Anthropology, University College London, London WC1H 0BW, UK 2 Hasegawa Laboratory, Evolutionary Cognitive Science Research Center 1F, Building 17, Department of Cognitive and Behavioral Science, Graduate School of Arts and Sciences, University of Tokyo, Tokyo 153-8902, Japan 3 Department of Psychology, University of Auckland, Auckland 1142, New Zealand Phylogenetic comparative methods (PCMs) provide a potentially powerful toolkit for testing hypotheses about cultural evolution. Here, we build on previous simulation work to assess the effect horizontal transmission between cultures has on the ability of both phylogenetic and non-phylogenetic methods to make inferences about trait evolution. We found that the mode of horizontal transmission of traits has important consequences for both methods. Where traits were horizontally transmitted separately, PCMs accurately reported when trait evolution was not correlated even at the highest levels of horizontal transmission. By contrast, linear regression analyses often incorrectly concluded that traits were correlated. Where simulated trait evolution was not correlated and traits were horizontally transmitted as a pair, both methods inferred increased levels of positive correlation with increasing horizontal transmission. Where simulated trait evolution was correlated, increasing rates of separate horizontal transmission led to decreasing levels of inferred correlation for both methods, but increasing rates of paired horizontal transmission did not. Furthermore, the PCM was also able to make accurate inferences about the ancestral state of traits. These results suggest that under certain conditions, PCMs can be robust to the effects of horizontal transmission. We discuss ways that future work can investigate the mode and tempo of horizontal transmission of cultural traits. Keywords: horizontal transmission; phylogenetic comparative methods; cultural evolution; cultural phylogenetics; Galton’s problem; ancestral state
1. INTRODUCTION Phylogenetic methods offer a potentially powerful toolkit with which to test hypotheses about cultural evolution (Mace & Pagel 1994; Mace & Holden 2005; Gray et al. 2007). Broadly speaking, phylogenetic analyses can be classified into two categories: (i) data can be used to construct phylogenetic trees and information about the structure of these trees can be used to make inferences about the sequence and timing of divergence of the units under investigation and (ii) traits can be mapped onto phylogenetic trees to make comparative inferences about trait evolution. In this paper, we are concerned with the second kind of analysis, and we examine the application of phylogenetic comparative methods (PCMs) to the study of cultural evolution. PCMs were originally developed in the biological sciences to control for the problem of non-independence of taxa owing to their historical
* Author for correspondence ([email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
relatedness (Harvey & Pagel 1991). A similar problem exists when performing cross-cultural comparisons where two or more traits may co-occur across cultures because they have inherited the traits from a common ancestor and not because they are functionally linked. Performing cross-cultural comparisons without correcting for this non-independence may lead to spurious results. This was noted by Sir Francis Galton early in the history of anthropology, and is now known as Galton’s problem. PCMs provide a solution to Galton’s problem while still allowing the comparison of closely related societies (Mace & Pagel 1994). Furthermore, by explicitly accounting for similarity owing to common ancestry, PCMs allow us to answer questions that are simply not possible with more traditional techniques (e.g. contingency tables or simple correlations). Such questions include: What is the ancestral state of a particular cultural trait? Are traits X and Y coevolving? Is a particular pathway of trait evolution more likely than another? What is the rate of change of a certain cultural trait? Analyses involving PCMs therefore provide a valuable complement to archaeological investigations, and may be particularly
3903
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3904
T. E. Currie et al.
Horizontal transmission simulation
important when there is a lack of evidence from the archaeological record. As with any analytical approach, the application of PCMs to cultural data involves a number of assumptions. The units of analysis in such studies are populations that are marked by differences in culturally acquired symbols (e.g. dress, language, rituals etc.; Barth 1969; McElreath et al. 2003). For clarity and consistency of expression, in this paper we will refer to these groups as cultures, while recognizing that in practice the identification of such units (as is the case with the identification of biological species) is by no means straightforward. Phylogenetic trees used in these studies reflect the hypothesized historical relationships between cultures, with the internal nodes of these trees representing hypothesized ancestral cultures (Holden & Shennan 2005; Jordan et al. 2009). Traits analysed in these studies are cultural behaviours, beliefs or practices that are defined according to the analysis we wish to conduct, and generally the variation in these traits is greater between rather than within cultures (Holden & Shennan 2005). In previous studies, these traits have predominantly been related to social structure (e.g. an estimation of ancestral states of residence in Austronesian-speaking societies; Jordan et al. 2009) or subsistence practices (e.g. an analysis of the coevolution of cattle and patriliny in Bantu-speaking populations; Holden & Mace 2003). These traits can vary either discretely (e.g. matrilocal versus patrilocal, pastoralist versus non-pastoralist) or more rarely continuously (e.g. sex ratio at birth; Mace & Jordan 2005). One reason this phylogenetic comparative approach has been advocated as a promising method for investigating cultural diversity is because of similarities between biological and cultural evolution (Mace & Holden 2005; Gray et al. 2007). Human cultural and linguistic evolution, like biological evolution, is a process of descent with modification: traits are passed on from one generation to the next and can change over time. Furthermore, new human groups are often formed by the splitting of a population to form daughter populations in a manner that is analogous to biological speciation (Mace 2005). Over time, these processes lead to variation between cultures and languages. When analysing real-world data, the historical relationships are usually inferred using phylogenetic trees built from linguistic data, although other forms of data could be used (Rogers et al. 2009). Previous studies using PCMs on cultural data have largely been concentrated on three large language groupings (Austronesian: Jordan et al. 2009; Bantu: Holden & Mace 2003; Indo-European: Pagel & Meade 2005). Each of these groupings is generally thought by linguists to descend from a common ancestor in a tree-like manner (Joseph 2003), and has been hypothesized to result from agriculturally driven population expansions (Diamond 1997). There has been considerable debate as to the extent to which different aspects of human populations (culture, genes and language) follow the same patterns of inheritance (O’Brien & Lyman 2005). It has been argued that processes such as frequency dependence, conformism and other transmission-isolating mechanisms can constrain Phil. Trans. R. Soc. B (2010)
language and culture to be predominantly inherited from one generation to the next within populations (Durham 1992; Holden & Shennan 2005; Jordan et al. 2009). In support of this idea, an analysis of 277 African cultures by Guglielmino et al. (1995) found that linguistic affiliation was a strong predictor of the variation in many traits in Murdock’s Ethnographic Atlas. While some traits also showed associations with geographical distance and environment, variation in family and kinship traits was only associated with language affiliation. It is possible, however, for cultural and linguistic traits to be transmitted horizontally between cultures (e.g. writing systems are thought to have been borrowed from culture to culture from a handful of independent origins; Diamond 1997). Such cultural borrowing violates the assumptions of phylogenetic methods, which some have argued makes these kinds of analyses unsuitable for cultural systems (Moore 1994; Temkin & Eldredge 2007). A recent simulation study by Nunn et al. (2006) seemed to confirm these suspicions. Nunn et al. simulated the evolution of cultures bearing continuous cultural traits, where different degrees of horizontal trait transmission between cultures could occur. The results of these simulations were analysed using a PCM known as independent contrasts (Felsenstein 1985), and a Mantel test, which is a statistical procedure for accounting for shared ancestry and geographical proximity (Mantel 1967). These were compared with the results of performing a linear regression on the raw simulated data. When there was no horizontal transmission of cultural traits, the phylogenetic method showed fewer incidences of type I errors than the simple regression analysis. However, when horizontal transmission did occur, both phylogenetic and nonphylogenetic methods showed similar levels of positive correlation. These results led the authors to conclude that phylogeny-based methods should be used ‘only under restrictive conditions when the traits in question are transmitted vertically’. (Nunn et al. 2006, p. 201). We agree that such simulation studies are important if we are to understand cultural evolutionary processes more completely. In another paper, we have used simulations to explore the separate but related question of how horizontal transmission affects our ability to reconstruct phylogenetic trees from cultural data (Greenhill et al. 2009). Our results suggest that building phylogenetic trees from cultural data is robust to even quite high levels of horizontal transmission. There are several reasons to be sceptical about the strong conclusion drawn by Nunn and colleagues that any degree of horizontal transmission invalidates the application of PCMs. First, in their simulation, the PCM did not perform worse than standard regression, even under very high levels of horizontal transmission. Second, the simulation contains a number of parameters that must be defined by the user. In such simulations it is important to use a range of values to show the effect that each parameter can have on the outcome. While the study varies a number of these parameters, other parameters are left unchanged, and we note that the final number of cultures to be analysed is fixed at 36. The number of
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Horizontal transmission simulation cultures is a particularly important parameter as phylogenetic methods, like other statistical methods, are more likely to be able to detect a true correlation between traits with a larger number of cases being analysed (Cohen 1988). Furthermore, while the horizontal transmission rate begins at a low absolute value, it is not stated what this translates into in real terms. Finally, and most importantly, although two forms of horizontal transmission were simulated (traits transmitted separately or as a pair), only the results from paired transmission were presented and discussed. In this study, we use the same simulation program used by Nunn et al. (2006) to explore what effect altering these parameters has on the results of analyses conducted using PCMs. As PCMs allow us to make inferences about trait evolution that are not possible in a traditional analysis, we also examined the effect horizontal transmission has on the ability of a PCM to accurately reconstruct the ancestral state of the simulated traits. 2. METHODS We used the same simulation program developed by Nunn and colleagues to enable direct comparisons between our results and those of the previous study. Here, we provide a brief outline of the simulation set-up (see also Nunn et al. 2006). The simulation consists of four main processes: cultures can diversify to create new cultures, cultures can go extinct, continuous cultural traits can change over time and the cultural traits can be transmitted to other adjacent contemporary cultures. The simulation starts with an empty grid of a userdefined number of columns and rows. Each cell of this grid is able to contain one ‘culture’. A single culture is placed in the middle cell of the first column. This culture contains two continuous traits, X and Y. New cultures are created via a splitting process. In each generation, if a culture is adjacent to an empty grid cell, then it can diversify into that cell with a certain probability (i.e. a new culture, with identical values for X and Y as the parent culture, is created and placed into the cell). This situation approximates the kind of population expansions that are thought to have led to the present day distribution of many widespread language families (Diamond 1997). In each generation of the simulation, a culture also has a certain probability of going extinct. In this simulation, this is assumed to represent the death of all individuals in a particular culture, but such cultural extinction could also result from individuals switching their cultural traits to those of another culture (Nunn et al. 2006). If a culture goes extinct, the cell it was inhabiting becomes empty. The program runs for 60 generations and the history of the surviving cultures in each simulation run is recorded, and is represented as a single bifurcating tree. (a) Trait change and horizontal transmission At the end of each generation, the two continuously distributed traits change under a model of Brownian motion with a specified variance of trait change per Phil. Trans. R. Soc. B (2010)
T. E. Currie et al.
3905
generation. The two traits change with a user-specified degree of correlation (r) that can range from 0 (no correlation) to 1 (completely correlated). We follow Nunn et al. in using values of trait correlation of 0, 0.3 and 0.6. The traits may also undergo horizontal transmission where the value of the trait in the recipient culture is replaced by the value of that trait in the donor culture. Traits can either be horizontally transmitted separately (e.g. trait X can get borrowed without trait Y) or as a pair (if trait X is borrowed then so is trait Y). Therefore, there are four scenarios for trait evolution: (i) trait change is correlated (r ¼ 0.3 or 0.6) and traits are horizontally transmitted as a pair, (ii) trait change is correlated but traits are horizontally transmitted separately, (iii) trait change is not correlated (r ¼ 0) and traits are horizontally transmitted as a pair, and (iv) trait change is not correlated but traits are horizontally transmitted separately. In their previous study, Nunn and colleagues concentrate on the results from the analyses of traits that have undergone paired horizontal transfer. Previous studies using PCMs have focused on the evolution of features related to social organization, and we are not aware of any anthropological context in which such traits are likely to be evolving independently and yet be consistently borrowed together. Therefore, in this paper, we present the results of both these forms of horizontal trait transmission. We explored the effect that three parameters have on the ability of a PCM to make accurate inferences about trait evolution: (i) the degree and nature of horizontal transmission, (ii) the extinction rate of cultures, and (iii) the final number of cultures. (b) Horizontal transmission probability One of our aims was to assess the effect of horizontal transmission on phylogenetic methods at levels lower than those investigated by Nunn and colleagues, who used horizontal transmission probabilities between 0.004 and 0.15 per generation, which translates to approximately 25 – 900 horizontal transmission events per simulation for 36 cultures (approx. 75 – 2400 events for 100 cultures). We simulated data with horizontal transmission probabilities of 0, 0.0008, 0.0016, 0.0024 and 0.0032 per culture per generation (approx. 0, 5, 10, 15 and 20 horizontal transmission events per simulation for 36 cultures, and 0, 15, 30, 45 and 60 for 100 cultures). In order to provide a direct comparison with the previous study, we also conducted simulations with horizontal transmission probabilities within the range used by Nunn and colleagues (0.0048, 0.0064, 0.008, 0.02, 0.06, 0.15). (c) Extinction probability We conducted simulations under two extinction regimes using the largest (0.32) and smallest (0.02) values of the extinction parameter used by Nunn and colleagues (i.e. each culture has a 32% or 2% chance of going extinct each generation). The main effect this has in relation to the current study is to produce two very different types of tree shape (figure 1). Under the low-extinction condition, the trees are characterized by a burst of diversification near the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3906
T. E. Currie et al.
Horizontal transmission simulation
(a)
(b)
Figure 1. Examples of the types of trees for 36 taxa that are generated in the simulation program under (a) the low extinction probability (0.02) and (b) the high extinction probability (0.32).
root and very long branch lengths. In the highextinction trees, much of the diversification of the extant cultures occurs closer to the tips of the tree, which should lead to higher degrees of autocorrelation between trait values and therefore more false positives when analyses are performed without controlling for phylogeny. (d) Number of cultures It is a condition of the simulation program created by Nunn et al. for the grid to be full at the end of the simulation in order for the results to be analysed. This was implemented to ensure that the final number of cultures was kept constant across simulations. Therefore, the final number of cultures is the same as the number of cells in the grid, which in the case of all simulations conducted by Nunn et al. was 36. Here, we conduct simulations that result in a final number of 36 cultures, but also expand the final number of cultures to 100. Both set-ups are conducted on square grids (i.e. 36 cultures on a 6 6 grid, 100 cultures on a 10 10 grid). The grids are bounded at the edges and therefore do not form a torus. The larger number of cases that can be analysed, as with other statistical procedures, should increase the statistical power of phylogenetic methods to accurately detect true correlations between traits (Cohen 1988). Parameter values are shown in table 1, with differences between parameter values used in the present study and those used by Nunn et al. highlighted. (e) Analysis For each simulation, the values of the traits X and Y were analysed in two ways. First, a regression analysis was performed on the raw values of the traits. Second, the data were analysed using the PCM CONTINUOUS (Pagel 1997, 1999) as implemented in the package BAYESTRAITS (available from www.evolution. rdg.ac.uk; Pagel & Meade 2005). CONTINUOUS estimates the values of parameters of a specified model of trait evolution that maximize the likelihood of observing the data, given the phylogenetic tree. To test whether traits are coevolving or not, we specify a model that estimates the covariation between the traits, and then compare the maximum likelihood of the data under this model with the maximum likelihood of the data under a model in which the Phil. Trans. R. Soc. B (2010)
Table 1. Parameter values used in the present study. Parameter values that differ from those used by Nunn et al. (2006) are in bold. simulation parameter
values used in this study
grid size probability of diversification probability of extinction number of generations number of simulations correlation between traits variance of trait change probability of horizontal transmission
6 6, 10 3 10 0.96
mode of horizontal transmission
0.02, 0.32 60 1000 0, 0.3, 0.6 0.02 0, 0.0008, 0.0016, 0.0024, 0.0032, 0.0048, 0.0064, 0.008, 0.02, 0.06, 0.15 paired, separate
covariation parameter is constrained to be zero. The likelihood of the two models can be compared via the likelihood ratio (LR) test, which is defined as H0 ; LR ¼ 2 loge H1 where H0 represents the simpler (null) model of uncorrelated trait evolution and H1 represents the alternative model of trait coevolution. These tests are nested as the no-covariation model is a simpler version of the covariation model. Therefore, the LR statistic approximates a x 2 distribution with degrees of freedom equal to the difference in the number of parameters between the two models. In this case the difference is 1, as only the covariation parameter is different between the dependent and independent models of trait evolution. CONTINUOUS has a number of advantages over the independent contrasts method used by Nunn et al., and allows for a more complete exploration of evolutionary processes in real-world datasets (Freckleton 2009). However, under the conditions in which it is applied in this study, CONTINUOUS will produce results that are equivalent to those produced by independent contrasts (Pagel 1999), so differences in results cannot be attributable to differences in the particular PCM being used. Across our range of simulations, we assessed the ability of both the phylogenetic and the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Horizontal transmission simulation non-phylogenetic procedures to correctly determine whether the two traits are correlated or not. We calculated the proportion of simulations in which the two methods (non-phylogenetic: regression; and phylogenetic: CONTINUOUS) detect a significant correlation between the traits. Under the condition where data were simulated with no correlation between them, we assessed the proportion of type I errors (a)— false positives—i.e. simulations for which a significant association between traits was incorrectly detected. Following standard statistical testing procedures, a significance level of p ¼ 0.05 was set, which means we would expect 5 per cent of our simulations to produce false positives simply by chance. Where a correlation between traits was simulated, we assessed the proportion of type II errors (b)—false negatives—simulations for which, incorrectly, no significant association between the variables was found. A suitable level of statistical power (1-b) is conventionally set at 0.8 (Cohen 1988); therefore, if the statistical procedures used in this study are performing satisfactorily, they should correctly detect a correlation between traits in at least 80 per cent of the simulations. We could then compare the error rates of the non-phylogenetic and phylogenetic analyses. In addition to testing the ability of CONTINUOUS to detect correlated evolution, we also assessed how well it was able to estimate the value of trait X at the root of each tree (i.e. the ancestral value of X). All simulations start with a value for both traits of 0, so we wanted to see how well CONTINUOUS is able to reconstruct this value and to assess what affect horizontal transmission has on this estimation.
3. RESULTS (a) Assessing trait correlations The proportion of simulations for which a positive association between traits X and Y was found is shown in figure 2. Under the condition in which traits were simulated with zero correlation (r ¼ 0), strikingly different results were produced depending on whether traits were horizontally transmitted separately or as a pair. Where traits were transmitted separately, the phylogenetic method showed lower false positives under all levels of horizontal transmission and under both extinction rates. Although some parameter combinations show false positives for CONTINUOUS above the acceptable level of 0.05, the difference is small and, importantly, false positives are not increasing systematically with the level of horizontal transmission. These results indicate that PCMs are robust to high levels of horizontal transmission. As predicted, the simple regression analysis on data generated under the high extinction rate gives a much higher proportion of false positives than those data generated under the low extinction rate. Where traits were horizontally transmitted as a pair, then both phylogenetic and non-phylogenetic methods displayed increasing levels of positive correlations as the degree of horizontal transmission increases. The performance of both methods is somewhat similar under the low-extinction condition, but the phylogenetic method returned markedly fewer positive Phil. Trans. R. Soc. B (2010)
T. E. Currie et al.
3907
correlations than the non-phylogenetic method under the high-extinction condition. Where traits were simulated as coevolving (r ¼ 0.3, r ¼ 0.6), the statistical power of both phylogenetic and non-phylogenetic methods is clearly affected by a number of factors. The relationship between increasing horizontal transmission and the proportion of false negatives shows different patterns depending on the mode of horizontal transmission (figure 3). Where traits were transmitted separately, the general pattern is for both phylogenetic and non-phylogenetic methods to return increasing levels of false negatives as horizontal transmission increases. The proportion of false negatives increases more rapidly under the low extinction rate. Where traits were transmitted as a pair, the pattern of false negatives and horizontal transmission is more complex, but generally analyses performed using 100 cultures show a lower proportion of errors, with the phylogenetic analysis showing fewer false negatives than the non-phylogenetic analysis. Where traits were simulated under a higher correlation coefficient (r ¼ 0.6), both methods were more able to accurately detect this correlation, particularly at lower levels of horizontal transmission. CONTINUOUS generally gave fewer false negatives than the regression analysis especially under the highextinction condition (figure 2). Again, the mode of horizontal transmission has the biggest effect on the results. Where traits were horizontally transmitted separately, the proportion of analyses suggesting no correlation between traits begins to increase substantially at the higher levels of horizontal transmission. Where traits were horizontally transmitted as a pair, the increase in false negatives was less marked, and the proportion of both types of analyses suggesting that traits are correlated is above 0.8 for all parameter combinations. (b) Estimation of ancestral value of traits The mean and standard deviation of the estimate of the ancestral value of trait X are shown in figure 3. Importantly, the mean estimates of the value at the root of the tree do not show any consistent pattern with increasing horizontal transmission (figure 3b). Furthermore, the standard deviation of this estimate does not increase to a large degree until the very highest levels of horizontal transmission (figure 3a). Accuracy in the root estimate (as measured by the standard deviation) is increased under the low extinction rate compared with the high extinction rate. Ancestral state estimates are also more accurate with 100 cultures compared with 36 cultures within each extinction rate. 4. DISCUSSION In this study, we used simulations to assess the impact of horizontal transmission on the ability of PCMs to answer questions about cultural evolution. We have shown that horizontal transmission does not necessarily lead PCMs to draw inaccurate inferences about trait evolution. This result is in contrast to conclusions drawn by Nunn et al. (2006), who only examined
Phil. Trans. R. Soc. B (2010)
extinction rate = 0.32
horizontal transmission level
extinction rate = 0.02
(c)
extinction rate = 0.02
extinction rate = 0.32
traits horizontally transmitted as a pair
0.1500 0.0600 0.0200 0.0080 0.0064 0.0048 0.0032 0.0024 0.0016 0.0008 0
Figure 2. Comparison of the performance of phylogenetic (filled circles) and non-phylogenetic (linear regression; open circles) methods under simulated correlation coefficients (r) of (a) 0, (b) 0.3 and (c) 0.6 and differing final numbers of cultures (solid lines, 36 cultures; dashed lines, 100 cultures). For r ¼ 0, the horizontal bar represents an acceptable proportion of type I errors of 0.05 (points above this line indicate elevated levels of type I errors). For r ¼ 0.3 and 0.6, the bar represents an acceptable proportion of type II errors of 0.8 (points below this line indicate elevated levels of type II errors). Grey areas indicate values of horizontal transmission simulated in Nunn et al. (2006).
0
0.2
0.4
0.6
0.8
(b)
traits horizontally transmitted separately
1.0
0
0.2
0.4
extinction rate = 0.32
T. E. Currie et al.
0.6
0.8
1.0
extinction rate = 0.02
3908
proportion of significant correlations
(a)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Horizontal transmission simulation
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Horizontal transmission simulation (a)
(b)
0.8
T. E. Currie et al.
3909
0.10 0.05
0.6 mean
standard deviation
0.7
0.5
0
0.4 –0.05 0.3 –0.10
0.2 0.1500 0.0600 0.0200 0.0080 0.0064 0.0048 0.0032 0.0024 0.0016 0.0008 0
0.1500 0.0600 0.0200 0.0080 0.0064 0.0048 0.0032 0.0024 0.0016 0.0008 0
horizontal transmission level
horizontal transmission level
Figure 3. (a) Standard deviation and (b) mean of estimates of the ancestral value of trait at the root of the tree. Grey areas indicate values of horizontal transmission simulated in Nunn et al. (2006). (N.B. Values at the tips of the tree were in the approximate range of +3). Number of cultures: solid line, 36; dashed line, 100. Extinction rate: filled circles, 0.32; open circles, 0.02.
situations in which traits were horizontally transferred as a pair. When traits were horizontally transferred separately, different results were obtained. Under these conditions, the phylogenetic method correctly determined when two traits had been evolving independently, while the non-phylogenetic method regularly returned spurious correlations. This difference in performance was more marked in analyses involving the high-extinction-rate trees. While separate horizontal transmission did lead to a decrease in the ability of PCMs to detect correlated traits, non-phylogenetic methods also show a decrease in performance. Our results suggest that horizontal transmission is not necessarily more likely to lead to erroneous conclusions of correlated trait evolution, and that PCMs outperform methods that do not account for shared ancestry. The phylogenetic method was able to make accurate estimations of the ancestral state under all levels of horizontal transmission. The error in these estimates is noticeably larger in the more structured, high-extinction trees. Additionally, a larger number of cultures led to lower estimation errors within each extinction condition. Interestingly, horizontal transmission did not lead to estimates of the value of the ancestral state to be consistently higher or lower than their true value. Furthermore, the variation in estimates was only substantially increased at the very highest levels of horizontal transmission. This kind of ‘virtual archaeology’, the ability to make inferences about the past states of cultural traits based on their present day distribution, is one of the key benefits to using phylogenetic techniques and is especially valuable if the traits of interest have left no mark in the archaeological record, or where finding historical or archaeological evidence is difficult or impractical. Fortunato & Jordan (2010) describe in more detail their work in using PCMs to reconstruct ancestral states of residence in Austronesian-speaking and Indo-European-speaking cultures. As with other statistical techniques, increasing the number of observations (in this case, ‘extant’ cultures) Phil. Trans. R. Soc. B (2010)
gives more statistical power. In our simulations, increasing the final number of cultures from 36 to 100, while keeping all other parameters constant, led to reduced numbers of false negatives for both techniques. Examples of previous studies using PCMs have used samples of 135 ( Jordan et al. 2009), 68 (Holden & Mace 2003), 74 (Mace & Jordan 2005) and 35 cultures (Borgerhoff Mulder et al. 2001). Our results suggest that it is advisable to analyse as many cultures as possible in order to reduce the potentially negative effect that horizontal transmission has on the ability of PCMs to correctly detect true correlations between traits, and to reduce error in ancestral state estimations. In these simulations, different patterns of results were found depending on whether traits were horizontally transmitted separately or as a pair. These differences are due to the role the different forms of horizontal transmission have in making traits appear correlated or in breaking up traits that have been evolving together. In these simulations, there are three processes that lead to values of traits appearing correlated: (i) correlated trait change owing to traits coevolving (defined by the parameter ‘r’), or nonindependence of observations owing to (ii) vertical inheritance from a common ancestor or (iii) paired horizontal transmission. If traits are not really correlated (r ¼ 0) and horizontally transmitted separately, then the only source of false correlations is due to inheritance from a common ancestor. This is controlled for in the analyses by the PCM but not the linear regression. If traits are not really correlated but are horizontally transmitted as a pair, then both vertical and horizontal transmission lead to traits appearing correlated. At low levels of horizontal transmission, the cultural phylogeny and the trait phylogeny will be largely congruent and the PCM can control for the nonindependence introduced by vertical transmission. However, the PCM picks up the correlation introduced by paired horizontal transmission, which comes to dominate as horizontal transmission increases. If traits are really correlated (r . 0) and
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3910
T. E. Currie et al.
Horizontal transmission simulation
horizontally transmitted separately, then horizontal transmission will break up traits that have been evolving together. At the highest levels of horizontal transmission, traits may spend so little time together that their ability to coevolve would in reality diminish, so it is not clear whether the characterization of these results as false negatives are really ‘false’. Finally, if traits are really correlated and horizontally transmitted as a pair, then horizontal transmission will not break up traits that have been evolving together. If two traits are consistently borrowed together, then this is suggestive of some kind of functional association between the traits. The results presented here show that PCMs can detect this linkage. However, the models in which trait change is correlated, but horizontal transmission is separate, and in which trait change is not correlated, but horizontal transmission is paired, are probably unrealistic, particularly in relation to the kind of cultural traits that have previously been the subject of analyses employing PCMs. Nunn et al. (2006) themselves suggest that future work could apply a more realistic model of horizontal trait transmission so that the probability of the different modes of horizontal transfer is proportional to the degree of correlation between the traits. Under such a model where traits are simulated with no correlation between them, they should have a higher probability of being horizontally transferred separately rather than as a pair. Conversely, this model would give a higher probability of paired horizontal transmission where trait evolution is correlated. Our results show that these are the conditions under which PCMs are likely to be most accurate. The fact that techniques such as regression incorrectly interpret phylogenetic clustering in data as correlation is why PCMs were developed and continue to be used routinely in comparative studies in evolutionary biology (Freckleton 2009). We show here that the enhanced performance of PCMs over nonphylogenetic analyses holds true even at the highest rates of horizontal transmission. Despite Galton’s problem being a well-known issue in cross-cultural research (Mace & Pagel 1994), it appears its importance is not always fully appreciated. For example, Younger (2008) recently performed a series of regression analyses on data from Polynesian cultures, who have long been known to be related phylogenetically (Kirch & Green 1997), without performing any form of phylogenetic correction. The results of the present study show that we cannot have confidence in results from such studies that do not attempt to control for phylogeny. In this study, the historical relationships between cultures were simulated as a bifurcating phylogenetic tree and were known without error. However, in realworld analyses, there is often uncertainty associated with any single phylogenetic tree. This may be due to practical issues such as a lack of sufficient data with which to infer relationships (Holden et al. 2005) or non-tree-like processes in population history such as dialect continua (Gray et al. 2010). The strength of support for any hypothesis about the phylogenetic relationships between societies can be assessed by examining the strength of support for nodes in the Phil. Trans. R. Soc. B (2010)
trees (Holden & Shennan 2005) or by viewing network representations that show the presence of conflicting signal in the data (Gray et al. 2010). Furthermore, the latest Bayesian phylogenetic methods produce a posterior sample of most-likely trees rather than a single tree, which represents our uncertainty about the historical relationships between cultures (Holden et al. 2005). This uncertainty can be taken into account in PCM analyses by integrating the analysis across the whole-tree sample to produce posterior distributions of parameter estimates (Pagel & Meade 2005). This approach is quite conservative, as it is likely to reduce the confidence associated with any particular estimation of a parameter in a PCM analysis. For example, Jordan et al. (2009) estimated the ancestral states of residence in Austronesian societies at various nodes in a posterior sample of most-likely trees, multiplying the probability associated with an estimated ancestral state by the proportion of trees in the sample for which that node existed. Therefore, there is no reason to suspect that incorporating this uncertainty into PCM analyses is more likely to lead to erroneous conclusions that two traits have coevolved when they have not, or to provide strong evidence in favour of an incorrect ancestral state. The results of these simulations show that under certain conditions the inferences from PCMs are robust to the occurrence of horizontal transmission. An important issue that needs to be addressed is how often horizontal transmission occurs and what form it takes in real-world cultural systems. A distinction should be made between horizontal transmission within and between cultures. Even if cultural traits have very high rates of horizontal transmission within cultures, as long as rates of transmission between cultures are sufficiently low, then population-level differences between lineages can emerge and be maintained, making the application of phylogenetic techniques appropriate. In real-world human populations, processes such as frequency-dependent selection and conformism can act to reduce horizontal transmission of cultural traits between populations even in the face of physical migration between groups (Mace 2005). We note that appreciation of this idea has led to a resurgence of interest in group-level evolutionary processes in cultural evolution (Boyd & Richerson 1985; Boyd et al. 2003). These kinds of processes rely on there being widespread mechanisms that maintain cultural differences between groups. The ability to detect phylogenetic signal in cultural data may provide some supporting evidence for this view. At present, we know surprisingly little about the rate and mode of transmission of cultural traits between cultures, and we concur with Nunn et al. (2006) that there is a great need for researchers to assess this issue empirically. So-called ‘Jungle’ methods, which can estimate the number of horizontal transfer events required to reconcile two incongruent trees (Page & Charleston 1998), may provide one technique for addressing this question. A recent study by Temkin & Eldredge (2007) employed this method and estimated 12 such events in the phylogenetic history of 38 brass instruments over a time period of about 175 years. Future work could employ such methods and other appropriate techniques to make
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Horizontal transmission simulation (a)
(b)
Figure 4. Different modes of horizontal transmission of cultural traits. (a) Adaptive traits may spread quickly among some cultures, whereas (b) other traits may move between cultures at a relatively constant rate, as in the simulations used in this study.
empirical estimates of rates of horizontal transmission and to ascertain to what extent traits get transferred as a package. It is also important to point out that most empirical studies employing cultural PCMs have modelled traits as discretely, rather than continuously, distributed. We suggest future efforts are best directed at exploring the cultural transmission of discrete traits as this is where the majority of theoretical and empirical studies of cultural phylogenetics are concentrated. There are good reasons to think that different types of cultural traits will show drastically different rates of horizontal transfer between cultures. We feel the term ‘diffusion’, which is often used to describe such transfer, carries the connotation of a passive process that does not capture the difficulty or scale of the changes that might occur were certain traits to be borrowed. Traits are adopted or not adopted for different reasons and the rate of transmission of some traits cannot be generalized to all traits. Fads or fashions with little impact on fitness or with few costs associated with adoption might move between cultures readily, but such traits are rarely the subject of coevolutionary anthropological hypotheses. Other traits might be borrowed at a high rate because they are strongly functional innovations. For example, some aspects of technology for which the benefits of adopting the novel trait from another culture are readily apparent may show such high rates of adoption by neighbouring cultures owing to what is termed ‘direct bias’ (Boyd & Richerson 1985). However, in some cases, traits may well be beneficial if adopted but cultural processes such as conformity may prevent these traits from being transmitted between cultures. Rogers (1995) describes how attempts by a public health worker over a 2-year period to encourage the boiling of water in a Peruvian village failed because of cultural beliefs that linked hot foods to illness. Traits such as those that are used as badges of group identity, those for which the benefits are not readily ascertainable or which are only beneficial in certain ecological situations may show low rates of horizontal transfer. In an analysis of the global distribution of traits relating to social structure, Murdock concluded that ‘the forms of social organization seem singularly impervious to diffusion’ (Murdock 1949, p. 196). Phil. Trans. R. Soc. B (2010)
T. E. Currie et al.
3911
In some cases, horizontal transmission may actually help us to detect functional relationships between cultural traits. Mace & Pagel (1994) use the example of camels being adopted by different pastoralist groups as a classic case of horizontal transmission, which in fact provides us with a great opportunity to seek correlated evolution. Holden & Mace (2003) show that when cattle are adopted by Bantu groups (borrowed in all cases), then the society is more likely to become patrilineal—the high rate of horizontal transmission of cattle-keeping provides the many changes in the subsistence system on the phylogenetic tree that then lead to opportunities to observe changes in the social system, hence helping generate the evidence for the coevolution of the two traits. These different processes described above will lead to fundamentally different patterns of horizontal transmission. In the simulations presented here, both traits are simply transmitted between adjacent cultures unsystematically and at the same rate, i.e. they are not borrowed because of their inherent superiority or because they are more suitable for a certain environment. Such a process would lead to the kind of patterns of borrowing shown in figure 4b. However, if beneficial traits are likely to be taken up by surrounding societies, this will lead to the kind of pattern of horizontal transmission shown in figure 4a. While the rate of transmission may be high over a short time frame, the rate of change over the entire phylogeny may be quite low. Modern likelihood techniques allow for variation in the rate of trait change across the different parts of the tree (Pagel 1999), which may help to resolve this particular problem. Clearly, the mode of horizontal transmission is an important issue, and future simulations should explore the effect different processes of horizontal transmission discussed above have on the application of PCMs to cultural data. In summary, the simulations reported here suggest that PCMs can make accurate estimations about cultural evolutionary processes even when traits have been transmitted horizontally. Contrary to the conclusions drawn by Nunn et al., who only analysed cases where traits were transmitted horizontally as a pair, when traits were horizontally transmitted separately, the PCM did not show inflated levels of false positives, whereas non-phylogenetic methods produced more false positives. Additionally, horizontal transmission did not produce systematic errors in the ancestral state estimation. It is this ability of PCMs to provide answers to questions that are not possible with more traditional analyses that should be most exciting to cross-cultural researchers and those interested in prehistory, and we strongly recommend that researchers explore phylogenetic approaches to investigating cultural diversity. The authors would like to thank Monique Borgerhoff Mulder and Charles Nunn for making the simulation code available. We would also like to thank Charles Nunn and two anonymous reviewers for helpful comments on earlier versions of this manuscript. T.E.C. was funded by NERC/ ESRC Interdisciplinary Research Studentship and a JSPS Post-doctoral Fellowship. S.J.G. was supported by the Royal Society of New Zealand Marsden Fund.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3912
T. E. Currie et al.
Horizontal transmission simulation
REFERENCES Barth, F. 1969 Introduction. Ethnic groups and boundaries. London, UK: Allen and Unwin. Boyd, R. & Richerson, P. 1985 Culture and the evolutionary process. Chicago, IL: Chicago University Press. Boyd, R., Gintis, H., Bowles, S. & Richerson, P. J. 2003 The evolution of altruistic punishment. Proc. Natl Acad. Sci. USA 100, 3531–3535. (doi:10.1073/pnas. 0630443100) Cohen, J. 1988 Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates. Diamond, J. 1997 Guns, germs and steel. London, UK: Jonathan Cape. Durham, W. H. 1992 Applications of evolutionary culture theory. Annu. Rev. Anthropol. 21, 331– 355. (doi:10. 1146/annurev.an.21.100192.001555) Felsenstein, J. 1985 Phylogenies and the comparative method. Am. Nat. 125, 1 –15. Fortunato, L. & Jordon, F. 2010 Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies. Phil. Trans. R. Soc. B 365, 3913–3922. (doi:10.1098/rstb.2010.0017) Freckleton, R. P. 2009 The seven deadly sins of comparative analysis. J. Evol. Biol. 22, 1367–1375. (doi:10.1111/j. 1420-9101.2009.01757.x) Gray, R. D., Greenhill, S. J. & Ross, R. M. 2007 The pleasures and perils of Darwinizing culture (with phylogenies). Biol. Theory 2, 360– 375. (doi:10.1162/ biot.2007.2.4.360) Gray, R. D., Bryant, D. & Greenhill, S. J. 2010 On the shape and fabric of human history. Phil. Trans. R. Soc. B 365, 3923–3933. (doi:10.1098/rstb.2010.0162) Greenhill, S. J., Currie, T. E. & Gray, R. D. 2009 Does horizontal transmission invalidate cultural phylogenies? Proc. R. Soc. B 276, 2299 –2306. (doi:10.1098/rspb. 2008.1944) Guglielmino, C. R., Viganotti, C., Hewlett, B. & CavalliSforza, L. L. 1995 Cultural variation in Africa: role of mechanisms of transmission and adaptation. Proc. Natl Acad. Sci. USA 92, 7585–7589. (doi:10.1073/pnas.92. 16.7585) Harvey, P. H. & Pagel, M. D. 1991 The comparative method in evolutionary biology. New York, NY: Oxford University Press. Holden, C. J. & Gray, R. D. 2006 Rapid radiation borrowing and dialect continua in the Bantu languages. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew). Cambridge, UK: McDonald Institute for Archaeological Research. Holden, C. J. & Mace, R. 2003 Spread of cattle led to the loss of matrilineal descent in Africa: a coevolutionary analysis. Proc. R. Soc. Lond. B 270, 2425–2433. (doi:10.1098/rspb.2003.2535) Holden, C. J. & Shennan, S. 2005 Introduction to part I: how tree-like is cultural evolution? In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Holden, C. J., Meade, A. & Pagel, M. 2005 Comparison of maximum parsimony and Bayesian Bantu language trees. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Jordan, F. M., Gray, R. D., Greenhill, S. J. & Mace, R. 2009 Matrilocal residence is ancestral in Austronesian societies. Proc. R. Soc. B 276, 1957–1964. (doi:10. 1098/rspb.2009.0088) Joseph, B. D. 2003 Historical linguistics. In The handbook of linguistics (eds M. Aronoff & J. Rees-Miller). Oxford, UK: Blackwell Publishers Ltd.
Phil. Trans. R. Soc. B (2010)
Kirch, P. V. & Green, R. C. 1997 History, phylogeny, and evolution in Polynesia. Curr. Anthropol. 33, 161–186. (doi:10.1086/204023) Mace, R. 2005 A phylogenetic approach to the evolution of cultural diversity. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Mace, R. & Holden, C. J. 2005 A phylogenetic approach to cultural evolution. Trends Ecol. Evol. 20, 116–121. (doi:10.1016/j.tree.2004.12.002) Mace, R. & Jordan, F. M. 2005 The evolution of human sex-ratio at birth: a biocultural analysis. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Mace, R. & Pagel, M. 1994 The comparative method in anthropology. Curr. Anthropol. 35, 549– 564. (doi:10. 1086/204317) Mantel, N. 1967 The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209. McElreath, R., Boyd, R. & Richerson, P. J. 2003 Shared norms and the evolution of ethnic markers. Curr. Anthropol. 44, 122– 130. (doi:10.1086/345689) Moore, J. H. 1994 Putting anthropology back together again—the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925 –948. (doi:10.1525/aa.1994.96.4. 02a00110) Mulder, M. B., George-Cramer, M., Eshleman, J. & Ortolani, J. 2001 A study of East African kinship and marriage using phylogenetically controlled comparison. Am. Anthropol. 103, 1059–1082. Murdock, G. P. 1949 Social structure. New York, NY: MacMillan. Nunn, C. L., Mulder, M. B. & Langley, S. 2006 Comparative methods for studying cultural trait evolution: a simulation study. Cross-Cult. Res.40, 177 –209. (doi:10. 1177/1069397105283401) O’Brien, M. J. & Lyman, R. L. 2005 Cultural phylogenetic hypotheses in archaeology: some fundamental issues. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. Holden & S. Shennan). London, UK: UCL Press. Page, R. D. M. & Charleston, M. A. 1998 Trees within trees: phylogeny and historical associations. Trends Ecol. Evol. 13, 356 –359. (doi:10.1016/S0169-5347(98) 01438-4) Pagel, M. 1997 Inferring evolutionary processes from phylogenies. Zool. Scripta 26, 331– 348. (doi:10.1111/j.14636409.1997.tb00423.x) Pagel, M. 1999 Inferring the historical patterns of biological evolution. Nature 401, 877– 884. (doi:10. 1038/44766) Pagel, M. & Meade, A. 2005 Bayesian estimation of correlated evolution across cultures: a case study of marriage systems and wealth transfer at marriage. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan). London, UK: UCL Press. Rogers, E. M. 1995 Diffusion of innovations. New York, NY: Free Press. Rogers, D. S., Feldman, M. W. & Ehrlich, P. R. 2009 Inferring population histories using cultural data. Proc. R. Soc. B 276, 3835–3843. (doi:10.1098/rspb. 2009.1088) Temkin, I. & Eldredge, N. 2007 Phylogenetics and material cultural evolution. Curr. Anthropol. 48, 146 –153. (doi:10. 1086/510463) Younger, S. M. 2008 Conditions and mechanisms for peace in precontact Polynesia. Curr. Anthropol. 49, 927 –934. (doi:10.1086/591276)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3913–3922 doi:10.1098/rstb.2010.0017
Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies Laura Fortunato1,2,*,† and Fiona Jordan1,2,‡ 1
Department of Anthropology, University College London, 14 Taviton Street, London WC1H 0BW, UK AHRC Centre for the Evolution of Cultural Diversity, Institute of Archaeology, University College London, 31– 34 Gordon Square, London WC1H 0PY, UK
2
Accurate reconstruction of prehistoric social organization is important if we are to put together satisfactory multidisciplinary scenarios about, for example, the dispersal of human groups. Such considerations apply in the case of Indo-European and Austronesian, two large-scale language families that are thought to represent Neolithic expansions. Ancestral kinship patterns have mostly been inferred through reconstruction of kin terminologies in ancestral proto-languages using the linguistic comparative method, and through geographical or distributional arguments based on the comparative patterns of kin terms and ethnographic kinship ‘facts’. While these approaches are detailed and valuable, the processes through which conclusions have been drawn from the data fail to provide explicit criteria for systematic testing of alternative hypotheses. Here, we use language trees derived using phylogenetic tree-building techniques on Indo-European and Austronesian vocabulary data. With these trees, ethnographic data and Bayesian phylogenetic comparative methods, we statistically reconstruct past marital residence and infer rates of cultural change between different residence forms, showing Proto-Indo-European to be virilocal and Proto-Malayo-Polynesian uxorilocal. The instability of uxorilocality and the rare loss of virilocality once gained emerge as common features of both families. Keywords: phylogenetic comparative methods; cultural phylogenetics; post-marital residence; Indo-European; Austronesian; human social organization
1. INTRODUCTION Marital residence norms are an important determinant of human kinship organization. By regulating the movement of people, these norms shape the pattern of genetic variation within and across populations. Knowledge of this feature of social organization is therefore crucial to our understanding of human demographic history (e.g. Seielstad et al. 1998; Wilkins & Marlowe 2006). Until recently, our ability to specify the behavioural strategies of people in prehistory was speculative at best. Ancestral kinship patterns have been inferred (i) through reconstruction of kin terminologies in ancestral proto-languages using the linguistic comparative method and (ii) through geographical or distributional arguments based on the comparative patterns of kin terms and ethnographic kinship ‘facts’ (e.g. Murdock 1949; Blust 1980; Mallory 1997). We stress that these approaches are
* Author for correspondence ([email protected]). † Present address: Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM87501, USA. ‡ Present address: Evolutionary Processes in Language and Culture, Max Planck Institute for Psycholinguistics, PB 310, Nijmegen, The Netherlands. Electronic supplementary material is available at http://dx.doi.org/ 10.1098/rstb.2010.0017 or via http://rstb.royalsocietypublishing.org. One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
detailed and carefully described, and valuable to the study of prehistory. However, the processes through which conclusions have been drawn from the data fail to provide explicit criteria for systematic testing of alternative hypotheses. Accurate reconstruction of prehistoric social organization is important if we are to put together satisfactory multidisciplinary scenarios about, for example, the dispersal of human groups. Such considerations apply in the case of Indo-European (IE) and Austronesian (AN), two large-scale language families that are thought to be Neolithic expansions associated with new domestication technologies (Diamond & Bellwood 2003). In this paper, we discuss the results of phylogenetic comparative analyses of marital residence in societies speaking IE and AN languages, and show how this approach progresses speculation into a testable scientific framework. Work in both language families illustrates the importance of correctly inferring ancestral social organization. For example, Gimbutas (1991) excluded Anatolia and southeastern and central Europe as potential homelands of the IE language family on the grounds that the Neolithic societies of these regions were ‘matrifocal’, while early IE society was reconstructed as practising virilocality (i.e. residence of married couples with or near the husband’s kin) and patrilineality (Mallory 1997). However, interpretations of the linguistic evidence seem to be strongly biased towards virilocality by the purportedly
3913
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3914
L. Fortunato & F. Jordan Indo-European and Austronesian residence
‘male-centred’ structure of early IE society (Clackson 2007; e.g. Anthony 2007). The prevalence of virilocality among the historically attested IE societies is used to bolster these interpretations (e.g. Mallory 1997), but the ethnographic evidence can also be used to support alternative scenarios. For example, based on cross-cultural variation in social systems, Murdock (1949, p. 349) reconstructed ‘an Eskimo type of social structure in the prehistory of the Indo-European peoples’, which is characterized in its typical form by monogamous marriage, independent nuclear families and neolocal residence (Murdock 1949). While diverse in terms of marital residence norms, AN societies are noted for their flexibility (Lane 1961) and their ‘matricentric orientation’, that is, a theme of uxorilocality (i.e. residence of married couples with or near the wife’s kin) and matrilineality (Burton et al. 1996). Pacific scholars have debated the nature of early AN social organization for many years with little apparent consensus (Van Wouden 1935 [1968]; Murdock 1949; Blust 1980). As with IE, inferences have relied on linguistic reconstructions and inferences from comparative ethnography. More recently, Hage (1998) and Marck (Hage & Marck 2003; Marck 2008) hypothesized that uxorilocality characterized ancestral Oceanic society (the branch of the family including Polynesian and other Remote Oceanic societies). A matri-biased social organization in ancestral Oceanic peoples would therefore have restricted female genetic diversity while increasing male diversity as non-AN men married in. Uxorilocality is thus consistent with the divergent mtDNA and Y-chromosome patterns seen in Pacific human genetics, and some geneticists are beginning to work within this paradigm (e.g. Kayser et al. 2008; for a review see Hurles et al. 2003). These examples illustrate how putative ancestral kinship patterns are invoked to constrain hypotheses (as in IE) or to explain conflicting evidence (as in AN) about the past. Resolving questions about past social structure will thus play a large part in correctly describing population prehistory. Phylogenetic comparative methods offer a rigorous statistical framework for reconstructing the pattern of change in cultural traits, and provide insights into features of social organization that are not preserved in the archaeological or historical records (Mace & Pagel 1994). Even within a small literature, phylogenetic comparative analyses of human cultural traits have concentrated on aspects of kinship (Fortunato 2008). The fact that kinship systems are organized in a restricted set of all the combinatorial possibilities available (e.g. Nerlove & Romney 1967) suggests that selective forces are at work to optimize these features of human social behaviour. Because they involve a co-dependent set of individuals, at the group level, kinship norms are likely to be stable for many generations, particularly if they represent effective behavioural strategies (e.g. Guglielmino et al. 1995). It then stands that if human societies have shared trajectories of cultural evolution, kinship is a likely locus for us to discover such commonalities. Here, we focus our discussion on comparison of the inferred patterns of change in marital residence across the two ethnoPhil. Trans. R. Soc. B (2010)
linguistic groups; the reconstructions for the two ethno-linguistic groups are discussed in detail in Fortunato (in press) for IE and in Jordan et al. (2009) for AN. Following Richards (1950), who noted that matrilineal systems involve tensions between male authority and the female focus of kinship relations, we suggest that one predicted commonality might be higher rates of change away from uxorilocal residence to other forms. 2. THE PHYLOGENETIC COMPARATIVE APPROACH Phylogenetic comparative methods work by reconstructing evolutionary pathways that are likely to have produced the observed distribution of traits of interest across a sample of taxa. This requires a phylogenetic tree representing the historical relationships among the taxa, and a model of how the traits have evolved on the tree (Felsenstein 1985; Harvey & Pagel 1991). In a cross-cultural framework, the model of trait evolution is inferred statistically from ethnographic comparative data mapped onto phylogenetic trees derived from genetic or linguistic data (Mace & Pagel 1994; e.g. Holden & Mace 1997, 2003; Fortunato et al. 2006). Here, we reconstruct ancestral states of marital residence across samples of societies speaking IE and AN languages, using comparative data from ethnographic sources and phylogenetic trees derived from linguistic data. We use phylogenetic comparative methods in a Bayesian Markov chain Monte Carlo (MCMC) framework (Pagel et al. 2004; Pagel & Meade 2005, 2006) to estimate the posterior probability distributions of parameters of interest to the comparative question (e.g. ancestral state probabilities at internal nodes, rates of trait change). The posterior probability of a parameter value is a quantity proportional to its likelihood of having produced the observed data, and represents the probability of the parameter value given the data and model of trait evolution (Huelsenbeck et al. 2001; Lewis 2001). Because posterior probabilities cannot feasibly be computed analytically, posterior probability distributions are inferred instead using an MCMC sampling algorithm. This distributional approach provides information about the degree of statistical uncertainty in the cultural trait reconstructions. Relatedly, this approach makes it possible to account for the effect of uncertainty in the phylogenetic tree model representing population history, a non-trivial consideration in the study of cultural traits as a single branching tree is unlikely to accurately represent human population history (Boyd et al. 1997): the estimation of parameters over a probability sample of trees yields estimates that are not dependent on any specific phylogenetic hypothesis. Finally, parameters can be estimated over different models of trait evolution, and this yields estimates that are not dependent on any specific model of how the cultural traits have evolved. (a) Tree samples We used the posterior probability samples of language trees published in Pagel et al. (2007) for IE and in Jordan et al. (2009) for AN. These samples were
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Indo-European and Austronesian residence L. Fortunato & F. Jordan themselves obtained from phylogenetic tree-building analyses of binary matrices showing the presence/absence of cognate terms from the 200-word list of basic vocabulary for 87 IE languages (data from Dyen et al. 1992) and the 210-word list for 400 AN languages (data from Greenhill et al. 2008; Gray et al. 2009), using the Bayesian MCMC method implemented in BAYESPHYLOGENIES (Pagel & Meade 2004). This method generates a sample of phylogenetic trees in which trees appear in proportion to their posterior probability. The samples included 750 trees for IE, and 1000 for AN; the size of the tree sample is arbitrarily large, and is determined by the specifications (e.g. length of the MCMC chain and sampling period) of the tree-building analysis. We pruned the parent trees in the samples to retain only those taxa for which we had corresponding cultural data in the ethnographic sources used (n ¼ 27 plus outgroup Hittite for IE and n ¼ 135 for AN); the criteria used for matching the linguistic and cultural taxa are outlined in §2b. (i) Outgroups and proto-societies Outgroup taxa are used in tree-building analyses for determining ancestor – descendant relationships; they provide information on the direction of change in the data (in this case, in the linguistic data), by virtue of being distantly related to the taxa under investigation, the ingroup taxa. The tree-building analysis for IE used Hittite, Tocharian A and Tocharian B as outgroups (Pagel et al. 2007). Hittite belongs to the extinct sister group of the IE languages, the Anatolian clade; together, the Anatolian and IE clades form the Indo-Hittite language family. The two known dialects of Tocharian, A and B, are speech varieties representing an extinct IE clade (Ruhlen 1991). We use the term ‘Proto-Indo-Hittite’ for the hypothetical ancestor of Indo-Hittite languages, and ‘Proto-Indo-European’ (PIE) for the hypothetical ancestor of IE languages, and for the hypothetical ‘proto-societies’ that spoke them. For consistency with previous work (Fortunato et al. 2006; Fortunato & Mace 2009), Hittite was retained in the tree sample, but was assigned no marriage strategy data for the purpose of the comparative analysis (§2b). In AN, the languages used as outgroup taxa in the tree-building analysis (Gray et al. 2009) were not retained in the tree sample as no corresponding cultural data could be found. Thus, the root of the tree corresponds to ‘Proto-Austronesian’ (PAN), whereas ‘Proto-Malayo-Polynesian’ corresponds to the hypothetical ancestor of all non-Formosan AN languages (Gray et al. 2009). For each set of analyses, we present a consensus tree summarizing the tree sample, but the comparative analyses were performed over the two tree samples in their entirety. (b) Coding residence data We matched languages to ethnographic data on marital residence using the geographical and descriptive information on societies in the anthropological literature. The IE analyses used data from Murdock’s (1967) Ethnographic Atlas, based on the updated electronic version by Gray (1999). Only societies located in Eurasia were included in the sample, corresponding Phil. Trans. R. Soc. B (2010)
3915
to the geographical range of IE languages before 1492 CE (Diamond & Bellwood 2003). In addition to the Ethnographic Atlas, the AN analyses used data from ethnographic encyclopaedias (LeBar 1975; Levinson 1990) and relevant ethnographic literature or fieldworkers (Jordan et al. 2009). The Ethnographic Atlas scores societies separately for prevailing and alternative modes of marital residence, the latter defined as ‘culturally patterned alternatives to, or significant deviations from, the prevailing profile’ (Murdock 1967, p. 48). For AN, the data from additional sources were coded consistently with the Ethnographic Atlas. In order to give higher weight to the prevailing mode of residence, each society was assigned three columns of data: two identical columns specifying the prevailing pattern, and a third column specifying the alternative pattern; the prevailing mode was used at all three columns for societies scored as not presenting an alternative mode. For both the prevailing and alternative patterns, we coded societies as practising neolocality (i.e. residence apart from the kin of either spouse; state N), uxorilocality (i.e. residence with or near the wife’s kin; state U) or virilocality (i.e. residence with or near the husband’s kin; state V). Ambilocal societies, where married couples take residence optionally with (or near) the kin of either spouse, and with approximately equal frequency, were assigned the dual state UV. Consistently, in the comparative analysis, these societies are treated as taking either state with equal probability (§2c). Missing information was coded as such (§2c). Below, we discuss results focusing on the changes in the prevailing residence pattern across the two tree samples.
(c) Estimation of ancestral states We used the phylogenetic comparative method implemented in BAYESMULTISTATE, available as part of the BAYESTRAITS package from http://www.evolution. rdg.ac.uk/BayesTraits (Pagel et al. 2004; Pagel & Meade 2005, 2006). Given the comparative data and tree sample, BAYESMULTISTATE uses a continuous-time Markov model to describe the evolution of the trait of interest along the branches of a phylogeny. Under this model, the trait ‘residence’ can switch repeatedly between its three states, N, U and V, in any of the branches of a tree. In analyses with multiple sites—in this case, the three columns of data specifying the prevailing and alternative modes of residence— BAYESMULTISTATE uses information from the sites simultaneously to estimate a single set of rate parameters specifying the model of trait evolution. Three states require six rate parameters specifying the possible transitions—in this case, qNU, qNV, qUN, qUV, qVN and qVU. These parameters measure the instantaneous rates of change from one state to another, and are used to define the probabilities of these changes, the character states at internal nodes on a tree and the likelihood of the data (Pagel 1994, 1997, 1999). In the likelihood calculations, BAYESMULTISTATE treats taxa that are assigned multiple states, like the ambilocal societies (§2b), as taking those states with equal probability at the relevant site; similarly, it treats taxa with missing data as taking any state with equal probability.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3916
L. Fortunato & F. Jordan Indo-European and Austronesian residence
The Bayesian MCMC implementation of BAYESMULTISTATE estimates the posterior probability distributions of rate parameters and ancestral character states (Pagel et al. 2004; Pagel & Meade 2005). All analyses used the program in reversible-jump mode, which additionally estimates the posterior probability distribution of the possible models of trait evolution specified by the six rate parameters (Pagel & Meade 2006). The reversible-jump procedure outputs a model string describing the rate parameters such that rates are assigned to classes denoted by ordered integers or a ‘zero bin’ depicted by Z. For example, the model string 00011Z assigns qNU, qNV and qUN all to an internally equivalent rate class (0) that is slower than class (1) to which qUV and qVN are assigned. Rate qVU is set to zero. The means of the posterior probability distribution of ancestral states at internal nodes on the consensus tree are combined with the posterior probability of each node, which represents the probability that the node exists (Lewis 2001), and is denoted as p(node). For example, for a given node, BAYESMULTISTATE may return a posterior probability distribution with a mean of 0.8 + s.d. for virilocality; this is denoted p(Vjnode) + s.d. If the node is present in all trees, i.e. p(node) ¼ 1.00, we accept the 0.8 value as the posterior probability of virilocality at that node. However, if the node is only present in 60 per cent of the trees, i.e. p(node) ¼ 0.60, we report the ‘combined probability’ for virilocality, p(V) ¼ p(Vjnode) p(node) ¼ 0.8 0.6 ¼ 0.48. A value of 0.7 for the combined probabilities represents an acceptable value of certainty for an ancestral state at a node (M. Pagel 2006, personal communication). The MCMC chain specifications were determined by examining the results of preliminary maximumlikelihood and MCMC runs. Different numbers of taxa and the probabilistic nature of the tree samples meant that the same specifications could not be applied to both datasets; however, where possible, we used comparable values (see electronic supplementary material, table S1, for details).
(d) Testing We tested the ancestral state reconstructions at the root (Proto-Indo-Hittite for IE and PAN for AN) and a historically significant basal node in each family (PIE for IE and Proto-Malayo-Polynesian for AN). These tests ‘fixed’ each node to be one of the three possible states (N, V and U), in turn. BAYESMULTISTATE does not allow sites to be fossilized separately; therefore, each run fixed all three sites to the same state. We determined which fossilized state had relatively higher support at a given node using Bayes factors. Following Pagel & Meade (2006), we took twice the difference in the logarithm of the harmonic mean of the likelihoods for pairs of runs; the resulting values represent a summary of the evidence for one state over another at a given node. Based on Raftery’s (1996) logarithmic scale for interpretation of the Bayes factors, values between 0 and 2 are barely worth mentioning, values between 2 and 5 represent positive evidence, values between 5 and 10 strong Phil. Trans. R. Soc. B (2010)
evidence and values greater than 10 very strong evidence. 3. RESULTS Figures 1 and 2 present the consensus trees summarizing the two phylogenies. The IE phylogeny in figure 1 is shown with all nodes and reconstructions. The AN phylogeny (figure 2) is depicted in condensed form with clades collapsed; the size of a clade is proportional to the number of daughter societies (see electronic supplementary material, table S2, for a listing of societies within each clade). Clade triangles terminate in a bar shaded proportionately to reflect the frequency of residence patterns within each clade. Below, we report the major findings of ancestral state reconstruction for the two families, followed by a comparative discussion on the relative rates of change in residence. (a) Indo-European For the prevailing mode of residence, nodes ProtoIndo-Hittite and PIE reconstructed as virilocal with posterior probabilities of p(Vjnode) ¼ p(V) ¼ 0.64 + 0.14 and p(Vjnode) ¼ p(V) ¼ 0.90 + 0.12, respectively (figure 3). Virilocality reconstructed with high posterior probabilities through to nodes A, B and C (in all cases p(Vjnode) 0.85), but the confidence that can be placed in these inferences is limited by the degree of phylogenetic uncertainty at these nodes. Uncertainty in the reconstructions at the base of the tree means that a host of scenarios can explain the observed distribution of states at the tips. Node D (the common ancestor of societies speaking Italic, Germanic and Celtic languages) yielded posterior probabilities of p(Vjnode) ¼ 0.40 + 0.10 for virilocality and p(Njnode) ¼ 0.40 + 0.15 for neolocality; additionally, this node is found in only 78 per cent of trees in the sample, i.e. p(node) ¼ 0.78. However, neolocality reconstructed with high posterior probabilities within the Italic clade. Nodes E (the common ancestor of societies speaking Indian and Iranian languages) and F (the common ancestor of societies speaking Baltic and Slavic languages) reconstructed as virilocal with posterior probabilities of p(Vjnode) ¼ p(V) ¼ 0.87 + 0.10 and p(Vjnode) ¼ p(V) ¼ 0.92 + 0.08, respectively. Virilocality reconstructed with high posterior probabilities within the Indo-Iranian and Balto-Slavic clades. In agreement with the ancestral state reconstructions, at node PIE the fossilization analyses returned strong evidence for virilocality over uxorilocality (Bayes factor 7.51), positive evidence for virilocality over neolocality (Bayes factor 4.36) and positive evidence for neolocality over uxorilocality (Bayes factor 3.15). The pattern was weaker at node Proto-Indo-Hittite, with values of the Bayes factor less than 2 in all cases. (b) Austronesian Uxorilocality is securely reconstructed for ProtoMalayo-Polynesian (p(Ujnode) ¼ p(U) ¼ 0.96 + 0.06), and many daughter subgroups and societies in the Island Southeast Asian region (e.g. Proto-Philippines, Sumatran societies) still retain this pattern. The root
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Indo-European and Austronesian residence L. Fortunato & F. Jordan 100 94
3917
Panjabi ST Hindi
100
Gujarati
63
Bengali
100
Kashmiri Singalese
100
100 98
E
Afghan Waziri
100
Persian Ossetic
48
100 93
C
Byelorussian
88
Russian
100
Czech
100
100
Lithuanian ST
B
100 96 46
49
100 48
A 100
Bulgarian Serbocroatian
F
78
Ukranian
Walloon Rumanian List Italian
78 D
Dutch List Irish B
PIE PIH
Portugese ST Spanish
Albanian G 56
Greek Md Armenian Md Hittite
Figure 1. Fifty per cent majority rule consensus phylogeny (including compatible groupings) summarizing the sample of 750 trees for 27 IE languages plus the outgroup Hittite. The colour of the dots at the tips depicts a society’s prevailing mode of marital residence: white, neolocality; grey, uxorilocality; black, virilocality; black and grey, ambilocality. The value above each node is the node’s posterior probability, as a percentage. The dots at the nodes indicate the ancestral states of residence (white, neolocality; grey, uxorilocality; black, virilocality; black and grey, ambilocality). for the node (black p(V) 0.70, grey p(N) 0.70); nodes with no dots have combined probability less than 0.70 for all states. PIH, Proto-Indo-Hittite; PIE, ProtoIndo-European.
PAN cannot be reconstructed with certainty in this three-state analysis that considers both prevailing and alternate residence forms, in comparison to Jordan et al. (2009) who considered only the prevailing mode of residence and two states (uxorilocal/virilocal). Here, for prevailing mode of residence, PAN is uxorilocal (p(Ujnode) ¼ p(U) ¼ 0.42 + 0.10) more than virilocal (p(Vjnode) ¼ p(V) ¼ 0.38 + 0.09) or neolocal (p(Njnode) ¼ p(N) ¼ 0.20 + 0.10), but there is considerable overlap in the distributions of these probabilities, leaving PAN an ambiguously reconstructed node (figure 4a). Alternative mode of residence at PAN is more securely uxorilocal (p(Ujnode) ¼ p(U) ¼ 0.67 + 0.13). Early AN societies thus do have a bias towards uxorilocality, suggesting that virilocality was a later development in the AN family as a whole. Switches to prevailing virilocality occur in many societies surrounding the island of New Guinea (clades such as Oceanic 1 –3, South Halmahera–West New Guinea and Central Malayo-Polynesian 1 and 2), though some retain uxorilocality, especially as an Phil. Trans. R. Soc. B (2010)
alternative strategy. The well-defined Proto-Oceanic node is present in all trees and reconstructs with prevailing virilocality (p(Vjnode) ¼ p(V) ¼ 0.72 + 0.18) and alternative uxorilocality (p(Ujnode) ¼ p(U) ¼ 0.77 + 0.16). No ancestral nodes robustly reconstruct as neolocal, and over all trees p(Njnode) is rarely greater than 0.33. When tested using the fossilization procedure, there is positive evidence in favour of PAN uxorilocality over virilocality (Bayes factor 3.88) and neolocality (Bayes factor 3.44), and very strong evidence for Proto-Malayo-Polynesian uxorilocality over virilocality (Bayes factor 16.42) and neolocality (Bayes factor 15.04).
(c) Rates of change Table 1 presents the mean values of the rate parameters over all model categories sampled by the chains; because we cannot directly compare parameter values across the two ethnolinguistic groups, we have scaled them against qVU ¼ 1. For each group, the
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3918
L. Fortunato & F. Jordan Indo-European and Austronesian residence
100
Polynesia
66 100
84 68
55
100
POC
100 96
54
78
85
37
84 87
100
90 93
52 99 100
98 98
PMP
100
73
99 89
100
PAN
Micronesia
Oceanic 3 Oceanic 2 Oceanic 1 SH-WNG CMP 2 CMP 1 WMP Sulawesi 2 Sulawesi 1 Java-Bali Sumatra Borneo
Philippine
Formosan
Figure 2. Fifty per cent majority rule consensus phylogeny (including compatible groupings) summarizing the sample of 1000 trees for 135 Austronesian languages. Collapsed clades are proportional in size to number of taxa, and terminate in a bar shaded proportional to residence patterns within that clade (white, neolocality; light grey, uxorilocality; black, virilocality; dark grey, ambilocality). The value above each node is the node’s posterior probability, as a percentage. Three major nodes only (Proto-Austronesian, PAN; Proto-Malayo-Polynesian, PMP; Proto-Oceanic, POC) are shown shaded according to the ancestral state reconstruction. CMP, Central Malayo-Polynesian; SHWNG, South Halmahera –West New Guinea; WMP, Western Malayo-Polynesian.
table also includes the string representing the model category sampled most frequently by the chain. In IE, the top reversible-jump model category accounts for 17 per cent of all points sampled by the chain. Here, rate parameter qVU was always assigned to the zero bin; in fact, it was assigned to the zero bin in the top 81 per cent of points sampled by the chain. The mean value of qVU over all sampled points was an order of magnitude smaller than the mean values of the other five rate parameters (0.2 compared with mean values ranging from 3.9 for qVN to 8.5 for qNV). The distribution of states at the tips of the tree and the reconstructed ancestral states indicate that transitions from viri- to uxorilocality in the prevailing mode of residence are rare, occurring only in the branch leading to Byelorussian and possibly in the branch leading to Dutch (figure 1); only three societies (Armenian, Italian and Singhalese) practise alternative uxorilocality. This means that the Phil. Trans. R. Soc. B (2010)
acquisition of uxorilocality is more likely to have occurred through neolocality than through virilocality throughout the history of IE-speaking societies. In AN, the top reversible-jump model category accounts for 23 per cent of all sampled points and captures the overall dynamics of how residence changes in these societies. In this, and in the first 94 per cent of sampled points, qVN is assigned to the zero bin. The value of qVN over all sampled points is effectively zero (0.6 compared with mean values greater than 27.6): thus, changes from viri- to neolocality are rare. Further, rates from viri- to uxorilocality and from uxori- to neolocality are always assigned to the slow rate category. Other transitions vary equally between the slower and faster rates, reflected in their mean values. The general pattern emerges that in AN residence, changes towards neolocality are uncommon, and transitions towards virilocality happen frequently. Uxorilocal societies are 1.5 times more likely to switch to virilocality (qUV ¼ 40.4)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Indo-European and Austronesian residence L. Fortunato & F. Jordan
zero rate classes (the mean number of non-zero rate parameters is 1.76 for IE and 2.04 for AN). More generally, the results suggest that in both IE and AN the loss of virilocality is a rare event, as indicated by the relative values of the rate parameters capturing these transitions (qVU and qVN). It is especially the case that changes from uxori- to virilocality (specified by parameter qUV) occur at a higher rate than the reverse transition (specified by parameter qVU): qUV is over 30 times more likely than qVU in IE, and one and a half times more likely in AN (table 1).
frequency (× 104)
(a) 10 8 6 4 2 0
frequency (× 104)
(b) 10 8 6 4 2 0 0
0.2
0.4 0.6 probability
0.8
1.0
Figure 3. Posterior probability distributions of reconstructed ancestral states for nodes corresponding to (a) Proto-IndoHittite and (b) PIE. Bar colours match the residence codings in figure 1: white, neolocality; grey, uxorilocality; black, virilocality.
frequency (× 10 4)
(a) 10 8 6 4 2 0 (b) 10 frequency (× 10 4)
3919
8 6 4 2 0 0
0.2
0.4
0.6
0.8
1.0
probability Figure 4. Posterior probability distributions of reconstructed ancestral states for nodes corresponding to (a) PAN and (b) Proto-Malayo-Polynesian. White, neolocality; grey, uxorilocality; black, virilocality.
than virilocal societies are to switch to uxorilocality (qVU ¼ 28.6); additionally, there are no instances where qVU . qUV. Overall, the analyses suggest that, in both ethnolinguistic families, the dynamics of evolutionary change in the residence strategy can be described by a model of trait evolution based on a small number of nonPhil. Trans. R. Soc. B (2010)
4. DISCUSSION Using phylogenetic comparative methods and ethnolinguistic information on two large cultural families, we have reconstructed an important aspect of the social structure of peoples who lived over 5000 years ago. The reconstruction of early IE virilocality is in line with the prevalent scenario derived from the linguistic evidence (Mallory 1997); as noted above, however, reconstructions of virilocality based on the linguistic evidence are plagued by substantial bias in interpretation, and several alternatives are at least equally plausible (Clackson 2007). The uncertainty in the reconstruction for Proto-Indo-Hittite reflects disagreements in the literature about the earliest residence pattern of IE peoples (Clackson 2007) and suggests that, for this point in time, we can place limited confidence in inferences about this aspect of social organization drawn from cross-cultural data. The reconstruction of early IE virilocality concurs with recent archaeological evidence based on strontium isotope analyses of Neolithic burials in Germany, which indicate the migration of females in adulthood (Price et al. 2001; Bentley et al. 2002; Haak et al. 2008; see discussion in Fortunato in press). In AN, early uxorilocality appears to be robustly supported in Proto-Malayo-Polynesian and as an alternate option in PAN. This is in line with some interpretations of PAN and Proto-Malayo-Polynesian kinship terminologies (Blust 1980), but, as with IE, here we provide independent confirmation from cross-cultural data. More recent work attempting to reconcile the different patterns of uniparental genetic markers seen in the Pacific has suggested that uxorilocality was a later development in AN, i.e. in Proto-Oceanic (Hage 1998; Hage & Marck 2003; Kayser et al. 2008). However, our findings suggest that this period of uxorilocality was earlier in time; our comparative methods may not be able to reconstruct this form of residence for Proto-Oceanic because many daughter societies have, while retaining an uxorilocal option, since switched to virilocality as the prevailing mode perhaps because of cultural contact with nearby non-AN (‘Papuan’) societies ( Jordan et al. 2009). Further work is required to identify independent ‘markers’ for contact that might allow us to systematically address hypotheses about cultural borrowings. The inferred model of trait evolution shows that in both IE and AN changes from uxori- to virilocality occur at a higher rate than the reverse transition. This may reflect the instability of ‘matricentric’
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3920
L. Fortunato & F. Jordan Indo-European and Austronesian residence
Table 1. Summary of the model of trait evolution for residence strategy. N, neolocality; U, uxorilocality; V, virilocality. rate parameters analysis mean valuea scaled valuea rate classb
IE AN IE AN IE AN
qNU 5.9 43 29.5 1.5 0 0
qNV 8.5 83 42.5 2.9 0 1
qUN 8.0 27.6 40 0.9 0 0
qUV 6.4 40.4 32 1.4 0 0
qVN 3.9 0.6 19.5 0.02 0 Z
qVU 0.2 28.6 1 1 Z 0
a
Mean and scaled values over all model categories sampled by the chain. The mean values are scaled by setting qVU ¼ 1. Rate class to which the rate parameter was assigned in the model sampled most frequently by the chain. ‘Z’ denotes rate parameters assigned to the zero bin (see §2c).
b
systems (e.g. systems involving matrilineal descent) as observed by Richards (1950) for African societies. In a phylogenetic comparative analysis of the coevolution of descent systems and cattle-keeping, Holden & Mace (2003) found evidence that Bantu matriliny was only sustained under certain socio-ecological conditions, i.e. the presence of horticulture and the absence of pastoralist subsistence systems. In this framework, both the prevalence of virilocality in ethnographically attested IE societies and the nearzero rate of switching from viri- to uxorilocality inferred by our evolutionary model are consistent with the pastoral and intensive agricultural subsistence economies ascribed to early IE societies (Mallory 1997). The matricentric character of AN societies (Burton et al. 1996) suggests a different evolutionary dynamic, that is, the loss of early—but perhaps widespread—uxorilocality. The origin and/or maintenance of uxorilocality has been linked to a ‘male absence’ factor (Keegan & Machlaclan 1989; Hage 1999). Many features of AN societies suggest this as a plausible hypothesis, including the unpredictable ecological features of oceanic environments; the voyaging traditions of seafaring people (both exploratory and tradingrelated); and subsistence systems that include deepsea fishing but not pastoralism as practised on large continental landmasses. If variation in residence is indeed linked to a society’s subsistence pattern and ecological niche, the type of analyses we present here offer good support for, and avenues for testing, the suspicions long held by anthropologists that human social life is not infinitely varied but rather is constrained by local environments. Asking the same questions in different ethnographic regions heralds a useful step forward in our ability to infer the general mechanisms of cultural evolutionary change, that is, the identification of lineage-specific processes within global domains (cf. Evans & Levinson 2009). Investigating the evolution of cross-cultural diversity—in kinship or otherwise—involves an explicit choice about how to statistically approach hierarchically related human populations (Mace & Pagel 1994). As well as controlling for the effects of historical relatedness, phylogenetic comparative methods let us drill down into the specifics of ancestral states and processes of cultural change in a way that no other statistical methods currently available will allow. This is not, however, a statement that all human cultural Phil. Trans. R. Soc. B (2010)
processes follow strict ‘vertical’ or phylogenetic transmission dynamics (contra Borgerhoff Mulder et al. 2006). Rather we suggest, following Mace (2005), that questions about the degree to which cultural traits are transmitted ‘vertically’ from parent to daughter populations or ‘horizontally’ across populations only make sense within a phylogenetic framework. In this context, phylogenetic comparative methods have been shown to outperform non-phylogenetic methods under realistic scenarios and levels of horizontal transmission (Nunn et al. 2006; Currie et al. 2010). Ultimately, as with any other methodological approach, as long as the assumptions of the comparative analysis are made clear, the conclusions can be sustained or refuted by different data or analytical approaches. Because the reconstruction of AN and IE prehistory are active fields of interdisciplinary scholarship, our findings have important implications for the interpretation of current ethnographic, archaeological, genetic and linguistic data; for example, fashionable statements in molecular anthropology about the impact of social structure on genetic diversity are largely used as post hoc narratives to explain incongruous findings. We believe there is obvious global utility in the methods and approaches presented here, and hope future research is stimulated by the promise of reconstructing the social lives of our ancestors. We thank M. Pagel, A. Meade and Q. Atkinson for the IndoEuropean tree sample; R. Gray and S. Greenhill, and all contributors to the Austronesian Basic Vocabulary Database for the Austronesian tree sample; and R. Mace for discussion. M. Pagel and A. Meade provided software and valuable computing assistance through the Center for Advanced Computing and Emerging Technologies (ACET) at the University of Reading.
REFERENCES Anthony, D. W. 2007 The horse, the wheel, and language: how Bronze-Age riders from the Eurasian steppes shaped the modern world. Princeton, NJ: Princeton University Press. Bentley, R. A., Price, T. D., Lu¨ning, J., Gronenborn, D., Wahl, J. & Fullagar, P. D. 2002 Prehistoric migration in Europe: strontium isotope analysis of Early Neolithic skeletons. Curr. Anthropol. 43, 799 –804. (doi:10.1086/ 344373) Blust, R. 1980 Early Austronesian social organization—the evidence of language. Curr. Anthropol. 21, 205 –247. (doi:10.1086/202430)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Indo-European and Austronesian residence L. Fortunato & F. Jordan Boyd, R., Richerson, P. J., Borgerhoff-Mulder, M. & Durham, W. H. 1997 Are cultural phylogenies possible? In Human by nature: between biology and the social sciences (eds P. Weingart, P. J. Richerson, S. D. Mitchell & S. Maasen), pp. 355 – 386. Mahwah, NJ: Lawrence Erlbaum Associates. Borgerhoff Mulder, M., Nunn, C. L. & Towner, M. C. 2006 Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Burton, M. L., Moore, C. C., Whiting, J. W. M. & Romney, A. K. 1996 Regions based on social structure. Curr. Anthropol. 37, 87–123. (doi:10.1086/204474) Clackson, J. 2007 Indo-European linguistics: an introduction. Cambridge, UK: CUP. Currie, T. E, Greenhill, S. J. & Mace, R. 2010 Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Phil. Trans. R. Soc. B 365, 3903 –3912. (doi:10.098/rstb.2010.0014) Diamond, J. & Bellwood, P. 2003 Farmers and their languages: the first expansions. Science 300, 597–603. (doi:10.1126/science.1078208) Dyen, I., Kruskal, J. B. & Black, P. 1992 An Indo-European classification: a lexicostatistical experiment. Trans. Am. Phil. Soc. 82, 1 –132. Evans, N. & Levinson, S. C. 2009 The myth of language universals: language diversity and its importance for cognitive science. Behav. Brain Sci. 32, 429–492. (doi:10.1017/S0140525X0999094X) Felsenstein, J. 1985 Phylogenies and the comparative method. Am. Nat. 125, 1–15. (doi:10.1086/284325) Fortunato, L. In press. Reconstructing the history of residence strategies in Indo-European-speaking societies: neo-, uxori-, and virilocality. Hum. Biol. Fortunato, L. 2008 A phylogenetic approach to the history of cultural practices. In Early human kinship: from sex to social reproduction (eds N. J. Allen, H. Callan, R. Dunbar & W. James), pp. 189 – 199. Malden, MA: Blackwell Publishing Ltd. Fortunato, L. & Mace, R. 2009 Testing functional hypotheses about cross-cultural variation: a maximum-likelihood comparative analysis of Indo-European marriage practices. In Pattern and process in cultural evolution (ed. S. Shennan), pp. 235–249. Berkeley, CA: University of California Press. Fortunato, L., Holden, C. & Mace, R. 2006 From bridewealth to dowry? A Bayesian estimation of ancestral states of marriage transfers in Indo-European groups. Hum. Nat. 17, 355–376. (doi:10.1007/s12110-006-1000-4) Gimbutas, M. 1991 The civilization of the goddess. San Francisco, CA: HarperSanFrancisco. Gray, J. P. 1999 A corrected Ethnographic Atlas. World Cult. 10, 24–136. Gray, R. D., Drummond, A. J. & Greenhill, S. J. 2009 Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479–483. (doi:10.1126/science.1166858) Greenhill, S. J., Blust, R. & Gray, R. D. 2008 The Austronesian basic vocabulary database: from bioinformatics to lexomics. Evol. Bioinform. 4, 271 –283. Guglielmino, C. R., Viganotti, C., Hewlett, B. & CavalliSforza, L. L. 1995 Cultural variation in Africa: role of mechanisms of transmission and adaptation. Proc. Natl Acad. Sci. USA 92, 7585–7589. (doi:10.1073/pnas.92. 16.7585) Haak, W. et al. 2008 Ancient DNA, strontium isotopes, and osteological analyses shed light on social and kinship organization of the Later Stone Age. Proc. Natl Acad. Sci. USA 105, 18 226–18 231. (doi:10.1073/pnas.0807592105) Hage, P. 1998 Was Proto-Oceanic society matrilineal? J. Polynesian Soc. 107, 365 –379. Phil. Trans. R. Soc. B (2010)
3921
Hage, P. 1999 Reconstructing ancestral Oceanic society. Asian Perspect. 38, 200–227. Hage, P. & Marck, J. 2003 Matrilineality and the Melanesian origin of Polynesian Y chromosomes. Curr. Anthropol. 44, S121– S127. (doi:10.1086/379272) Harvey, P. H. & Pagel, M. D. 1991 The comparative method in evolutionary biology. Oxford, UK: Oxford University Press. Holden, C. & Mace, R. 1997 Phylogenetic analysis of the evolution of lactose digestion in adults. Hum. Biol. 69, 605–628. Holden, C. J. & Mace, R. 2003 Spread of cattle led to the loss of matrilineal descent in Africa: a coevolutionary analysis. Proc. R. Soc. Lond. B 270, 2425–2433. (doi:10.1098/rspb.2003.2535) Huelsenbeck, J. P., Ronquist, F., Nielsen, R. & Bollback, J. P. 2001 Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310–2314. (doi:10. 1126/science.1065889) Hurles, M. E., Matisoo-Smith, E., Gray, R. D. & Penny, D. 2003 Untangling Oceanic settlement: the edge of the knowable. Trends Ecol. Evol. 18, 531 –540. (doi:10.1016/ S0169-5347(03)00245-3) Jordan, F. M., Gray, R. D., Greenhill, S. J. & Mace, R. 2009 Matrilocal residence is ancestral in Austronesian societies. Proc. R. Soc. B 276, 1957–1964. (doi:10.1098/rspb.2009. 0088) Kayser, M., Lao, O., Saar, K., Brauer, S., Wang, X., Nu¨rnberg, P., Trent, R. J. & Stoneking, M. 2008 Genome-wide analysis indicates more Asian than Melanesian ancestry of Polynesians. Am. J. Hum. Genet. 82, 194 –198. (doi:10.1016/j.ajhg.2007.09.010) Keegan, W. F. & Machlaclan, M. D. 1989 The evolution of avunculocal chiefdoms: a reconstruction of Taino kinship and politics. Am. Anthropol. 91, 613– 630. (doi:10.1525/ aa.1989.91.3.02a00050) Lane, R. B. 1961 A reconsideration of Malayo-Polynesian social organization. Am. Anthropol. 63, 711– 720. (doi:10.1525/aa.1961.63.4.02a00030) LeBar, F. M. (ed.) 1975 Ethnic groups of insular southeast Asia. New Haven, CT: HRAF Press. Levinson, D. (ed.) 1990 Encyclopedia of world cultures. Boston, MA: G. K. Hall & Co. Lewis, P. O. 2001 Phylogenetic systematics turns over a new leaf. Trends Ecol. Evol. 16, 30–37. (doi:10.1016/S01695347(00)02025-5) Mace, R. 2005 On the use of phylogenetic comparative methods to test co-evolutionary hypotheses across cultures. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan), pp. 235–256. London, UK: UCL Press. Mace, R. & Pagel, M. 1994 The comparative method in anthropology. Curr. Anthropol. 35, 549 –564. (doi:10. 1086/204317) Mallory, J. P. 1997 Residence. In Encyclopedia of IndoEuropean culture (eds J. P. Mallory & D. Q. Adams), pp. 483–484. London, UK: Fitzroy Dearborn. Marck, J. 2008 Proto Oceanic society was matrilineal. J. Polynesian Soc. 117, 345 –382. Murdock, G. P. 1949 Social structure. New York, NY: MacMillan. Murdock, G. P. 1967 Ethnographic Atlas. Pittsburgh, PA: University of Pittsburgh Press. Nerlove, S. & Romney, S. K. 1967 Sibling terminology and cross-sex behavior. Am. Anthropol. 69, 179–187. (doi:10. 1525/aa.1967.69.2.02a00050) Nunn, C. L., Borgerhoff Mulder, M. & Langley, S. 2006 Comparative methods for studying cultural trait evolution: a simulation study. Cross-Cult. Res. 40, 177– 209. (doi:10.1177/1069397105283401)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3922
L. Fortunato & F. Jordan Indo-European and Austronesian residence
Pagel, M. 1994 Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proc. R. Soc. Lond. B 255, 37– 45. (doi:10.1098/rspb.1994.0006) Pagel, M. 1997 Inferring evolutionary processes from phylogenies. Zool. Scripta 26, 331–348. (doi:10.1111/j.14636409.1997.tb00423.x) Pagel, M. 1999 The maximum likelihood approach to reconstructing ancestral character states on phylogenies. Syst. Biol. 48, 612–622. Pagel, M. & Meade, A. 2004 A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst. Biol. 53, 571 –581. Pagel, M. & Meade, A. 2005 Bayesian estimation of correlated evolution across cultures: a case study of marriage systems and wealth transfer at marriage. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. Shennan), pp. 235 –256. London, UK: UCL Press. Pagel, M. & Meade, A. 2006 Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. Am. Nat. 167, 808 –825. (doi:10.1086/503444) Pagel, M., Meade, A. & Barker, D. 2004 Bayesian estimation of ancestral character states on phylogenies. Syst. Biol. 53, 673 –684. (doi:10.1080/10635150490522232)
Phil. Trans. R. Soc. B (2010)
Pagel, M., Atkinson, Q. D. & Meade, A. 2007 Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature 449, 717 –720. (doi:10. 1038/nature06176) Price, T. D., Bentley, R. A., Lu¨ning, J., Gronenborn, D. & Wahl, J. 2001 Prehistoric human migration in the Linearbandkeramik of Central Europe. Antiquity 75, 593 –603. Raftery, A. E. 1996 Hypothesis testing and model selection. In Markov chain Monte Carlo in practice (eds W. R. Gilks, S. Richardson & D. J. Spiegelhalter), pp. 163 –187. London, UK: Chapman & Hall/CRC. Richards, A. R. 1950 Some types of family structure amongst the central Bantu. In African systems of kinship and marriage (eds A. R. Radcliffe-Brown & C. D. Forde), pp. 207–251. London, UK: Oxford University Press. Ruhlen, M. 1991 A guide to the world’s languages: classification. London, UK: Edward Arnold. Seielstad, M. T., Minch, E. & Cavalli-Sforza, L. L. 1998 Genetic evidence for a higher female migration rate in humans. Nat. Genet. 20, 278 –280. (doi:10.1038/3088) Van Wouden, F. A. E. 1935 [1968] Types of social structure in eastern Indonesia. The Hague, The Netherlands: Nijhoff. Wilkins, J. F. & Marlowe, F. 2006 Sex-biased migration in humans: what should we expect from genetic data? Bioessays 28, 290 –300. (doi:10.1002/bies.20378)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
Phil. Trans. R. Soc. B (2010) 365, 3923–3933 doi:10.1098/rstb.2010.0162
On the shape and fabric of human history Russell D. Gray1,*, David Bryant1,2 and Simon J. Greenhill1 1
Department of Psychology, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand Department of Mathematics and Statistics, University of Otago, PO Box 56, Dunedin 9054, New Zealand
2
In this paper we outline two debates about the nature of human cultural history. The first focuses on the extent to which human history is tree-like (its shape), and the second on the unity of that history (its fabric). Proponents of cultural phylogenetics are often accused of assuming that human history has been both highly tree-like and consisting of tightly linked lineages. Critics have pointed out obvious exceptions to these assumptions. Instead of a priori dichotomous disputes about the validity of cultural phylogenetics, we suggest that the debate is better conceptualized as involving positions along continuous dimensions. The challenge for empirical research is, therefore, to determine where particular aspects of culture lie on these dimensions. We discuss the ability of current computational methods derived from evolutionary biology to address these questions. These methods are then used to compare the extent to which lexical evolution is tree-like in different parts of the world and to evaluate the coherence of cultural and linguistic lineages. Keywords: cultural evolution; linguistic evolution; phylogenetics; networks; delta plots; Q-residual
1. INTRODUCTION The only figure in Darwin’s (1859) On the Origin of Species is an evolutionary tree. This tree reflects Darwin’s vision of descent with modification from a common ancestor. Today phylogenetic methods or ‘tree-thinking’ (O’Hara 1997) form the foundation of inferences in evolutionary biology (Harvey & Pagel 1991; Huelsenbeck & Rannala 1997; Felsenstein 2004). However, biologists are not alone, nor even first, in their use of trees to represent histories of descent with modification. There is a long parallel tradition of using trees to study linguistic and cultural genealogies (Spielman et al. 1974; Cavalli-Sforza et al. 1988; Atkinson & Gray 2005; Hunley et al. 2007, 2008). There is also a lengthy history of scepticism about the applicability of evolutionary analogies to culture. The influential American anthropologist Kroeber (1948) explicitly contrasted Darwin’s idea of a ‘tree of life’ with that of a ‘tree of cultures’. Kroeber argued that the tree of cultures entwines around itself, with frequent borrowing and diffusion of traits between cultures. In this scenario, information not only flows vertically from parent to daughter cultures but—just as importantly—horizontally between them too. There is a constant branching-out but the branches also grow together again, wholly or partially, all the time. Culture diverges, but it syncretizes and anastomoses too. . . . The tree of culture . . . is a ramification of such coalescences, assimilations, or acculturations. (Kroeber 1948, pp. 260–261)
The late palaeontologist Stephen Jay Gould was also a vocal critic of phylogenetic approaches to culture. * Author for correspondence ([email protected]). One contribution of 14 to a Theme Issue ‘Cultural and linguistic diversity: evolutionary approaches’.
In his 1987 book, An Urchin in the Storm, he proclaimed that: Human cultural evolution proceeds along paths outstandingly different from the ways of genetic change. . . Biological evolution is constantly diverging; once lineages become separate, they cannot amalgamate (except in producing new species by hybridization—a process that occurs very rarely in animals). Trees are correct topologies of biological evolution. . . In human cultural evolution, on the other hand, transmission and anastomosis are rampant. Five minutes with a wheel, a snowshoe, a bobbin, or a bow and arrow may allow an artisan of one culture to capture a major achievement of another. (Stephen Jay Gould 1987, p. 70).
Put bluntly, the obvious inference is that while phylogenetic methods are great in the biological realm, in studies of cultural evolution they are doomed to failure because cultural change is governed by completely different principles. Gould was not alone in holding this view (see Terrell 1988; Moore 1994 for total rejections of a phylogenetic approach to cultural evolution). Borgerhoff Mulder et al. (2006, p. 55) espouse the more moderate view that ‘. . . tree building is a powerful method and provides considerable insight, particularly when based on maximum likelihood and Bayesian inference procedures. However, without principled methods designed to uncover horizontal transmission, there is a danger of biasing findings towards vertical transmission if we only use tree-building methods’. They conclude their review with a cautionary statement that our ‘Current understanding of the relative importance of horizontal and vertical transmission is shaky, to say the least’ (Borgerhoff Mulder et al. 2006, p. 62). A similar, if rather less polemical, debate exists about the coherence or fabric of cultural evolution. In an insightful article, Boyd et al. (1997) lay out a range of possibilities for the fabric of cultural
3923
This journal is q 2010 The Royal Society
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
R. D. Gray et al.
The shape and fabric of human history
evolution. First, culture could evolve as (vertebrate) species do. Factors such as shared worldview, cultural group selection and demographic events might act to ensure that cultures are coherent and tightly integrated systems with little horizontal transmission between cultures. Mace & Holden (2005, p. 117) argue that ‘population dynamics can lead to group-level selection occurring in human cultural evolution . . . Such processes could maintain the identity of discrete cultural groups even when genetic distinctions are more blurred or even absent’. The main pathway of information flow in such cases would be vertically between generations and hence phylogenetic methods should work well. Pagel & Mace (2004) and Mace & Holden (2005) defend something close to this viewpoint. Second, cultures could be hierarchically integrated systems. Here, cultures are comprised of ‘core traditions’ that are inherited vertically. Horizontal transmission occurs, but only affects peripheral traits and not the core of the system. In this scenario, phylogenetic methods will work well for the core traditions, but not for the peripheral traits. In the case of linguistic evolution, basic vocabulary trees might be highly congruent with trees based on innovations in morphology and phonology (e.g. Gray et al. 2009), but much less congruent with trees based on a sampling of the entire lexicon or typological features (Greenhill et al. 2010). A third possibility is that cultures are assemblages of coherent clusters. These clusters are tightly integrated and vertical change occurs inside each cluster, but each cluster can be transmitted horizontally and may thus have a quite distinct evolutionary trajectory. In this case, phylogenetic methods will only work on a cluster-by-cluster basis, and only if the boundaries of each cluster can be identified. Finally, if horizontal transmission is the predominant mode of cultural change, then cultures could just be collections of ephemeral entities. In this situation, there is no coherent cultural system beyond a non-structured set of highly diffusible traits. This could be the outcome when cultural evolution is either too rapid, or cultural selection is too strict (such that alternate variants die out almost immediately), or the constraints on culture are severe (i.e. there is only one way to build a mousetrap). We believe that the current polarized debates about the shape and fabric of human history are not particularly productive. The way forward is not to be found by charging onward building trees in a blinkered and unreflective fashion. Reticulate cultural evolution and multiple cultural histories are real, if sometimes overemphasized. However, simply giving up at the first sign of horizontal transmission or an incongruent tree is no solution either. Despite the concerns about the tree-likeness and coherence of cultural evolution, computational phylogenetic methods have considerable success recently in answering questions about cultural history ranging from the origin of IndoEuropean languages (Gray & Atkinson 2003) to the social impact of adopting pastoralism in Africa (Holden & Mace 2003). In this paper, we suggest that further progress can be achieved through a combination of conceptual reframing, new methods for quantifying the tree-likeness and coherence of cultural evolution, and most crucially, empirical research. Phil. Trans. R. Soc. B (2010)
shape (Rh)
3924
total lexicon /C)
c (1
i fabr
basic vocabulary
morpho-syntax rate (R
v)
Figure 1. This figure positions linguistic traits on three dimensions. Rv is the rate of change of vertically inherited cultural traits, Rh is the rate of horizontal transmission and C is the degree of cultural cohesion (adapted from Gray et al. (2007)). In this hypothetical example, morpho-syntactical traits evolved slowly, are relatively rarely borrowed and are tightly bound together. In contrast, a random sampling of the total lexicon evolves rapidly, has lots of borrowing and reflects many different cultural histories.
2. A REFRAMING Rather than dichotomous disputes about the validity of cultural phylogenetics, we suggest that the debates are better conceptualized as involving positions along three continuous dimensions (figure 1). The first dimension we propose is Rv, the rate of change in characters transmitted vertically between generations. If this rate is very slow relative to the time period being studied, then there will be too little character change to allow the construction of cultural phylogenies. If Rv is too fast then the trace left by ‘descent with modification’ will be erased. The second dimension is Rh, the rate of horizontal transmission. At low rates of Rh, the estimated phylogenies will be good estimates of the cultural history. A recent simulation study by Greenhill et al. (2009) showed that phylogenetic tree estimates can be quite robust under realistic borrowing scenarios and moderate levels of undetected borrowing (e.g. less than 20% per 1000 years). At high rates of Rh, the estimated phylogenies will become increasingly inaccurate and poor summaries of the overall history. The third dimension is C, a measure of the extent to which different aspects of culture are coupled together. The challenge for empirical research is therefore to determine where particular aspects of culture lie on these dimensions. Methods exist to quantify the relative and absolute rates of change in cultural traits (Pagel et al. 2007; Greenhill et al. 2010). What we need is methods that enable us to quantify the shape and fabric of cultural evolution.
3. THE SHAPE OF CULTURAL EVOLUTION Imagine a dataset (either biological or cultural) that contains comparative information on a range of taxa. For the sake of simplicity, let us assume that each taxon has been assigned a discrete character state for a number of characters (e.g. the nucleotide present
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
The shape and fabric of human history R. D. Gray et al. at a specific point on a DNA sequence or the presence or absence of a cognate word). For each character, the taxa can be partitioned into a group that shares a specific character state and those that do not. In phylogenetic jargon this is termed a ‘split’. The more characters that group the taxa in the same way, the stronger the support for that split. When the splits are compatible (none of the splits group the taxa in contradictory ways), we can represent a set of splits derived from the whole dataset in a tree. The branches of the tree represent the splits and the branch lengths indicate the split weights. When the splits are incompatible, we can use a split graph. A split graph is a graphical representation of a collection of weighted splits (Bandelt & Dress 1992). In a tree, each split corresponds to a single branch. Removing that edge partitions the taxa set into two parts making up the split. In a split graph, each split corresponds to a collection of parallel edges, all with length equal to the weight of the split. Removing those edges partitions the graph, and therefore taxa set, into the two parts making up the split. There are a number of methods for obtaining the set of splits to represent in a split graph (reviewed in Huson & Bryant 2006). One method that has proved useful in analysing conflicting signal in biological datasets is the NeighborNet algorithm (Bryant & Moulton 2002, 2004; Bryant et al. 2005; Kennedy et al. 2005). NeighborNet closely resembles agglomerative clustering algorithms like the single and average linkage methods. It constructs splits by progressively combining clusters in a way that allows overlap. The resulting graph provides a useful visualization of the extent to which the data is tree-like. A program that calculates NeighborNets and displays split graphs, SPLITSTREE4, can be downloaded from http://www.ab.informatik.uni-tuebingen. de/software/splitstree4. Phylogenetic networks, such as the split graphs produced by the NeighborNet algorithm, give a broad brushstroke picture of conflicting signal within a dataset. The next step is to explore and measure aspects of the data that do not fit well into a tree, determine where the conflicting signal arises and find which taxa are involved. For this, we have found the delta score (Holland et al. 2002) to be useful. The method scores individual taxa from 0 to 1 according to how much each taxon is involved in conflicting signals. The scores returned are defined in terms of quartets, or subsets of four taxa selected from the complete set of taxa. Each quartet is given a score, and the score for a taxon is the average overall quartets that contain it. To determine the score for a quartet, e.g. the quartet containing i, j, k and l, we compute the three sums of the path lengths in the quartet dij þ dkl, dik þ djl and dil þ djk, where d denotes the distance between taxa in the quartet. For example, in figure 2, dij equals the sum lengths of the branches a, b and c. Let m1 be the maximum of these three values, let m2 be the second largest value, and let m3 be the smallest. The score assigned to that quartet is then (m1 2 m2)/(m1 2 m3), or zero if the denominator is zero. The rationale behind this score is that it equals zero if the distances between the four taxa exactly fit a tree; otherwise, the score ranges Phil. Trans. R. Soc. B (2010)
i
3925
j
a
c
b
l
k
Figure 2. A quartet containing the taxa i, j, k and l. The pathlength from taxon i to taxon j is the sum of branches a, b and c.
Flemish Afrikaans Frisian
Dutch
German_ST Sranan
(a) (b)
English
Danish Swedish Riksmal Icelandic Faroese Figure 3. A split graph showing the results of a NeighborNet analysis of 12 Indo-European languages. The graph shows strong conflicting signal for the positioning of Sranan. The split labelled (a) with the short-dashed line groups Sranan most closely with English, while the other one labelled (b) with the long-dashed line groups Sranan with Dutch and other closely related Germanic languages. Scale bar, 0.01.
between 0 and 1. In practice, we find that dividing by the normalization constant (m1 2 m3) obscures some of the signal. Instead, we find that the simpler score (m1 2 m2)2 for the quartet (called a Q-residual score in SPLITSTREE4) is a more accurate measure of departures from a strict tree and provides a value much closer to the residual in standard statistics. Note that scaling distances by some constant has no effect on the deltascore, but it does affect the Q-residual scores. For this reason, we rescale all of the distances before computing Q-residual scores so that the average of the distances between the taxa is 1. Once the scores are computed for each quartet, an overall estimate of the tree-likeness of the dataset can be obtained by summing the scores for all the quartets and dividing that sum by the total number of quartets (for n taxa there are n(n 2 1)(n 2 2)(n 2 3)/12 quartets). The score for a specific taxon is simply the average of the overall quartets that contain it. Hence, if there are n taxa, the score for an individual taxon is an average of n(n 2 1)(n 2 2)/6 quartets.
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3926
R. D. Gray et al.
The shape and fabric of human history
Eastern Polynesian Tahitian Rurutuan
Tahitic Manihiki Maori
Marquesic Marquesan (Nukuhiva) Hawaiian
Penrhyn Rarotongan Mangareva Tuamotu Pukapuka Kapingamarangi
Marquesan Rapanui
Nukuoro Rotuman
Ifira-Mele Futuna-Aniwa Luangiua Sikaiana
Fijian (Navosa)
Takuu
Fijian (Bau) Fijian (Suva)
Tikopia Vaeakau-Taumako
Bellona
Rennellese West Uvea East Futuna Emae Samoan
Anuta Tuvalu Niue East Tokelau Uvea Tongan
Tongic
Figure 4. A split graph showing the results of NeighborNet analyses of the Polynesian lexical data. The network has three main regions: Fijian dialects plus Rotuman, western Polynesian and Eastern Polynesian. There is substantial conflicting signal within each region consistent with the break-up of a dialect chain. Scale bar, 0.1.
The delta score was introduced by Holland et al. (2002) primarily as a tool for data exploration. As such, there is little indication of how the statistical significance of various delta scores might be determined. We have implemented and tested a number of schemes for assessing the significance of delta score and Q-residual values, including non-parametric and parametric bootstrapping. Unfortunately, and curiously, none have proven to be sufficiently powerful and robust. Until such tests are available, we will continue to use delta scores and Q-residuals as indicators of the extent of tree-likeness. Let us see how the combination of NeighborNets, delta scores and Q-residual scores might be put into practice in analysing the shape of linguistic evolution. We will start with a simple example, where the history is known to be more complex than a single tree. Sranan is a creole language developed by African slaves in Surinam on the northern coast of South America. The English established Surinam in 1651 as a slave colony but Dutch has been the official language since 1667 (McWhorter 2001). Sranan thus has words derived from both English and Dutch. Figure 3 shows a NeighborNet based on cognate-coded basic vocabulary for 12 Indo-European languages including Sranan, English and Dutch. The data consisting of 2355 cognate sets were derived from Dyen et al. (1992, 1997). Borrowings identified and removed by Dyen and co-workers were included in the analysis (see Bryant et al. 2005). Gene content distances were used in the NeighborNet analysis. This is an appropriate distance transformation for lexical data as it is equivalent to the stochastic Dollo model developed by Nicholls & Gray (2006, Phil. Trans. R. Soc. B (2010)
2008) in which cognates can evolve only once but be lost multiple times. As NeighborNet can overfit the data, splits with small weights (less than 0.005) were filtered from the split graph. As might be expected given the hybrid history of Sranan, the split graph shows strong conflicting signal for the positioning of Sranan. One split labelled (a) groups Sranan most closely with English, while another one labelled (b) groups Sranan with Dutch and other closely related Germanic languages. The average delta score for this dataset ¼ 0.23 and the average Qresidual ¼ 0.03. Overall, this suggests that the data is moderately tree-like. This is not surprising given that basic vocabulary is known to be much less likely to be borrowed than a sampling of the total lexicon (Embleton 1986). However, Sranan stands out as having the highest taxon-specific scores reflecting its hybrid history (delta score ¼ 0.29, Q-residual ¼ 0.05). What can these methods reveal about the shape of lexical evolution on a much broader scale? It might be expected that factors such as geographical isolation and recent population expansions would promote relatively tree-like evolution, while ancient connections and geographical proximity would lead to more network-like patterns. If that was the case then the lexical evolution in the Polynesian language family should be way more tree-like than that of IndoEuropean. The far-flung Polynesian islands have only been settled in the last 3000 years (Spriggs 2010), whereas the Indo-European languages started to disperse across continental Europe approximately 8500 years ago, with the major radiation of the language families occurring around 6000 years BP (Gray &
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
The shape and fabric of human history R. D. Gray et al.
3927
Indic Khaskura
Gujarati Marathi Bengali Hindi Panjabi ST Lahnda
Indo-Iranian
Nepali
Balto-Slavic
Baltic Kashmiri Gypsy Gk
Waziri Afghan
Lithuanian O Lithuanian ST Latvian Singhalese
Wakhi
Russian Ukrainian Slavic Byelorussian Polish Lusatian L Lusatian U Czech Czech E Slovak
Tadzik Persian list
Baluchi
Slovenian Serbocroatian Macedonian Bulgarian
Iranian
Ossetic
Armenian list Armenian Mod
Greek Mod Greek MD Greek ML Greek D
Irish A Irish B
Greek K
Armenian
Greek
Celtic Welsh C Welsh N Breton ST Breton list Breton SE Albanian G Albanian T Albanian Top Albanian K Albanian C
Albanian
Vlach Romanian list
Italic
Catalan Spanish Portuguese ST Brazilian Ladin Italian Sardinian L Sardinian N Sardinian C Provencal French Walloon French Creole D French Creole C
Icelandic ST Faroese Danish Riksmal Swedish Up Swedish VL Swedish list English ST
Frisian Flemish
Afrikaans Dutch list Sranan Penn Dutch German ST
Germanic
Figure 5. A split graph showing the results of NeighborNet analyses of the Indo-European lexical data. Scale bar, 0.1.
Atkinson 2003; Atkinson et al. 2005; Nicholls & Gray 2008). Figures 4 and 5 show the results of NeighborNet analyses of comparable basic vocabulary datasets for Polynesian and Indo-European languages. The Polynesian cognate set data were extracted from our Austronesian Basic Vocabulary Database (Greenhill et al. 2008; http://www.language.psy.auckland.ac.nz/ austronesian/). The Indo-European data came from Dyen et al. (1997). Known borrowings were included in the analyses. Gene content distances were used in the NeighborNet analysis and splits with small weights (less than 0.005) were filtered from the split graph. The split graphs and the associated delta scores and Q-residual scores reveal that the expectation that Polynesian languages would be more tree-like is completely wrong. For Polynesian, the average delta score was 0.41 and the average Q-residual value was 0.02. The respective figures for Indo-European were 0.22 and 0.002. It would be difficult to ascribe this difference to statistical sampling error. Why is the evolution of even basic vocabulary in Polynesian so strikingly non-tree-like? There are a number of factors that may have jointly contributed to this pattern. There is increasing evidence that, far from being the consequence of chance voyages, the settlement of the Pacific required relatively complex Phil. Trans. R. Soc. B (2010)
sailing technology and considerable navigational skill. This is especially the case for the rapid settlement of the eastern and southern margins of Polynesia (Irwin 2008). Thus, the voyaging skills of the Polynesians meant that the substantial ocean distances were not necessarily a barrier to ongoing contact. In fact, both archaeological and linguistic evidence attest to substantial ongoing contact (Walter & Sheppard 1996; Weisler & Kirch 1996; Weisler 1998; Geraghty 2004). The lack of social and ecological resources on small islands may have also contributed to this (Irwin 1998). On the basis of linguistic evidence, Pawley (1996) has argued that the settlement of Polynesia involved the establishment and break-up of a series of dialect chains. Figure 6 shows how the break-up of dialect chains can produce conflicting character distributions. According to Pawley, an initial Proto Central Pacific dialect chain broke-up into a dialect chain consisting of Rotuman and Western and Central Fijian in the west and a Tokelau –Fijian and Polynesian dialect chain further to the east. This later dialect chain subsequently split into northern and southern clusters with the southern cluster ultimately becoming the Tongic subgroup and the northern cluster giving rise to Proto Nuclear Polynesian. Finally, Proto Nuclear
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3928
R. D. Gray et al.
The shape and fabric of human history
Polynesian split into Proto Eastern Polynesian and a non-monophyletic western group of languages. After this split there, Eastern Polynesian split into a Marquesic and a Tahitic subgroup and there was substantial borrowing between parts of western and eastern Polynesia. For example, the western Polynesian language Pukapuka is known to have borrowed extensively from eastern Polynesia (Clark 1980). The sequential break-up of Proto-Central Pacific dialect chains described by Pawley is consistent with the network-like evolution seen in figure 4. One region of the network separates off Fijian dialects and Rotuman. The lower right side of the network shows considerable conflicting signal within the western Polynesian languages including the Tongic subgroup. The upper left side of the figure shows strong support for the Eastern Polynesian subgroup, within which there is again substantial conflicting signal. The network also shows some conflicting signal between eastern and western Polynesia, with Pukapuka placed in an intermediate position. Within the eastern Polynesian part of the network, the Marquesic and the Tahitic groups do not form clean clusters. The hybrid history of Hawaiian is the likely cause of this local conflicting signal. Archaeological evidence suggests that Hawaii was initially settled from the Marquesas around AD 800 – 900, but its language and culture were subsequently influenced by contact with Tahiti (Spriggs 2010). The taxonspecific delta and Q-residual scores support the idea that the main source of conflicting signal in the Polynesian data has been the process of dialect chain formation and break-up. Dialect chain break-up should smear that conflicting signal across the whole dialect, e.g. within Eastern Polynesia. In contrast, if just a few taxa are involved in some relatively discrete borrowing, then those taxa should be picked out by the taxon-specific delta and Q-residual scores. This is not the case (table 1). Why is the evolution of Indo-European basic vocabulary relatively tree-like? One possibility is that the socio-linguistic situation in Europe was markedly different. Instead of the far-flung islands linked by kin connections in the Pacific, the relatively high population densities and thus intense competition in continental Europe and Asia may have meant that small linguistic differences became markers of cultural group identity and hence barriers to lexical diffusion. Alternatively, it might be the case that dialect chain formation and break-up are actually the dominant mode of lexical evolution around the globe. Holden & Gray (2006) argue that this has been the case for Bantu languages and Garrett (2006) advances a similar argument for Indo-European. The other obvious difference between Polynesian and Indo-European is time depth. According to the recent phylogenetic estimates (Gray & Atkinson 2003; Nicholls & Gray 2008), the initial divergence of Indo-European languages dates back to approximately 8500 years, whereas Polynesian languages date back to only 3000 years (Gray et al. 2009; Spriggs 2010). One possibility, discussed by Garrett (2006), is that over time networks get pruned by language extinction to appear more tree-like. If this was true, then older language Phil. Trans. R. Soc. B (2010)
families around the globe should be more tree-like. This is a possibility that deserves broader comparative testing.
4. THE FABRIC OF CULTURAL EVOLUTION It is often claimed that language must function as an inter-related system with strong dependencies between components: ‘un syste`me ou` tout se tient’ (attributed variously to Antoine Meillet, and Ferdinand de Saussure; see Peeters 1990). If these dependencies are very strong, then different aspects of language should all have similar histories and thus be similar in the extent to which their evolution is tree-like. To test this, we compared the evolution of basic vocabulary with that of typological linguistic features (Greenhill et al. 2010). We selected 20 Austronesian and 20 Indo-European languages for which there were both good lexical and typological information available. The Austronesian lexical data were sourced from Austronesian Basic Vocabulary Database (Greenhill et al. 2008), and the Indo-European lexical data from Dyen et al. (1997). Typological information about these languages (e.g. information about word order, number of consonants, syllable structures, conjunctions, possessives, tenses, etc.) was obtained from the Word Atlas of Language Structures (Haspelmath et al. 2005). The networks built from these datasets using the NeighborNet algorithm in SPLITSTREE v. 4.10 are shown in figure 7. The networks clearly show that the typological evolution is far less tree-like than that of the basic vocabulary. This difference is also reflected in the delta scores and Q-residuals (figure 7), where the delta scores for the structural information are much larger (twice as large in the Indo-European case), and the Q-residuals are at least two orders of magnitude larger. This supports the view that typological features diffuse relatively easily between neighbouring languages (Matras et al. 2006), while basic vocabulary is less prone to diffusion. For example, although over 50 per cent of the total English lexicon comes from Romance languages post the Norman conquest, this figure falls to around 6 per cent for basic vocabulary, such as the Swadesh 200 word list (Embleton 1986). So far from language being ‘un syste`me ou` tout se tient’, different aspects of language can have quite different histories, some of which are relatively tree-like and others that are not. It could be argued that linguistic evolution is a rather special case of cultural evolution. Despite the typological results discussed above, it could be claimed that the transmission mechanisms and social role of language mean that its evolution is likely to be much more coherent and tree-like than other aspects of culture. First, children mainly learn language from their parents, and this enforced vertical transmission tends to maintain intergenerational consistency (Labov 2007). Second, language change is strongly constrained by the need to communicate with others. So, while languages do change rapidly, they cannot change completely overnight. In contrast, many aspects of culture do not share these intergeneration and communicative stabilizing constraints. As Gould (1987) argued, all it takes is 5 min with a bobbin or
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
The shape and fabric of human history R. D. Gray et al.
Table 1. The taxon-specific delta and Q-residual scores for the Polynesian lexical data, ranked from the lowest Q-residual score to the highest.
dialect chain A
B
A
B
C
A
B
C
A
B
C
A
B
C
C A B
A B
1.
2.
C
Figure 6. A diagram showing the problem dialect chains cause for the construction of bifurcating trees. The dialects A, B and C are initially all mutually intelligible (note the permeable boundaries between the dialects). Innovations evolve in these dialects (filled circles; filled triangles) and diffuse through the network. However, if a dialect splits off from the network (e.g. the split between C and the other two languages), and this diffusion is only partially complete, then conflicting character histories can result. The filled circle characters support topology 1, whereas the filled triangle characters support topology 2. So, under the Dialect Chain/Network-Breaking model, areas where dialect chains were present should be poorly resolved in a phylogenetic analysis, and are better represented by a network diagram rather than a tree.
a bow and arrow for cultural transmission to occur. So, there can be cultural, but not linguistic, revolutions. While we think that these arguments are plausible, we maintain that the extent to which linguistic evolution is unique is an issue that is best addressed empirically, rather than through armchair speculation. Phylogenetic research on material culture is not common but includes studies of weaving motifs in Turkmen carpets (Collard & Tehrani 2005), basketry traditions in northern California ( Jordan & Shennan 2003) and Palaeoindian projectile points (Darwent & O’Brien 2006). However, these studies rarely include an independent estimate of the population history with which to compare the material culture history. A recent study of the cultural evolution of canoe design in the Pacific (Rogers & Ehrlich 2008; Rogers et al. 2009) affords us the opportunity to assess the extent to which the evolution of this aspect of material cultural mirrors the settlement history. Rogers et al. (2009) analysed 134 canoe design traits. Of these traits, 94 were classified as ‘functional’ and 38 ‘symbolic’. Functional traits were those aspects of canoe design that affected canoe sailing performance and hence the prospect of surviving long Oceanic voyages. Symbolic traits were, ‘esthetic, social, and spiritual decorations that presumably have no differential effect on survival from group to group’ (Rogers & Ehrlich 2008, p. 3417). They claimed that population histories could be inferred from the canoe design data and that functional aspects of canoe design provided a stronger reflection of population history. Boldly they Phil. Trans. R. Soc. B (2010)
3929
language
delta score
Q-residual
Fijian (Bau) Sikaiana West Fijian (Navosa) Luangiua Anuta Kapingamarangi Rotuman Maori Hawaiian Tahitian Vaeakau-Taumako Niue Tuvalu Bellona Nukuoro Tikopia Tongan Rurutuan Manihiki Penrhyn Rapanui Fijian (Suva) Emae Samoan Tuamotu Futuna-Aniwa East Uvea Rennellese Pukapuka Takuu Marquesan Rarotongan Ifira-Mele Marquesan (Nukuhiva) East Futuna West Uvea Tokelau Mangareva
0.33 0.40 0.34 0.40 0.41 0.41 0.37 0.35 0.33 0.32 0.40 0.42 0.39 0.40 0.43 0.41 0.41 0.34 0.39 0.38 0.40 0.36 0.41 0.44 0.41 0.45 0.41 0.45 0.46 0.44 0.41 0.41 0.46 0.38 0.44 0.50 0.49 0.44
0.015 0.016 0.016 0.016 0.016 0.016 0.016 0.016 0.017 0.017 0.017 0.018 0.018 0.019 0.019 0.019 0.020 0.021 0.021 0.021 0.022 0.022 0.023 0.024 0.025 0.026 0.027 0.029 0.030 0.030 0.031 0.031 0.036 0.038 0.040 0.042 0.043 0.046
suggest that this history may have included Maori sailing the 7000 km from Hawaii to Aotearoa/ New Zealand. To assess these claims, we calculated site-specific likelihoods for each canoe trait. We estimated the relative fit of functional and symbolic traits on a language tree for the 11 societies analysed by Rogers et al. (2009). The tree was constructed from lexical data in the Austronesian Basic Vocabulary Database (Greenhill et al. 2008). Following Gray et al. (2009), cognate sets were binary-coded. Obvious borrowings were eliminated from the analysis. A single substitution rate model of cognates gains and losses, gamma-distributed rate heterogeneity and a strict clock was implemented in the phylogenetic programme BEAST v. 1.5.4 (Drummond & Rambaut 2007). To ensure that the language trees matched the population history as closely as possible, and to minimize the impact of undetected borrowing, we constrained the topologies in accordance with independent phonological and morphological
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3930
R. D. Gray et al.
The shape and fabric of human history Indo-European
Austronesian Hawaiian Samoan Maori
Kiribati Fijian
Bulgarian Modern Greek
Kilivila Paamese
Rapanui
Spanish Italian French
Albanian Romanian
Pohnpeian
German Dutch
Mokilese
typology
Tigak Yapese
Iaai
Indonesian
Drehu
Paiwan
English Swedish
Persian
Chamorro
Malagasy
Irish Russian
Hindi Kashmiri
Fijian
Kiribati
Lithuanian
Polish Russian Bulgarian
Paamese
German Dutch Swedish
Latvian Iaai
lexicon
Lithuanian
Latvian
Tagalog
Samoan Rapanui Maori Hawaiian
Polish
Armenian (eastern)
Palauan
Irish
Mokilese
English
Pohnpeian
Drehu
Modern Greek Armenian (eastern) Tigak
Malagasy Romanian
Indonesian Kilivila
Yapese
Tagalog Palauan
Paiwan Chamorro
Albanian Persian
Kashmiri
Spanish Italian French
Hindi
Figure 7. Split graphs showing the results of NeighborNet analyses of the lexical and typological data. The analyses used Hamming distances and splits were filtered to a threshold of 0.001. For Austronesian basic vocabulary, the average delta score was 0.33 and the average Q-residual ¼ 0.0020. The average delta score for Austronesian typological data was 0.44 and the average Q-residual ¼ 0.05. The respective figures for Indo-European were 0.21 and 0.001 (basic vocabulary) and 0.40 and 0.04 (typology). Known subgroups within each language family are colour-coded. Scale bar, 0.01.
evidence (Pawley 1966, 1996). From the posterior probability sample, we constructed a maximum clade credibility tree (figure 8), and then mapped the canoe data onto this tree using MESQUITE v. 2.72 (Maddison & Maddison 2010). We calculated the site-specific likelihoods of each character under a 1-rate parameter Markov model. If the claims of Rogers et al. are correct, then it would be expected that both datasets should fit the language trees in figure 8 well, with the functional data fitting the best. Neither prediction is supported by our analyses. Both datasets fit poorly (close to a random distribution), and if anything the functional traits fit the worst (figure 9). Why might this be the case? The trajectory of technological evolution does not need to be tightly tied to population history, especially for functional traits (Dunnell 1978). The global distribution of mobile phones across all kinds of cultural boundaries shows just how quickly useful technology can spread. This is likely to have been the case with functional aspects of canoe design. The large double-hulled drua canoes constructed in Fiji in the late eighteenth century derived their design and handling methods from Tonga and Uvea, while their fore-and-aft rig was Micronesian in origin (D’Arcy 2006). Phil. Trans. R. Soc. B (2010)
Fijian Marquesan Hawaiian Societies Australs Manihiki New Zealand Tuamotuan Cooks Samoan Tongan Figure 8. Maximum clade credibility language tree for the 11 societies analysed by Rogers et al. The tree is constructed from basic vocabulary data with the analyses constrained on the basis of phonological and morphological innovations. To match languages to cultures, we assumed that Societies ¼ Tahitian, Australs ¼ Rurutuan, Cooks ¼ Rarotongan.
NeighborNet analyses reveal that the evolution of functional aspects of canoe design is indeed strikingly non-tree-like (figure 10). Not only is it clear that Pacific peoples borrowed good aspects of
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
The shape and fabric of human history R. D. Gray et al.
frequency
(a)
Cooks
(a)
800
3931
Australs
Samoan New Zealand
400
Tongan
0
Hawaiian
frequency
(b) 40 Fijian 20 Societies 0
Tuamotuan Marquesan
frequency
(c) 8
(b)
4
Tongan Samoan Fijian Manihiki
Manihiki
Tuamotuan
0 Hawaiian
frequency
(d ) Marquesan
200 100
New Zealand
0 0
–2
–4 –6 –8 site-specific likelihood
Figure 9. Histograms showing the distribution of likelihood scores for (a) basic vocabulary, (b) functional aspects of canoe design, (c) symbolic aspects of canoe design and (d) randomization of the canoe data on the language tree. Likelihood scores close to zero indicate a good fit. The basic vocabulary data fit the tree the best (mean ¼ 22.89, median ¼ 22.89, s.d. ¼ 2.31). Both the functional and symbolic aspects of canoe design are close to the random distribution (functional: mean ¼ 26.64, median ¼ 27.36, s.d. ¼ 1.28; symbolic: mean ¼ 26.13, median ¼ 26.34, s.d. ¼ 1.37; random: mean ¼ 26.30, median ¼ 26.92, s.d. ¼ 1.45).
canoe design, they also borrowed, traded and exchanged both canoes (Rolett 2002) and canoe builders (D’Arcy 2006). For example, the drua canoes built in the Lau Group of Fiji were constructed by the Lemaki. The Lemaki were a Tongan and Samoan clan of specialist canoe builders renowned for their extremely watertight method of joining wooden planks without numerous holes and lashings (D’Arcy 2006). While Polynesians readily borrowed functional aspects of canoe design, the symbolic aspects of canoe design might be more closely tied to cultural identity and history. The prows of Maori waka were typically carved in a regional style (Hiroa 1949). This would explain why the symbolic traits fit the languages trees slightly better than the functional traits. The canoe data reveal that, at least when it comes to highly functional aspects of material culture, the fabric of cultural evolution is rather different from the evolution of genes in vertebrate species. Different aspects of culture can have quite different evolutionary histories. One challenge for future research is to characterize the processes that promote the tight coupling of cultural lineages and those that lead the different threads to follow separate paths. Phil. Trans. R. Soc. B (2010)
Societies
–10 Australs Cooks Figure 10. Split graphs showing the results of NeighborNet analyses of the (a) functional and (b) the symbolic aspects of canoe design. For functional traits, the average delta score was 0.46 and the average Q-residual ¼ 0.03. For symbolic traits, the average delta score was 0.37 and the average Q-residual ¼ 0.05. Scale bar, 0.01.
5. CONCLUSION In this paper we have argued that we need to move beyond dichotomous disputes about the validity of cultural phylogenetics. Instead, we have suggested that the debate is better conceptualized as involving positions along continuous dimensions. The challenge for empirical research is to determine how tree-like and how tightly coupled the evolution of particular aspects of culture are. Both critics and proponents of cultural phylogenetics need to become ‘evidencebased’ in their claims about cultural evolution. Using new network methods derived from evolutionary biology, we have outlined how such investigations can reveal some surprising results—the far-flung Polynesian islands in the Pacific are a hotbed of horizontal lexical and cultural evolution. Properly characterizing the shape and fabric of human cultural history will no doubt require further methodological innovations. For example, it would be very useful to be able to test for significant differences in the degree of tree-likeness. However, the most fundamental requirement for further progress is the collection of more high-quality comparative cultural data. The days when all a study of cultural evolution required was a quick trawl through the Ethnographic Atlas (Murdock 1967) are rapidly drawing to an end. It is time for anthropologists to roll their sleeves up and get serious about gathering comparative data again. We can only echo the sentiments expressed by Shennan (2008,
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
3932
R. D. Gray et al.
The shape and fabric of human history
p. 3176) when he noted, ‘the creation of comparable sets of data across time and space has not been the tradition in either anthropology or archaeology, especially in these postmodern times. . .If cultural evolutionary studies are to progress, this situation needs to change’. We thank Roger Green for his advice and enthusiastic support of phylogenetic studies of cultural evolution. He is sadly missed. We would like to thank Deborah Rogers for providing the canoe data, Barbara Holland for the original delta-score code and James Steele and Fiona Jordan for their useful comments on the manuscript.
REFERENCES Atkinson, Q. D. & Gray, R. D. 2005 Curious parallels and curious connections: phylogenetic thinking in biology and historical linguistics. Syst. Biol. 54, 513 –526. (doi:10.1080/10635150590950317) Atkinson, Q. D., Nicholls, G., Welch, D. & Gray, R. D. 2005 From words to dates: water into wine, mathemagic or phylogenetic inference? Trans. Philol. Soc. 103, 193–219. (doi:10.1111/j.1467-968X.2005.00151.x) Bandelt, H. & Dress, A. W. M. 1992 Split decomposition: a new and useful approach to phylogenetic analysis of distance data. Mol. Phylogenet. Evol. 1, 242 –252. (doi:10. 1016/1055-7903(92)90021-8) Borgerhoff Mulder, M., Nunn, C. L. & Towner, M. C. 2006 Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Boyd, R., Borgerhoff Mulder, M., Durham, W. H. & Richerson, P. J. 1997 Are cultural phylogenies possible? In Human by nature, between biology and the social sciences (eds P. Weingart, P. J. Richerson, S. D. Mitchell & S. Maasen), pp. 355– 386. Mahwah, NJ: Lawrence Erlbaum Associates. Bryant, D. & Moulton, V. 2002 NeighborNet: an agglomerative method for the construction of planar phylogenetic networks. Lect: Notes Comp. Sci. 2452, 375 –391. (doi:10.1007/3-540-45784-4_28) Bryant, D. & Moulton, V. 2004 NeighborNet, an agglomerative algorithm for the construction of phylogenetic networks. Mol. Biol. Evol. 21, 255 –265. (doi:10.1093/ molbev/msh018) Bryant, D., Filimon, F. & Gray, R. D. 2005 Untangling our past: languages, trees, splits and networks. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. J. Shennan), pp. 67–83. London, UK: UCL Press. Cavalli-Sforza, L. L., Piazza, A., Menozzi, P. & Mountain, J. L. 1988 Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. Proc. Natl Acad. Sci. USA 85, 6002–6006. (doi:10.1073/pnas. 85.16.6002) Clark, R. 1980 East Polynesian borrowings in Pukapukan. J. Polynesian Soc. 89, 259 –265. Collard, M. & Tehrani, J. 2005 Phylogenesis versus ethnogenesis in Turkmen cultural evolution. In The evolution of cultural diversity: a phylogenetic approach (eds R. Mace, C. J. Holden & S. J. Shennan), pp. 109–132. London, UK: UCL Press. D’Arcy, P. 2006 The people of the sea: environment, identity, and history in Oceania. Honolulu, HI: University of Hawai’i Press. Darwent, J. & O’Brien, M. J. 2006 Using cladistics to construct lineages of projectile points from northeastern Missouri. In Mapping our ancestors: phylogenetic approaches in anthropology and prehistory (eds C. Lipo, M. J. O’Brien, M. Collard & S. J. Shennan), pp. 185 –208. New Brunswick, NJ: Aldine Transactions. Phil. Trans. R. Soc. B (2010)
Darwin, C. 1859 On the origin of species. London, UK: Murray. Drummond, A. J. & Rambaut, A. 2007 BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214. (doi:10.1186/1471-2148-7-214) Dunnell, R. C. 1978 Style and function: a fundamental dichotomy. Am. Antiquity 43, 192 –202. (doi:10.2307/ 279244) Dyen, I., Kruskal, J. B. & Black, P. 1992 An Indoeuropean classification: a lexicostatistical experiment. Trans. Am. Phil. Soc. 82, iii –132. (doi:10.2307/1006517) Dyen, I., Kruskal, J. B. & Black, P. 1997 FILE IE-DATA1. Available online at http://www.ntu.edu.au/education/ langs/ ielex/IE-DATA1. Embleton, S. M. 1986 Statistics in historical linguistics. Bochum: Studienverlag Brockmeyer. Felsenstein, J. 2004 Inferring phylogenies. Sunderland, MA: Sinauer Associates, Inc. Garrett, A. 2006 Convergence in the formation of Indo-European subgroups: phylogeny and chronology. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 139 –151. Cambridge, UK: McDonald Institute for Archaeological Research. Geraghty, P. 2004 Borrowed plants in Fiji and Polynesia: some linguistic evidence. In Borrowing: a pacific perspective (eds J. Tent & P. Geraghty), pp. 65–98. Canberra: Pacific Linguistics. Gould, S. J. 1987 An urchin in the storm. New York, NY: W. W. Norton. Gray, R. D. & Atkinson, Q. D. 2003 Language-tree divergence times support the Anatolian theory of IndoEuropean origin. Nature 426, 435–439. (doi:10.1038/ nature02029) Gray, R. D., Greenhill, S. J. & Ross, R. M. 2007 The pleasures and perils of Darwinizing culture (with phylogenies). Biol. Theory 2, 360 –375. (doi:10.1162/ biot.2007.2.4.360) Gray, R. D., Drummond, A. J. & Greenhill, S. J. 2009 Language phylogenies reveal expansion pulses and pauses in pacific settlement. Science 323, 479 –483. (doi:10.1126/science.1166858) Greenhill, S. J., Blust, R. & Gray, R. D. 2008 The Austronesian basic vocabulary database: from bioinformatics to lexomics. Evol. Bioinform. 4, 271 –283. Greenhill, S. J., Currie, T. E. & Gray, R. D. 2009 Does horizontal transmission invalidate cultural phylogenies? Proc. R. Soc. B 276, 2299–2306. (doi:10.1098/rspb.2008.1944) Greenhill, S. J., Atkinson, Q. D., Meade, A. & Gray, R. D. 2010 The shape and tempo of language evolution. Proc. R. Soc. B 277, 2443–2450. (doi:10.1098/rspb. 2010.0051). Harvey, P. H. & Pagel, M. 1991 The comparative method in evolutionary biology. Oxford, UK: Oxford University Press. Haspelmath, M., Dryer, M., Gil, D. & Comrie, B. 2005 The world atlas of language structures. Oxford, UK: Oxford University Press. Hiroa, T. R. 1949 The coming of the Ma¯ori. Ma¯ori Purposes Fund Board. Christchurch, New Zealand: Whitcombe & Tombs. Holden, C. J. & Gray, R. D. 2006 Exploring Bantu linguistic relationships using trees and networks. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 19–31. Cambridge, UK: The McDonald Institute for Archaeological Research. Holden, C. J. & Mace, R. 2003 Spread of cattle led to the loss of matrilineal descent in Africa: a coevolutionary hypothesis. Proc. R. Soc. Lond. B 270, 2425–2433. (doi:10.1098/rspb.2003.2535)
Downloaded from rstb.royalsocietypublishing.org on November 11, 2010
The shape and fabric of human history R. D. Gray et al. Holland, B. R., Huber, K. T., Dress, A. & Moulton, V. 2002 d Plots: a tool for analyzing phylogenetic distance data. Mol. Biol. Evol. 19, 2051–2059. Hunley, K. L., Cabana, G. S., Merriwether, D. A. & Long, J. C. 2007 A formal test of linguistic and genetic coevolution in native Central and South America. Am. J. Phys. Anthropol. 132, 622–631. (doi:10.1002/ajpa.20542) Hunley, K., Dunn, M., Lindstro¨m, E., Reesink, G., Terrill, A., Healy, M. E., Koki, G., Friedlaender, F. R. & Friedlaender, J. S. 2008 Genetic and linguistic coevolution in Northern Island Melanesia. PLoS Genet. 4, e1000239. (doi:10. 1371/journal.pgen.1000239) Huelsenbeck, J. P. & Rannala, B. 1997 Phylogenetic methods come of age: testing hypotheses in an evolutionary context. Science 276, 227– 232. (doi:10.1126/science. 276.5310.227) Huson, D. H. & Bryant, D. 2006 Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254 –267. (doi:10.1093/molbev/msj030) Irwin, G. J. 1998 The colonisation of the Pacific Plate: chronological, navigational and social issues. J. Polynesian Soc. 107, 111 –143. Irwin, G. J. 2008 Pacific seascapes, canoe performance, and a review of Lapita voyaging with regard to theories of migration. Asian Perspect. 47, 12–27. (doi:10.1353/asi. 2008.0002) Jordan, P. & Shennan, S. J. 2003 Cultural transmission, language and basketry traditions amongst the California Indians. J. Anthropol. Archaeol. 22, 42–74. (doi:10. 1016/S0278-4165(03)00004-7) Kennedy, M. R., Holland, B. R., Gray, R. D. & Spencer, H. G. 2005 Untangling long branches: Identifying conflicting phylogenetic signals using spectral analysis, Neighbor-Net, and consensus networks. Syst. Biol. 54, 620–633. (doi:10. 1080/106351591007462) Kroeber, A. L. 1948 Anthropology. New York, NY: Harcourt (Revised edition). Labov, W. 2007 Transmission and diffusion. Language 83, 344 –387. (doi:10.1353/lan.2007.0082) Mace, R. & Holden, C. J. 2005 A phylogenetic approach to cultural evolution. Trends Ecol. Evol. 20, 116–121. (doi:10.1016/j.tree.2004.12.002) Maddison, W. P. & Maddison, D. R. 2010 Mesquite: a modular system for evolutionary analysis. Version 2.72. http:// mesquiteproject.org Matras, Y., McMahon, A. & Vincent, N. 2006 Linguistic areas: convergence in historical and typological perspective. New York, NY: Palgrave. McWhorter, J. 2001 The power of Babel. New York, NY: Henry Holt/Times. Moore, J. H. 1994 Putting anthropology back together again: the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925–948. (doi:10.1525/aa.1994.96.4.02a00110) Murdock, G. P. 1967 Ethnographic Atlas: a summary. Ethnology 6, 109 –236. (doi:10.2307/3772751) Nicholls, G. K. & Gray, R. D. 2006 Quantifying uncertainty in a stochastic dollo model of vocabulary evolution. In Phylogenetic methods and the prehistory of languages (eds P. Forster & C. Renfrew), pp. 161–171. Cambridge, UK: The McDonald Institute for Archaeological Research.
Phil. Trans. R. Soc. B (2010)
3933
Nicholls, G. K. & Gray, R. D. 2008 Dated ancestral trees from binary trait data and its application to the diversification of languages. J. R. Stat. Soc. B 70, 545– 566. (doi:10.1111/j.1467-9868.2007.00648.x) O’Hara, R. J. 1997 Population thinking and tree thinking in systematics. Zool. Scripta 26, 323 –330. (doi:10.1111/j. 1463-6409.1997.tb00422.x) Pagel, M. & Mace, R. 2004 The cultural wealth of nations. Nature 428, 275– 278. (doi:10.1038/428275a) Pagel, M., Atkinson, Q. D. & Meade, A. 2007 Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature 449, 717–720. (doi:10. 1038/nature06176) Pawley, A. K. 1966 Internal relationships of Polynesian languages and dialects. J. Polynesian Soc. 75, 39–64. Pawley, A. 1996 On the Polynesian subgroup as a problem for Irwin’s continuous settlement hypothesis. In Oceanic culture history: essays in honour of Roger Green (eds J. M. Davidson, G. Irwin, B. F. Leach, A. Pawley & D. Brown), pp. 387– 410. Auckland, New Zealand: New Zealand Journal of Archaeology Special Publication. Peeters, B. 1990 Encore une fois ‘ou` tout se tient’. Historiograph. Linguist. 17, 427–436. Rogers, D. S. & Ehrlich, P. R. 2008 Natural selection and cultural rates of change. Proc. Natl Acad. Sci. USA 105, 3416–3420. (doi:10.1073/pnas.0711802105) Rogers, D. S., Feldman, M. W. & Ehrlich, P. R. 2009 Inferring population histories using cultural data. Proc. R. Soc. B 276, 3835–3843. (doi:10.1098/rspb.2009.1088) Rolett, B. 2002 Voyaging and interaction in ancient East Polynesia. Asian Perspect. 41, 182–194. (doi:10.1353/ asi.2003.0009) Shennan, S. 2008 Canoes and cultural evolution. Proc. Natl Acad. Sci. USA 105, 3175–3176. (doi:10.1073/pnas. 0800666105) Spielman, R. S., Migliazza, E. C. & Neel, J. V. 1974 Regional linguistic and genetic differences among Yanomama Indians. Science 184, 637 –644. (doi:10.1126/science. 184.4137.637) Spriggs, M. 2010 ‘I was so much older then, I’m younger than that now’: why the dates keep changing for the spread of Austronesian languages. In A journey through Austronesian and Papuan linguistic and cultural space: papers in honour of Andrew K. Pawley (eds J. Bowden, N. Himmelmann & M. Ross). Canberra, Australia: Pacific Linguistics. Terrell, J. E. 1988 History as a family tree, history as an entangled bank: constructing images and interpretations of prehistory in the South Pacific. Antiquity 62, 642 –657. Walter, R. K. & Sheppard, P. 1996 The Ngati Tiare adze cache: further evidence of prehistoric contact between West Polynesia and the southern Cook Islands. Archaeol. Oceania 31, 33–39. Weisler, M. 1998 Hard evidence for prehistoric interaction in Polynesia. Curr. Anthropol. 39, 521–532. (doi:10.1086/ 204768) Weisler, M. I. & Kirch, P. V. 1996 Interisland and interarchipelago transfer of stone tools in prehistoric Polynesia. Proc. Natl Acad. Sci. USA 93, 1381– 1385. (doi:10. 1073/pnas.93.4.1381)
RSTB_365_1559_Cover.qxd
10/20/10
2:52 PM
Page 2
GUIDANCE FOR AUTHORS
Editor Professor Georgina Mace Publishing Editor Joanna Bolesworth Editorial Board Neuroscience and Cognition Dr Brian Billups Dr Andrew Glennerster Professor Bill Harris Professor Trevor Lamb Professor Tetsuro Matsuzawa Professor Andrew Whiten Cell and developmental biology Professor Makoto Asashima Dr Buzz Baum Professor Martin Buck Dr Louise Cramer Dr Anne Donaldson Professor Laurence Hurst Professor Fotis Kafatos Professor Elliot Meyerowitz Professor Dale Sanders Dr Stephen Tucker
Publishing Editor: Joanna Bolesworth (tel: +44 (0)20 7451 2602; fax: +44 (0)20 7976 1837; [email protected]) Production Editor: Jessica Mnatzaganian 6–9 Carlton House Terrace, London SW1Y 5AG, UK rstb.royalsocietypublishing.org
Organismal, environmental and evolutionary biology Professor Spencer Barrett Professor Nick Barton Dr Will Cresswell Professor Georgina Mace Professor Yadvinder Malhi Professor Manfred Milinski Professor Peter Mumby Professor Karl Sigmund Health and Disease Professor Zhu Chen Professor Mark Enright Professor Michael Malim Professor Angela McLean Professor Nicholas Wald Professor Joanne Webster
Publishing format Phil. Trans. R. Soc. B articles are published regularly online and in print issues twice a month. Along with all Royal Society journals, we are committed to archiving and providing perpetual access. The journal also offers the facility for including Electronic Supplementary Material (ESM) to papers. Contents of the ESM might include details of methods, derivations of equations, large tables of data, DNA sequences and computer programs. However, the printed version must include enough detail
to satisfy most non-specialist readers. Supplementary data up to 10Mb is placed on the Society's website free of charge. Larger datasets must be deposited in recognised public domain databases by the author.
Conditions of publication Articles must not have been published previously, nor be under consideration for publication elsewhere. The main findings of the article should not have been reported in the mass media. Like many journals, Phil. Trans. R. Soc. B employs a strict embargo policy where the reporting of a scientific article by the media is embargoed until a specific time. The Executive Editor has final authority in all matters relating to publication.
Electronic Submission details For full submission guidelines and access to all journal content please visit the Phil. Trans. R. Soc. B website at rstb.royalsocietypublishing.org.
AIMS AND SCOPE Each issue of Phil. Trans. R. Soc. B is devoted to a specific area of the biological sciences. This area will define a research frontier that is advancing rapidly, often bridging traditional disciplines. Phil. Trans. R. Soc. B is essential reading for scientists working across the biological sciences. In particular, the journal is focused on the following four cluster areas: neuroscience and cognition; organismal and evolutionary biology; cell and developmental biology; and health and disease. As well as theme issues, the journal publishes papers from the Royal Society’s biological discussion meetings. For information on submitting a proposal for a theme issue, consult the journal‘s website at rstb.royalsocietypublishing.org.
ISBN: 978-0-85403-854-1
Copyright © 2010 The Royal Society Except as otherwise permitted under the Copyright, Designs and Patents Act, 1988, this publication may only be reproduced, stored or transmitted, in any form or by any other means, with the prior permission in writing of the publisher, or in the case of reprographic reproduction, in accordance with the terms of a licence issued by the Copyright Licensing Agency. In particular, the Society permits the making of a single photocopy of an article from this issue (under Sections 29 and 38 of this Act) for an individual for the purposes of research or private study. SUBSCRIPTIONS In 2011 Phil. Trans. R. Soc. B (ISSN 0962-8436) will be published twice a month. Full details of subscriptions and single issue sales may be obtained either by contacting our journal fulfilment agent, Portland Customer Services, Commerce Way, Colchester CO2 8HP; tel: +44 (0)1206 796351; fax: +44 (0)1206 799331; email: [email protected] or by visiting our website at http://royalsocietypublishing.org/info/subscriptions. The Royal Society is a Registered Charity No. 207043.
Selection criteria The criteria for selection are scientific excellence, originality and interest across disciplines within biology. The Editors are responsible for all editorial decisions and they make these decisions based on the reports received from the referees and/or Editorial Board members. Many more good proposals and articles are submitted to us than we have space to print, we give preference to those that are of broad interest and of high scientific quality.
The Royal Society, the national academy of science of the UK and the Commonwealth, is at the cutting edge of scientific progress. We support many top young scientists, engineers and technologists, influence science policy, debate scientific issues with the public and much more. We are an independent, charitable body and derive our authoritative status from over 1400 Fellows and Foreign Members. During 2010, we are celebrating the Royal Society’s 350th anniversary. As part of this, there will be an exciting programme of activities – exhibitions, lectures, conferences, a new book, a vast science festival on the South Bank in London, television and radio broadcasting and much more besides. Our mission: to expand knowledge and further the role of science and engineering in making the world a better place.
Subscription prices 2011 calendar year
Europe
USA & Canada
All other countries
Electronic access only
£2145/€2788
$4058
£2317/US$4153
• invest in future scientific leaders and in
Printed version plus electronic access
£2574/€3345
$4869
£2780/US$4983
• influence policymaking with the best scientific
For further information on the Society’s activities, please contact the following departments on the extensions listed by dialling +44 (0)20 7839 5561, or visit the Society’s Web site (www.royalsociety.org). Research Support (UK grants and fellowships) Research Appointments (Fellowships): 2542 Research Grants: 2223 International travel Grants: 2555 Newton International Fellowships: 2559 Science Advice Science Policy Centre: 2550 Science Communication General enquiries: 2573 Library and Information Services Library/archive enquiries: 2606
The Royal Society’s strategic priorities are to: innovation, advice,
• invigorate science and mathematics education, • increase access to the best science internationally, and
Typeset in India by Techset Composition Limited, Salisbury, UK. Printed by Latimer Trend, Plymouth. This paper meets the requirements of ISO 9706:1994(E) and ANSI/NISO Z39.48-1992 (Permanence of Paper) effective with volume 335, issue 1273, 1992. Philosophical Transactions of the Royal Society B (ISSN: 0962-8436) is published twice a month for $4058 per year by the Royal Society, and is distributed in the USA by Agent named Air Business, C/O Worldnet Shipping USA Inc., 149-35 177th Street, Jamaica, New York, NY11434, USA. US Postmaster: Send address changes to Philosophical Transactions of the Royal Society B, C/O Air Business Ltd, C/O Worldnet Shipping USA Inc, 149-35 177th Street Jamaica, New York, NY11414.
• inspire an interest in the joy, wonder and excitement of scientific discovery.
Cover image: A split graph showing the results of NeighborNet analyses of the Indo-European lexical data. The network has three main regions: Fijian dialects plus Rotuman, western Polynesian and Eastern Polynesian. There is substantial conflicting signal within each region consistent with the break-up of a dialect chain. Scale bar, 0.1. (See article by Russell D. Gray, David Bryant and Simon J. Greenhill, pp. 3923–3933.)
RSTB_365_1559_Cover.qxd
10/20/10
2:52 PM
Page 1
volume 365
. number 1559 . pages 3779–3933
Cultural and linguistic diversity: evolutionary approaches Papers of a Theme issue compiled and edited by James Steele, Peter Jordan and Ethan Cochrane Introduction Evolutionary approaches to cultural and linguistic diversity J. Steele, P. Jordan & E. Cochrane
3781
Articles Transmission coupling mechanisms: cultural group selection R. Boyd & P. J. Richerson
3787
Cultural traits as units of analysis M. J. O’Brien, R. L. Lyman, A. Mesoudi & T. L. VanPool
3797
Simulating trait evolution for cross-cultural comparison C. L. Nunn, C. Arnold, L. Matthews & M. B. Mulder
3807
Measuring the diffusion of linguistic change J. Nerbonne
3821
Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories P. Heggarty, W. Maguire & A. McMahon
Language shift, bilingualism and the future of Britain’s Celtic languages A. Kandler, R. Unger & J. Steele
3855
The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles J. J. Tehrani, M. Collard & S. J. Shennan
3865
Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast P. Jordan & S. O’Neill
3875
Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of Remote Oceania E. E. Cochrane & C. P. Lipo
3889
Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits T. E. Currie, S. J. Greenhill & R. Mace
3903
Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies L. Fortunato & F. Jordan
3913
Registered Charity No 207043
3923
Cultural and linguistic diversity: evolutionary approaches
3845
Founded in 1660, the Royal Society is the independent scientific academy of the UK, dedicated to promoting excellence in science
volume 365
number 1559
pages 3779–3933
In this Issue
Cultural and linguistic diversity: evolutionary approaches Papers of a Theme issue compiled and edited by James Steele, Peter Jordan and Ethan Cochrane
3829
Historical linguistics in Australia: trees, networks and their implications C. Bowern
On the shape and fabric of human history R. D. Gray, D. Bryant & S. J. Greenhill
Phil. Trans. R. Soc. B | vol. 365 no. 1559 pp. 3779–3933 | 12 Dec 2010
12 December 2010
ISSN 0962-8436
The world’s first science journal
rstb.royalsocietypublishing.org 12 December 2010
Published in Great Britain by the Royal Society, 6–9 Carlton House Terrace, London SW1Y 5AG See further with the Royal Society in 2010 – celebrate 350 years